This is running in YARN cluster mode. It was restarted automatically and
continued fine.
I was trying to see what went wrong. AFAIK there were no task failure.
Nothing in executor logs. The log I gave is in driver.

After some digging, I did see that there was a rebalance in kafka logs
around this time. So will driver fail and exit in such cases?
I've seen drivers exit after a job has hit max retry attempts. This is
different though rt?

Srikanth


On Tue, Feb 7, 2017 at 5:25 PM, Tathagata Das <tathagata.das1...@gmail.com>
wrote:

> Does restarting after a few minutes solves the problem? Could be a
> transient issue that lasts long enough for spark task-level retries to all
> fail.
>
> On Tue, Feb 7, 2017 at 4:34 PM, Srikanth <srikanth...@gmail.com> wrote:
>
>> Hello,
>>
>> I had a spark streaming app that reads from kafka running for a few hours
>> after which it failed with error
>>
>> *17/02/07 20:04:10 ERROR JobScheduler: Error generating jobs for time 
>> 1486497850000 ms
>> java.lang.IllegalStateException: No current assignment for partition 
>> mt_event-5
>>      at 
>> org.apache.kafka.clients.consumer.internals.SubscriptionState.assignedState(SubscriptionState.java:251)
>>      at 
>> org.apache.kafka.clients.consumer.internals.SubscriptionState.needOffsetReset(SubscriptionState.java:315)
>>      at 
>> org.apache.kafka.clients.consumer.KafkaConsumer.seekToEnd(KafkaConsumer.java:1170)
>>      at 
>> org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:179)
>>      at 
>> org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:196)
>>      at 
>> org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)
>>      at 
>> org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341)*
>>
>> ....
>> ....
>>
>> 17/02/07 20:04:10 ERROR ApplicationMaster: User class threw exception: 
>> java.lang.IllegalStateException: No current assignment for partition 
>> mt_event-5
>> java.lang.IllegalStateException: No current assignment for partition 
>> mt_event-5
>>
>> ....
>> ....
>>
>> 17/02/07 20:05:00 WARN JobGenerator: Timed out while stopping the job 
>> generator (timeout = 50000)
>>
>>
>> Driver did not recover from this error and failed. The previous batch ran 
>> 5sec back. There are no indications in the logs that some rebalance happened.
>> As per kafka admin, kafka cluster health was good when this happened and no 
>> maintenance was being done.
>>
>> Any idea what could have gone wrong and why this is a fatal error?
>>
>> Regards,
>> Srikanth
>>
>>
>

Reply via email to