Vadim Chekan created SPARK-2009:
-----------------------------------

             Summary: Key not found exception when slow receiver starts
                 Key: SPARK-2009
                 URL: https://issues.apache.org/jira/browse/SPARK-2009
             Project: Spark
          Issue Type: Bug
          Components: Streaming
    Affects Versions: 1.0.0
            Reporter: Vadim Chekan


I got "java.util.NoSuchElementException: key not found: 1401756085000 ms" 
exception when using kafka stream and 1 sec batchPeriod.

Investigation showed that the reason is that ReceiverLauncher.startReceivers is 
asynchronous (started in a thread).
https://github.com/vchekan/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L206

In case of slow starting receiver, such as Kafka, it easily takes more than 
2sec to start. In result, no single "compute" will be called on 
ReceiverInputDStream before first batch job is executed and receivedBlockInfo 
remains empty (obviously). Batch job will cause 
ReceiverInputDStream.getReceivedBlockInfo call and "key not found" exception.

The patch makes getReceivedBlockInfo more robust by tolerating missing values.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to