[
https://issues.apache.org/jira/browse/SPARK-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14019610#comment-14019610
]
Kanwaldeep commented on SPARK-2009:
-----------------------------------
I have exactly the same problem.I'm also seeing that data is being read from
Kafka and offsets are being incremented even though we are failing to process
the data. Another concern is the Spark application monitoring UI is not showing
any such exception.
> Key not found exception when slow receiver starts
> -------------------------------------------------
>
> Key: SPARK-2009
> URL: https://issues.apache.org/jira/browse/SPARK-2009
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.0.0
> Reporter: Vadim Chekan
>
> I got "java.util.NoSuchElementException: key not found: 1401756085000 ms"
> exception when using kafka stream and 1 sec batchPeriod.
> Investigation showed that the reason is that ReceiverLauncher.startReceivers
> is asynchronous (started in a thread).
> https://github.com/vchekan/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L206
> In case of slow starting receiver, such as Kafka, it easily takes more than
> 2sec to start. In result, no single "compute" will be called on
> ReceiverInputDStream before first batch job is executed and receivedBlockInfo
> remains empty (obviously). Batch job will cause
> ReceiverInputDStream.getReceivedBlockInfo call and "key not found" exception.
> The patch makes getReceivedBlockInfo more robust by tolerating missing values.
--
This message was sent by Atlassian JIRA
(v6.2#6252)