[
https://issues.apache.org/jira/browse/SPARK-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14578161#comment-14578161
]
SaintBacchus commented on SPARK-8163:
-------------------------------------
Hi [~sowen] all the description was the problem how I meet it.
Since my poor English, I think you may not understand what i say:
First, Producer had push a lot data to the Kafka Brokers
Second, after a while(about 10s) shutdown the streaming
Third, recover it from checkpoint file
The result is that Streaming skipped many batches.
I really think this is a big problem, so I still reopen this issue.
> CheckPoint mechanism did not work well when error happened in big streaming
> ---------------------------------------------------------------------------
>
> Key: SPARK-8163
> URL: https://issues.apache.org/jira/browse/SPARK-8163
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.4.0
> Reporter: SaintBacchus
>
> I tested it with Kafka DStream.
> Sometimes Kafka Producer had push a lot data to the Kafka Brokers, then
> Streaming Receiver wanted to pull this data without rate limite.
> At this first batch, Streaming may take 10 or more seconds to comsume this
> data(batch was 2 second).
> I wanted to describle what the Streaming do more detail at this moment:
> The SC was doing its job; the JobGenerator was still send new batchs to
> StreamingContext and StreamingContext writed this to the CheckPoint files;And
> the Receiver still was busy receiving the data from kafka and also tracked
> this events into CheckPoint.
> Then an error(unexcept error) occured, leading to shutdown the Streaming
> Application.
> Then we wanted to recover the application from check point files.But since
> the StreamingContext had record the next few batch, it would be recorvered
> from the last batch. So the Streaming had already missed the first batch and
> did not know what data had been actually comsumed by Receiver.
> Setting spark.streaming.concurrentJobs=2 could avoid this problem, but some
> application can not do this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]