How to work around NoOffsetForPartitionException when using Spark Streaming

2018-06-01 Thread Martin Peng
Hi, We see below exception when using Spark Kafka streaming 0.10 on a normal Kafka topic. Not sure why offset missing in zk, but since Spark streaming override the offset reset policy to none in the code. I can not set the reset policy to latest(I don't really care data loss now). Is there any

Re: Spark Job crash due to File Not found when shuffle intermittently

2017-07-25 Thread Martin Peng
handle task fail so if job ended normally , this error >>> can be ignore. >>> Second, when using BypassMergeSortShuffleWriter, it will first write >>> data file then write an index file. >>> You can check "Failed to delete temporary index file at

Re: Spark Job crash due to File Not found when shuffle intermittently

2017-07-24 Thread Martin Peng
Is there anyone at share me some lights about this issue? Thanks Martin 2017-07-21 18:58 GMT-07:00 Martin Peng <wei...@gmail.com>: > Hi, > > I have several Spark jobs including both batch job and Stream jobs to > process the system log and analyze them. We are using Kaf

Spark Job crash due to File Not found when shuffle intermittently

2017-07-21 Thread Martin Peng
Hi, I have several Spark jobs including both batch job and Stream jobs to process the system log and analyze them. We are using Kafka as the pipeline to connect each jobs. Once upgrade to Spark 2.1.0 + Spark Kafka Streaming 010, I found some of the jobs(both batch or streaming) are thrown below

The stability of Spark Stream Kafka 010

2017-06-29 Thread Martin Peng
Hi, We planned to upgrade our Spark Kafka library to 0.10 from 0.81 to simplify our infrastructure code logic. Does anybody know when will the 010 version become stable from experimental? May I use this 010 version together with Spark 1.5.1?