Kafka as yet doesn't have a good way to distinguish between "it's ok
to reset at the beginning of a job" and "it's ok to reset any time
offsets are out of range"
https://issues.apache.org/jira/browse/KAFKA-3370
Because of that, it's better IMHO to err on the side of obvious
failures, as opposed t
Yes that's right.
I understand this is a data loss. The restart doesn't have to be all that
silent. It requires us to set a flag. I thought auto.offset.reset is that
flag.
But there isn't much I can do at this point given that retention has
cleaned things up.
The app has to start. Let admins addre
Just so I'm clear on what's happening...
- you're running a job that auto-commits offsets to kafka.
- you stop that job for longer than your retention
- you start that job back up, and it errors because the last committed
offset is no longer available
- you think that instead of erroring, the job
My retention is 1d which isn't terribly low. The problem is every time I
restart after retention expiry, I get this exception instead of honoring
auto.offset.reset.
It isn't a corner case where retention expired after driver created a
batch. Its easily reproducible and consistent.
On Tue, Sep 6, 2
You don't want auto.offset.reset on executors, you want executors to
do what the driver told them to do. Otherwise you're going to get
really horrible data inconsistency issues if the executors silently
reset.
If your retention is so low that retention gets expired in between
when the driver crea
This isn't a production setup. We kept retention low intentionally.
My original question was why I got the exception instead of it using
auto.offset.reset
on restart?
On Tue, Sep 6, 2016 at 10:48 AM, Cody Koeninger wrote:
> If you leave enable.auto.commit set to true, it will commit offsets t
If you leave enable.auto.commit set to true, it will commit offsets to
kafka, but you will get undefined delivery semantics.
If you just want to restart from a fresh state, the easiest thing to
do is use a new consumer group name.
But if that keeps happening, you should look into why your retenti
Hi,
Upon restarting my Spark Streaming app it is failing with error
Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure:
Lost task 0.0 in stage 1.0 (TID 6, localhost):
org.apache.kafka.clients.consum