Thanks Cody for the reply. My thoughts were that the time is anyways
required to write and commit the offsets to any of the external systems -
which are all sync.
So why not sync commit of Kafka itself to store the offsets. It helps add
another dependency on the application side to check if say
1. No, prefetched message offsets aren't exposed.
2. No, I'm not aware of any plans for sync commit, and I'm not sure
that makes sense. You have to be able to deal with repeat messages in
the event of failure in any case, so the only difference sync commit
would make would be (possibly) slower
Hi Experts,
A question on what could potentially happen with Spark Streaming 2.2.0 +
Kafka. LocationStrategies says that "new Kafka consumer API will pre-fetch
messages into buffers.".
If we store offsets in Kafka, currently we can only use a async commits.
So,
1 - Could it happen that we commit