Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/4805#issuecomment-78121376
@koeninger I agree the purpose of this PR was just to update ZK for tools
to work. I am thinking beyond on the bigger goal to make this new API a
superset in features of the old API. May be this PR is not the right place to
discuss and we can move this discussion to a different location if you feel so.
However for the sake of answering your questions let me give the answers here.
What I was proposing the following. The simpler direct API (the one that
does not take explicit offsets) can pick up the initial offset from ZK if
group.id is specified. This is exactly similar to how we find the latest or
earliest offset in that API, just that instead of latest/earliest offset of the
topic we want to find the offset from the consumer group. The group offsets is
ZK is not used at all for any further processing and restarting, so the
exactly-once semantics is not broken.
The use case where this is useful is simplified code upgrade. If the user
wants to upgrade the code, he/she can the context stop gracefully which will
ensure the ZK consumer group offset will be updated with the last offsets
processed. Then the new code is started (not restarted from checkpoint) can
pickup the consumer group offset from ZK and continue where the previous code
had left off.
Without the functionality of picking up consumer group offsets to start
(that is, currently) the only way to do this is for the users to save the
offsets somewhere (file, database, etc.) and manage the offsets themselves. I
just want to simplify this process.
Either way, I am moving this discussion to this JIRA
https://issues.apache.org/jira/browse/SPARK-6249
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]