Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/4805#issuecomment-78121376
  
    @koeninger I agree the purpose of this PR was just to update ZK for tools 
to work. I am thinking beyond on the bigger goal to make this new API a 
superset in features of the old API. May be this PR is not the right place to 
discuss and we can move this discussion to a different location if you feel so. 
However for the sake of answering your questions let me give the answers here. 
    
    What I was proposing the following. The simpler direct API (the one that 
does not take explicit offsets) can pick up the initial offset from ZK if 
group.id is specified. This is exactly similar to how we find the latest or 
earliest offset in that API, just that instead of latest/earliest offset of the 
topic we want to find the offset from the consumer group. The group offsets is 
ZK is not used at all for any further processing and restarting, so the 
exactly-once semantics is not broken. 
    
    The use case where this is useful is simplified code upgrade. If the user 
wants to upgrade the code, he/she can the context stop gracefully which will 
ensure the ZK consumer group offset will be updated with the last offsets 
processed. Then the new code is started (not restarted from checkpoint) can 
pickup  the consumer group offset from ZK and continue where the previous code 
had left off. 
    
    Without the functionality of picking up consumer group offsets to start 
(that is, currently) the only way to do this is for the users to save the 
offsets somewhere (file, database, etc.) and manage the offsets themselves. I 
just want to simplify this process. 
    
    Either way, I am moving this discussion to this JIRA
    https://issues.apache.org/jira/browse/SPARK-6249



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to