Xu Mingmin created BEAM-3093:
--------------------------------

             Summary: add an option 'FirstPollOffsetStrategy' to KafkaIO
                 Key: BEAM-3093
                 URL: https://issues.apache.org/jira/browse/BEAM-3093
             Project: Beam
          Issue Type: Improvement
          Components: sdk-java-core
            Reporter: Xu Mingmin
            Assignee: Kenneth Knowles


This is a feature borrowed from Storm KafkaSpout.

*What's the issue?*
In KafkaIO, when offset is stored either in checkpoint or auto_committed, it 
cannot be changed in application, to force to read from earliest/latest. --This 
feature is important to reset the start offset when relaunching a job.

*Proposed solution:*
By borrowing the FirstPollOffsetStrategy concept, users can have more options:
1). *{{EARLIEST}}*: always start_from_beginning no matter of what's in 
checkpoint/auto_commit;
2). *{{LATEST}}*: always start_from_latest no matter of what's in 
checkpoint/auto_commit;
3). *{{UNCOMMITTED_EARLIEST}}*: if no offset in checkpoint/auto_commit then 
start_from_beginning if, otherwise start_from_previous_offset;
4). *{{UNCOMMITTED_LATEST}}*: if no offset in checkpoint/auto_commit then 
start_from_latest, otherwise start_from_previous_offset;

[~rangadi], any comments?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to