Xu Mingmin created BEAM-3093: -------------------------------- Summary: add an option 'FirstPollOffsetStrategy' to KafkaIO Key: BEAM-3093 URL: https://issues.apache.org/jira/browse/BEAM-3093 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Xu Mingmin Assignee: Kenneth Knowles
This is a feature borrowed from Storm KafkaSpout. *What's the issue?* In KafkaIO, when offset is stored either in checkpoint or auto_committed, it cannot be changed in application, to force to read from earliest/latest. --This feature is important to reset the start offset when relaunching a job. *Proposed solution:* By borrowing the FirstPollOffsetStrategy concept, users can have more options: 1). *{{EARLIEST}}*: always start_from_beginning no matter of what's in checkpoint/auto_commit; 2). *{{LATEST}}*: always start_from_latest no matter of what's in checkpoint/auto_commit; 3). *{{UNCOMMITTED_EARLIEST}}*: if no offset in checkpoint/auto_commit then start_from_beginning if, otherwise start_from_previous_offset; 4). *{{UNCOMMITTED_LATEST}}*: if no offset in checkpoint/auto_commit then start_from_latest, otherwise start_from_previous_offset; [~rangadi], any comments? -- This message was sent by Atlassian JIRA (v6.4.14#64029)