tzulitai commented on PR #52: URL: https://github.com/apache/flink-connector-kafka/pull/52#issuecomment-1738590569
@Tan-JiaLiang there's something that doesn't quite add up here that I'm trying to understand. The scenario that is being fixed is the following, if I'm understanding correctly (assume topic has 3 partitions, A / B / C): t=0: Partitions A, B are written some data. C is empty. t=1: Start job with `latest-offsets` offsets initializer. `currentOffsets` for partitions A and B will be some meaningful value, while for partition C we have `-1`. t=2: Take savepoint. t=3: Add more records to topic. All of A / B / C get new records. t=4: Restore from savepoint. KafkaSource only reads the new records from partitions A + B, while for C we don't see any records? If that's the case, shouldn't partition C be started from earliest offset at t=4 instead of from latest offset, which seems to be what this PR change does? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
