liujinhui1994 commented on a change in pull request #2227:
URL: https://github.com/apache/hudi/pull/2227#discussion_r579630158
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -227,12 +232,23 @@ public KafkaOffsetGen(TypedProperties props) {
// else return earliest offsets
private Map<TopicPartition, Long> checkupValidOffsets(KafkaConsumer consumer,
Option<String>
lastCheckpointStr, Set<TopicPartition> topicPartitions) {
- Map<TopicPartition, Long> checkpointOffsets =
CheckpointUtils.strToOffsets(lastCheckpointStr.get());
Map<TopicPartition, Long> earliestOffsets =
consumer.beginningOffsets(topicPartitions);
+ if (checkTopicCheckPoint(lastCheckpointStr)) {
+ Map<TopicPartition, Long> checkpointOffsets =
CheckpointUtils.strToOffsets(lastCheckpointStr.get());
+ boolean checkpointOffsetReseter = checkpointOffsets.entrySet().stream()
+ .anyMatch(offset -> offset.getValue() <
earliestOffsets.get(offset.getKey()));
+ return checkpointOffsetReseter ? earliestOffsets : checkpointOffsets;
+ }
+
+ switch (autoResetValue) {
+ case EARLIEST:
+ return earliestOffsets;
+ case LATEST:
+ return consumer.endOffsets(topicPartitions);
+ default:
Review comment:
I also don’t think it’s necessary to add NONE
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -257,6 +273,11 @@ public boolean checkTopicExists(KafkaConsumer consumer) {
return result.containsKey(topicName);
}
+ public boolean checkTopicCheckPoint(Option<String> lastCheckpointStr) {
Review comment:
thanks for pointing out
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]