wangxianghu commented on a change in pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#discussion_r556990030



##########
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -165,6 +169,7 @@ public KafkaOffsetGen(TypedProperties props) {
     }
     DataSourceUtils.checkRequiredProperties(props, 
Collections.singletonList(Config.KAFKA_TOPIC_NAME));
     topicName = props.getString(Config.KAFKA_TOPIC_NAME);
+    kafkaCheckpointTimestamp = 
props.getString(Config.KAFKA_CHECKPOINT_TIMESTAMP);

Review comment:
       if the value of `Config.KAFKA_CHECKPOINT_TIMESTAMP`  does not exist, 
Exception will be thrown, this is not expected when the user want to use 
checkpoint by providing offsets

##########
File path: 
hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java
##########
@@ -182,6 +187,10 @@ public KafkaOffsetGen(TypedProperties props) {
               .map(x -> new TopicPartition(x.topic(), 
x.partition())).collect(Collectors.toSet());
 
       // Determine the offset ranges to read from
+      if (kafkaCheckpointTimestamp != null) {
+        lastCheckpointStr = Option.of(getOffsetsByTimestamp(consumer, 
partitionInfoList, topicName, Long.parseLong(kafkaCheckpointTimestamp)));
+      }
+
       if (lastCheckpointStr.isPresent() && !lastCheckpointStr.get().isEmpty()) 
{

Review comment:
       Here we can not simply over write `lastCheckpointStr`. if user configed 
`Config.KAFKA_CHECKPOINT_TIMESTAMP`,  hudi will always consume from 
`Config.KAFKA_CHECKPOINT_TIMESTAMP` and can not moving on, right ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to