codope commented on code in PR #12392:
URL: https://github.com/apache/hudi/pull/12392#discussion_r1865345488


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/hudi/streaming/HoodieStreamSourceV1.scala:
##########
@@ -150,8 +148,11 @@ class HoodieStreamSourceV1(sqlContext: SQLContext,
       .getOrElse(initialOffsets)
     var endOffset = HoodieSourceOffset(end)
 
-    startOffset = 
HoodieSourceOffset(getCorrectV1CommitTime(startOffset.completionTime))
-    endOffset = 
HoodieSourceOffset(getCorrectV1CommitTime(endOffset.completionTime))
+    // We update the offsets here since until this point the latest offsets 
have been
+    // calculated no matter if it is in the expected version.
+    // We translate them here, then the rest logic should be intact.
+    startOffset = 
HoodieSourceOffset(getV1CommitTime(startOffset.completionTime))

Review Comment:
   let's change the attribute in `HoodieSourceOffset` from `completionTime` to 
`commitTimeOffset` for better readability? Even though in line 130, the 
timestamp is the requestedTime, because of the attribute name, here you have to 
do `startOffset.completionTime`. So, all I am saying is logic is correct but 
naming is not right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to