voonhous commented on code in PR #7997:
URL: https://github.com/apache/hudi/pull/7997#discussion_r1118217290


##########
hudi-common/src/main/java/org/apache/hudi/common/model/HoodieFileGroup.java:
##########
@@ -122,7 +129,24 @@ public HoodieFileGroupId getFileGroupId() {
    * some log files, that are based off a commit or delta commit.
    */
   private boolean isFileSliceCommitted(FileSlice slice) {
-    if (!compareTimestamps(slice.getBaseInstantTime(), LESSER_THAN_OR_EQUALS, 
lastInstant.get().getTimestamp())) {
+    if (compareTimestamps(slice.getBaseInstantTime(), GREATER_THAN, 
lastInstant.get().getTimestamp())) {
+      return false;
+    }
+
+    if (!slice.getBaseFile().isPresent() && 
timeline.isBeforeTimelineStarts(slice.getBaseInstantTime())) {

Review Comment:
   IIUC, this issue is only specific to Flink writers as the the write and 
rollback logic might execute concurrently. 
   
   While this will fix the issue, i agree that the cleaner way is for MOR to 
delete the log files so that the responsibility of writers and rollback 
executors are clearly defined. i.e. Writers should not know or handle the 
quirks of rollback executors. The current way of fixing things does introduce 
code smell.
   
   To be honest, I am getting a lot of resistance and pushback with this fix, 
so I am going to take small steps. I have already spent 8 hours writing a test 
case and there doesn't seem to be an agreement in which direction everybody 
wants to take. 
   
   @danny0405 @bvaradar LMK which direction the both of you would like to take 
to fix this issue. 
   
   Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to