kbuci commented on PR #11580:
URL: https://github.com/apache/hudi/pull/11580#issuecomment-2380097399

   > I was chasing some test failures in this patch and realized that flink 
might have an issue. In 
[this](https://github.com/apache/hudi/blob/ed65de1460468ad33a374a66606c0baae6cc129b/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/CompactionUtil.java#L78)
 piece of code block, we generate a compaction time in the past. So, the 
additional validation in this patch may not sit well w/ flink.
   > 
   
   @nsivabalan  My understanding of MOR compaction on latest 0.x is likely out 
of date so sorry if my comment here might not make sense , but I assumed that  
(in 0.x) once a compaction plan with instant time T targeting a file group is 
created, any write (deltacommit) that has a greater instant time than T will 
create a new log file with an instant time of T (assuming appends are 
disabled). If this is the case, then if you have  a MOR dataset with 
[C0.deltacommit, C2.deltacommit.inflight] and then a compaction plan is 
scheduled with earlier timestamp [C0.deltacommit, C1.compaction.requested, 
C2.deltacommit.requested] , then there might be no issue on the base table as 
long as C2 fails itself during write conflict resolution. But if this MOR 
dataset has a metadata table, then we might find ourselves in same case we 
discussed offline (first scenario in 
https://issues.apache.org/jira/browse/HUDI-7507). Specifically, if the writer 
that worked on C2  (or a greater instant) scheduled a com
 paction on the metadata table. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to