Re: [PR] [HUDI-7507] Adding timestamp ordering validation before creating requested instant [hudi]

via GitHub Fri, 27 Sep 2024 14:38:25 -0700


kbuci commented on PR #11580:
URL: https://github.com/apache/hudi/pull/11580#issuecomment-2380097399

> I was chasing some test failures in this patch and realized that flink
might have an issue. In
[this](https://github.com/apache/hudi/blob/ed65de1460468ad33a374a66606c0baae6cc129b/hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/util/CompactionUtil.java#L78)
piece of code block, we generate a compaction time in the past. So, the
additional validation in this patch may not sit well w/ flink.
>

@nsivabalan My understanding of MOR compaction on latest 0.x is likely out
of date so sorry if my comment here might not make sense , but I assumed that
(in 0.x) once a compaction plan with instant time T targeting a file group is
created, any write (deltacommit) that has a greater instant time than T will
create a new log file with an instant time of T (assuming appends are
disabled). If this is the case, then if you have a MOR dataset with
[C0.deltacommit, C2.deltacommit.inflight] and then a compaction plan is
scheduled with earlier timestamp [C0.deltacommit, C1.compaction.requested,
C2.deltacommit.requested] , then there might be no issue on the base table as
long as C2 fails itself during write conflict resolution. But if this MOR
dataset has a metadata table, then we might find ourselves in same case we
discussed offline (first scenario in
https://issues.apache.org/jira/browse/HUDI-7507). Specifically, if the writer
that worked on C2 (or a greater instant) scheduled a com
paction on the metadata table.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [HUDI-7507] Adding timestamp ordering validation before creating requested instant [hudi]

Reply via email to