sivabalan narayanan created HUDI-2712:
-----------------------------------------
Summary: rollback of a commit which has new partitions fails with
metadata table
Key: HUDI-2712
URL: https://issues.apache.org/jira/browse/HUDI-2712
Project: Apache Hudi
Issue Type: Bug
Affects Versions: 0.10.0
Reporter: sivabalan narayanan
When a commit is being rolledback, and the commit has new partitions which was
not present in the table before, files pertaining to this new partition may not
be part of rollback plan. and so these files will be end up dangling w/o being
cleaned up.
Eg:
commit 1: p1 (5 files) p2(5 files)
commit2: p1(3 files) p2(3 files) p3(2 files) partial failed write.
when commit3 is triggered, it will rollback commit2
when generating rollback plan, we first fetch all partitions from
TableFileSystemView which will hit metadata table when enabled.
This may return only p1 and p2 and not p3(since commit2 is not completed)
and then we do fs.list and filter out files that matches the commit2.
So, in this case, we might miss to rollback the files added to p3.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)