Lokesh Jain created HUDI-8208:
---------------------------------
Summary: Fix partition stats with compaction or clustering
Key: HUDI-8208
URL: https://issues.apache.org/jira/browse/HUDI-8208
Project: Apache Hudi
Issue Type: Bug
Components: metadata
Reporter: Lokesh Jain
Assignee: Lokesh Jain
Fix For: 1.0.0
Consider a partition with 10 file slices. If compaction triggered for 1 file
slice fs1_1, the partition stats are updated for that file slice with the same
key (partition path). The older partition stat record for that partition path
would account for the other 9 file slices (fs2_0 - fs10_0) + the older stat
(fs1_0). The final read value would be merging of all versions of file slices
(fs2_0 - fs10_0, fs1_0, fs1_1). It should only account for the latest version
of fs1.
Upon compaction or clustering, the partition stat should be recomputed and the
older records for that partition should be invalidated.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)