Prashant Wason created HUDI-1717: ------------------------------------ Summary: Metadata Table reader does not show correct view of the metadata Key: HUDI-1717 URL: https://issues.apache.org/jira/browse/HUDI-1717 Project: Apache Hudi Issue Type: Bug Reporter: Prashant Wason
Dataset timeline: C1 C2 C3 Compaction.inflight C4 C5 Metadata timeline: DC1 DC2 DC3. (DC=deltaCommit) Assume the dataset timeline has some completed commits (C1, C2 ... C5) and an async compaction operation in progress. Also assume that the metadata table is synced only till C3. The MetadataTableWriter will not sync any more instants to the Metadata Table since an incomplete instant is present next (Compaction.inflight). The same sync logic is also used by the MetadataReader to perform the in-memory merge of timeline. Hence, the reader will also not consider C4 and C5 thereby providing an incorrect and older view of the FileSlices and FileGroups. Any future ingestion into this table MAY insert data into older versions of the FileSlices which will end up being a data loss when queried. -- This message was sent by Atlassian Jira (v8.3.4#803005)