Prashant Wason created HUDI-1717:
------------------------------------

             Summary: Metadata Table reader does not show correct view of the 
metadata
                 Key: HUDI-1717
                 URL: https://issues.apache.org/jira/browse/HUDI-1717
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Prashant Wason


Dataset timeline: C1 C2 C3 Compaction.inflight C4 C5

Metadata timeline: DC1 DC2 DC3. (DC=deltaCommit)

Assume the dataset timeline has some completed commits (C1, C2 ... C5) and an 
async compaction operation in progress. Also assume that the metadata table is 
synced only till C3.

The MetadataTableWriter will not sync any more instants to the Metadata Table 
since an incomplete instant is present next (Compaction.inflight).

The same sync logic is also used by the MetadataReader to perform the in-memory 
merge of timeline. Hence, the reader will also not consider C4 and C5  thereby 
providing an incorrect and older view of the FileSlices and FileGroups. 

Any future ingestion into this table MAY insert data into older versions of the 
FileSlices which will end up being a data loss when queried.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to