Ryan Pifer created HUDI-1508:
--------------------------------

             Summary: Partition update with global index in MOR tables 
resulting in duplicate values during read optimized queries
                 Key: HUDI-1508
                 URL: https://issues.apache.org/jira/browse/HUDI-1508
             Project: Apache Hudi
          Issue Type: Bug
            Reporter: Ryan Pifer


The way Hudi handles updating partition path is by locating the existing record 
and performing a delete on the previous partition and performing insert on new 
partition. In the case of Merge-on-Read tables the delete operation, and any 
update operation, is added as a log file. However since an insert occurs in the 
new partition the record is added in a parquet file. Querying using 
`QUERY_TYPE_READ_OPTIMIZED_OPT_VAL` fetches only parquet files and now we have 
the case where 2 records for given primary key are present



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to