Ryan Pifer created HUDI-1508:
--------------------------------
Summary: Partition update with global index in MOR tables
resulting in duplicate values during read optimized queries
Key: HUDI-1508
URL: https://issues.apache.org/jira/browse/HUDI-1508
Project: Apache Hudi
Issue Type: Bug
Reporter: Ryan Pifer
The way Hudi handles updating partition path is by locating the existing record
and performing a delete on the previous partition and performing insert on new
partition. In the case of Merge-on-Read tables the delete operation, and any
update operation, is added as a log file. However since an insert occurs in the
new partition the record is added in a parquet file. Querying using
`QUERY_TYPE_READ_OPTIMIZED_OPT_VAL` fetches only parquet files and now we have
the case where 2 records for given primary key are present
--
This message was sent by Atlassian Jira
(v8.3.4#803005)