sivabalan narayanan created HUDI-6758:
-----------------------------------------
Summary: Avoid duplicated log blocks on the LogRecordReader
Key: HUDI-6758
URL: https://issues.apache.org/jira/browse/HUDI-6758
Project: Apache Hudi
Issue Type: Bug
Components: reader-core
Reporter: sivabalan narayanan
Due to spark retries, we could have duplicated log blocks added during write.
And since, we don't delete anything during marker based reconciliation on the
writer side, the reader could see duplicated log blocks. for most of the
payload implementation, this should not be an issue. But for expression
payload, it could result in data consistency since an expression could be
evaluated twice (for eg, colA*2).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)