sivabalan narayanan created HUDI-6758:
-----------------------------------------

             Summary: Avoid duplicated log blocks on the LogRecordReader
                 Key: HUDI-6758
                 URL: https://issues.apache.org/jira/browse/HUDI-6758
             Project: Apache Hudi
          Issue Type: Bug
          Components: reader-core
            Reporter: sivabalan narayanan


Due to spark retries, we could have duplicated log blocks added during write. 
And since, we don't delete anything during marker based reconciliation on the 
writer side, the reader could see duplicated log blocks. for most of the 
payload implementation, this should not be an issue. But for expression 
payload, it could result in data consistency since an expression could be 
evaluated twice (for eg, colA*2).

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to