hudi-bot opened a new issue, #16045:
URL: https://github.com/apache/hudi/issues/16045

   During spark stage retries, spark driver may have all the information to 
reconcile the commit and proceed with next steps, while a stray executor may 
still be writing to a data file and complete later (before the JVM exit). 
   
   Extra files left on the dataset, excluded from reconcile commit step could 
show up as data quality issue for query engines with duplicate records.
   
   This change brings completion markers which tries to prevent the dataset 
from experiencing data quality issues,  in such corner case scenarios.
   
   ## JIRA info
   
   - Link: https://issues.apache.org/jira/browse/HUDI-6416
   - Type: Bug
   - Epic: https://issues.apache.org/jira/browse/HUDI-7967


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to