[ 
https://issues.apache.org/jira/browse/HUDI-5420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-5420:
----------------------------
    Description: When a write transaction writes uncommitted log files in a 
delta commit, e.g., due to Spark task retries, these log files stay in the file 
system after the successful delta commit for some time (unlike uncommitted base 
files which are deleted based on the markers).  The delta commit metadata does 
not contain these log files, and the metadata table does not contain these 
entries either.  Currently, the metadata table validator does not consider such 
valid case for discrepancy and thus throws errors.

> Fix metadata table validator to exclude uncommitted log files in successful 
> deltacommits
> ----------------------------------------------------------------------------------------
>
>                 Key: HUDI-5420
>                 URL: https://issues.apache.org/jira/browse/HUDI-5420
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Ethan Guo
>            Assignee: Ethan Guo
>            Priority: Critical
>             Fix For: 0.13.0
>
>
> When a write transaction writes uncommitted log files in a delta commit, 
> e.g., due to Spark task retries, these log files stay in the file system 
> after the successful delta commit for some time (unlike uncommitted base 
> files which are deleted based on the markers).  The delta commit metadata 
> does not contain these log files, and the metadata table does not contain 
> these entries either.  Currently, the metadata table validator does not 
> consider such valid case for discrepancy and thus throws errors.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to