[ 
https://issues.apache.org/jira/browse/HUDI-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Y Ethan Guo updated HUDI-6761:
------------------------------
    Fix Version/s: 1.0.2

> Fix rollbacks with MDT for MOR data table with log files
> --------------------------------------------------------
>
>                 Key: HUDI-6761
>                 URL: https://issues.apache.org/jira/browse/HUDI-6761
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>             Fix For: 1.0.2
>
>
> There are few rollback scenarios, where some log files from data table could 
> be missed to sync to MDT. Esply for cleaner purpose, every valid file from 
> data table (which could be seen with fs.listStatus), should be synced to MDT. 
> we can't afford to miss any log files. 
>  
> Two major gaps which needs to be fixed. 
> 1. log files from original commit being rolled back. 
> for eg, t5.dc fails mid-way in DT which added lf2. 
> we start a rollback commit t6.rb. when t6 syncs to MDT, we should also track 
> lf2 and ensure we sync to MDT. 
> 2. log files added by previous attempts of rollbacks. 
> in the above scenario, rollback could have added a log file (rollback command 
> block) called lf3. 
> but if the rollback failed and is re-attempted, it could add another file 
> called lf4. So, when this rollback syncs to MDT, we need to somehow ensure 
> lf3 is also synced w/o a miss. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to