[ 
https://issues.apache.org/jira/browse/HUDI-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-2468:
--------------------------------------
    Description: 
lets say there is only one commit which got applied to metadata table as well.

now user for some reason, wants to rollback this commit in data table. 

So, when this reaches metadata code path, first we go through bootstrap code 
path. here we check last synced instant from metadata table and try to compare 
w/ data timeline. since the corresponding commit in datatimeline is inflight, 
code deduces that last synced instant is out of active timeline and need to be 
rebootstrapped. 

but then, we have a condition that boostrapping can be done only if there are 
no inflight in data timeline. But the same very commit is actually inflight in 
datatime and we fail here. 

 

This could also be an issue while trying to rollback a bootstrap commit in data 
table. 

lets say we do a bootstrap with data table which will result in just 1 commit. 
And later if we try to rollback, we will hit the same issue as above. all tests 
in TestBootstrap fails because of this when metadata is enabled. 

possible fix:

We can pass information on current instant being operated on while 
instantiating metadata table writer and ignore that from inflght while checking 
for bootstrap pre-requisite. But wondering is there is a better approach. 

 

  was:
lets say there is only one commit which got applied to metadata table as well.

now user for some reason, wants to rollback this commit in data table. 

So, when this reaches metadata code path, first we go through bootstrap code 
path. here we check last synced instant from metadata table and try to compare 
w/ data timeline. since the corresponding commit in datatimeline is inflight, 
code deduces that last synced instant is out of active timeline and need to be 
rebootstrapped. 

but then, we have a condition that boostrapping can be done only if there are 
no inflight in data timeline. But the same very commit is actually inflight in 
datatime and we fail here. 

possible fix:

We can pass information on current instant being operated on while 
instantiating metadata table writer and ignore that from inflght while checking 
for bootstrap pre-requisite. But wondering is there is a better approach. 

 


> Fix rollback of first commit after being synced to metadata table
> -----------------------------------------------------------------
>
>                 Key: HUDI-2468
>                 URL: https://issues.apache.org/jira/browse/HUDI-2468
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>             Fix For: 0.10.0
>
>
> lets say there is only one commit which got applied to metadata table as well.
> now user for some reason, wants to rollback this commit in data table. 
> So, when this reaches metadata code path, first we go through bootstrap code 
> path. here we check last synced instant from metadata table and try to 
> compare w/ data timeline. since the corresponding commit in datatimeline is 
> inflight, code deduces that last synced instant is out of active timeline and 
> need to be rebootstrapped. 
> but then, we have a condition that boostrapping can be done only if there are 
> no inflight in data timeline. But the same very commit is actually inflight 
> in datatime and we fail here. 
>  
> This could also be an issue while trying to rollback a bootstrap commit in 
> data table. 
> lets say we do a bootstrap with data table which will result in just 1 
> commit. And later if we try to rollback, we will hit the same issue as above. 
> all tests in TestBootstrap fails because of this when metadata is enabled. 
> possible fix:
> We can pass information on current instant being operated on while 
> instantiating metadata table writer and ignore that from inflght while 
> checking for bootstrap pre-requisite. But wondering is there is a better 
> approach. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to