[ 
https://issues.apache.org/jira/browse/HUDI-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-5407:
--------------------------------------
    Description: 
On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have set 
cleaning policy to be lazy. hence rollbacks happens only when cleaner kicks in 
and not when we start a new commit. Given MDT is a single writer table, 
rollback blocks are effective only when the commit to rollback is just prior to 
the rollback block. 

 

Scenarios where this could fail w/ inline compaction. 

 
{code:java}
Data table timeline
t1.dc   t2.comp.req.     |Crash  t3.dc     t2.comp.inflight    t2.commit

MDT timeline
t1.dc.  t2.comp.inflight |Crash  t3.dc  t4.rb(t2)           t2.dc

{code}
 

The first attempt of t2 in MDT should be rolled back since it crashed mid-way. 
in other words, if there are any log blocks written by t2 in MDT, it should be 
deemed invalid. 

 

But what happens is, here is how the log blocks are laid out. 

log1(t1).  log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back 
t2) ... log5 (t2)

 

So, when we read the log blocks via AbstractLogRecordReader, ideally we want to 
ignore log2. but when we encounter log4 for a rollback block, we only check the 
previous log block for matching commit to rollback. since it does not match w/ 
t2, we assume log4 is a duplicate rollback and hence still deem log2 as a valid 
log block. 

hence MDT could serve more data files which are not valid from a FS based 
listing standpoint. 

 

Impact:

log blocks to be ignored are considered valid if not for this fix. 

 

 

 

  was:
On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have set 
cleaning policy to be lazy. hence rollbacks happens only when cleaner kicks in 
and not when we start a new commit. Given MDT is a single writer table, 
rollback blocks are effective only when the commit to rollback is just prior to 
the rollback block. 

 

Scenarios where this could fail w/ inline compaction. 

 
{code:java}
Data table timeline
t1.dc   t2.comp.req.     |Crash  t3.dc     t2.comp.inflight    t2.commit

MDT timeline
t1.dc.  t2.comp.inflight |Crash  t3.dc  t4.rb(t2)           t2.dc

{code}
 

The first attempt of t2 in MDT should be rolled back since it crashed mid-way. 
in other words, if there are any log blocks written by t2 in MDT, it should be 
deemed invalid. 

 

But what happens is, here is how the log blocks are laid out. 

log1(t1).  log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back 
t2) ... log5 (t2)

 

So, when we read the log blocks via AbstractLogRecordReader, ideally we want to 
ignore log2. but when we encounter log4 for a rollback block, we only check the 
previous log block for matching commit to rollback. since it does not match w/ 
t2, we assume log4 is a duplicate rollback and hence still deem log2 as a valid 
log block. 

hence MDT could serve more data files which are not valid from a FS based 
listing standpoint. 

 

 

 

 

 


> Rollbacks in MDT is not effective
> ---------------------------------
>
>                 Key: HUDI-5407
>                 URL: https://issues.apache.org/jira/browse/HUDI-5407
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Critical
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
> On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have 
> set cleaning policy to be lazy. hence rollbacks happens only when cleaner 
> kicks in and not when we start a new commit. Given MDT is a single writer 
> table, rollback blocks are effective only when the commit to rollback is just 
> prior to the rollback block. 
>  
> Scenarios where this could fail w/ inline compaction. 
>  
> {code:java}
> Data table timeline
> t1.dc   t2.comp.req.     |Crash  t3.dc     t2.comp.inflight    t2.commit
> MDT timeline
> t1.dc.  t2.comp.inflight |Crash  t3.dc  t4.rb(t2)           t2.dc
> {code}
>  
> The first attempt of t2 in MDT should be rolled back since it crashed 
> mid-way. in other words, if there are any log blocks written by t2 in MDT, it 
> should be deemed invalid. 
>  
> But what happens is, here is how the log blocks are laid out. 
> log1(t1).  log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back 
> t2) ... log5 (t2)
>  
> So, when we read the log blocks via AbstractLogRecordReader, ideally we want 
> to ignore log2. but when we encounter log4 for a rollback block, we only 
> check the previous log block for matching commit to rollback. since it does 
> not match w/ t2, we assume log4 is a duplicate rollback and hence still deem 
> log2 as a valid log block. 
> hence MDT could serve more data files which are not valid from a FS based 
> listing standpoint. 
>  
> Impact:
> log blocks to be ignored are considered valid if not for this fix. 
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to