[
https://issues.apache.org/jira/browse/HUDI-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-5407:
--------------------------------------
Description:
On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have set
cleaning policy to be lazy. hence rollbacks happens only when cleaner kicks in
and not when we start a new commit. Given MDT is a single writer table,
rollback blocks are effective only when the commit to rollback is just prior to
the rollback block.
Scenarios where this could fail w/ inline compaction.
{code:java}
Data table timeline
t1.dc t2.comp.req. |Crash t3.dc t2.comp.inflight t2.commit
MDT timeline
t1.dc. t2.comp.inflight |Crash t3.dc t4.rb(t2) t2.dc
{code}
The first attempt of t2 in MDT should be rolled back since it crashed mid-way.
in other words, if there are any log blocks written by t2 in MDT, it should be
deemed invalid.
But what happens is, here is how the log blocks are laid out.
log1(t1). log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back
t2) ... log5 (t2)
So, when we read the log blocks via AbstractLogRecordReader, ideally we want to
ignore log2. but when we encounter log4 for a rollback block, we only check the
previous log block for matching commit to rollback. since it does not match w/
t2, we assume log4 is a duplicate rollback and hence still deem log2 as a valid
log block.
hence MDT could serve more data files which are not valid from a FS based
listing standpoint.
Impact:
log blocks to be ignored are considered valid if not for this fix.
was:
On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have set
cleaning policy to be lazy. hence rollbacks happens only when cleaner kicks in
and not when we start a new commit. Given MDT is a single writer table,
rollback blocks are effective only when the commit to rollback is just prior to
the rollback block.
Scenarios where this could fail w/ inline compaction.
{code:java}
Data table timeline
t1.dc t2.comp.req. |Crash t3.dc t2.comp.inflight t2.commit
MDT timeline
t1.dc. t2.comp.inflight |Crash t3.dc t4.rb(t2) t2.dc
{code}
The first attempt of t2 in MDT should be rolled back since it crashed mid-way.
in other words, if there are any log blocks written by t2 in MDT, it should be
deemed invalid.
But what happens is, here is how the log blocks are laid out.
log1(t1). log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back
t2) ... log5 (t2)
So, when we read the log blocks via AbstractLogRecordReader, ideally we want to
ignore log2. but when we encounter log4 for a rollback block, we only check the
previous log block for matching commit to rollback. since it does not match w/
t2, we assume log4 is a duplicate rollback and hence still deem log2 as a valid
log block.
hence MDT could serve more data files which are not valid from a FS based
listing standpoint.
> Rollbacks in MDT is not effective
> ---------------------------------
>
> Key: HUDI-5407
> URL: https://issues.apache.org/jira/browse/HUDI-5407
> Project: Apache Hudi
> Issue Type: Bug
> Components: metadata
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Critical
> Labels: pull-request-available
> Fix For: 0.13.0
>
>
> On rare conditions, rollbacks in MDT is not effective. Apparenlty, we have
> set cleaning policy to be lazy. hence rollbacks happens only when cleaner
> kicks in and not when we start a new commit. Given MDT is a single writer
> table, rollback blocks are effective only when the commit to rollback is just
> prior to the rollback block.
>
> Scenarios where this could fail w/ inline compaction.
>
> {code:java}
> Data table timeline
> t1.dc t2.comp.req. |Crash t3.dc t2.comp.inflight t2.commit
> MDT timeline
> t1.dc. t2.comp.inflight |Crash t3.dc t4.rb(t2) t2.dc
> {code}
>
> The first attempt of t2 in MDT should be rolled back since it crashed
> mid-way. in other words, if there are any log blocks written by t2 in MDT, it
> should be deemed invalid.
>
> But what happens is, here is how the log blocks are laid out.
> log1(t1). log2(t2 first attempt) crash.... log3 (t3) log4(t4.rb rolling back
> t2) ... log5 (t2)
>
> So, when we read the log blocks via AbstractLogRecordReader, ideally we want
> to ignore log2. but when we encounter log4 for a rollback block, we only
> check the previous log block for matching commit to rollback. since it does
> not match w/ t2, we assume log4 is a duplicate rollback and hence still deem
> log2 as a valid log block.
> hence MDT could serve more data files which are not valid from a FS based
> listing standpoint.
>
> Impact:
> log blocks to be ignored are considered valid if not for this fix.
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)