[
https://issues.apache.org/jira/browse/HUDI-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-2436:
--------------------------------------
Description:
I don't fully get
[this|https://github.com/apache/hudi/pull/3651/files#r708820142] point. will
follow up with Vinoth on the exact scenario.
here is my understanding: of a scenario using cloud stores that does not
support append.
If there was crash during a commit, when listing log files to be logged, the
last one which got crashed may not be part of the rollback plan. but thats
should be fine. anyways, its not available via listing. and so I assume even
during compaction those will not be available. we will proceed on with rollback
by adding another log block (file). and this will get replayed to metadata
table.
If you are talking about the case, where a crash happens when rollback itself
is being logged and crashed just before committing to metadata table.
we should be ok here too. we will retry the rollback which will redo the action
phase. and will add new log blocks (with same old logs that were part of failed
writes, just that it may not be able to successfully delete). and this will get
applied to metadata table. We just have to ensure when applying changes to
metadata table, we consider all files from the plan and not just the ones that
got successfully deleted.
- with hdfs type of cloud stores, where appends are allowed, we just create new
log blocks. and hence should not be an issue.
was:
In cloud stores that does not support append, if there was a crash during
writing to a log file. and later rollback was triggered for this failed commit,
this last log file may not be returned when listing all log files written for
this commit of interest. but is there a chance that log files that the rollback
is committed/logged, uses the same log file name as the one that got failed?
- with hdfs type of cloud stores, where appends are allowed, we just create new
log blocks. and hence should not be an issue.
> rollback in cloud stores w/o append, wrt collecting failed log files to be
> deleted/logged
> -----------------------------------------------------------------------------------------
>
> Key: HUDI-2436
> URL: https://issues.apache.org/jira/browse/HUDI-2436
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: Writer Core
> Reporter: sivabalan narayanan
> Assignee: sivabalan narayanan
> Priority: Major
> Fix For: 0.10.0
>
>
> I don't fully get
> [this|https://github.com/apache/hudi/pull/3651/files#r708820142] point. will
> follow up with Vinoth on the exact scenario.
> here is my understanding: of a scenario using cloud stores that does not
> support append.
> If there was crash during a commit, when listing log files to be logged, the
> last one which got crashed may not be part of the rollback plan. but thats
> should be fine. anyways, its not available via listing. and so I assume even
> during compaction those will not be available. we will proceed on with
> rollback by adding another log block (file). and this will get replayed to
> metadata table.
> If you are talking about the case, where a crash happens when rollback itself
> is being logged and crashed just before committing to metadata table.
> we should be ok here too. we will retry the rollback which will redo the
> action phase. and will add new log blocks (with same old logs that were part
> of failed writes, just that it may not be able to successfully delete). and
> this will get applied to metadata table. We just have to ensure when applying
> changes to metadata table, we consider all files from the plan and not just
> the ones that got successfully deleted.
>
> - with hdfs type of cloud stores, where appends are allowed, we just create
> new log blocks. and hence should not be an issue.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)