[ 
https://issues.apache.org/jira/browse/HUDI-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-2436:
--------------------------------------
    Description: 
I don't fully get 
[this|https://github.com/apache/hudi/pull/3651/files#r708820142] point. will 
follow up with Vinoth on the exact scenario.
here is my understanding: of a scenario using cloud stores that does not 
support append.

If there was crash during a commit, when listing log files to be logged, the 
last one which got crashed may not be part of the rollback plan. but thats 
should be fine. anyways, its not available via listing. and so I assume even 
during compaction those will not be available. we will proceed on with rollback 
by adding another log block (file). and this will get replayed to metadata 
table.

If you are talking about the case, where a crash happens when rollback itself 
is being logged and crashed just before committing to metadata table.
we should be ok here too. we will retry the rollback which will redo the action 
phase. and will add new log blocks (with same old logs that were part of failed 
writes, just that it may not be able to successfully delete). and this will get 
applied to metadata table. We just have to ensure when applying changes to 
metadata table, we consider all files from the plan and not just the ones that 
got successfully deleted.

 

- with hdfs type of cloud stores, where appends are allowed, we just create new 
log blocks. and hence should not be an issue. 

  was:
In cloud stores that does not support append, if there was a crash during 
writing to a log file. and later rollback was triggered for this failed commit, 
this last log file may not be returned when listing all log files written for 
this commit of interest. but is there a chance that log files that the rollback 
is committed/logged, uses the same log file name as the one that got failed? 

 

- with hdfs type of cloud stores, where appends are allowed, we just create new 
log blocks. and hence should not be an issue. 


> rollback in cloud stores w/o append, wrt collecting failed log files to be 
> deleted/logged
> -----------------------------------------------------------------------------------------
>
>                 Key: HUDI-2436
>                 URL: https://issues.apache.org/jira/browse/HUDI-2436
>             Project: Apache Hudi
>          Issue Type: Sub-task
>          Components: Writer Core
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Major
>             Fix For: 0.10.0
>
>
> I don't fully get 
> [this|https://github.com/apache/hudi/pull/3651/files#r708820142] point. will 
> follow up with Vinoth on the exact scenario.
> here is my understanding: of a scenario using cloud stores that does not 
> support append.
> If there was crash during a commit, when listing log files to be logged, the 
> last one which got crashed may not be part of the rollback plan. but thats 
> should be fine. anyways, its not available via listing. and so I assume even 
> during compaction those will not be available. we will proceed on with 
> rollback by adding another log block (file). and this will get replayed to 
> metadata table.
> If you are talking about the case, where a crash happens when rollback itself 
> is being logged and crashed just before committing to metadata table.
> we should be ok here too. we will retry the rollback which will redo the 
> action phase. and will add new log blocks (with same old logs that were part 
> of failed writes, just that it may not be able to successfully delete). and 
> this will get applied to metadata table. We just have to ensure when applying 
> changes to metadata table, we consider all files from the plan and not just 
> the ones that got successfully deleted.
>  
> - with hdfs type of cloud stores, where appends are allowed, we just create 
> new log blocks. and hence should not be an issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to