[
https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-7361:
---------------------------------
Labels: pull-request-available (was: )
> Fix a concurrency issue caused by rollbackFailedWrites
> ------------------------------------------------------
>
> Key: HUDI-7361
> URL: https://issues.apache.org/jira/browse/HUDI-7361
> Project: Apache Hudi
> Issue Type: Bug
> Components: writer-core
> Reporter: eric
> Priority: Major
> Labels: pull-request-available
> Attachments: jobmanager_log.txt, taskmanager_log.txt
>
>
> {quote}CREATE TABLE tbl (
> ......
> ) WITH (
> 'connector' = 'hudi',
> 'path' = '/tblpath',
> 'table.type' = 'COPY_ON_WRITE',
> 'write.bucket_assign.tasks'='5',
> 'write.operation'='insert',
> 'write.tasks'='5',
> 'clustering.schedule.enabled'='true',
> 'clustering.async.enabled'='true',
> 'clustering.delta_commits'='3',
> 'clustering.tasks'='5',
> 'hoodie.cleaner.policy.failed.writes'='LAZY'
> );
> {quote}
> *Table parameters are as above*
>
> *From jbmanager and taskmanager log, we can summarize the process of abnormal
> triggering:*
> before the writeClient complete the commit 20240126154725671, the clean table
> service starts to work, and the failed Writes rollback needs to be checked
> and completed during the clean process.
> This method will verify whether the heartbeats of all inflight instants are
> overtime and rollback which instants have overtime heartbeats. At the same
> time, the write client has completed the commit 20240126154725671 and deleted
> the heartbeat file of this instant.
> The clean table service client obtained the last heartbeat of 0, so it rolled
> back this instant.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)