[
https://issues.apache.org/jira/browse/HUDI-7361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
eric updated HUDI-7361:
-----------------------
Description:
{quote}CREATE TABLE tbl (
......
) WITH (
'connector' = 'hudi',
'path' = '/tblpath',
'table.type' = 'COPY_ON_WRITE',
'write.bucket_assign.tasks'='5',
'write.operation'='insert',
'write.tasks'='5',
'clustering.schedule.enabled'='true',
'clustering.async.enabled'='true',
'clustering.delta_commits'='3',
'clustering.tasks'='5',
'hoodie.cleaner.policy.failed.writes'='LAZY'
);
{quote}
*Table parameters are as above*
*From jbmanager and taskmanager log, we can summarize the process of abnormal
triggering:*
before the writeClient complete the commit 20240126154725671, the clean table
service starts to work, and the failed Writes rollback needs to be checked and
completed during the clean process.
This method will verify whether the heartbeats of all inflight instants are
overtime and rollback which instants have overtime heartbeats. At the same
time, the write client has completed the commit 20240126154725671 and deleted
the heartbeat file of this instant.
The clean table service client obtained the last heartbeat of 0, so it rolled
back this instant.
> Fix a concurrency issue caused by rollbackFailedWrites
> ------------------------------------------------------
>
> Key: HUDI-7361
> URL: https://issues.apache.org/jira/browse/HUDI-7361
> Project: Apache Hudi
> Issue Type: Bug
> Components: writer-core
> Reporter: eric
> Priority: Major
> Attachments: jobmanager_log.txt, taskmanager_log.txt
>
>
> {quote}CREATE TABLE tbl (
> ......
> ) WITH (
> 'connector' = 'hudi',
> 'path' = '/tblpath',
> 'table.type' = 'COPY_ON_WRITE',
> 'write.bucket_assign.tasks'='5',
> 'write.operation'='insert',
> 'write.tasks'='5',
> 'clustering.schedule.enabled'='true',
> 'clustering.async.enabled'='true',
> 'clustering.delta_commits'='3',
> 'clustering.tasks'='5',
> 'hoodie.cleaner.policy.failed.writes'='LAZY'
> );
> {quote}
> *Table parameters are as above*
>
> *From jbmanager and taskmanager log, we can summarize the process of abnormal
> triggering:*
> before the writeClient complete the commit 20240126154725671, the clean table
> service starts to work, and the failed Writes rollback needs to be checked
> and completed during the clean process.
> This method will verify whether the heartbeats of all inflight instants are
> overtime and rollback which instants have overtime heartbeats. At the same
> time, the write client has completed the commit 20240126154725671 and deleted
> the heartbeat file of this instant.
> The clean table service client obtained the last heartbeat of 0, so it rolled
> back this instant.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)