Jonathan Vexler created HUDI-6718:
-------------------------------------

             Summary: Concurrent cleaner commit same instance conflict 
                 Key: HUDI-6718
                 URL: https://issues.apache.org/jira/browse/HUDI-6718
             Project: Apache Hudi
          Issue Type: Bug
          Components: cleaning, multi-writer, table-service
            Reporter: Jonathan Vexler
            Assignee: Jonathan Vexler


Timeline 

 
{code:java}
-rw-r--r--   1 jon  wheel     0B Aug 16 19:58 20230816195843234.commit.requested
-rw-r--r--   1 jon  wheel     0B Aug 16 19:58 20230816195845557.commit.requested
-rw-r--r--   1 jon  wheel   2.2K Aug 16 19:58 20230816195843234.inflight
-rw-r--r--   1 jon  wheel   813B Aug 16 19:58 20230816195845557.inflight
-rw-r--r--   1 jon  wheel   2.6K Aug 16 19:58 20230816195845557.commit
-rw-r--r--   1 jon  wheel   2.6K Aug 16 19:58 20230816195843234.commit
-rw-r--r--   1 jon  wheel   1.7K Aug 16 19:58 20230816195855285.clean.requested
-rw-r--r--   1 jon  wheel   1.7K Aug 16 19:58 20230816195855285.clean.inflight
-rw-r--r--   1 jon  wheel   1.8K Aug 16 19:58 20230816195855389.clean.requested
-rw-r--r--   1 jon  wheel   1.7K Aug 16 19:58 20230816195855285.clean {code}
requests:

 
{code:java}
avrocat hudi/output/.hoodie/20230816195855285.clean.requested
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp": 
"20230816195654386", "action": "commit", "state": "COMPLETED"}}, 
"lastCompletedCommitTimestamp": "20230816195845557", "policy": 
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version": 
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01": 
[{"filePath": {"string": 
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
 "isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted": 
{"array": []}} {code}
{code:java}
avrocat hudi/output/.hoodie/20230816195855389.clean.requested 
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp": 
"20230816195704584", "action": "commit", "state": "COMPLETED"}}, 
"lastCompletedCommitTimestamp": "20230816195845557", "policy": 
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version": 
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01": 
[{"filePath": {"string": 
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
 "isBootstrapBaseFile": {"boolean": false}}], "1970/01/20": [{"filePath": 
{"string": 
"file:/tmp/hudi/output/1970/01/20/05942caf-2d53-4345-845c-5e42abaca797-0_0-1454-2121_20230816195635690.parquet"},
 "isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted": 
{"array": []}}
{code}
The following pr exposed the issue

[https://github.com/apache/hudi/pull/8602]

This does not cause data corruption. Writer needs to be restarted



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to