[
https://issues.apache.org/jira/browse/HUDI-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Vexler updated HUDI-6718:
----------------------------------
Description:
Timeline
{code:java}
-rw-r--r-- 1 jon wheel 0B Aug 16 19:58 20230816195843234.commit.requested
-rw-r--r-- 1 jon wheel 0B Aug 16 19:58 20230816195845557.commit.requested
-rw-r--r-- 1 jon wheel 2.2K Aug 16 19:58 20230816195843234.inflight
-rw-r--r-- 1 jon wheel 813B Aug 16 19:58 20230816195845557.inflight
-rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195845557.commit
-rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195843234.commit
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean.requested
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean.inflight
-rw-r--r-- 1 jon wheel 1.8K Aug 16 19:58 20230816195855389.clean.requested
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean {code}
requests:
{code:java}
avrocat hudi/output/.hoodie/20230816195855285.clean.requested
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
"20230816195654386", "action": "commit", "state": "COMPLETED"}},
"lastCompletedCommitTimestamp": "20230816195845557", "policy":
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version":
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01":
[{"filePath": {"string":
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
"isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
{"array": []}} {code}
{code:java}
avrocat hudi/output/.hoodie/20230816195855389.clean.requested
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
"20230816195704584", "action": "commit", "state": "COMPLETED"}},
"lastCompletedCommitTimestamp": "20230816195845557", "policy":
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version":
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01":
[{"filePath": {"string":
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
"isBootstrapBaseFile": {"boolean": false}}], "1970/01/20": [{"filePath":
{"string":
"file:/tmp/hudi/output/1970/01/20/05942caf-2d53-4345-845c-5e42abaca797-0_0-1454-2121_20230816195635690.parquet"},
"isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
{"array": []}}
{code}
Console output:
notice transaction starts twice for the same instance
{code:java}
424775 [pool-75-thread-1] INFO
org.apache.hudi.table.action.clean.CleanActionExecutor [] - Finishing
previously unfinished cleaner
instant=[==>20230816195855285__clean__INFLIGHT__20230816195855525]
424775 [pool-75-thread-1] INFO
org.apache.hudi.table.action.clean.CleanActionExecutor [] - Using
cleanerParallelism: 1
424779 [pool-91-thread-1] INFO
org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded instants
upto : Option{val=[==>20230816195855389__clean__REQUESTED__20230816195855634]}
424779 [pool-91-thread-1] INFO
org.apache.hudi.client.transaction.TransactionManager [] - Transaction starting
for Option{val=[==>20230816195855285__clean__INFLIGHT]} with latest completed
transaction instant Optional.empty
424779 [pool-91-thread-1] INFO
org.apache.hudi.client.transaction.lock.LockManager [] - LockProvider
org.apache.hudi.client.transaction.lock.InProcessLockProvider
424779 [pool-91-thread-1] INFO
org.apache.hudi.client.transaction.lock.InProcessLockProvider [] - Base Path
file:/tmp/hudi/output, Lock Instance
java.util.concurrent.locks.ReentrantReadWriteLock@78f60539[Write locks = 0,
Read locks = 0], Thread pool-91-thread-1, In-process lock state ACQUIRING
424779 [pool-91-thread-1] INFO
org.apache.hudi.client.transaction.lock.InProcessLockProvider [] - Base Path
file:/tmp/hudi/output, Lock Instance
java.util.concurrent.locks.ReentrantReadWriteLock@78f60539[Write locks = 1,
Read locks = 0], Thread pool-91-thread-1, In-process lock state ACQUIRED
424779 [pool-91-thread-1] INFO
org.apache.hudi.client.transaction.TransactionManager [] - Transaction started
for Option{val=[==>20230816195855285__clean__INFLIGHT]} with latest completed
transaction instant Optional.empty {code}
The following pr exposed the issue
[https://github.com/apache/hudi/pull/8602]
This does not cause data corruption. Writer needs to be restarted
was:
Timeline
{code:java}
-rw-r--r-- 1 jon wheel 0B Aug 16 19:58 20230816195843234.commit.requested
-rw-r--r-- 1 jon wheel 0B Aug 16 19:58 20230816195845557.commit.requested
-rw-r--r-- 1 jon wheel 2.2K Aug 16 19:58 20230816195843234.inflight
-rw-r--r-- 1 jon wheel 813B Aug 16 19:58 20230816195845557.inflight
-rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195845557.commit
-rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195843234.commit
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean.requested
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean.inflight
-rw-r--r-- 1 jon wheel 1.8K Aug 16 19:58 20230816195855389.clean.requested
-rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean {code}
requests:
{code:java}
avrocat hudi/output/.hoodie/20230816195855285.clean.requested
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
"20230816195654386", "action": "commit", "state": "COMPLETED"}},
"lastCompletedCommitTimestamp": "20230816195845557", "policy":
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version":
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01":
[{"filePath": {"string":
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
"isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
{"array": []}} {code}
{code:java}
avrocat hudi/output/.hoodie/20230816195855389.clean.requested
{"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
"20230816195704584", "action": "commit", "state": "COMPLETED"}},
"lastCompletedCommitTimestamp": "20230816195845557", "policy":
"KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}}, "version":
{"int": 2}, "filePathsToBeDeletedPerPartition": {"map": {"1970/01/01":
[{"filePath": {"string":
"file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
"isBootstrapBaseFile": {"boolean": false}}], "1970/01/20": [{"filePath":
{"string":
"file:/tmp/hudi/output/1970/01/20/05942caf-2d53-4345-845c-5e42abaca797-0_0-1454-2121_20230816195635690.parquet"},
"isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
{"array": []}}
{code}
The following pr exposed the issue
[https://github.com/apache/hudi/pull/8602]
This does not cause data corruption. Writer needs to be restarted
> Concurrent cleaner commit same instance conflict
> -------------------------------------------------
>
> Key: HUDI-6718
> URL: https://issues.apache.org/jira/browse/HUDI-6718
> Project: Apache Hudi
> Issue Type: Bug
> Components: cleaning, multi-writer, table-service
> Reporter: Jonathan Vexler
> Assignee: Jonathan Vexler
> Priority: Major
>
> Timeline
>
> {code:java}
> -rw-r--r-- 1 jon wheel 0B Aug 16 19:58
> 20230816195843234.commit.requested
> -rw-r--r-- 1 jon wheel 0B Aug 16 19:58
> 20230816195845557.commit.requested
> -rw-r--r-- 1 jon wheel 2.2K Aug 16 19:58 20230816195843234.inflight
> -rw-r--r-- 1 jon wheel 813B Aug 16 19:58 20230816195845557.inflight
> -rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195845557.commit
> -rw-r--r-- 1 jon wheel 2.6K Aug 16 19:58 20230816195843234.commit
> -rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58
> 20230816195855285.clean.requested
> -rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean.inflight
> -rw-r--r-- 1 jon wheel 1.8K Aug 16 19:58
> 20230816195855389.clean.requested
> -rw-r--r-- 1 jon wheel 1.7K Aug 16 19:58 20230816195855285.clean {code}
> requests:
> {code:java}
> avrocat hudi/output/.hoodie/20230816195855285.clean.requested
> {"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
> "20230816195654386", "action": "commit", "state": "COMPLETED"}},
> "lastCompletedCommitTimestamp": "20230816195845557", "policy":
> "KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}},
> "version": {"int": 2}, "filePathsToBeDeletedPerPartition": {"map":
> {"1970/01/01": [{"filePath": {"string":
> "file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
> "isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
> {"array": []}} {code}
> {code:java}
> avrocat hudi/output/.hoodie/20230816195855389.clean.requested
> {"earliestInstantToRetain": {"HoodieActionInstant": {"timestamp":
> "20230816195704584", "action": "commit", "state": "COMPLETED"}},
> "lastCompletedCommitTimestamp": "20230816195845557", "policy":
> "KEEP_LATEST_COMMITS", "filesToBeDeletedPerPartition": {"map": {}},
> "version": {"int": 2}, "filePathsToBeDeletedPerPartition": {"map":
> {"1970/01/01": [{"filePath": {"string":
> "file:/tmp/hudi/output/1970/01/01/f66cf644-9e9f-477f-863c-eb62d1c6b14d-0_0-1391-2009_20230816195619275.parquet"},
> "isBootstrapBaseFile": {"boolean": false}}], "1970/01/20": [{"filePath":
> {"string":
> "file:/tmp/hudi/output/1970/01/20/05942caf-2d53-4345-845c-5e42abaca797-0_0-1454-2121_20230816195635690.parquet"},
> "isBootstrapBaseFile": {"boolean": false}}]}}, "partitionsToBeDeleted":
> {"array": []}}
> {code}
> Console output:
> notice transaction starts twice for the same instance
> {code:java}
> 424775 [pool-75-thread-1] INFO
> org.apache.hudi.table.action.clean.CleanActionExecutor [] - Finishing
> previously unfinished cleaner
> instant=[==>20230816195855285__clean__INFLIGHT__20230816195855525]
> 424775 [pool-75-thread-1] INFO
> org.apache.hudi.table.action.clean.CleanActionExecutor [] - Using
> cleanerParallelism: 1
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.common.table.timeline.HoodieActiveTimeline [] - Loaded
> instants upto :
> Option{val=[==>20230816195855389__clean__REQUESTED__20230816195855634]}
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.client.transaction.TransactionManager [] - Transaction
> starting for Option{val=[==>20230816195855285__clean__INFLIGHT]} with latest
> completed transaction instant Optional.empty
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.client.transaction.lock.LockManager [] - LockProvider
> org.apache.hudi.client.transaction.lock.InProcessLockProvider
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.client.transaction.lock.InProcessLockProvider [] - Base Path
> file:/tmp/hudi/output, Lock Instance
> java.util.concurrent.locks.ReentrantReadWriteLock@78f60539[Write locks = 0,
> Read locks = 0], Thread pool-91-thread-1, In-process lock state ACQUIRING
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.client.transaction.lock.InProcessLockProvider [] - Base Path
> file:/tmp/hudi/output, Lock Instance
> java.util.concurrent.locks.ReentrantReadWriteLock@78f60539[Write locks = 1,
> Read locks = 0], Thread pool-91-thread-1, In-process lock state ACQUIRED
> 424779 [pool-91-thread-1] INFO
> org.apache.hudi.client.transaction.TransactionManager [] - Transaction
> started for Option{val=[==>20230816195855285__clean__INFLIGHT]} with latest
> completed transaction instant Optional.empty {code}
> The following pr exposed the issue
> [https://github.com/apache/hudi/pull/8602]
> This does not cause data corruption. Writer needs to be restarted
--
This message was sent by Atlassian Jira
(v8.20.10#820010)