Jason-liujc commented on issue #9512:
URL: https://github.com/apache/hudi/issues/9512#issuecomment-1721575819
Thanks!
We are using some of the retry parameters to see we can allow all these
writers to go through with optimistic retries eventually.
This is the hoodie options related to concurrency we have:
```
// 5 minute
"hoodie.write.lock.client.wait_time_ms_between_retry" -> "300000",
// 10 minutes
"hoodie.write.lock.max_wait_time_ms_between_retry" -> "600000",
// 40 retries
"hoodie.write.lock.num_retries" -> "40",
// 3 minutes
"hoodie.write.lock.wait_time_ms" -> "180000",
// 3 minutes
"hoodie.write.lock.wait_time_ms_between_retry" -> "180000"
```
I'm trying to run this in the following environment:
```
Environment Description
Hudi version : 0.13.0 (EMR 6.11)
Spark version : 3.3.1
Hive version : 3.1.3
Hadoop version : 3.3.3
Storage (HDFS/S3/GCS..) : S3
Running on Docker? (yes/no) : No
```
However, when I spin up multiple writers (7+), I still see some fails after
1h
The errors are all the same:
```
java.util.ConcurrentModificationException: Cannot resolve conflicts for
overlapping writes
at
org.apache.hudi.client.transaction.SimpleConcurrentFileWritesConflictResolutionStrategy.resolveConflict(SimpleConcurrentFileWritesConflictResolutionStrategy.java:108)
at
org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:85)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
at
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647)
```
Are there additional configs I need to use? Because base on the retry
parameters I have set, I'd expect it to run for at least 4 hours.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]