[ 
https://issues.apache.org/jira/browse/HUDI-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-5028:
--------------------------------------
    Sprint: 2022/10/18, 2022/11/01, 2022/11/29  (was: 2022/10/18, 2022/11/01)

> Handle conficting writes for multi-writers with OCC enabled
> -----------------------------------------------------------
>
>                 Key: HUDI-5028
>                 URL: https://issues.apache.org/jira/browse/HUDI-5028
>             Project: Apache Hudi
>          Issue Type: Task
>          Components: deltastreamer, writer-core
>            Reporter: Sagar Sumit
>            Assignee: sivabalan narayanan
>            Priority: Major
>
> Even if OCC is enabled and lock provider configs are set, we see that once in 
> a while one of the writers will abort (even with retries) because another 
> writer has acquired a lock for a long-running transaction. We see below 
> message and stacktrace:
> {code:java}
> Found conflicting writes between first operation = {actionType=deltacommit, 
> instantTime=20221004012622654, actionState=INFLIGHT'}, second operation = 
> {actionType=deltacommit, instantTime=20221004012430279, 
> actionState=COMPLETED'} , intersecting file ids 
> [f5aacff1-8f27-48af-89e9-8618a3cbe700-0]{code}
> {code:java}
> java.util.concurrent.ExecutionException: 
> org.apache.hudi.exception.HoodieException: 
> java.util.ConcurrentModificationException: Cannot resolve conflicts for 
> overlapping writes at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
> at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at 
> org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103)
>  at 
> org.apache.hudi.utilities.deltastreamer.internal.OnehouseDeltaStreamer.lambda$sync$1(OnehouseDeltaStreamer.java:145)
>  at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at 
> org.apache.hudi.utilities.deltastreamer.internal.OnehouseDeltaStreamer.sync(OnehouseDeltaStreamer.java:142)
>  at 
> org.apache.hudi.utilities.deltastreamer.internal.OnehouseDeltaStreamer.main(OnehouseDeltaStreamer.java:196)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) 
> at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
>  at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) at 
> org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) at 
> org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052) at 
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: 
> org.apache.hudi.exception.HoodieException: 
> java.util.ConcurrentModificationException: Cannot resolve conflicts for 
> overlapping writes at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:741)
>  at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:750) Caused by: 
> org.apache.hudi.exception.HoodieWriteConflictException: 
> java.util.ConcurrentModificationException: Cannot resolve conflicts for 
> overlapping writes at 
> org.apache.hudi.client.transaction.SimpleConcurrentFileWritesConflictResolutionStrategy.resolveConflict(SimpleConcurrentFileWritesConflictResolutionStrategy.java:102)
>  at 
> org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:85)
>  at 
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
>  at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) 
> at 
> java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742) 
> at 
> java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:647) 
> at 
> org.apache.hudi.client.utils.TransactionUtils.resolveWriteConflictIfAny(TransactionUtils.java:79)
>  at 
> org.apache.hudi.client.SparkRDDWriteClient.preCommit(SparkRDDWriteClient.java:463)
>  at 
> org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:243)
>  at 
> org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:120)
>  at 
> org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:697)
>  at 
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:358)
>  at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:699)
>  
> ... 4 more {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to