nsivabalan commented on issue #6711:
URL: https://github.com/apache/hudi/issues/6711#issuecomment-1400757064
I tried multi-writers from two diff spark-shells, and one of them fails
while writing to hudi.
```
scala> df2.write.format("hudi").
| options(getQuickstartWriteConfigs).
| option(PRECOMBINE_FIELD_OPT_KEY, "ts").
| option(RECORDKEY_FIELD_OPT_KEY, "uuid").
| option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
| option(TABLE_NAME, tableName).
|
option("hoodie.write.concurrency.mode","optimistic_concurrency_control").
| option("hoodie.cleaner.policy.failed.writes","LAZY").
|
option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider").
| option("hoodie.write.lock.zookeeper.url","localhost:2181").
| option("hoodie.write.lock.zookeeper.port","2181").
| option("hoodie.write.lock.zookeeper.lock_key","locks").
| option("hoodie.write.lock.zookeeper.base_path","/tmp/locks/.lock").
| mode(Append).
| save(basePath)
warning: there was one deprecation warning; re-run with -deprecation for
details
[Stage 14:> (0 + 3)
/ 3]# WARNING: Unable to attach Serviceability Agent. Unable to attach even
with module exceptions:
[org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.]
23/01/23 10:00:20 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
org.apache.hudi.exception.HoodieWriteConflictException:
java.util.ConcurrentModificationException: Cannot resolve conflicts for
overlapping writes
at
org.apache.hudi.client.transaction.SimpleConcurrentFileWritesConflictResolutionStrategy.resolveConflict(SimpleConcurrentFileWritesConflictResolutionStrategy.java:102)
at
org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:85)
at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
at
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at
org.apache.hudi.client.utils.TransactionUtils.resolveWriteConflictIfAny(TransactionUtils.java:79)
at
org.apache.hudi.client.SparkRDDWriteClient.preCommit(SparkRDDWriteClient.java:491)
at
org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:234)
at
org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:126)
at
org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:698)
at
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:343)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145)
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
at
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:249)
... 75 elided
Caused by: java.util.ConcurrentModificationException: Cannot resolve
conflicts for overlapping writes
... 109 more
scala>
```
Write to hudi fails and next command prompt it seen.
excerpt from my other shell which succeeded.
```
scala> df2.write.format("hudi").
| options(getQuickstartWriteConfigs).
| option(PRECOMBINE_FIELD_OPT_KEY, "ts").
| option(RECORDKEY_FIELD_OPT_KEY, "uuid").
| option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
| option(TABLE_NAME, tableName).
|
option("hoodie.write.concurrency.mode","optimistic_concurrency_control").
| option("hoodie.cleaner.policy.failed.writes","LAZY").
|
option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider").
| option("hoodie.write.lock.zookeeper.url","localhost:2181").
| option("hoodie.write.lock.zookeeper.port","2181").
| option("hoodie.write.lock.zookeeper.lock_key","locks").
| option("hoodie.write.lock.zookeeper.base_path","/tmp/locks/.lock").
| mode(Append).
| save(basePath)
warning: one deprecation; for details, enable `:setting -deprecation' or
`:replay -deprecation'
# WARNING: Unable to attach Serviceability Agent. Unable to attach even with
module exceptions: [org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException:
Sense failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed., org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense
failed.]
23/01/23 10:00:19 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
scala>
```
If you can provide us w/ reproducible script, would be nice. as of now, its
not reproducible from our end
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]