maikouliujian commented on issue #7653:
URL: https://github.com/apache/hudi/issues/7653#issuecomment-1415695253
> I tried multi-writers from two diff spark-shells, and one of them fails
while writing to hudi.
>
> ```
>
>
> scala> df2.write.format("hudi").
> | options(getQuickstartWriteConfigs).
> | option(PRECOMBINE_FIELD_OPT_KEY, "ts").
> | option(RECORDKEY_FIELD_OPT_KEY, "uuid").
> | option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
> | option(TABLE_NAME, tableName).
> |
option("hoodie.write.concurrency.mode","optimistic_concurrency_control").
> | option("hoodie.cleaner.policy.failed.writes","LAZY").
> |
option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider").
> | option("hoodie.write.lock.zookeeper.url","localhost:2181").
> | option("hoodie.write.lock.zookeeper.port","2181").
> | option("hoodie.write.lock.zookeeper.lock_key","locks").
> |
option("hoodie.write.lock.zookeeper.base_path","/tmp/locks/.lock").
> | mode(Append).
> | save(basePath)
> warning: there was one deprecation warning; re-run with -deprecation for
details
> [Stage 14:> (0 +
3) / 3]# WARNING: Unable to attach Serviceability Agent. Unable to attach even
with module exceptions:
[org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.]
> 23/01/23 10:00:20 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
> org.apache.hudi.exception.HoodieWriteConflictException:
java.util.ConcurrentModificationException: Cannot resolve conflicts for
overlapping writes
> at
org.apache.hudi.client.transaction.SimpleConcurrentFileWritesConflictResolutionStrategy.resolveConflict(SimpleConcurrentFileWritesConflictResolutionStrategy.java:102)
> at
org.apache.hudi.client.utils.TransactionUtils.lambda$resolveWriteConflictIfAny$0(TransactionUtils.java:85)
> at
java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> at
java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:742)
> at
java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
> at
org.apache.hudi.client.utils.TransactionUtils.resolveWriteConflictIfAny(TransactionUtils.java:79)
> at
org.apache.hudi.client.SparkRDDWriteClient.preCommit(SparkRDDWriteClient.java:491)
> at
org.apache.hudi.client.BaseHoodieWriteClient.commitStats(BaseHoodieWriteClient.java:234)
> at
org.apache.hudi.client.SparkRDDWriteClient.commit(SparkRDDWriteClient.java:126)
> at
org.apache.hudi.HoodieSparkSqlWriter$.commitAndPerformPostOperations(HoodieSparkSqlWriter.scala:698)
> at
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:343)
> at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145)
> at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
> at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
> at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
> at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
> at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
> at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> at
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
> at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
> at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
> at
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
> at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
> at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
> at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
> at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
> at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
> at
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
> at
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
> at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:249)
> ... 75 elided
> Caused by: java.util.ConcurrentModificationException: Cannot resolve
conflicts for overlapping writes
> ... 109 more
>
> scala>
> ```
>
> Write to hudi fails and next command prompt it seen.
>
> excerpt from my other shell which succeeded.
>
> ```
> scala> df2.write.format("hudi").
> | options(getQuickstartWriteConfigs).
> | option(PRECOMBINE_FIELD_OPT_KEY, "ts").
> | option(RECORDKEY_FIELD_OPT_KEY, "uuid").
> | option(PARTITIONPATH_FIELD_OPT_KEY, "partitionpath").
> | option(TABLE_NAME, tableName).
> |
option("hoodie.write.concurrency.mode","optimistic_concurrency_control").
> | option("hoodie.cleaner.policy.failed.writes","LAZY").
> |
option("hoodie.write.lock.provider","org.apache.hudi.client.transaction.lock.ZookeeperBasedLockProvider").
> | option("hoodie.write.lock.zookeeper.url","localhost:2181").
> | option("hoodie.write.lock.zookeeper.port","2181").
> | option("hoodie.write.lock.zookeeper.lock_key","locks").
> |
option("hoodie.write.lock.zookeeper.base_path","/tmp/locks/.lock").
> | mode(Append).
> | save(basePath)
> warning: one deprecation; for details, enable `:setting -deprecation' or
`:replay -deprecation'
> # WARNING: Unable to attach Serviceability Agent. Unable to attach even
with module exceptions:
[org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.,
org.apache.hudi.org.openjdk.jol.vm.sa.SASupportException: Sense failed.]
> 23/01/23 10:00:19 WARN MetricsConfig: Cannot locate configuration: tried
hadoop-metrics2-hbase.properties,hadoop-metrics2.properties
>
> scala>
> ```
>
> If you can provide us w/ reproducible script, would be nice. as of now,
its not reproducible from our end
In hudi 0.11.0, it can not supports multiple writers on spark ds ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]