nochimow opened a new issue #3431:
URL: https://github.com/apache/hudi/issues/3431
**Describe the problem you faced**
Hi all,
We are currently facing some sporadic issues with the error:
org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit
time.
I searched about this error and all i found was that it is related with
multiple-writing scenario, that it's not my case, we are using the single
writter Hudi config, using hudi 0.8 in AWS Glue jobs and our jobs do not run in
parallel for the same dataset.
This error it's not fully reproducible, but it mainly happens in my biggest
datasets, for some time, increasing the number and size of the AWS Glue
machines fixed this error, but I also had this problem even with 14 * G.2X
machines. But, we did not think that this problem is 100% related to data size,
cause some bigger datasets worked on the same machines configs.
Can you guys help me found what other causes may throw this error?
**Stack error**
`ERROR:__main__:WRITE:HUDI:TABLE:S3:ERROR: An error occurred while calling
o193.save. : org.apache.hudi.exception.HoodieUpsertException: Failed to upsert
for commit time 20210806184859 at
org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
at
org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
at
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:94)
at
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:84)
at
org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:154)
at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214)
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:186)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145) at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataS
ourceCommand.scala:45) at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:8
0) at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
at
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
at
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) at
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMetho
dAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at
py4j.Gateway.invoke(Gateway.java:282) at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at
py4j.commands.CallCommand.execute(CallCommand.java:79) at
py4j.GatewayConnection.run(GatewayConnection.java:238) at
java.lang.Thread.run(Thread.java:748) Caused by:
java.lang.IllegalArgumentException at
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:396)
at
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionRequestedToInflight(HoodieActiveTimeline.java:453)
at
org.apache.hudi.table.action.commit.BaseCommitActionExecutor.saveWorkloadProfileMetadataToInflight(BaseCommitActionExecutor.java:114)
at org.apache.hud
i.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:128)
at
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:78)
at
org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:55)
... 39 more`
**Hoodie configs:**
"hoodie.datasource.write.keygenerator.class":
"org.apache.hudi.keygen.ComplexKeyGenerator",
"hoodie.datasource.write.payload.class":
"org.apache.hudi.common.model.DefaultHoodieRecordPayload",
"hoodie.datasource.hive_sync.partition_extractor_class":
"org.apache.hudi.hive.MultiPartKeysValueExtractor",
"hoodie.table.name": table_name,
"hoodie.datasource.write.recordkey.field": IDX_COL,
"hoodie.datasource.write.partitionpath.field": pks,
"hoodie.datasource.write.hive_style_partitioning": "true",
"hoodie.datasource.write.precombine.field": tiebreaker,
"hoodie.datasource.write.operation": operation,
"hoodie.write.concurrency.mode": "single_writer",
"hoodie.cleaner.commits.retained": 1,
"hoodie.fail.on.timeline.archiving": False,
"hoodie.keep.max.commits": 3,
"hoodie.keep.min.commits": 2,
"hoodie.bloom.index.use.caching": True,
"hoodie.parquet.compression.codec": "snappy"
**Environment Description**
AWS Glue Job
* Hudi version :
0.8
* Spark version :
"Spark 2.4 - Python 3 with improved job times (Glue Version 2.0)"
* Storage (HDFS/S3/GCS..) :
S3
* Running on Docker? (yes/no) :
No
**Additional context**
Infrastructure: Glue Job + S3
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]