nochimow opened a new issue #3431:
URL: https://github.com/apache/hudi/issues/3431


   **Describe the problem you faced**
   Hi all, 
   We are currently facing some sporadic issues with the error: 
org.apache.hudi.exception.HoodieUpsertException: Failed to upsert for commit 
time.
   
   I searched about this error and all i found was that it is related with 
multiple-writing scenario, that it's not my case, we are using the single 
writter Hudi config, using hudi 0.8 in AWS Glue jobs and our jobs do not run in 
parallel for the same dataset.
   
   This error it's not fully reproducible, but it mainly happens in my biggest 
datasets, for some time, increasing the number and size of the AWS Glue 
machines fixed this error, but I also had this problem even with 14 * G.2X 
machines. But, we did not think that this problem is 100% related to data size, 
cause some bigger datasets worked on the same machines configs.
   
   Can you guys help me found what other causes may throw this error?
   
   **Stack error**
   `ERROR:__main__:WRITE:HUDI:TABLE:S3:ERROR: An error occurred while calling 
o193.save. : org.apache.hudi.exception.HoodieUpsertException: Failed to upsert 
for commit time 20210806184859 at 
org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:62)
 at 
org.apache.hudi.table.action.commit.SparkUpsertCommitActionExecutor.execute(SparkUpsertCommitActionExecutor.java:46)
 at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:94)
 at 
org.apache.hudi.table.HoodieSparkCopyOnWriteTable.upsert(HoodieSparkCopyOnWriteTable.java:84)
 at 
org.apache.hudi.client.SparkRDDWriteClient.upsert(SparkRDDWriteClient.java:154) 
at org.apache.hudi.DataSourceUtils.doWriteOperation(DataSourceUtils.java:214) 
at org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:186) 
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145) at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataS
 ourceCommand.scala:45) at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
 at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
 at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
 at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) 
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) 
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
 at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:8
 0) at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
 at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:676)
 at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
 at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
 at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
 at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) 
at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285) 
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) at 
org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229) at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMetho
 dAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at 
py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at 
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at 
py4j.Gateway.invoke(Gateway.java:282) at 
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at 
py4j.commands.CallCommand.execute(CallCommand.java:79) at 
py4j.GatewayConnection.run(GatewayConnection.java:238) at 
java.lang.Thread.run(Thread.java:748) Caused by: 
java.lang.IllegalArgumentException at 
org.apache.hudi.common.util.ValidationUtils.checkArgument(ValidationUtils.java:31)
 at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionState(HoodieActiveTimeline.java:396)
 at 
org.apache.hudi.common.table.timeline.HoodieActiveTimeline.transitionRequestedToInflight(HoodieActiveTimeline.java:453)
 at 
org.apache.hudi.table.action.commit.BaseCommitActionExecutor.saveWorkloadProfileMetadataToInflight(BaseCommitActionExecutor.java:114)
 at org.apache.hud
 
i.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:128)
 at 
org.apache.hudi.table.action.commit.BaseSparkCommitActionExecutor.execute(BaseSparkCommitActionExecutor.java:78)
 at 
org.apache.hudi.table.action.commit.AbstractWriteHelper.write(AbstractWriteHelper.java:55)
 ... 39 more`
   
   **Hoodie configs:**
   "hoodie.datasource.write.keygenerator.class": 
"org.apache.hudi.keygen.ComplexKeyGenerator",
   "hoodie.datasource.write.payload.class": 
"org.apache.hudi.common.model.DefaultHoodieRecordPayload",
   "hoodie.datasource.hive_sync.partition_extractor_class": 
"org.apache.hudi.hive.MultiPartKeysValueExtractor",
   "hoodie.table.name": table_name,
   "hoodie.datasource.write.recordkey.field": IDX_COL,
   "hoodie.datasource.write.partitionpath.field": pks,
   "hoodie.datasource.write.hive_style_partitioning": "true",
   "hoodie.datasource.write.precombine.field": tiebreaker,
   "hoodie.datasource.write.operation": operation,
   "hoodie.write.concurrency.mode": "single_writer",
   "hoodie.cleaner.commits.retained": 1,
   "hoodie.fail.on.timeline.archiving": False,
   "hoodie.keep.max.commits": 3,
   "hoodie.keep.min.commits": 2,
   "hoodie.bloom.index.use.caching": True,
   "hoodie.parquet.compression.codec": "snappy"
                           
   **Environment Description**
   AWS Glue Job
   * Hudi version :
   0.8
   * Spark version :
   "Spark 2.4 - Python 3 with improved job times (Glue Version 2.0)"
   
   * Storage (HDFS/S3/GCS..) :
   S3
   * Running on Docker? (yes/no) :
   No
   
   **Additional context**
   
   Infrastructure: Glue Job + S3


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to