[GitHub] [hudi] bkosuru commented on issue #4864: Insert with INSERT_DROP_DUPS_OPT_KEY fails

GitBox Mon, 21 Feb 2022 13:05:03 -0800


bkosuru commented on issue #4864:
URL: https://github.com/apache/hudi/issues/4864#issuecomment-1047227041



   Hi Sivabalan, 
   Without INSERT_DROP_DUPS_OPT_KEY setting, the job runs fine. Here is the 
stack trace-
   
   User class threw exception: org.apache.spark.SparkException: Job aborted due 
to stage failure: Task 198 in stage 7.0 failed 4 times, most recent failure: 
Lost task 198.3 in stage 7.0 (TID 8888, xyz1.cnet.com, executor 467): 
ExecutorLostFailure (executor 467 exited caused by one of the running tasks) 
Reason: Container marked as failed: container_e330_16441790_15827_02_00078 on 
host: xyz1.cnet.com. Exit status: 143. Diagnostics: [2022-02-13 
08:14:04.532]Container killed on request. Exit code is 143
   [2022-02-13 08:14:04.532]Container exited with a non-zero exit code 143. 
   [2022-02-13 08:14:04.537]Killed by external signal
   
   Driver stacktrace:
   at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1892)
   at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1880)
   at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1879)
   at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
   at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1879)
   at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:930)
   at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:930)
   at scala.Option.foreach(Option.scala:257)
   at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:930)
   at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2113)
   at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2062)
   at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2051)
   at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
   at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:741)
   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2081)
   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2102)
   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2121)
   at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1386)
   at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
   at org.apache.spark.rdd.RDD.take(RDD.scala:1359)
   at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply$mcZ$sp(RDD.scala:1494)
   at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1494)
   at org.apache.spark.rdd.RDD$$anonfun$isEmpty$1.apply(RDD.scala:1494)
   at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
   at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
   at org.apache.spark.rdd.RDD.isEmpty(RDD.scala:1493)
   at org.apache.spark.api.java.JavaRDDLike$class.isEmpty(JavaRDDLike.scala:544)
   at 
org.apache.spark.api.java.AbstractJavaRDDLike.isEmpty(JavaRDDLike.scala:45)
   at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:181)
   at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:145)
   at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
   at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
   at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
   at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
   at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
   at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
   at 
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
   at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
   at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
   at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
   at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
   at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
   at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:677)
   at 
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:677)
   at 
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
   at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
   at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
   at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:677)
   at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285)
   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
   at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] bkosuru commented on issue #4864: Insert with INSERT_DROP_DUPS_OPT_KEY fails

Reply via email to