Re: [I] [SUPPORT] OOM errors while creating a table using Bulk Insert operation [hudi]

via GitHub Mon, 21 Oct 2024 11:26:37 -0700


dataproblems commented on issue #12116:
URL: https://github.com/apache/hudi/issues/12116#issuecomment-2427410163


   @ad1happy2go - I've removed that config for the second screenshot. I 
couldn't find the executor with the exact error message as shown in the screen 
shot but here's something that I do see: 
   
   ```
   4/10/21 04:15:25 INFO ShuffleBlockFetcherIterator: Started 40 remote fetches 
in 1160 ms
   24/10/21 04:15:25 INFO ShuffleBlockFetcherIterator: Started 42 remote 
fetches in 1160 ms
   24/10/21 04:15:25 ERROR Utils: Aborting task
   org.apache.spark.TaskKilledException: null
        at 
org.apache.spark.TaskContextImpl.killTaskIfInterrupted(TaskContextImpl.scala:267)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:36) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_doSort_0$(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.hasNext(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:959)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:464)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1575)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:509)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:448)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:514)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:411)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.scheduler.Task.run(Task.scala:141) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:563)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:566) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_422]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_422]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_422]
   24/10/21 04:15:25 ERROR DataWritingSparkTask: Aborting commit for partition 
4117 (task 20594, attempt 0, stage 23.0)
   24/10/21 04:15:25 ERROR DataWritingSparkTask: Aborted commit for partition 
4117 (task 20594, attempt 0, stage 23.0)
   24/10/21 04:15:25 ERROR Utils: Aborting task
   org.apache.spark.TaskKilledException: null
        at 
org.apache.spark.TaskContextImpl.killTaskIfInterrupted(TaskContextImpl.scala:267)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:36) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_addToSorter_0$(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.sort_doSort_0$(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.hasNext(Unknown
 Source) ~[?:?]
        at 
org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:959)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:490) 
~[scala-library-2.12.15.jar:?]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.$anonfun$run$1(WriteToDataSourceV2Exec.scala:464)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1575)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run(WriteToDataSourceV2Exec.scala:509)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.WritingSparkTask.run$(WriteToDataSourceV2Exec.scala:448)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:514)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:411)
 ~[spark-sql_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.scheduler.Task.run(Task.scala:141) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:563)
 ~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:566) 
~[spark-core_2.12-3.4.1-amzn-2.jar:3.4.1-amzn-2]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_422]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_422]
        at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_422]
   24/10/21 04:15:25 ERROR DataWritingSparkTask: Aborting commit for partition 
4116 (task 20593, attempt 0, stage 23.0)
   24/10/21 04:15:25 ERROR DataWritingSparkTask: Aborted commit for partition 
4116 (task 20593, attempt 0, stage 23.0)
   24/10/21 04:15:25 INFO Executor: Executor killed task 4117.0 in stage 23.0 
(TID 20594), reason: Stage cancelled
   24/10/21 04:15:25 INFO Executor: Executor killed task 4116.0 in stage 23.0 
(TID 20593), reason: Stage cancelled
   24/10/21 04:15:26 ERROR CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [SUPPORT] OOM errors while creating a table using Bulk Insert operation [hudi]

Reply via email to