[ https://issues.apache.org/jira/browse/SPARK-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carson Wang updated SPARK-9809: ------------------------------- Description: When a stage failed and another stage was resubmitted with only part of partitions to compute, all the tasks failed with error message: java.util.NoSuchElementException: key not found: peakExecutionMemory. This is because the internal accumulators are not properly initialized for this stage while other codes assume the internal accumulators always exist. Job aborted due to stage failure: Task 4 in stage 12.0 failed 4 times, most recent failure: Lost task 4.3 in stage 12.0 (TID 4460, 1 0.1.2.40): java.util.NoSuchElementException: key not found: peakExecutionMemory at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:58) at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:699) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:80) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) was: Job aborted due to stage failure: Task 4 in stage 12.0 failed 4 times, most recent failure: Lost task 4.3 in stage 12.0 (TID 4460, 1 0.1.2.40): java.util.NoSuchElementException: key not found: peakExecutionMemory at scala.collection.MapLike$class.default(MapLike.scala:228) at scala.collection.AbstractMap.default(Map.scala:58) at scala.collection.MapLike$class.apply(MapLike.scala:141) at scala.collection.AbstractMap.apply(Map.scala:58) at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:699) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:80) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) at java.lang.Thread.run(Thread.java:722) > Task crashes because the internal accumulators are not properly initialized > --------------------------------------------------------------------------- > > Key: SPARK-9809 > URL: https://issues.apache.org/jira/browse/SPARK-9809 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.5.0 > Reporter: Carson Wang > Assignee: Apache Spark > Priority: Blocker > > When a stage failed and another stage was resubmitted with only part of > partitions to compute, all the tasks failed with error message: > java.util.NoSuchElementException: key not found: peakExecutionMemory. > This is because the internal accumulators are not properly initialized for > this stage while other codes assume the internal accumulators always exist. > Job aborted due to stage failure: Task 4 in stage 12.0 failed 4 times, most > recent failure: Lost task 4.3 in stage 12.0 (TID 4460, 1 > 0.1.2.40): java.util.NoSuchElementException: key not found: > peakExecutionMemory > at scala.collection.MapLike$class.default(MapLike.scala:228) > at scala.collection.AbstractMap.default(Map.scala:58) > at scala.collection.MapLike$class.apply(MapLike.scala:141) > at scala.collection.AbstractMap.apply(Map.scala:58) > at > org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:699) > at > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:80) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:88) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > at java.lang.Thread.run(Thread.java:722) -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org