Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/21729#discussion_r202725810 --- Diff: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala --- @@ -87,7 +87,7 @@ private[spark] class TaskSetManager( // Set the coresponding index of Boolean var when the task killed by other attempt tasks, // this happened while we set the `spark.speculation` to true. The task killed by others // should not resubmit while executor lost. - private val killedByOtherAttempt: Array[Boolean] = new Array[Boolean](numTasks) + private val killedByOtherAttempt = new HashSet[Long] --- End diff -- For instance when you have corrupted shuffle data you may want to ensure it's not caused by killing tasks, and that requires track all killed `taskId`s corresponding to a partition. With a hashMap as @mridulm proposed it shall be easy to add extra log to debug. But actually I just looked at the code again and found that expanding the logInfo in L735 can also resolve my case. So it seems fine to use hashSet to save some memory.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org