tgravescs commented on a change in pull request #27207: [WIP][SPARK-18886][CORE] Make Locality wait time measure resource under utilization due to delay scheduling. URL: https://github.com/apache/spark/pull/27207#discussion_r374753943
########## File path: core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ########## @@ -100,6 +101,11 @@ private[spark] class TaskSchedulerImpl( // on this class. Protected by `this` private val taskSetsByStageIdAndAttempt = new HashMap[Int, HashMap[Int, TaskSetManager]] + // keyed by task set stage id Review comment: This is going to be a problem, you really want this keyed by taskset, the stage id here is not unique per task set, so if you have multiple task sets for the same stage these could overlaps and you could remove to early. We could use taskSet.id which has both stage id and stage attempt id ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
