[ https://issues.apache.org/jira/browse/SPARK-38019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-38019: ---------------------------------- Parent: (was: SPARK-35781) Issue Type: Bug (was: Sub-task) > ExecutorMonitor.timedOutExecutors should be deterministic > --------------------------------------------------------- > > Key: SPARK-38019 > URL: https://issues.apache.org/jira/browse/SPARK-38019 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 3.2.0, 3.2.1, 3.3.0 > Reporter: Dongjoon Hyun > Assignee: Dongjoon Hyun > Priority: Major > Fix For: 3.3.0, 3.2.2 > > > Since the AS-IS timedOutExecutors returns the result indeterministic, it > kills the executors in a random order at Dynamic Allocation setting. > spark/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala > {code} > private val executors = new ConcurrentHashMap[String, Tracker]() > ... > timedOutExecs = executors.asScala > {code} > This random behavior not only makes the users confusing but also causes a K8s > decommission tests flaky like the following case in Java 17 on Apple Silicon > environment. The K8s test expects the decommission of executor 1 while the > executor 2 is chosen at this time. > {code} > 22/01/25 06:11:16 DEBUG ExecutorMonitor: Executors 1,2 do not have active > shuffle data after job 0 finished. > 22/01/25 06:11:16 DEBUG ExecutorAllocationManager: max needed for rpId: 0 > numpending: 0, tasksperexecutor: 1 > 22/01/25 06:11:16 DEBUG ExecutorAllocationManager: No change in number of > executors > 22/01/25 06:11:16 DEBUG ExecutorAllocationManager: Request to remove > executorIds: (2,0), (1,0) > 22/01/25 06:11:16 DEBUG ExecutorAllocationManager: Not removing idle executor > 1 because there are only 1 executor(s) left (minimum number of executor limit > 1) > 22/01/25 06:11:16 INFO KubernetesClusterSchedulerBackend: Decommission > executors: 2 > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org