Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/20770#discussion_r173116730 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -140,8 +141,8 @@ class DAGScheduler( private[scheduler] def numTotalJobs: Int = nextJobId.get() private val nextStageId = new AtomicInteger(0) - private[scheduler] val jobIdToStageIds = new HashMap[Int, HashSet[Int]] - private[scheduler] val stageIdToStage = new HashMap[Int, Stage] + private[scheduler] val jobIdToStageIds = new TrieMap[Int, HashSet[Int]] + private[scheduler] val stageIdToStage = new TrieMap[Int, Stage] --- End diff -- Do we really need ```TrieMap``` for concurrence purpose, since job schedule is FIFO(even if after move ```createResultStage()``` ) ? BTW, out of my curiosity, what's ```TrieMap``` advantages compare to ```ConcurrentHashMap``` ï¼
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org