[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

srowen Mon, 10 Sep 2018 09:19:17 -0700

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22371#discussion_r216383808
  
    --- Diff: 
core/src/main/scala/org/apache/spark/shuffle/IndexShuffleBlockResolver.scala ---
    @@ -138,13 +154,22 @@ private[spark] class IndexShuffleBlockResolver(
           mapId: Int,
           lengths: Array[Long],
           dataTmp: File): Unit = {
    +    val mapLocks = shuffleIdToLocks.get(shuffleId)
    +    require(mapLocks != null, "Shuffle should be registered to 
IndexShuffleBlockResolver first")
    +    val lock = mapLocks.synchronized {
    --- End diff --
    
    The theory is many fewer threads would contend here because it's 
per-shuffleID.
    
    If it's an issue, then your idea of a second-level ConcurrentHashMap might 
help. It's more complex than a usual Map but can allow for safe concurrent 
access by a limited number of threads.
    
    Otherwise it might be overkill as the second-level Map.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #22371: [SPARK-25386][CORE] Don't need to synchronize the...

Reply via email to