DaveDeCaprio commented on a change in pull request #24028: [SPARK-26917][SQL] 
Further reduce locks in CacheManager
URL: https://github.com/apache/spark/pull/24028#discussion_r264009687
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala
 ##########
 @@ -144,16 +144,10 @@ class CacheManager extends Logging {
       } else {
         _.sameResult(plan)
       }
-    val plansToUncache = mutable.Buffer[CachedData]()
-    readLock {
-      val it = cachedData.iterator()
-      while (it.hasNext) {
-        val cd = it.next()
-        if (shouldRemove(cd.plan)) {
-          plansToUncache += cd
-        }
-      }
+    val cachedDataCopy = readLock {
+      cachedData.asScala.clone()
 
 Review comment:
   It is a shallow copy.  This definitely only makes a difference under very 
heavy load.  We haven't seen any impact in performance under smaller cases, and 
in our case it prevents the entire spark application from getting in a lock 
state where only one job gets to run at a time.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to