DaveDeCaprio opened a new pull request #23469: [SPARK-26548][SQL] Don't hold 
CacheManager write lock while computing executedPlan
URL: https://github.com/apache/spark/pull/23469
 
 
   ## What changes were proposed in this pull request?
   
   Address SPARK-26548, in Spark 2.4.0, the CacheManager holds a write lock 
while computing the executedPlan for a cached logicalPlan.  In some cases with 
very large query plans this can be an expensive operation, taking minutes to 
run.  The entire cache is blocked during this time.  This PR changes that so 
the writeLock is only obtained after the executedPlan is generated, this 
reduces the time the lock is held to just the necessary time when the shared 
data structure is being updated.
   
   @gatorsmile and @cloud-fan - You can committed patches in this area before.  
This is a small incremental change.
   
   ## How was this patch tested?
   
   Has been tested on a live system where the blocking was causing major issues 
and it is working well. 
    CacheManager has no explicit unit test but is used in many places 
internally as part of the SharedState.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to