dhruve commented on issue #22015: 
[SPARK-20286][SPARK-24786][Core][DynamicAllocation] Release executors on 
unpersisting RDD
URL: https://github.com/apache/spark/pull/22015#issuecomment-465145564
 
 
   Let me explain the change to make it more clear. In regular cases, if an 
executor doesn't have any cached data, we release the executor if no tasks are 
scheduled on it within the timeout interval(set by 
`spark.dynamicAllocation.executorIdleTimeout`). This is the default behavior.
   
   For executors which have cached data, by default the expiry interval is 
infinite (configured by `spark.dynamicAllocation.cachedExecutorIdleTimeout`). 
Since we don't know when the next task will need to access the cached data, we 
keep them for as long as required. But now lets say the user unpersists an rdd, 
basically all the cached blocks for this rdd are removed from all the 
executors. These executors which were previously holding cached data no longer 
hold any of it, so they are equivalent to any other executor. In the current 
behavior if there are no tasks which are scheduled on these executors which 
previously had any cached data but currently don't, spark doesn't release them 
and holds them for infinite time or whatever the cachedExecutorIdleTimeout is 
set to.
   
   This change checks if the executor has any cached data. If it does, then 
nothing changes. But if the executor doesn't have any cached data, then it 
treats it the same way as any other executor acquired using dynamic allocation 
and if no tasks are scheduled on it within the time interval (set by 
`spark.dynamicAllocation.executorIdleTimeout`) we release it. So we are just 
ensuring spark doesn't waste cluster resources.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to