dhruve commented on issue #22015: [SPARK-20286][SPARK-24786][Core][DynamicAllocation] Release executors on unpersisting RDD URL: https://github.com/apache/spark/pull/22015#issuecomment-465145564 Let me explain the change to make it more clear. In regular cases, if an executor doesn't have any cached data, we release the executor if no tasks are scheduled on it within the timeout interval(set by `spark.dynamicAllocation.executorIdleTimeout`). This is the default behavior. For executors which have cached data, by default the expiry interval is infinite (configured by `spark.dynamicAllocation.cachedExecutorIdleTimeout`). Since we don't know when the next task will need to access the cached data, we keep them for as long as required. But now lets say the user unpersists an rdd, basically all the cached blocks for this rdd are removed from all the executors. These executors which were previously holding cached data no longer hold any of it, so they are equivalent to any other executor. In the current behavior if there are no tasks which are scheduled on these executors which previously had any cached data but currently don't, spark doesn't release them and holds them for infinite time or whatever the cachedExecutorIdleTimeout is set to. This change checks if the executor has any cached data. If it does, then nothing changes. But if the executor doesn't have any cached data, then it treats it the same way as any other executor acquired using dynamic allocation and if no tasks are scheduled on it within the time interval (set by `spark.dynamicAllocation.executorIdleTimeout`) we release it. So we are just ensuring spark doesn't waste cluster resources.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
