rdblue edited a comment on issue #23401: [SPARK-26513][Core] : Trigger GC on executor node idle URL: https://github.com/apache/spark/pull/23401#issuecomment-450701012 > Dynamic scale down is often done fairly conservatively when combined with cached blocks I agree. We (Netflix) actually don't internally recommend caching when using dynamic allocation for most ETL workloads, and we recommend careful settings with ML workloads. So when I'm talking about dynamic allocation, I mean a case where executors time out fairly quickly as a stage enters its long tail, which is when this would primarily take effect. I don't think it would hurt, but I don't think it would help much either. I also don't mean to say that I think this is a bad idea. It may be worth putting in as an option for certain workloads. I'm just skeptical that it is a good practice for a default and I didn't see much validation across workloads in the paper.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
