[ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831014#comment-16831014 ]
Josh Rosen commented on SPARK-17637: ------------------------------------ I think this old feature suggestion is still very relevant and I'd love to see someone pick this up and rework it for a newer version of Spark (or come up with an alternative improvement). Here's my use-case for packed scheduling of tasks: I have a job shaped roughly like {{input.flatMap(parse).repartition(40).map(transform).write.parquet()}}, where I have ~1000 map tasks but only 40 reduce tasks (because I want to write exactly 40 output files) and those reduce tasks are single-core CPU bound and don't require much memory (so the tasks aren't really able to take advantage of idle resources from unused task slots on their executors). I want to run the initial map phase with huge parallelism (because my input dataset is massive) and then want to scale down resource usage during the final reduce phase such that I minimize the number of idle task slots. I'm currently using dynamic allocation with Spark on YARN with the external shuffle service enabled. What happens today is that the final reduce phase's tasks get round-robin scheduled across the executors, spreading work thin and leaving most of the executor cores idle. If these tasks were instead densely-packed then I'd be able to scale down all but a couple of my executors, freeing resources for other YARN applications. When the current current round-robin policy is coupled with multi-core executors, it can sometimes cause this shape of workload to be less resource efficient (in terms of vcore_seconds) than an equivalent Scalding / Hadoop MapReduce job. As a short-term workaround, I can reconfigure my application to run a larger number of smaller executors, but that's somewhat painful compared to being able to flip a single flag to toggle the task scheduling policy. > Packed scheduling for Spark tasks across executors > -------------------------------------------------- > > Key: SPARK-17637 > URL: https://issues.apache.org/jira/browse/SPARK-17637 > Project: Spark > Issue Type: Improvement > Components: Scheduler > Reporter: Zhan Zhang > Assignee: Zhan Zhang > Priority: Minor > > Currently Spark scheduler implements round robin scheduling for tasks to > executors. Which is great as it distributes the load evenly across the > cluster, but this leads to significant resource waste in some cases, > especially when dynamic allocation is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org