[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

Josh Rosen (JIRA) Wed, 01 May 2019 07:12:16 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831014#comment-16831014
 ]


Josh Rosen commented on SPARK-17637:
------------------------------------

I think this old feature suggestion is still very relevant and I'd love to see 
someone pick this up and rework it for a newer version of Spark (or come up 
with an alternative improvement).

Here's my use-case for packed scheduling of tasks:

I have a job shaped roughly like 
{{input.flatMap(parse).repartition(40).map(transform).write.parquet()}}, where 
I have ~1000 map tasks but only 40 reduce tasks (because I want to write 
exactly 40 output files) and those reduce tasks are single-core CPU bound and 
don't require much memory (so the tasks aren't really able to take advantage of 
idle resources from unused task slots on their executors).

I want to run the initial map phase with huge parallelism (because my input 
dataset is massive) and then want to scale down resource usage during the final 
reduce phase such that I minimize the number of idle task slots.

I'm currently using dynamic allocation with Spark on YARN with the external 
shuffle service enabled. What happens today is that the final reduce phase's 
tasks get round-robin scheduled across the executors, spreading work thin and 
leaving most of the executor cores idle. If these tasks were instead 
densely-packed then I'd be able to scale down all but a couple of my executors, 
freeing resources for other YARN applications. When the current current 
round-robin policy is coupled with multi-core executors, it can sometimes cause 
this shape of workload to be less resource efficient (in terms of 
vcore_seconds) than an equivalent Scalding / Hadoop MapReduce job.

As a short-term workaround, I can reconfigure my application to run a larger 
number of smaller executors, but that's somewhat painful compared to being able 
to flip a single flag to toggle the task scheduling policy.

> Packed scheduling for Spark tasks across executors
> --------------------------------------------------
>
>                 Key: SPARK-17637
>                 URL: https://issues.apache.org/jira/browse/SPARK-17637
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>            Reporter: Zhan Zhang
>            Assignee: Zhan Zhang
>            Priority: Minor
>
> Currently Spark scheduler implements round robin scheduling for tasks to 
> executors. Which is great as it distributes the load evenly across the 
> cluster, but this leads to significant resource waste in some cases, 
> especially when dynamic allocation is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors

Reply via email to