[ 
https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14125172#comment-14125172
 ] 

Patrick Wendell commented on SPARK-3174:
----------------------------------------

Hey Sandy - thanks for posting the design here. This proposes moving blocks off 
of executors before they are decommissioned. This might cause issues for long 
running or ETL-style workloads since the accumulate state on the machines could 
be very large (i.e. gigabytes of data). Another approach would be to use the 
YARN shuffle directly to decouple the shuffle data from Spark executors. Just 
wanted to mention it as a possibility. I think in parallel Andrew is look at 
this as well.

> Under YARN, add and remove executors based on load
> --------------------------------------------------
>
>                 Key: SPARK-3174
>                 URL: https://issues.apache.org/jira/browse/SPARK-3174
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 1.0.2
>            Reporter: Sandy Ryza
>            Assignee: Andrew Or
>         Attachments: SPARK-3174design.pdf
>
>
> A common complaint with Spark in a multi-tenant environment is that 
> applications have a fixed allocation that doesn't grow and shrink with their 
> resource needs.  We're blocked on YARN-1197 for dynamically changing the 
> resources within executors, but we can still allocate and discard whole 
> executors.
> I think it would be useful to have some heuristics that
> * Request more executors when many pending tasks are building up
> * Request more executors when RDDs can't fit in memory
> * Discard executors when few tasks are running / pending and there's not much 
> in memory
> Bonus points: migrate blocks from executors we're about to discard to 
> executors with free space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to