[jira] [Commented] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

Craig Ingram (JIRA) Wed, 18 Oct 2017 12:58:28 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-21122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209918#comment-16209918
 ]


Craig Ingram commented on SPARK-21122:
--------------------------------------

Finally getting back around to this. Thanks for the feedback [~tgraves]. I 
agree with pretty much everything you pointed out. Newer versions of YARN do 
have a [Cluster Metrics 
API|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Metrics_API].
 I think this was introduced in 2.8 though. Regardless I believe working with 
YARN's PreemptionMessage is a better path forward than what I was initially 
proposing. I wish there was an elegant way to do this generically, but I 
believe I can at least make an abstraction of the PreemptionMessage that can be 
used by other RM clients. For now, I will focus on a YARN specific solution.

> Address starvation issues when dynamic allocation is enabled
> ------------------------------------------------------------
>
>                 Key: SPARK-21122
>                 URL: https://issues.apache.org/jira/browse/SPARK-21122
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 2.2.0, 2.3.0
>            Reporter: Craig Ingram
>         Attachments: Preventing Starvation with Dynamic Allocation Enabled.pdf
>
>
> When dynamic resource allocation is enabled on a cluster, it’s currently 
> possible for one application to consume all the cluster’s resources, 
> effectively starving any other application trying to start. This is 
> particularly painful in a notebook environment where notebooks may be idle 
> for tens of minutes while the user is figuring out what to do next (or eating 
> their lunch). Ideally the application should give resources back to the 
> cluster when monitoring indicates other applications are pending.
> Before delving into the specifics of the solution. There are some workarounds 
> to this problem that are worth mentioning:
> * Set spark.dynamicAllocation.maxExecutors to a small value, so that users 
> are unlikely to use the entire cluster even when many of them are doing work. 
> This approach will hurt cluster utilization.
> * If using YARN, enable preemption and have each application (or 
> organization) run in a separate queue. The downside of this is that when YARN 
> preempts, it doesn't know anything about which executor it's killing. It 
> would just as likely kill a long running executor with cached data as one 
> that just spun up. Moreover, given a feature like 
> https://issues.apache.org/jira/browse/SPARK-21097 (preserving cached data on 
> executor decommission), YARN may not wait long enough between trying to 
> gracefully and forcefully shut down the executor. This would mean the blocks 
> that belonged to that executor would be lost and have to be recomputed.
> * Configure YARN to use the capacity scheduler with multiple scheduler 
> queues. Put high-priority notebook users into a high-priority queue. Prevents 
> high-priority users from being starved out by low-priority notebook users. 
> Does not prevent users in the same priority class from starving each other.
> Obviously any solution to this problem that depends on YARN would leave other 
> resource managers out in the cold. The solution proposed in this ticket will 
> afford spark clusters the flexibly to hook in different resource allocation 
> policies to fulfill their user's needs regardless of resource manager choice. 
> Initially the focus will be on users in a notebook environment. When 
> operating in a notebook environment with many users, the goal is fair 
> resource allocation. Given that all users will be using the same memory 
> configuration, this solution will focus primarily on fair sharing of cores.
> The fair resource allocation policy should pick executors to remove based on 
> three factors initially: idleness, presence of cached data, and uptime. The 
> policy will favor removing executors that are idle, short-lived, and have no 
> cached data. The policy will only preemptively remove executors if there are 
> pending applications or cores (otherwise the default dynamic allocation 
> timeout/removal process is followed). The policy will also allow an 
> application's resource consumption to expand based on cluster utilization. 
> For example if there are 3 applications running but 2 of them are idle, the 
> policy will allow a busy application with pending tasks to consume more than 
> 1/3rd of the the cluster's resources.
> More complexity could be added to take advantage of task/stage metrics, 
> histograms, and heuristics (i.e. favor removing executors running tasks that 
> are quick). The important thing here is to benchmark effectively before 
> adding complexity so we can measure the impact of the changes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-21122) Address starvation issues when dynamic allocation is enabled

Reply via email to