[
https://issues.apache.org/jira/browse/SPARK-31107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun updated SPARK-31107:
----------------------------------
Affects Version/s: (was: 3.0.0)
3.1.0
> Extend FairScheduler to support pool level resource isolation
> -------------------------------------------------------------
>
> Key: SPARK-31107
> URL: https://issues.apache.org/jira/browse/SPARK-31107
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Affects Versions: 3.1.0
> Reporter: liupengcheng
> Priority: Major
>
> Currently, spark only provided two types of scheduler: FIFO & FAIR, but in
> sql high-concurrency scenarios, a few of drawbacks are exposed.
> FIFO: it can easily causing congestion when large sql query occupies all the
> resources
> FAIR: the taskSets of one pool may occupies all the resource due to there are
> no hard limit on the maximum usage for each pool. this case may be
> frequently met under high workloads.
> So we propose to add a maxShare argument for FairScheduler to control the
> maximum running tasks for each pool.
> One thing that needs our attention is that we should handle it well to make
> the `ExecutorAllocationManager` can release resources:
> e.g. Suppose we got 100 executors, if the tasks are scheduled on all
> executors with max concurrency 50, there are cases that the executors may not
> idle, and can not be released.
> One idea is to bind those executors to each pool, then we only schedule tasks
> on executors of the pool which it belongs to.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]