[
https://issues.apache.org/jira/browse/SPARK-24375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16516411#comment-16516411
]
Mridul Muralidharan commented on SPARK-24375:
---------------------------------------------
[~jiangxb1987] A couple of comments based on the document and your elaboration
above:
* Is the 'barrier' logic pluggable ? Instead of only being a global sync point.
* Dynamic resource allocation (dra) triggers allocation of additional resources
based on pending tasks - hence *We may add a check of total available slots
before scheduling tasks from a barrier stage taskset.* does not necessarily
work in that context.
* Currently DRA in spark uniformly allocates resources - are we envisioning
changes as part of this effort to allocate heterogenous executor resources
based on pending tasks (atleast initially for barrier support for gpu's) ?
* How is fault tolerance handled w.r.t waiting on incorrect barriers ? Any way
to identify the barrier ? Example:
{code}
try {
... snippet A ...
// Barrier 1
context.barrier()
... snippet B ...
} catch { ... }
... snippet C ...
// Barrier 2
context.barrier()
{code}
** In face of exceptions, some tasks will wait on barrier 2 and others on
barrier 1 : causing issues.
*
> Design sketch: support barrier scheduling in Apache Spark
> ---------------------------------------------------------
>
> Key: SPARK-24375
> URL: https://issues.apache.org/jira/browse/SPARK-24375
> Project: Spark
> Issue Type: Story
> Components: Spark Core
> Affects Versions: 3.0.0
> Reporter: Xiangrui Meng
> Assignee: Jiang Xingbo
> Priority: Major
>
> This task is to outline a design sketch for the barrier scheduling SPIP
> discussion. It doesn't need to be a complete design before the vote. But it
> should at least cover both Scala/Java and PySpark.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]