[
https://issues.apache.org/jira/browse/SPARK-29153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Graves resolved SPARK-29153.
-----------------------------------
Fix Version/s: 3.1.0
Assignee: Thomas Graves
Resolution: Fixed
> ResourceProfile conflict resolution stage level scheduling
> ----------------------------------------------------------
>
> Key: SPARK-29153
> URL: https://issues.apache.org/jira/browse/SPARK-29153
> Project: Spark
> Issue Type: Story
> Components: Scheduler
> Affects Versions: 3.0.0
> Reporter: Thomas Graves
> Assignee: Thomas Graves
> Priority: Major
> Fix For: 3.1.0
>
>
> For the stage level scheduling, if a stage has ResourceProfiles from multiple
> RDD that conflict we have to resolve that conflict.
> We may have 2 approaches.
> # default to error out if conflicting, that way user realizes what is going
> on, have a config to turn this on and off.
> # If config to error out if off, then resolve the conflict. See below from
> the design doc on the SPIP.
> For the merge strategy we can choose the max from the ResourceProfiles to
> make the largest container required. This in general will work but there are
> a few cases people may have intended them to be a sum. For instance lets say
> one RDD needs X memory and another RDD needs Y memory. It might be when those
> get combined into a stage you really need X+Y memory vs the max(X, Y).
> Another example might be union, where you would want to sum the resources of
> each RDD. I think we can document what we choose for now and later on add in
> the ability to have other alternatives then max. Or perhaps we do need to
> change what we do either per operation or per resource type.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]