[
https://issues.apache.org/jira/browse/YUNIKORN-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Bacsko updated YUNIKORN-1217:
-----------------------------------
Description:
When running a Spark workload with gang scheduling, the driver and executor
pods have different annotations.
It is critical that we process the driver first, because it has the task group
definitions. Based on
[https://yunikorn.apache.org/docs/next/user_guide/gang_scheduling/,] the
executor only needs {{{}yunikorn.apache.org/taskGroupName{}}}.
So when we add the pods in the recovery code path, we have to start with the
driver.
> Ensure that Spark driver pod is processed before executor pods during recovery
> ------------------------------------------------------------------------------
>
> Key: YUNIKORN-1217
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1217
> Project: Apache YuniKorn
> Issue Type: Sub-task
> Components: shim - kubernetes
> Reporter: Peter Bacsko
> Assignee: Peter Bacsko
> Priority: Major
>
> When running a Spark workload with gang scheduling, the driver and executor
> pods have different annotations.
> It is critical that we process the driver first, because it has the task
> group definitions. Based on
> [https://yunikorn.apache.org/docs/next/user_guide/gang_scheduling/,] the
> executor only needs {{{}yunikorn.apache.org/taskGroupName{}}}.
> So when we add the pods in the recovery code path, we have to start with the
> driver.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]