[ 
https://issues.apache.org/jira/browse/YUNIKORN-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17303735#comment-17303735
 ] 

Weiwei Yang commented on YUNIKORN-582:
--------------------------------------

Thanks [~ayubpathan]. I think this makes a lot of sense.
We can think to add a scheduling policy parameter in the taskGroup definition 
for the app, to define if the app wants to be "soft" or "hard" gang scheduled. 
The "hard" option is the current behavior, "soft" means if we can't obtain all 
resources for the gang members in a given time, fallback to schedule it as a 
non-gang app. [~wilfreds], [~kmarton] pls let me know if this makes sense.

> Consider a fallback mechanism to schedule the app incase of gang failure 
> instead of rejecting the app.
> ------------------------------------------------------------------------------------------------------
>
>                 Key: YUNIKORN-582
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-582
>             Project: Apache YuniKorn
>          Issue Type: Sub-task
>          Components: core - scheduler
>            Reporter: Ayub Pathan
>            Priority: Major
>
> Incases when the app encounters gang issues due to placeholder pod 
> allocation(failed due to various reasons), currently yunikorn marks the app 
> failed. 
> Instead, consider a configurable option for hard or soft gang scheduling 
> which allows fallback mechanism to schedule the app successfully.  This needs 
> to be brain stormed to see if this makes sense. Let us use this jira for 
> documenting all the thoughts.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to