cheersyang commented on a change in pull request #61:
URL: 
https://github.com/apache/incubator-yunikorn-site/pull/61#discussion_r660195875



##########
File path: docs/user_guide/gang_scheduling.md
##########
@@ -206,6 +214,64 @@ Annotations:
 Once the job is submitted to the scheduler, the job won’t be scheduled 
immediately.
 Instead, the scheduler will ensure it gets its minimal resources before 
actually starting the driver/executors. 
 
+## Gang scheduling Styles
+
+Initially when the app encountered gang issues due to placeholder pod 
allocation(failed due to various reasons), we marked the application failed 
without retrying it. This wasn’t a really user friendly experience, so it led 
to a demand of making the gangs scheduling style configurable and make it 
possible to succeed to schedule the app through a fallback mechanism.
+
+To solve this issue we defined two Gang scheduling styles: Soft and Hard.

Review comment:
       I feel we can simplify this to shorter lines, such as "there are 2 gang 
scheduling styles supported, Soft and Hard respectively. It can be configured 
per app-level to define how the app will behave in case the gang scheduling 
fails."

##########
File path: docs/user_guide/gang_scheduling.md
##########
@@ -206,6 +214,64 @@ Annotations:
 Once the job is submitted to the scheduler, the job won’t be scheduled 
immediately.
 Instead, the scheduler will ensure it gets its minimal resources before 
actually starting the driver/executors. 
 
+## Gang scheduling Styles
+
+Initially when the app encountered gang issues due to placeholder pod 
allocation(failed due to various reasons), we marked the application failed 
without retrying it. This wasn’t a really user friendly experience, so it led 
to a demand of making the gangs scheduling style configurable and make it 
possible to succeed to schedule the app through a fallback mechanism.
+
+To solve this issue we defined two Gang scheduling styles: Soft and Hard.
+
+- `Hard style`: when this style is used, we will have the initial behavior, 
more precisely if the application cannot be scheduled according to gang 
scheduling rules, and it times out, it will be marked as failed, without 
retrying to schedule it.

Review comment:
       when the app cannot be gang scheduled, it will be marked as failed 
without retrying to schedule it.

##########
File path: docs/user_guide/gang_scheduling.md
##########
@@ -101,6 +101,14 @@ could not schedule all the placeholder pods, it will 
eventually give up after a
 freed up and used by other apps. If non of the placeholders can be allocated, 
this timeout won't kick-in. To avoid the placeholder
 pods stuck forever, please refer to 
[troubleshooting](trouble_shooting.md#gang-scheduling) for solutions.
 
+` gangSchedulingStyle`
+
+Possible values: *Soft*, *Hard*

Review comment:
       Possible values -> Valid values

##########
File path: docs/user_guide/gang_scheduling.md
##########
@@ -206,6 +214,64 @@ Annotations:
 Once the job is submitted to the scheduler, the job won’t be scheduled 
immediately.
 Instead, the scheduler will ensure it gets its minimal resources before 
actually starting the driver/executors. 
 
+## Gang scheduling Styles
+
+Initially when the app encountered gang issues due to placeholder pod 
allocation(failed due to various reasons), we marked the application failed 
without retrying it. This wasn’t a really user friendly experience, so it led 
to a demand of making the gangs scheduling style configurable and make it 
possible to succeed to schedule the app through a fallback mechanism.
+
+To solve this issue we defined two Gang scheduling styles: Soft and Hard.
+
+- `Hard style`: when this style is used, we will have the initial behavior, 
more precisely if the application cannot be scheduled according to gang 
scheduling rules, and it times out, it will be marked as failed, without 
retrying to schedule it.
+- `Soft style`: using this style will make it possible to schedule a gang 
application as a normal, simple application if it cannot be scheduled and 
started by following the gang scheduling rules. This means that in case of the 
placeholder timeout the placeholders will be deleted and the application state 
will transition to Resuming state. After all the placeholders are deleted, the 
application will transition into Accepted state and the app’s pods will be 
scheduled according to the non-gang application scheduling logic.

Review comment:
       when the app cannot be gang scheduled, it will fall back to the normal 
scheduling, and the non-gang scheduling strategy will be used to achieve the 
best-effort scheduling.  When this happens, the app transits to the Resuming 
state and all the remaining placeholder pods will be cleaned up.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to