Si Latt created YUNIKORN-1076:
---------------------------------

             Summary: Robust handling of invalid Task Group annotation
                 Key: YUNIKORN-1076
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1076
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: core - scheduler
            Reporter: Si Latt


For gang scheduling, task group information has to be defined in the annotation 
section. If the provided YAML for task group info is invalid, such as missing 
double quote for keys, it results in parse exception and gets logged in YK log. 
However, when looking at Kubernetes event log, there is no indication that 
exception happened during gang scheduling. Other gang scheduling events are 
logged and hence give users the wrong impression that pods are gang scheduled 
without any issue.

Current behavior is dangerous as it cause gang scheduling to not work in 
production without users realizing any issue.  We should probably take the 
following actions:
 # Reject / fail any app with invalid task group annotation
 # Emit exceptions in kubernetes events for surfacing critical events happening 
in the cluster.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to