Anna Faytelson created YUNIKORN-1247:
----------------------------------------

             Summary: Helm Hooks on Role/RoleBinding in rbac.yaml Causes Fresh 
Install Deployment Failure With helm upgrade --install
                 Key: YUNIKORN-1247
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1247
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: deployment
            Reporter: Anna Faytelson
         Attachments: helm-hook-pre-install-failure.txt, 
success-helm-pre-install-hook.txt

The command for deployment of YuniKorn being used is helm upgrade --install 
with the --atomic flag set. There is a new issue that popped up where - on a 
fresh install of YuniKorn on a cluster - the RoleBinding for the scheduler was 
not able to be created because the Role was not created beforehand, causing the 
deployment to fail. More investigation needs to be done if this failure occurs 
on a helm upgrade --install without the atomic flag set. However, even so, if 
this order of operations for rollout of resources (service account, role 
binding, then role), the RoleBinding creation will always fail no matter what 
because, naturally, it needs a Role to bind to.

The "helm.sh/hook-weight" annotation to the Role and RoleBinding are both set 
to "1" in rbac.yaml for the scheduler. Meaning that they are treated the same. 
When in reality, it seems the role needs to be created before the rolebinding. 
This is the same for the Role and RoleBinding in 
admission-controller-rbac.yaml. 

I have attached a file "helm-hook-pre-install-failure" for the logs from the 
helm upgrade --install --atomic command that was used and where this failure 
was first seen. Specifically the line in question is 

 
{code:java}
client.go:299: [debug] Starting delete for "yunikorn-rbac" RoleBinding 
client.go:328: [debug] rolebindings.rbac.authorization.k8s.io "yunikorn-rbac" 
not found client.go:128: [debug] creating 1 resource(s) install.go:441: [debug] 
Install failed and atomic is set, uninstalling release uninstall.go:95: [debug] 
uninstall: Deleting yunikorn
{code}
For debugging, I updated my YuniKorn chart rbac.yaml and 
admission-controller-rbac.yaml to give the Role resources a hook-weight of "0" 
to be created before the RoleBindings, leaving the "helm.sh/hook-weight"of the 
RoleBindings to "1". Once we packaged this chart up and pushed the new chart up 
for use, the helm upgrade --install command succeeded. This log can be seen in 
the "success-helm-pre-install-hook.txt" file.

The Chart.yaml
{code:java}
apiVersion: v2
name: yunikorn
description: YuniKorn scheduler for Kubernetes
type: application
version: 1.0.7
appVersion: 1.0.1{code}
Thank you!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to