Peter Bacsko created YUNIKORN-1169:
--------------------------------------

             Summary: Fix ApplicationMetadata restoration during recovery
                 Key: YUNIKORN-1169
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1169
             Project: Apache YuniKorn
          Issue Type: Bug
          Components: shim - kubernetes
            Reporter: Peter Bacsko


The following code in {{general.go}} handles the recovery part:

{noformat}
        for _, pod := range appPods {
                log.Logger().Debug("Looking at pod for recovery candidates", 
zap.String("podNamespace", pod.Namespace), zap.String("podName", pod.Name))
                // general filter passes, and pod is assigned
                // this means the pod is already scheduled by scheduler for an 
existing app
                if utils.GeneralPodFilter(pod) && utils.IsAssignedPod(pod) {
                        if meta, ok := os.getAppMetadata(pod); ok {
                                podsRecovered++
                                log.Logger().Debug("Adding appID as recovery 
candidate", zap.String("appID", meta.ApplicationID))
                                if _, exist := 
existingApps[meta.ApplicationID]; !exist {
                                        existingApps[meta.ApplicationID] = meta
                                }
...
{noformat}

The crucial part is the handling of {{existingApps}} map. It's populated only 
once - however, there's no guarantee that all pods have the same tags or 
ownerReferences. 

The scope of this JIRA is to analyze the possible side-effects of this code and 
come up with a better solution. A bug was already identified because of this 
(see YUNIKORN-1161).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to