pbacsko commented on PR #918:
URL: https://github.com/apache/yunikorn-k8shim/pull/918#issuecomment-2401887556

   > I think we need to handle orphan pods for foreign pods similar to yunikorn 
ones. In other words, if a pod references a non-existent node, it needs to be 
"orphaned" and not sent to the core. If a node appears, any currently orphaned 
pods (foreign or otherwise) which reference the new node need to be adopted and 
registered with the core.
   
   @craigcondit I looked at the code in more details, here are my findings:
   1. Adding orphan foreign pod: handled inside `updateForeignPod()`:
   ```
   if oldPod == nil && utils.IsAssignedPod(pod) && !utils.IsPodTerminated(pod) {
                if ctx.schedulerCache.UpdatePod(pod) {
                        // pod was accepted by a real node
                        log.Log(log.ShimContext).Debug("pod is assigned to a 
node, trigger occupied resource update",
                                zap.String("namespace", pod.Namespace),
                                zap.String("podName", pod.Name),
                                zap.String("podStatusBefore", podStatusBefore),
                                zap.String("podStatusCurrent", 
string(pod.Status.Phase)))
                        allocReq := common.CreateAllocationForForeignPod(pod)
                        if err := 
ctx.apiProvider.GetAPIs().SchedulerAPI.UpdateAllocation(allocReq); err != nil {
                                log.Log(log.ShimContext).Error("failed to add 
foreign allocation to the core",
                                        zap.Error(err))
                        }
                } else {
                        // pod is orphaned (references an unknown node)
                        log.Log(log.ShimContext).Info("skipping occupied 
resource update for assigned orphaned pod",
                                zap.String("namespace", pod.Namespace),
                                zap.String("podName", pod.Name),
                                zap.String("nodeName", pod.Spec.NodeName))
                }
                return
        }
   ```
   2. Pod becomes assigned to a real node: handled together with YK pods 
(`adoptedPods` slice) 
   3. Orphan pod update: same as addition, handled in `updateForeignPod()`
   4. Deleting orphan pod: this is where probably the only change is necessary, 
don't send message to the core:
   ```
   func (ctx *Context) deleteForeignPod(pod *v1.Pod) {
        releaseReq := common.CreateReleaseRequestForForeignPod(string(pod.UID), 
constants.DefaultPartition)
        if err := 
ctx.apiProvider.GetAPIs().SchedulerAPI.UpdateAllocation(releaseReq); err != nil 
{ <--- don't send if orphan pod
                log.Log(log.ShimContext).Error("failed to remove foreign 
allocation from the core",
                        zap.Error(err))
        }
   
        log.Log(log.ShimContext).Debug("removing pod from cache", 
zap.String("podName", pod.Name))
        ctx.schedulerCache.RemovePod(pod)
   }
   ```
   
   Is there anything else?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@yunikorn.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to