Peter Bacsko created YUNIKORN-1197:
--------------------------------------

             Summary: Placeholders are replaced during recovery
                 Key: YUNIKORN-1197
                 URL: https://issues.apache.org/jira/browse/YUNIKORN-1197
             Project: Apache YuniKorn
          Issue Type: Sub-task
          Components: shim - kubernetes
            Reporter: Peter Bacsko


When we restart YK, some placeholders that are running are immediately 
replaced, despite the fact that the timeout has not yet expired.

Example:
{noformat}
2022-04-27T11:43:47.145Z        INFO    cache/context_recovery.go:182   node 
state      {"nodeName": "minikube", "nodeState": "Healthy"}
2022-04-27T11:43:47.145Z        INFO    cache/context_recovery.go:196   nodes 
recovery is successful    {"recoveredNodes": 1}
2022-04-27T11:43:47.145Z        INFO    shim/scheduler.go:226   scheduler 
recovery succeed
2022-04-27T11:43:47.145Z        INFO    cache/nodes.go:238      scheduler node 
event    {"name": "minikube", "current state ": "New", "transition to ": 
"RecoverNode"}
2022-04-27T11:43:47.145Z        INFO    shim/scheduler.go:356   No outstanding 
apps found for a while   {"timeout": "2m0s"}
2022-04-27T11:43:47.145Z        INFO    cache/application.go:557        Skip 
the reservation stage      {"appID": "batch-sleep-job"}
2022-04-27T11:43:47.145Z        INFO    cache/context.go:318    trigger 
scheduler configuration reloading
2022-04-27T11:43:48.148Z        INFO    objects/application.go:585      Ask 
added successfully to application   {"appID": "batch-sleep-job", "ask": 
"ce3558cd-2a02-47d8-9bb7-93b2aadf9cc8", "placeholder": false, "pendingDelta": 
"map[memory:10000000 vcore:10]"}
2022-04-27T11:43:48.148Z        INFO    objects/application.go:585      Ask 
added successfully to application   {"appID": "batch-sleep-job", "ask": 
"c88d0bba-ef94-4728-ad54-da30f72646ee", "placeholder": false, "pendingDelta": 
"map[memory:10000000 vcore:10]"}
2022-04-27T11:43:48.148Z        INFO    objects/application.go:585      Ask 
added successfully to application   {"appID": "batch-sleep-job", "ask": 
"412d750d-f8c2-4b9c-a4cf-c7077c5384e1", "placeholder": false, "pendingDelta": 
"map[memory:10000000 vcore:10]"}
2022-04-27T11:43:48.148Z        INFO    objects/application.go:585      Ask 
added successfully to application   {"appID": "batch-sleep-job", "ask": 
"54607708-a8f3-4ff3-b73c-210111a54625", "placeholder": false, "pendingDelta": 
"map[memory:10000000 vcore:10]"}
2022-04-27T11:43:48.148Z        INFO    objects/application.go:585      Ask 
added successfully to application   {"appID": "batch-sleep-job", "ask": 
"f080aad1-6b08-4d83-8802-8dbf853a89cd", "placeholder": false, "pendingDelta": 
"map[memory:10000000 vcore:10]"}
2022-04-27T11:43:48.156Z        INFO    scheduler/partition.go:863      
scheduler replace placeholder processed {"appID": "batch-sleep-job", 
"allocationKey": "ce3558cd-2a02-47d8-9bb7-93b2aadf9cc8", "UUID": 
"5f0d5e0d-0668-4297-82ba-c8ebb585b0f7", "placeholder released UUID": 
"312d7df9-000c-4035-9170-9ea96ef9e718"}
2022-04-27T11:43:48.156Z        INFO    scheduler/partition.go:863      
scheduler replace placeholder processed {"appID": "batch-sleep-job", 
"allocationKey": "c88d0bba-ef94-4728-ad54-da30f72646ee", "UUID": 
"a80035cc-9751-4dc2-9a36-9a649ae50922", "placeholder released UUID": 
"a2c072c7-3814-4464-bcd8-64f3e3b79b4e"}
2022-04-27T11:43:48.156Z        INFO    scheduler/partition.go:863      
scheduler replace placeholder processed {"appID": "batch-sleep-job", 
"allocationKey": "412d750d-f8c2-4b9c-a4cf-c7077c5384e1", "UUID": 
"126f996f-ca79-4895-a593-ffa51a6fc40e", "placeholder released UUID": 
"a48d2a0a-c9cc-446b-8f33-bf7952e5771c"}
2022-04-27T11:43:48.156Z        INFO    scheduler/partition.go:863      
scheduler replace placeholder processed {"appID": "batch-sleep-job", 
"allocationKey": "54607708-a8f3-4ff3-b73c-210111a54625", "UUID": 
"d82c103e-85ce-4375-9448-75d251549326", "placeholder released UUID": 
"84e6a8bc-42ab-45da-9af6-c4067b2a3561"}
2022-04-27T11:43:48.156Z        INFO    scheduler/partition.go:863      
scheduler replace placeholder processed {"appID": "batch-sleep-job", 
"allocationKey": "f080aad1-6b08-4d83-8802-8dbf853a89cd", "UUID": 
"2441b758-4c12-44d8-ab70-6c6b3fb100de", "placeholder released UUID": 
"ec4f534c-628a-4dc3-87d1-73f782da8c46"}
2022-04-27T11:43:48.156Z        INFO    cache/application.go:675        try to 
release pod from application     {"appID": "batch-sleep-job", "allocationUUID": 
"312d7df9-000c-4035-9170-9ea96ef9e718", "terminationType": 
"PLACEHOLDER_REPLACED"}
2022-04-27T11:43:48.168Z        INFO    cache/application.go:675        try to 
release pod from application     {"appID": "batch-sleep-job", "allocationUUID": 
"a2c072c7-3814-4464-bcd8-64f3e3b79b4e", "terminationType": 
"PLACEHOLDER_REPLACED"}
2022-04-27T11:43:48.174Z        INFO    cache/application.go:675        try to 
release pod from application     {"appID": "batch-sleep-job", "allocationUUID": 
"a48d2a0a-c9cc-446b-8f33-bf7952e5771c", "terminationType": 
"PLACEHOLDER_REPLACED"}
2022-04-27T11:43:48.180Z        INFO    cache/application.go:675        try to 
release pod from application     {"appID": "batch-sleep-job", "allocationUUID": 
"84e6a8bc-42ab-45da-9af6-c4067b2a3561", "terminationType": 
"PLACEHOLDER_REPLACED"}
2022-04-27T11:43:48.199Z        INFO    cache/application.go:675        try to 
release pod from application     {"appID": "batch-sleep-job", "allocationUUID": 
"ec4f534c-628a-4dc3-87d1-73f782da8c46", "terminationType": 
"PLACEHOLDER_REPLACED"}
2022-04-27T11:43:49.671Z        INFO    general/general.go:285  task completes  
{"appType": "general", "namespace": "default", "podName": 
"tg-groupa-batch-sleep-job-3", "podUID": 
"84e6a8bc-42ab-45da-9af6-c4067b2a3561", "podStatus": "Failed"}
{noformat}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to