[ 
https://issues.apache.org/jira/browse/YUNIKORN-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qi Zhu updated YUNIKORN-2141:
-----------------------------
    Description: 
The details about the bug:
 # The real pod created and waiting for scheduling after placeholders bound

{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
 state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
\"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
 \"source\": \"New\", \"destination\": \"Pending\", \"event\": \"InitTask\"}"}

{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
 state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
\"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
 \"source\": \"Pending\", \"destination\": \"Scheduling\", \"event\": 
\"SubmitTask\"}"}



 # Scheduler replace placeholder processed, and send release allocation request 
to shim side:

{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tscheduler/partition.go:828\tscheduler
 replace placeholder processed\t\{\"appID\": 
\"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationKey\": 
\"8837de6e-d888-4549-9baf-254c8a807421\", \"uuid\": 
\"9508439d-60a2-404e-9c84-bd2c6783b5c7\", \"placeholder released uuid\": 
\"cc243ba1-7054-4b07-8344-6afb1424b1e0\"}"}

{"stream":"stdout","log":"2023-11-08T15:16:14.913Z\tINFO\tcache/application.go:637\ttry
 to release pod from application\t\{\"appID\": 
\"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationUUID\": 
\"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
\"PLACEHOLDER_REPLACED\"}"}



 # The same time, Preempting task try to preempt the already sent release 
allocation
{"stream":"stdout","log":"2023-11-08T15:16:20.870Z\tINFO\tobjects/preemption.go:563\tPreempting
 task\t\{\"applicationID\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"allocationKey\": \"e6e91651-7152-42f5-8504-355590fa0079\", \"nodeID\": 
\"ip-10-157-240-201.ec2.internal\", \"resources\": \"map[memory:3430940672 
pods:1 vcore:2100]\"}"}

{"stream":"stdout","log":"2023-11-08T15:16:20.871Z\tINFO\tcache/application.go:637\ttry
 to release pod from application\t\{\"appID\": 
\"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationUUID\": 
\"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
\"PREEMPTED_BY_SCHEDULER\"}"}


 # The pod deleted and trigger complete task and the terminationType is 
PREEMPTED_BY_SCHEDULER
{"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tgeneral/general.go:204\tdelete
 pod\t\{\"appType\": \"general\", \"namespace\": \"dex-app-q5nslqd5\", 
\"podName\": \"tg-spark-driver-spark-28105bdfe17b494887c0c4-0\", \"podUID\": 
\"e6e91651-7152-42f5-8504-355590fa0079\"}"}

{"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tcache/task_state.go:380\tTask
 state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"task\": \"e6e91651-7152-42f5-8504-355590fa0079\", \"taskAlias\": 
\"dex-app-q5nslqd5/tg-spark-driver-spark-28105bdfe17b494887c0c4-0\", 
\"source\": \"Bound\", \"destination\": \"Completed\", \"event\": 
\"CompleteTask\"}"}

{"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tscheduler/partition.go:1245\tremoving
 allocation from application\t\{\"appID\": 
\"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationId\": 
\"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
\"PREEMPTED_BY_SCHEDULER\"}"}

  was:
The details about the bug:
 # The real pod created and waiting for scheduling
{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
 state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
\"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
 \"source\": \"New\", \"destination\": \"Pending\", \"event\": \"InitTask\"}"}
{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
 state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
\"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
\"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
 \"source\": \"Pending\", \"destination\": \"Scheduling\", \"event\": 
\"SubmitTask\"}"}
{"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tobjects/application.go:669\task
 added successfully to application\t\{\"appID\": 
\"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"ask\": 
\"8837de6e-d888-4549-9baf-254c8a807421\", \"placeholder\": false, 
\"pendingDelta\": \"map[memory:3430940672 pods:1 vcore:2100]\"}"}


 #  


> Should not preempt placeholders which has been released
> -------------------------------------------------------
>
>                 Key: YUNIKORN-2141
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2141
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Qi Zhu
>            Assignee: Qi Zhu
>            Priority: Major
>
> The details about the bug:
>  # The real pod created and waiting for scheduling after placeholders bound
> {"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
>  state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
> \"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
> \"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
>  \"source\": \"New\", \"destination\": \"Pending\", \"event\": \"InitTask\"}"}
> {"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tcache/task_state.go:380\tTask
>  state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
> \"task\": \"8837de6e-d888-4549-9baf-254c8a807421\", \"taskAlias\": 
> \"dex-app-q5nslqd5/ogautaealleventsdynamicu2klogfm2-50-eb8bde8baf814091-driver\",
>  \"source\": \"Pending\", \"destination\": \"Scheduling\", \"event\": 
> \"SubmitTask\"}"}
>  # Scheduler replace placeholder processed, and send release allocation 
> request to shim side:
> {"stream":"stdout","log":"2023-11-08T15:16:14.912Z\tINFO\tscheduler/partition.go:828\tscheduler
>  replace placeholder processed\t\{\"appID\": 
> \"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationKey\": 
> \"8837de6e-d888-4549-9baf-254c8a807421\", \"uuid\": 
> \"9508439d-60a2-404e-9c84-bd2c6783b5c7\", \"placeholder released uuid\": 
> \"cc243ba1-7054-4b07-8344-6afb1424b1e0\"}"}
> {"stream":"stdout","log":"2023-11-08T15:16:14.913Z\tINFO\tcache/application.go:637\ttry
>  to release pod from application\t\{\"appID\": 
> \"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationUUID\": 
> \"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
> \"PLACEHOLDER_REPLACED\"}"}
>  # The same time, Preempting task try to preempt the already sent release 
> allocation
> {"stream":"stdout","log":"2023-11-08T15:16:20.870Z\tINFO\tobjects/preemption.go:563\tPreempting
>  task\t\{\"applicationID\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
> \"allocationKey\": \"e6e91651-7152-42f5-8504-355590fa0079\", \"nodeID\": 
> \"ip-10-157-240-201.ec2.internal\", \"resources\": \"map[memory:3430940672 
> pods:1 vcore:2100]\"}"}
> {"stream":"stdout","log":"2023-11-08T15:16:20.871Z\tINFO\tcache/application.go:637\ttry
>  to release pod from application\t\{\"appID\": 
> \"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationUUID\": 
> \"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
> \"PREEMPTED_BY_SCHEDULER\"}"}
>  # The pod deleted and trigger complete task and the terminationType is 
> PREEMPTED_BY_SCHEDULER
> {"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tgeneral/general.go:204\tdelete
>  pod\t\{\"appType\": \"general\", \"namespace\": \"dex-app-q5nslqd5\", 
> \"podName\": \"tg-spark-driver-spark-28105bdfe17b494887c0c4-0\", \"podUID\": 
> \"e6e91651-7152-42f5-8504-355590fa0079\"}"}
> {"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tcache/task_state.go:380\tTask
>  state transition\t\{\"app\": \"spark-28105bdfe17b494887c0c443f8a3ab0f\", 
> \"task\": \"e6e91651-7152-42f5-8504-355590fa0079\", \"taskAlias\": 
> \"dex-app-q5nslqd5/tg-spark-driver-spark-28105bdfe17b494887c0c4-0\", 
> \"source\": \"Bound\", \"destination\": \"Completed\", \"event\": 
> \"CompleteTask\"}"}
> {"stream":"stdout","log":"2023-11-08T15:16:45.489Z\tINFO\tscheduler/partition.go:1245\tremoving
>  allocation from application\t\{\"appID\": 
> \"spark-28105bdfe17b494887c0c443f8a3ab0f\", \"allocationId\": 
> \"cc243ba1-7054-4b07-8344-6afb1424b1e0\", \"terminationType\": 
> \"PREEMPTED_BY_SCHEDULER\"}"}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to