[ 
https://issues.apache.org/jira/browse/YUNIKORN-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17779561#comment-17779561
 ] 

Craig Condit edited comment on YUNIKORN-2069 at 10/25/23 4:03 PM:
------------------------------------------------------------------

Hi [~rrajesh] assigned you this one. I think the general approach should be:
 # Create two queues, one with a very low guaranteed limit (500m CPU), one with 
a very high guaranteed limit (much larger than a node size, say 5000 CPU)
 # Select a schedulable node from the cluster
 # Schedule a number of small, low priority pause tasks of 500m CPU in the 
low-guarantee queue (enough to fill the selected node), with a node selector 
that only allows placement on that node. Wait for all tasks to become running. 
Since a node may not have exactly a multiple of 500m available, round down when 
calculating number of tasks needed.
 # Schedule a larger (1000m CPU) high-priority task in the high-guarantee 
queue, also with the same node selector.
 # Wait (perhaps up to 60 seconds) for the larger task to schedule. This should 
require several of the low-priority tasks to be preempted.
 # Shutdown and cleanup all tasks created.


was (Author: ccondit):
Hi [~rrajesh] assigned you this one. I think the general approach should be:
 # Create two queues, one with a very low guaranteed limit (500m CPU), one with 
a very high guaranteed limit (much larger than a node size, say 5000 CPU)
 # Select a schedulable node from the cluster
 # Schedule a number of small, low priority pause tasks of 500m CPU in the 
low-guarantee queue (enough to fill the selected node), with a node selector 
that only allows placement on that node. Wait for all tasks to become running.
 # Schedule a larger (1000m CPU) high-priority task in the high-guarantee 
queue, also with the same node selector.
 # Wait (perhaps up to 60 seconds) for the larger task to schedule. This should 
require several of the low-priority tasks to be preempted.
 # Shutdown and cleanup all tasks created.

> Add e2e test covering YUNIKORN-2068
> -----------------------------------
>
>                 Key: YUNIKORN-2069
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2069
>             Project: Apache YuniKorn
>          Issue Type: Test
>            Reporter: Craig Condit
>            Assignee: Rajesh Kanhaiya Lal
>            Priority: Major
>
> YUNIKORN-2068 added a deadlock fix during preemption - we need to create a 
> test case that can preempt tasks from a specific node (probably using node 
> selectors).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to