[jira] [Commented] (YUNIKORN-2895) Don't add duplicated allocation to node when the allocation ask fails

Wilfred Spiegelenburg (Jira) Tue, 01 Oct 2024 18:42:53 -0700


    [ 
https://issues.apache.org/jira/browse/YUNIKORN-2895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17886295#comment-17886295
 ]


Wilfred Spiegelenburg commented on YUNIKORN-2895:
-------------------------------------------------

There is something more broken. We should never see an ask at that point in the 
scheduling cycle that has been allocated. The application was write locked, and 
remains locked, a number of function calls before we get here. We filter all 
allocations and only proceed with the allocations that have not been allocated. 
We do that inside the lock.

The allocation should only be manipulated through the application which means 
the allocated flag cannot be changed while scheduling is in progress.

I think the issue is located in the maintenance of the {{sortedRequests}} on 
the application. That list used to be rebuild each cycle but now we 
insert/delete from the slice. During recovery I think we broke things. Recovery 
is using the same path as a node addition so this *could* happen on any node 
add or maybe even on a simple add of an new ask.

When we call {{application.AddAllocationAsk}} we check that the object is not 
allocated. That fact is always true as we create a new ask object from the SI. 
So we skip to the next step. This triggers a check for an already known 
outstanding ask. If that outstanding ask is not allocated we replace the object 
with the new one. We also make sure that we update resources if those have 
changed on the queues and app (pending).

{*}First issue{*}: if the old ask _IS_ allocated we will still replace that 
allocation with the new one in the requests map. We skip adjusting the pending 
resources using the already registered ask. This is where it breaks down: the 
requests list should never contain already allocated objects. It means we have 
a reference leak, and thus a memory leak. Long after the allocation is removed 
a reference will be kept in requests that will not get removed until we clean 
up the application. The GC will thus not remove it. For long running 
applications with lots of requests this can become significant.

{*}Second issue{*}: Caused by the replacement also. The new object is not 
marked allocated which causes a big problem as we will try and schedule it. We 
now could have an unallocated and an allocated object with the same key one in 
requests and one in allocations. After we schedule the second one the 
allocations list will be updated and we lose the original info.

{*}Third issue{*}: independent of the state we proceed to add the ask to the 
requests. The requests are stored in a map based on the allocation key. Which 
means we are always only tracking a single ask. Never any duplicates. The 
sorted requests however is a sorted slice of references to objects. There is no 
checks in the add into the sorted request slice to replace the existing entry. 
We will happily add a second one to the slice. Two objects same key they are 
both considered when scheduling which means we can easily cause issues there.

This code of adding allocations and asks needs a proper review. Over time with 
multiple changes on top of each other we have introduced issues here.

> Don't add duplicated allocation to node when the allocation ask fails
> ---------------------------------------------------------------------
>
>                 Key: YUNIKORN-2895
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2895
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Qi Zhu
>            Assignee: Qi Zhu
>            Priority: Critical
>
> When i try to revisit the new update allocation logic, the potential 
> duplicated allocation to node will happen if the allocation already 
> allocated.  And we try to add the allocation to the node again and don't 
> revert it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YUNIKORN-2895) Don't add duplicated allocation to node when the allocation ask fails

Reply via email to