[ 
https://issues.apache.org/jira/browse/YUNIKORN-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870352#comment-17870352
 ] 

Wilfred Spiegelenburg edited comment on YUNIKORN-2646 at 8/2/24 4:29 AM:
-------------------------------------------------------------------------

It is a false positive detection. The code explicitly prevents the case from 
happening. See [this 
comment|https://issues.apache.org/jira/browse/YUNIKORN-2646?focusedCommentId=17850240&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17850240]
 here again worded slightly differently:

The detector is not smart enough to understand this part of the logic and just 
sees the order:
 # FIRST: Application A lock taken followed by Application B.
 # SECOND: Application B lock is taken followed by Application A.

That triggers the detection. The fact this sequence is only possible because we 
have a guarantee in our code that between FIRST and SECOND all locks are 
released without exception cannot be expressed in rules.

BTW: running with deadlock detection in production is a really bad idea. It 
causes a lot of overhead.


was (Author: wifreds):
It is a false positive detection. The code explicitly prevents the case from 
happening. See this comment here again worded slightly differently:

The detector is not smart enough to understand this part of the logic and just 
sees the order:
 # FIRST: Application A lock taken followed by Application B.
 # SECOND: Application B lock is taken followed by Application A.

That triggers the detection. The fact this sequence is only possible because we 
have a guarantee in our code that between FIRST and SECOND all locks are 
released without exception cannot be expressed in rules.

BTW: running with deadlock detection in production is a really bad idea. It 
causes a lot of overhead.

> Deadlock detected during preemption
> -----------------------------------
>
>                 Key: YUNIKORN-2646
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-2646
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: core - scheduler
>            Reporter: Dmitry
>            Assignee: Wilfred Spiegelenburg
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.6.0, 1.5.2
>
>         Attachments: yunikorn-logs-lock.txt.gz, yunikorn-logs.txt.gz
>
>
> Hitting deadlocks in 1.5.1
> The log is attached



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to