[
https://issues.apache.org/jira/browse/YUNIKORN-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870352#comment-17870352
]
Wilfred Spiegelenburg edited comment on YUNIKORN-2646 at 8/2/24 4:29 AM:
-------------------------------------------------------------------------
It is a false positive detection. The code explicitly prevents the case from
happening. See [this
comment|https://issues.apache.org/jira/browse/YUNIKORN-2646?focusedCommentId=17850240&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17850240]
here again worded slightly differently:
The detector is not smart enough to understand this part of the logic and just
sees the order:
# FIRST: Application A lock taken followed by Application B.
# SECOND: Application B lock is taken followed by Application A.
That triggers the detection. The fact this sequence is only possible because we
have a guarantee in our code that between FIRST and SECOND all locks are
released without exception cannot be expressed in rules.
BTW: running with deadlock detection in production is a really bad idea. It
causes a lot of overhead.
was (Author: wifreds):
It is a false positive detection. The code explicitly prevents the case from
happening. See this comment here again worded slightly differently:
The detector is not smart enough to understand this part of the logic and just
sees the order:
# FIRST: Application A lock taken followed by Application B.
# SECOND: Application B lock is taken followed by Application A.
That triggers the detection. The fact this sequence is only possible because we
have a guarantee in our code that between FIRST and SECOND all locks are
released without exception cannot be expressed in rules.
BTW: running with deadlock detection in production is a really bad idea. It
causes a lot of overhead.
> Deadlock detected during preemption
> -----------------------------------
>
> Key: YUNIKORN-2646
> URL: https://issues.apache.org/jira/browse/YUNIKORN-2646
> Project: Apache YuniKorn
> Issue Type: Bug
> Components: core - scheduler
> Reporter: Dmitry
> Assignee: Wilfred Spiegelenburg
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.6.0, 1.5.2
>
> Attachments: yunikorn-logs-lock.txt.gz, yunikorn-logs.txt.gz
>
>
> Hitting deadlocks in 1.5.1
> The log is attached
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]