[ 
https://issues.apache.org/jira/browse/LUCENE-9661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17262219#comment-17262219
 ] 

Michael McCandless commented on LUCENE-9661:
--------------------------------------------

Whoa, thank you for digging to the root cause so quickly [~danmuzi]!

Do you know if this corresponds to a recent change?  It looks like 
{{BaseTermsEnum}} was factored out in Feb 2019.
{quote}However, it is difficult how to write a test for this issue.
Is there a good way?
{quote}
Hmm, I'm confused – it looks like your attached patch already has a 
{{testDeadlock}} test case?  Is it difficult to make that real test (move it to 
a separate {{TestFooBar}} source)?  Or, would that break the test, since it 
relies specifically on when classloader inits {{TermsEnum}} and 
{{BaseTermsEnum}}?

Did we create tests for the past couple classloader deadlocks?

> Another classloader deadlock?
> -----------------------------
>
>                 Key: LUCENE-9661
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9661
>             Project: Lucene - Core
>          Issue Type: Bug
>    Affects Versions: master (9.0)
>            Reporter: Michael McCandless
>            Priority: Major
>         Attachments: deadlock_test.patch
>
>
> The {{java}} processes spawned by our Lucene nightly benchmarks sometimes 
> randomly hang, apparently while loading classes across threads, under 
> contention.
> I've opened [this {{luceneutil}} issue with some 
> details|https://github.com/mikemccand/luceneutil/issues/89], but 
> [~uschindler] suggested I open an issue here too since he has been seeing 
> this in CI builds too.
> It is rare, maybe once a week in the nightly benchmarks (which spawn many 
> {{java}} processes with many threads across 128 CPU cores).  It is clearly a 
> deadlock – when it strikes, the process hangs forever until I notice and 
> {{kill -9}} it.  I posted a coupled {{jstacks}} in the issue above.
> [~rcmuir] suggested using {{classcycle}} to maybe statically dig into 
> possible deadlocks ... I have not tried that yet.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to