[ 
https://issues.apache.org/jira/browse/DAFFODIL-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17554561#comment-17554561
 ] 

Steve Lawrence commented on DAFFODIL-2704:
------------------------------------------

Seems we might have some sort of deadlock related to our generated properties. 
Using profiling, it shows multiple tests being run in parallel.

Here is the call tree of one test that is stuck:

!image-2022-06-15-08-34-12-731.png!

it seems to be hanging trying to initialize BitOrder.MostSignificantBitFirst:

Here is a call tree of another test that is stuck.

!image-2022-06-15-08-36-21-352.png!

Note that this test is triggering the SimpleNamedServiceLoader initialization. 
We can see that it looks to be stuck initializing the BitOrder when 
initializing BitOrder.LeastSignificantBitFirst.

So I wonder if there's some synchronization issues, where two things are trying 
to initialize the BitOrder properties at the same time, and and getting 
deadlocked?

> sbt tests are intermittenly hanging
> -----------------------------------
>
>                 Key: DAFFODIL-2704
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2704
>             Project: Daffodil
>          Issue Type: Bug
>            Reporter: Steve Lawrence
>            Assignee: Steve Lawrence
>            Priority: Critical
>             Fix For: 3.4.0
>
>         Attachments: image-2022-06-15-08-34-12-731.png, 
> image-2022-06-15-08-36-21-352.png
>
>
> Starting at commit c81b213b02ac414a39290787cf2eea14d83fdc26, running \{sbt 
> daffodil-io/test} multiple times will eventually cause it to hang.
> It's not clear to me the cause, but I've confirmed that the hang happens 
> somewhere in the SimpleNamedServiceLoader loadClass function. Seems likely 
> that it's getting stuck in the while loop and can't break out for some reason.
> I don't think the above commit caused the issue, but it did a new call to 
> SimpleNamedServiceLoader, and it loads a lot more classes than the usual use 
> of this class. We're probably using service loaders incorrectly in this 
> function somehow.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to