Ivan Daschinskiy created IGNITE-13564:
-----------------------------------------

             Summary: Improve SYSTEM_WORKER_BLOCKED reporting.
                 Key: IGNITE-13564
                 URL: https://issues.apache.org/jira/browse/IGNITE-13564
             Project: Ignite
          Issue Type: Improvement
    Affects Versions: 2.8.1, 2.9
            Reporter: Ivan Daschinskiy
            Assignee: Ivan Daschinskiy
             Fix For: 2.10


Currently, reporting of system thread blocking has major drawbacks.

1. As system worker blocking is detected by another thread, due to 
implementation, failure handler receives not full information about problem. In 
{{FailureContext}} we have only two fields -- {{type}} and {{err}}.  Throwable 
{{err}} is generated in thread-detector flow, so we lost a context of main 
problem. 
2. Currently, due to implementation, we print not full stacktrace of blocking 
thread in {{org.apache.ignite.internal.worker.WorkersRegistry#onIdle}}. 
3. Current approach doesn't work when there is one thread in registry, this 
fact isn't checked and this can cause to infinite looping of single thread, 
calling {{onIdle}}

This two drawbacks can lead to completely loss of information about blocking 
system thread.

I suggests:
1. Add another parameter in {{FailureContext}}, namely {{worker}}
2. Fix threaddump printing.
3. Add assertion when there is only one system thread in registry



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to