[ 
https://issues.apache.org/jira/browse/IGNITE-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrey Kuznetsov updated IGNITE-9679:
-------------------------------------
    Description: 
Newly implemented critical worker thread liveness checks should be mentioned in 
Ignite Documentation. Brief description of the functionality follows.

Ignite node has a number of critical worker threads that should be alive and 
responsive, otherwise node's health is not guaranteed. These threads monitor 
each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Whenever at least one of the above conditions is violated, checker thread logs 
the error and calls currently configured {{FailureHandler}}.

{{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property 
affects monitoring behavior. At runtime monitoring settings can be changed via 
{{FailureHandlingMxBean}}.

By default, liveness checks are enabled, but blocked system worker detection 
will not lead to failure handler invocation, see 
{{FailureProcessor#getDefaultFailureHandler}} .


  was:
Newly implemented critical worker thread liveness checks should be mentioned in 
Ignite Documentation. Brief description of the functionality follows.

Ignite node has a number of critical worker threads that should be alive and 
responsive, otherwise node's health is not guaranteed. These threads monitor 
each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Both checks use {{IgniteConfiguration.failureDetectionTimeout}} property as a 
threshold value.
Whenever at least one of the above conditions is violated, checker thread logs 
the error and calls currently configured {{FailureHandler}}.

Liveness checks are enabled by default, but can be disabled through 
{{WorkersControlMXBean.healthMonitoringEnabled}} property.



> Document critical workers liveness checking implementation
> ----------------------------------------------------------
>
>                 Key: IGNITE-9679
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9679
>             Project: Ignite
>          Issue Type: Task
>          Components: documentation
>            Reporter: Andrey Kuznetsov
>            Priority: Major
>             Fix For: 2.7
>
>
> Newly implemented critical worker thread liveness checks should be mentioned 
> in Ignite Documentation. Brief description of the functionality follows.
> Ignite node has a number of critical worker threads that should be alive and 
> responsive, otherwise node's health is not guaranteed. These threads monitor 
> each other periodically and track two aspects for a thread being checked:
> - whether it's alive;
> - whether it updates its internal heartbeat timestamp.
> Whenever at least one of the above conditions is violated, checker thread 
> logs the error and calls currently configured {{FailureHandler}}.
> {{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property 
> affects monitoring behavior. At runtime monitoring settings can be changed 
> via {{FailureHandlingMxBean}}.
> By default, liveness checks are enabled, but blocked system worker detection 
> will not lead to failure handler invocation, see 
> {{FailureProcessor#getDefaultFailureHandler}} .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to