[ https://issues.apache.org/jira/browse/IGNITE-9679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Kuznetsov updated IGNITE-9679: ------------------------------------- Description: Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows. Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked: - whether it's alive; - whether it updates its internal heartbeat timestamp. Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}. {{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property affects monitoring behavior. At runtime monitoring settings can be changed via {{FailureHandlingMxBean}}. By default, liveness checks are enabled, but blocked system worker detection will not lead to failure handler invocation, see {{FailureProcessor#getDefaultFailureHandler}} . was: Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows. Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked: - whether it's alive; - whether it updates its internal heartbeat timestamp. Both checks use {{IgniteConfiguration.failureDetectionTimeout}} property as a threshold value. Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}. Liveness checks are enabled by default, but can be disabled through {{WorkersControlMXBean.healthMonitoringEnabled}} property. > Document critical workers liveness checking implementation > ---------------------------------------------------------- > > Key: IGNITE-9679 > URL: https://issues.apache.org/jira/browse/IGNITE-9679 > Project: Ignite > Issue Type: Task > Components: documentation > Reporter: Andrey Kuznetsov > Priority: Major > Fix For: 2.7 > > > Newly implemented critical worker thread liveness checks should be mentioned > in Ignite Documentation. Brief description of the functionality follows. > Ignite node has a number of critical worker threads that should be alive and > responsive, otherwise node's health is not guaranteed. These threads monitor > each other periodically and track two aspects for a thread being checked: > - whether it's alive; > - whether it updates its internal heartbeat timestamp. > Whenever at least one of the above conditions is violated, checker thread > logs the error and calls currently configured {{FailureHandler}}. > {{IgniteConfiguration.SystemWorkerBlockedTimeout}} configuration property > affects monitoring behavior. At runtime monitoring settings can be changed > via {{FailureHandlingMxBean}}. > By default, liveness checks are enabled, but blocked system worker detection > will not lead to failure handler invocation, see > {{FailureProcessor#getDefaultFailureHandler}} . -- This message was sent by Atlassian JIRA (v7.6.3#76005)