[ https://issues.apache.org/jira/browse/KUDU-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16408880#comment-16408880 ]
Todd Lipcon commented on KUDU-2287: ----------------------------------- This would also be very nice to include in ksck > Add replica metric tracking time since there was a valid leader > --------------------------------------------------------------- > > Key: KUDU-2287 > URL: https://issues.apache.org/jira/browse/KUDU-2287 > Project: Kudu > Issue Type: New Feature > Components: ksck, metrics, supportability > Affects Versions: 1.7.0 > Reporter: Todd Lipcon > Priority: Major > > Currently monitoring systems can report that the Kudu cluster is perfectly > healthy when in fact some tablet has gotten "stuck" with no leader (eg due to > some network connectivity problem or a bug). If we exposed a numeric metric > on a tablet indicating the time since a replica was healthy, or number of > failed election attempts, etc, we could easily monitor for this case. -- This message was sent by Atlassian JIRA (v7.6.3#76005)