[ 
https://issues.apache.org/jira/browse/IMPALA-7857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16692401#comment-16692401
 ] 

ASF subversion and git services commented on IMPALA-7857:
---------------------------------------------------------

Commit 93a0ce857f181f2fe4248252428fc2adfdf1bdb7 in impala's branch 
refs/heads/master from [~tarmstr...@cloudera.com]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=93a0ce8 ]

IMPALA-7857: log more information about statestore failure detection

This adds a couple of log messages for state transitions in the
statestore's failure detector.

Testing:
Ran test_statestore.py and checked for presence of new log messages.

Added a new tests to test_statestore that exercises handling of
intermittent heartbeat failures (required to produce one of the new log
messages).

Change-Id: Ie6ff85bee117000e4434dcffd3d1680a79905f14
Reviewed-on: http://gerrit.cloudera.org:8080/11937
Reviewed-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>


> Log more information about statestore failure detector
> ------------------------------------------------------
>
>                 Key: IMPALA-7857
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7857
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Distributed Exec
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Major
>              Labels: statestore, supportability
>
> For debugging heartbeat failures (or non-failures) it would be useful to log 
> enough information to infer the current state of the failure detector from 
> logs. Specifically:
> * Upon a failure, we should log the number of consecutive failures according 
> to the failure detector. And also maybe how many failures remain until it's 
> considered to be failed.
> * We should log when the failure count is reset to 0 by a successful 
> heartbeat.
> Currently if there are occasional failures it's hard to tell with certainty 
> whether it was reset correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to