Jean-Daniel Cryans created KUDU-1525:
----------------------------------------

             Summary: Create metrics for errors
                 Key: KUDU-1525
                 URL: https://issues.apache.org/jira/browse/KUDU-1525
             Project: Kudu
          Issue Type: Improvement
          Components: supportability
            Reporter: Jean-Daniel Cryans


There's a class of issue that can be hard to debug, namely when things fail 
semi-silently on the client-side. We currently have glog_warning_messages and 
glog_error_messages, but it could be good to have more granular metrics. A few 
I have in mind:
 - rpc errors, basically any "recv error"
 - server-level errors, like when it says TOO BUSY.
 - any kind of insert rejection, right now we have row key duplicates and 
memory pressure, but we're missing things like txn_tracker rejections, "not a 
leader".
 - raft errors like dropping a follower because we don't have the WALs around 
and it's lagging too much.

There's probably more but the above would be a good start.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to