[jira] [Commented] (CASSANDRA-9793) Log when messages are dropped due to cross_node_timeout

Tyler Hobbs (JIRA) Fri, 07 Aug 2015 09:48:07 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662067#comment-14662067
 ]


Tyler Hobbs commented on CASSANDRA-9793:
----------------------------------------

The logging level for dropped messages was set to ERROR by this patch.  I've 
pushed a follow-up commit to lower it to INFO as 
{{4d1b8b418609bd2c431a0ea905a667ddaaa50bbe}}.

(The error-level logging was also causing a lot of dtests to fail.)

> Log when messages are dropped due to cross_node_timeout
> -------------------------------------------------------
>
>                 Key: CASSANDRA-9793
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9793
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Brandon Williams
>            Assignee: Stefania
>             Fix For: 2.1.9, 2.0.17, 2.2.1
>
>
> When a node has clock skew and cross node timeouts are enabled, there's no 
> indication that the messages were dropped due to the cross timeout, just that 
> messages were dropped.  This can errantly lead you down a path of 
> troubleshooting a load shedding situation when really you just have clock 
> drift on one node.  This is also not simple to troubleshoot, since you have 
> to determine that this node will answer requests, but other nodes won't 
> answer requests from it.  If the problem goes away on a reboot (and the 
> machine does one-shot time sync, not continuous) it becomes even harder to 
> detect because you're left with a weird piece of evidence such as "it's fine 
> after a reboot, but comes back in about X days every time."
> It would help tremendously if there were a log message indicating how many 
> messages (don't need them broken down by type) were eagerly dropped due to 
> the cross node timeout.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-9793) Log when messages are dropped due to cross_node_timeout

Reply via email to