[
https://issues.apache.org/jira/browse/CASSANDRA-9793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662067#comment-14662067
]
Tyler Hobbs commented on CASSANDRA-9793:
----------------------------------------
The logging level for dropped messages was set to ERROR by this patch. I've
pushed a follow-up commit to lower it to INFO as
{{4d1b8b418609bd2c431a0ea905a667ddaaa50bbe}}.
(The error-level logging was also causing a lot of dtests to fail.)
> Log when messages are dropped due to cross_node_timeout
> -------------------------------------------------------
>
> Key: CASSANDRA-9793
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9793
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Brandon Williams
> Assignee: Stefania
> Fix For: 2.1.9, 2.0.17, 2.2.1
>
>
> When a node has clock skew and cross node timeouts are enabled, there's no
> indication that the messages were dropped due to the cross timeout, just that
> messages were dropped. This can errantly lead you down a path of
> troubleshooting a load shedding situation when really you just have clock
> drift on one node. This is also not simple to troubleshoot, since you have
> to determine that this node will answer requests, but other nodes won't
> answer requests from it. If the problem goes away on a reboot (and the
> machine does one-shot time sync, not continuous) it becomes even harder to
> detect because you're left with a weird piece of evidence such as "it's fine
> after a reboot, but comes back in about X days every time."
> It would help tremendously if there were a log message indicating how many
> messages (don't need them broken down by type) were eagerly dropped due to
> the cross node timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)