Brandon Williams created CASSANDRA-9793:
-------------------------------------------
Summary: Log when messages are dropped due to cross_node_timeout
Key: CASSANDRA-9793
URL: https://issues.apache.org/jira/browse/CASSANDRA-9793
Project: Cassandra
Issue Type: Improvement
Components: Core
Reporter: Brandon Williams
Fix For: 2.1.x, 2.0.x
When a node has clock skew and cross node timeouts are enabled, there's no
indication that the messages were dropped due to the cross timeout, just that
messages were dropped. This can errantly lead you down a path of
troubleshooting a load shedding situation when really you just have clock drift
on one node. This is also not simple to troubleshooting, since you have to
determine that this node will answer requests, but other nodes won't answer
requests from it. If the problem goes away on a reboot (and the machine does
one-shot time sync, not continuos) it becomes even harder to detect because
you're left with a weird piece of evidence such as "it's fine after a reboot,
but comes back in about X days every time."
It would help tremendously if there were a log message indicating how many
messages (don't need them broken down by type) were eagerly dropped due to the
cross node timeout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)