[
https://issues.apache.org/jira/browse/CASSANDRA-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Brandon Williams updated CASSANDRA-3273:
----------------------------------------
Attachment: 3273.txt
bq. What if we reset the intervals when we get a node back-from-the-dead?
That makes sense if we're observing a generation change, the node either
rebooted or was taken over by a new machine, so relearning the network
characteristics is a good idea.
In the case that there was only a heartbeat change, that indicates there was
something bad (most likely in the network) and we should remember that for next
time to avoid flapping. However, in the case of a long partition where the
generation won't change, we don't want to record the partition time as an
interval since if the partition reoccurs soon, it will take us a very long time
to mark the host down again.
This patch clears the intervals on a generation change, and handles the long
partition case by defining a reasonable maximum to record, in this case the rpc
timeout, since adapting beyond this rather than failing quickly doesn't make
much sense that I can think of, but I'll entertain a higher hard set default if
anyone disagrees.
> FailureDetector can take a very long time to mark a host down
> -------------------------------------------------------------
>
> Key: CASSANDRA-3273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3273
> Project: Cassandra
> Issue Type: Bug
> Components: Core
> Reporter: Brandon Williams
> Assignee: Brandon Williams
> Attachments: 3273.txt
>
>
> There are two ways to trigger this:
> * Bring a node up very briefly in a mixed-version cluster and then terminate
> it
> * Bring a node up, terminate it for a very long time, then bring it back up
> and take it down again
> In the first case, what can happen is a very short interval arrival time is
> recorded by the versioning logic which requires reconnecting and can happen
> very quickly. This can easily be solved by rejecting any intervals within a
> reasonable bound, for instance the gossiper interval.
> The second instance is harder to solve, because what is happening is that an
> extremely large interval is recorded, which is the time the node was left
> dead the first time. This throws off the mean of the intervals and causes it
> to take a much longer time than it should to mark it down the second time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira