[ https://issues.apache.org/jira/browse/CASSANDRA-15138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hiroyuki Yamada updated CASSANDRA-15138: ---------------------------------------- Discovered By: User Report > A cluster (RF=3) not recovering after two nodes are stopped > ----------------------------------------------------------- > > Key: CASSANDRA-15138 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15138 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Membership > Reporter: Hiroyuki Yamada > Priority: Normal > > I faced a weird issue when recovering a cluster after two nodes are stopped. > It is easily reproduce-able and looks like a bug or an issue to fix. > The following is a step to reproduce it. > === STEP TO REPRODUCE === > * Create a 3-node cluster with RF=3 > - node1(seed), node2, node3 > * Start requests to the cluster with cassandra-stress (it continues > until the end) > - what we did: cassandra-stress mixed cl=QUORUM duration=10m > -errors ignore -node node1,node2,node3 -rate threads\>=16 > threads\<=256 > - (It doesn't have to be this many threads. Can be 1) > * Stop node3 normally (with systemctl stop or kill (not without -9)) > - the system is still available as expected because the quorum of nodes is > still available > * Stop node2 normally (with systemctl stop or kill (not without -9)) > - the system is NOT available as expected after it's stopped. > - the client gets `UnavailableException: Not enough replicas > available for query at consistency QUORUM` > - the client gets errors right away (so few ms) > - so far it's all expected > * Wait for 1 mins > * Bring up node2 back > - {color:#FF0000}The issue happens here.{color} > - the client gets ReadTimeoutException` or WriteTimeoutException > depending on if the request is read or write even after the node2 is > up > - the client gets errors after about 5000ms or 2000ms, which are > request timeout for write and read request > - what node1 reports with `nodetool status` and what node2 reports > are not consistent. (node2 thinks node1 is down) > - It takes very long time to recover from its state > === STEPS TO REPRODUCE === > Some additional important information to note: > * If we don't start cassandra-stress, it doesn't cause the issue. > * Restarting node1 and it recovers its state right after it's restarted > * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000 > or something) fixes the issue > * If we `kill -9` the nodes, then it doesn't cause the issue. > * Hints seems not related. I tested with hints disabled, it didn't make any > difference. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org