[jira] [Commented] (CASSANDRA-15138) A cluster (RF=3) not recovering after two nodes are stopped

Yuji Ito (Jira) Sat, 06 Jun 2020 09:01:20 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17127216#comment-17127216
 ]


Yuji Ito commented on CASSANDRA-15138:
--------------------------------------

Here are logs about this issue.
h3. Node1 (the living node):
{code:java}
TRACE [GossipStage:1] 2020-06-05 06:08:42,866 Gossiper.java:1101 - marking as 
down /10.42.2.90
INFO  [GossipStage:1] 2020-06-05 06:08:42,866 Gossiper.java:1105 - InetAddress 
/10.42.2.90 is now DOWN
TRACE [GossipStage:1] 2020-06-05 06:08:42,866 MessagingService.java:658 - 
Resetting pool for /10.42.2.90
...
DEBUG [GossipStage:1] 2020-06-05 06:08:42,868 FailureDetector.java:313 - 
Forcing conviction of /10.42.2.90
...
DEBUG [GossipStage:1] 2020-06-05 06:08:42,866 OutboundTcpConnection.java:205 - 
Enqueuing socket close for /10.42.2.90
DEBUG [GossipStage:1] 2020-06-05 06:08:42,866 OutboundTcpConnection.java:205 - 
Enqueuing socket close for /10.42.2.90
DEBUG [MessagingService-Outgoing-/10.42.2.90-Small] 2020-06-05 06:08:42,866 
OutboundTcpConnection.java:411 - Socket to /
10.42.2.90 closed
DEBUG [GossipStage:1] 2020-06-05 06:08:42,866 OutboundTcpConnection.java:205 - 
Enqueuing socket close for /10.42.2.90
TRACE [GossipStage:1] 2020-06-05 06:08:42,867 Server.java:630 - Status changed 
event : /10.42.2.90, DOWN
...
DEBUG [MessagingService-Outgoing-/10.42.2.90-Gossip] 2020-06-05 06:08:42,868 
OutboundTcpConnection.java:411 - Socket to /10.42.2.90 closed
...
DEBUG [MessagingService-Outgoing-/10.42.2.90-Small] 2020-06-05 06:08:42,869 
OutboundTcpConnection.java:425 - Attempting to connect to /10.42.2.90
DEBUG [MessagingService-Outgoing-/10.42.2.90-Small] 2020-06-05 06:08:42,873 
OutboundTcpConnection.java:533 - Done connecting to /10.42.2.90
....
DEBUG [MessagingService-Outgoing-/10.42.2.90-Gossip] 2020-06-05 06:08:48,732 
OutboundTcpConnection.java:411 - Socket to /10.42.2.90 closed
...
TRACE [GossipStage:1] 2020-06-05 06:27:51,065 OutboundTcpConnection.java:628 - 
Expiration of MessagingService-Outgoing-/10.42.2.90-Small took 234?s
...
DEBUG [MessagingService-Outgoing-/10.42.2.90-Small] 2020-06-05 06:28:09,750 
OutboundTcpConnection.java:411 - Socket to /10.42.2.90 closed
DEBUG [MessagingService-Outgoing-/10.42.2.90-Small] 2020-06-05 06:28:09,750 
OutboundTcpConnection.java:350 - Error writing to /10.42.2.90
java.io.IOException: Connection timed out
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[na:1.8.0_252]
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) 
~[na:1.8.0_252]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) 
~[na:1.8.0_252]
        at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[na:1.8.0_252]
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:468) 
~[na:1.8.0_252]
        at 
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.doFlush(BufferedDataOutputStreamPlus.java:323)
 ~[main/:na]
        at 
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.flush(BufferedDataOutputStreamPlus.java:331)
 ~[main/:na]
        at 
org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:342)
 [main/:na]
        at 
org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:263)
 [main/:na]{code}
h3. Node2 (the second shutdown node)
{code:java}
DEBUG [StorageServiceShutdownHook] 2020-06-05 06:08:42,840 
StorageService.java:1480 - DRAINING: starting drain process
...
INFO  [StorageServiceShutdownHook] 2020-06-05 06:08:42,850 Gossiper.java:1647 - 
Announcing shutdown
...
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:42,858 
MessagingService.java:942 - 
ip-10-42-2-90.ap-northeast-1.compute.internal/10.42.2.90 sending 
GOSSIP_SHUTDOWN to 3506286@/10.42.2.16
...
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:42,858 
MessagingService.java:942 - 
ip-10-42-2-90.ap-northeast-1.compute.internal/10.42.2.90 sending 
GOSSIP_SHUTDOWN to 3506286@/10.42.2.16
...
TRACE [MessagingService-Incoming-/10.42.2.16] 2020-06-05 06:08:42,868 
IncomingTcpConnection.java:120 - Closing socket 
Socket[addr=/10.42.2.16,port=52646,localport=7000] - isclosed: false
TRACE [ACCEPT-ip-10-42-2-90.ap-northeast-1.compute.internal/10.42.2.90] 
2020-06-05 06:08:42,872 MessagingService.java:1315 - Connection version 11 from 
/10.42.2.16
TRACE [MessagingService-Incoming-/10.42.2.16] 2020-06-05 06:08:42,873 
MessagingService.java:1087 - Setting version 11 for /10.42.2.16
TRACE [MessagingService-Incoming-/10.42.2.16] 2020-06-05 06:08:42,873 
IncomingTcpConnection.java:155 - Set version for /10.42.2.16 to 11 (will use 11)
...
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:44,859 
MessagingService.java:1351 - Closing accept() thread
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:44,859 
IncomingTcpConnection.java:120 - Closing socket 
Socket[addr=/10.42.2.16,port=53386,localport=7000] - isclosed: false
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:44,859 
IncomingTcpConnection.java:120 - Closing socket 
Socket[addr=/10.42.2.90,port=53024,localport=7000] - isclosed: false
TRACE [ACCEPT-ip-10-42-2-90.ap-northeast-1.compute.internal/10.42.2.90] 
2020-06-05 06:08:44,859 MessagingService.java:1327 - Asynchronous close seen by 
server thread
TRACE [StorageServiceShutdownHook] 2020-06-05 06:08:44,859 
IncomingTcpConnection.java:120 - Closing socket 
Socket[addr=/10.42.2.16,port=53368,localport=7000] - isclosed: false
{code}

> A cluster (RF=3) not recovering after two nodes are stopped
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-15138
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15138
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Membership
>            Reporter: Hiroyuki Yamada
>            Priority: Normal
>
> I faced a weird issue when recovering a cluster after two nodes are stopped.
>  It is easily reproduce-able and looks like a bug or an issue to fix.
>  The following is a step to reproduce it.
> === STEP TO REPRODUCE ===
>  * Create a 3-node cluster with RF=3
>     - node1(seed), node2, node3
>  * Start requests to the cluster with cassandra-stress (it continues
>  until the end)
>     - what we did: cassandra-stress mixed cl=QUORUM duration=10m
>  -errors ignore -node node1,node2,node3 -rate threads\>=16
>  threads\<=256
>  - (It doesn't have to be this many threads. Can be 1)
>  * Stop node3 normally (with systemctl stop or kill (without -9))
>     - the system is still available as expected because the quorum of nodes is
>  still available
>  * Stop node2 normally (with systemctl stop or kill (without -9))
>     - the system is NOT available as expected after it's stopped.
>     - the client gets `UnavailableException: Not enough replicas
>  available for query at consistency QUORUM`
>     - the client gets errors right away (so few ms)
>     - so far it's all expected
>  * Wait for 1 mins
>  * Bring up node2 back
>     - {color:#ff0000}The issue happens here.{color}
>     - the client gets ReadTimeoutException` or WriteTimeoutException
>  depending on if the request is read or write even after the node2 is
>  up
>     - the client gets errors after about 5000ms or 2000ms, which are
>  request timeout for write and read request
>     - what node1 reports with `nodetool status` and what node2 reports
>  are not consistent. (node2 thinks node1 is down)
>     - It takes very long time to recover from its state
> === STEPS TO REPRODUCE ===
> Some additional important information to note:
>  * If we don't start cassandra-stress, it doesn't cause the issue.
>  * Restarting node1 and it recovers its state right after it's restarted
>  * Setting lower value in dynamic_snitch_reset_interval_in_ms (to 60000
>  or something) fixes the issue
>  * If we `kill -9` the nodes, then it doesn't cause the issue.
>  * Hints seems not related. I tested with hints disabled, it didn't make any 
> difference.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15138) A cluster (RF=3) not recovering after two nodes are stopped

Reply via email to