We had six node clusters and when we attempted to join a node to this, cpu load on two gradually climbed to abnormally high number. Stopping the join and shutting down cassandra on two high-load nodes restored the cluster health (we have RF=3)
Anyone have any insight on this cassandra behavior? We have done node join many times before; most recent was just 4 days before. The The following unusual messages in the relevant time period for two nodes. We are using cassandra 2.0.10 Jun 30 16:47:30 cass-22.pelotime.com cassandra-serverERROR [GossipStage:1] CassandraDaemon.java (line 199) Exception in thread Thread[GossipStage:1,5,main] Jun 30 16:47:30 cass-22.pelotime.com java.lang.NullPointerException Jun 30 16:47:30 cass-22.pelotime.com at org.apache.cassandra.gms.Gossiper.convict(Gossiper.java:301) Jun 30 16:47:30 cass-22.pelotime.com at org.apache.cassandra.gms.FailureDetector.forceConviction(FailureDetector.java:251) Jun 30 16:47:30 cass-22.pelotime.com at org.apache.cassandra.gms.GossipShutdownVerbHandler.doVerb(GossipShutdownVerbHandler.java:37) Jun 30 16:47:30 cass-22.pelotime.com at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:62) Jun 30 16:47:30 cass-22.pelotime.com at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) Jun 30 16:47:30 cass-22.pelotime.com at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) Jun 30 16:47:30 cass-22.pelotime.com at java.lang.Thread.run(Unknown Source) Jun 30 16:47:30 cass-22.pelotime.com cassandra-server INFO [GossipStage:2] Gossiper.java (line 910) Node /10.0.251.77 is now part of the cluster Jun 30 16:47:35 cass-22.pelotime.com cassandra-server INFO [HANDSHAKE-/ 10.0.251.77] OutboundTcpConnection.java (line 386) Handshaking version with /10.0.251.77 Jun 30 16:47:35 cass-22.pelotime.com cassandra-server INFO [RequestResponseStage:138] Gossiper.java (line 876) InetAddress /10.0.251.77 is now UP Jun 30 16:47:38 cass-22.pelotime.com cassandra-server INFO [GossipStage:2] Gossiper.java (line 890) InetAddress /10.0.251.77 is now DOWN Jun 30 16:48:02 cass-22.pelotime.com cassandra-server INFO [HANDSHAKE-/ 10.0.251.77] OutboundTcpConnection.java (line 386) Handshaking version with /10.0.251.77 Jun 30 16:48:05 cass-22.pelotime.com cassandra-server INFO [GossipTasks:1] Gossiper.java (line 658) FatClient /10.0.251.77 has been silent for 30000ms, removing from gossip Jun 30 16:48:05 cass-22.pelotime.com cassandra-server INFO [HANDSHAKE-/ 10.0.251.77] OutboundTcpConnection.java (line 386) Handshaking ve Jun 30 16:48:59 cass-24.pelotime.com cassandra-server INFO [HANDSHAKE-/ 10.0.251.77] OutboundTcpConnection.java (line 386) Handshaking version with /10.0.251.77 Jun 30 16:48:59 cass-24.pelotime.com cassandra-server INFO [RequestResponseStage:26] Gossiper.java (line 876) InetAddress /10.0.251.77 is now UP Jun 30 16:48:59 cass-24.pelotime.com cassandra-server INFO [HANDSHAKE-/ 10.0.251.77] OutboundTcpConnection.java (line 386) Handshaking version with /10.0.251.77 Jun 30 16:50:52 cass-24.pelotime.com cassandra-serverERROR [STREAM-OUT-/ 10.0.251.77] StreamSession.java (line 454) [Stream #5f2251e0-1f69-11e5-94c0-d9033a25abe9] Streaming error occurred Jun 30 16:50:52 cass-24.pelotime.com java.io.IOException: Broken pipe Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:74) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319) Jun 30 16:50:52 cass-24.pelotime.com at java.lang.Thread.run(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com cassandra-serverERROR [STREAM-OUT-/ 10.0.251.77] StreamSession.java (line 454) [Stream #5f2251e0-1f69-11e5-94c0-d9033a25abe9] Streaming error occurred Jun 30 16:50:52 cass-24.pelotime.com java.io.IOException: Broken pipe Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileDispatcherImpl.write0(Native Method) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.SocketDispatcher.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.IOUtil.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.SocketChannelImpl.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319) Jun 30 16:50:52 cass-24.pelotime.com at java.lang.Thread.run(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com cassandra-serverERROR [STREAM-OUT-/ 10.0.251.77] StreamSession.java (line 454) [Stream #5f2251e0-1f69-11e5-94c0-d9033a25abe9] Streaming error occurred Jun 30 16:50:52 cass-24.pelotime.com java.io.IOException: Broken pipe Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileDispatcherImpl.write0(Native Method) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.SocketDispatcher.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.IOUtil.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.SocketChannelImpl.write(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339) Jun 30 16:50:52 cass-24.pelotime.com at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319) Jun 30 16:50:52 cass-24.pelotime.com at java.lang.Thread.run(Unknown Source) Jun 30 16:50:52 cass-24.pelotime.com cassandra-serverERROR [STREAM-OUT-/ 10.0.251.77] StreamSession.java (line 454) [Stream #5f2251e0-1f69-11e5-94c0-d9033a25abe9] Streaming error occurred Jun 30 16:50:52 cass-24.pelotime.com java.io.IOException: Broken pipe Jun 30 16:50:52 cass-24.pelotime.com at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
