I am trying once more using more aggressive tcp settings, as recommended here <https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>
sudo sysctl -w net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 (added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all nodes) Let's see what happens. I don't know what else to try. I have even further increased streaming_socket_timeout_in_ms On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pauloricard...@gmail.com> wrote: > I'm afraid raising streaming_socket_timeout_in_ms won't help much in this > case because the incoming connection on the source node is timing out on > the network layer, and streaming_socket_timeout_in_ms controls the socket > timeout in the app layer and throws SocketTimeoutException (not > java.io.IOException: > Connection timed out). So you should probably use more aggressive tcp > keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try > tuning that? Even that might not be sufficient as some routers tend to > ignore tcp keep-alives and just kill idle connections. > > As said before, this will ultimately be fixed by adding keep-alive to the > app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one > extreme approach would be to backport this to 2.1 (unless some experienced > operator out there has a more creative approach). > > @eevans, I'm not sure he is using a mixed version cluster, it seem he > finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild. > > 2016-05-27 11:39 GMT-03:00 Eric Evans <john.eric.ev...@gmail.com>: > >> From the various stacktraces in this thread, it's obvious you are >> mixing versions 2.1.13 and 2.1.14. Topology changes like this aren't >> supported with mixed Cassandra versions. Sometimes it will work, >> sometimes it won't (and it will definitely not work in this instance). >> >> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add >> the new nodes using 2.1.13, and upgrade after. >> >> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <sigle...@textkernel.nl> >> wrote: >> >> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027 >> >>>> StreamSession.java:505 - [Stream >> #74c57bc0-231a-11e6-a698-1b05ac77baf9] >> >>>> Streaming error occurred >> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed >> >>>> at >> >>>> >> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at java.lang.Thread.run(Unknown Source) [na:1.7.0_79] >> >>>> >> >>>> And this is from the source node: >> >>>> >> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097 >> >>>> StreamSession.java:505 - [Stream >> #74c57bc0-231a-11e6-a698-1b05ac77baf9] >> >>>> Streaming error occurred >> >>>> java.io.IOException: Broken pipe >> >>>> at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) >> >>>> ~[na:1.7.0_79] >> >>>> at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown >> Source) >> >>>> ~[na:1.7.0_79] >> >>>> at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source) >> >>>> ~[na:1.7.0_79] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45) >> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358) >> >>>> [apache-cassandra-2.1.14.jar:2.1.14] >> >>>> at >> >>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330) >> >>>> [apache-cassandra-2.1.14.jar:2.1.14] >> >> >> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704 >> >>>>>>>>>>> StreamSession.java:620 - [Stream >> #2c290460-20d4-11e6-930f-1b05ac77baf9] >> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session. >> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705 >> >>>>>>>>>>> StreamSession.java:505 - [Stream >> #2c290460-20d4-11e6-930f-1b05ac77baf9] >> >>>>>>>>>>> Streaming error occurred >> >>>>>>>>>>> java.io.IOException: Connection timed out >> >>>>>>>>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.SocketDispatcher.write(Unknown Source) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown >> >>>>>>>>>>> Source) ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.IOUtil.write(Unknown Source) >> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.SocketChannelImpl.write(Unknown Source) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351) >> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323) >> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at java.lang.Thread.run(Unknown Source) [na:1.7.0_79] >> >>>>>>>>>>> INFO [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625 >> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream >> #2c290460-20d4-11e6-930f-1b05ac77baf9] >> >>>>>>>>>>> Session with /192.168.1.140 is complete >> >>>>>>>>>>> WARN [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627 >> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream >> #2c290460-20d4-11e6-930f-1b05ac77baf9] >> >>>>>>>>>>> Stream failed >> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24 >> 22:44:58,628 >> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node >> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172) >> >>>>>>>>>>> ~[guava-16.0.jar:na] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) >> >>>>>>>>>>> ~[guava-16.0.jar:na] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156) >> >>>>>>>>>>> ~[guava-16.0.jar:na] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145) >> >>>>>>>>>>> ~[guava-16.0.jar:na] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202) >> >>>>>>>>>>> ~[guava-16.0.jar:na] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79] >> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629 >> >>>>>>>>>>> StreamSession.java:505 - [Stream >> #2c290460-20d4-11e6-930f-1b05ac77baf9] >> >>>>>>>>>>> Streaming error occurred >> >>>>>>>>>>> java.io.IOException: Broken pipe >> >>>>>>>>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.SocketDispatcher.write(Unknown Source) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown >> >>>>>>>>>>> Source) ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.IOUtil.write(Unknown Source) >> ~[na:1.7.0_79] >> >>>>>>>>>>> at sun.nio.ch.SocketChannelImpl.write(Unknown Source) >> >>>>>>>>>>> ~[na:1.7.0_79] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44) >> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351) >> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at >> >>>>>>>>>>> >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331) >> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13] >> >>>>>>>>>>> at java.lang.Thread.run(Unknown Source) [na:1.7.0_79] >> >> >> >> -- >> Eric Evans >> john.eric.ev...@gmail.com >> > >