I am trying once more using more aggressive tcp settings, as recommended
here
<https://docs.datastax.com/en/cassandra/2.1/cassandra/troubleshooting/trblshootIdleFirewall.html>

sudo sysctl -w net.ipv4.tcp_keepalive_time=60
net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10

(added to /etc/sysctl.conf and run sysctl -p /etc/sysctl.conf on all nodes)

Let's see what happens. I don't know what else to try. I have even further
increased streaming_socket_timeout_in_ms



On Fri, May 27, 2016 at 4:56 PM, Paulo Motta <pauloricard...@gmail.com>
wrote:

> I'm afraid raising streaming_socket_timeout_in_ms won't help much in this
> case because the incoming connection on the source node is timing out on
> the network layer, and streaming_socket_timeout_in_ms controls the socket
> timeout in the app layer and throws SocketTimeoutException (not 
> java.io.IOException:
> Connection timed out). So you should probably use more aggressive tcp
> keep-alive settings (net.ipv4.tcp_keepalive_*) on both hosts, did you try
> tuning that? Even that might not be sufficient as some routers tend to
> ignore tcp keep-alives and just kill idle connections.
>
> As said before, this will ultimately be fixed by adding keep-alive to the
> app layer on CASSANDRA-11841. If tuning tcp keep-alives does not help, one
> extreme approach would be to backport this to 2.1 (unless some experienced
> operator out there has a more creative approach).
>
> @eevans, I'm not sure he is using a mixed version cluster, it seem he
> finished the upgrade from 2.1.13 to 2.1.14 before performing the rebuild.
>
> 2016-05-27 11:39 GMT-03:00 Eric Evans <john.eric.ev...@gmail.com>:
>
>> From the various stacktraces in this thread, it's obvious you are
>> mixing versions 2.1.13 and 2.1.14.  Topology changes like this aren't
>> supported with mixed Cassandra versions.  Sometimes it will work,
>> sometimes it won't (and it will definitely not work in this instance).
>>
>> You should either upgrade your 2.1.13 nodes to 2.1.14 first, or add
>> the new nodes using 2.1.13, and upgrade after.
>>
>> On Fri, May 27, 2016 at 8:41 AM, George Sigletos <sigle...@textkernel.nl>
>> wrote:
>>
>> >>>> ERROR [STREAM-IN-/192.168.1.141] 2016-05-26 09:08:05,027
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.lang.RuntimeException: Outgoing stream handler has been closed
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:138)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:568)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:457)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:263)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>
>> >>>> And this is from the source node:
>> >>>>
>> >>>> ERROR [STREAM-OUT-/172.31.22.104] 2016-05-26 11:08:05,097
>> >>>> StreamSession.java:505 - [Stream
>> #74c57bc0-231a-11e6-a698-1b05ac77baf9]
>> >>>> Streaming error occurred
>> >>>> java.io.IOException: Broken pipe
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferToDirectly(Unknown
>> Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at sun.nio.ch.FileChannelImpl.transferTo(Unknown Source)
>> >>>> ~[na:1.7.0_79]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:84)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:88)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:49)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:41)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
>> >>>> ~[apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:358)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>> >>>>         at
>> >>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:330)
>> >>>> [apache-cassandra-2.1.14.jar:2.1.14]
>>
>>
>> >>>>>>>>>>> ERROR [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:57,704
>> >>>>>>>>>>> StreamSession.java:620 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Remote peer 192.168.1.140 failed stream session.
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:57,705
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Connection timed out
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:323)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>> >>>>>>>>>>> INFO  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,625
>> >>>>>>>>>>> StreamResultFuture.java:180 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Session with /192.168.1.140 is complete
>> >>>>>>>>>>> WARN  [STREAM-IN-/192.168.1.140] 2016-05-24 22:44:58,627
>> >>>>>>>>>>> StreamResultFuture.java:207 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Stream failed
>> >>>>>>>>>>> ERROR [RMI TCP Connection(24)-127.0.0.1] 2016-05-24
>> 22:44:58,628
>> >>>>>>>>>>> StorageService.java:1075 - Error while rebuilding node
>> >>>>>>>>>>> org.apache.cassandra.streaming.StreamException: Stream failed
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:85)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.Futures$4.run(Futures.java:1172)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
>> >>>>>>>>>>> ~[guava-16.0.jar:na]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:208)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:184)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:415)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.sessionFailed(StreamSession.java:621)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:475)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:256)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) ~[na:1.7.0_79]
>> >>>>>>>>>>> ERROR [STREAM-OUT-/192.168.1.140] 2016-05-24 22:44:58,629
>> >>>>>>>>>>> StreamSession.java:505 - [Stream
>> #2c290460-20d4-11e6-930f-1b05ac77baf9]
>> >>>>>>>>>>> Streaming error occurred
>> >>>>>>>>>>> java.io.IOException: Broken pipe
>> >>>>>>>>>>>         at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketDispatcher.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.writeFromNativeBuffer(Unknown
>> >>>>>>>>>>> Source) ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.IOUtil.write(Unknown Source)
>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>> >>>>>>>>>>> ~[na:1.7.0_79]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.io.util.DataOutputStreamAndChannel.write(DataOutputStreamAndChannel.java:48)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)
>> >>>>>>>>>>> ~[apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:351)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at
>> >>>>>>>>>>>
>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:331)
>> >>>>>>>>>>> [apache-cassandra-2.1.13.jar:2.1.13]
>> >>>>>>>>>>>         at java.lang.Thread.run(Unknown Source) [na:1.7.0_79]
>>
>>
>>
>> --
>> Eric Evans
>> john.eric.ev...@gmail.com
>>
>
>

Reply via email to