I don't see an issue with the size of the data / node. You can attempt the rebuild again and play around with throughput if your network can handle it.
It can be changed on-the-fly with nodetool: nodetool setstreamthroughput This article is also worth a read - https://support.datastax.com/hc/en-us/articles/205409646-How-to-performance-tune-data-streaming-activities-like-repair-and-bootstrap -- Jacob Shadix On Fri, Apr 7, 2017 at 9:23 AM, Roland Otta <roland.o...@willhaben.at> wrote: > good point! > > on the source side i can see the following error > > ERROR [STREAM-OUT-/192.168.0.114:34094] 2017-04-06 17:18:56,532 > StreamSession.java:529 - [Stream #41606030-1ad9-11e7-9f16-51230e2be4e9] > Streaming error occurred on session with peer 10.192.116.1 through 192.168. > 0.114 > org.apache.cassandra.io.FSReadError: java.io.IOException: Broken pipe > at > org.apache.cassandra.io.util.ChannelProxy.transferTo(ChannelProxy.java:145) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.compress. > CompressedStreamWriter.lambda$write$0(CompressedStreamWriter.java:90) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.io.util.BufferedDataOutputStreamPlus. > applyToChannel(BufferedDataOutputStreamPlus.java:350) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.compress. > CompressedStreamWriter.write(CompressedStreamWriter.java:90) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.messages. > OutgoingFileMessage.serialize(OutgoingFileMessage.java:91) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1. > serialize(OutgoingFileMessage.java:48) ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1. > serialize(OutgoingFileMessage.java:40) ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.messages. > StreamMessage.serialize(StreamMessage.java:48) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.ConnectionHandler$ > OutgoingMessageHandler.sendMessage(ConnectionHandler.java:370) > ~[apache-cassandra-3.7.jar:3.7] > at org.apache.cassandra.streaming.ConnectionHandler$ > OutgoingMessageHandler.run(ConnectionHandler.java:342) > ~[apache-cassandra-3.7.jar:3.7] > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77] > Caused by: java.io.IOException: Broken pipe > at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) > ~[na:1.8.0_77] > at > sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428) > ~[na:1.8.0_77] > at > sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493) > ~[na:1.8.0_77] > at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608) > ~[na:1.8.0_77] > at > org.apache.cassandra.io.util.ChannelProxy.transferTo(ChannelProxy.java:141) > ~[apache-cassandra-3.7.jar:3.7] > ... 10 common frames omitted > DEBUG [STREAM-OUT-/192.168.0.114:34094] 2017-04-06 17:18:56,532 > ConnectionHandler.java:110 - [Stream #41606030-1ad9-11e7-9f16-51230e2be4e9] > Closing stream connection handler on /10.192.116.1 > INFO [STREAM-OUT-/192.168.0.114:34094] 2017-04-06 17:18:56,532 > StreamResultFuture.java:187 - [Stream #41606030-1ad9-11e7-9f16-51230e2be4e9] > Session with /10.192.116.1 is complete > WARN [STREAM-OUT-/192.168.0.114:34094] 2017-04-06 17:18:56,532 > StreamResultFuture.java:214 - [Stream #41606030-1ad9-11e7-9f16-51230e2be4e9] > Stream failed > > > the dataset is approx 300GB / Node. > > does that mean that cassandra does not try to reconnect (for streaming) in > case of short network dropouts? > > On Fri, 2017-04-07 at 08:53 -0400, Jacob Shadix wrote: > > Did you look at the logs on the source DC as well? How big is the dataset? > > -- Jacob Shadix > > On Fri, Apr 7, 2017 at 7:16 AM, Roland Otta <roland.o...@willhaben.at> > wrote: > > Hi! > > we are on 3.7. > > we have some debug messages ... but i guess they are not related to that > issue > DEBUG [GossipStage:1] 2017-04-07 13:11:00,440 FailureDetector.java:456 - > Ignoring interval time of 2002469610 for /192.168.0.27 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,441 FailureDetector.java:456 - > Ignoring interval time of 2598593732 for /10.192.116.4 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,441 FailureDetector.java:456 - > Ignoring interval time of 2002612298 for /10.192.116.5 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,441 FailureDetector.java:456 - > Ignoring interval time of 2002660534 for /10.192.116.9 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,465 FailureDetector.java:456 - > Ignoring interval time of 2027212880 for /10.192.116.3 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,465 FailureDetector.java:456 - > Ignoring interval time of 2027279042 for /192.168.0.188 > DEBUG [GossipStage:1] 2017-04-07 13:11:00,465 FailureDetector.java:456 - > Ignoring interval time of 2027313992 for /10.192.116.10 > > beside that the debug.log is clean > > all the mentioned cassandra.yml parameters are the shipped defaults ( > streaming_socket_timeout_in_ms does not exist at all in my cassandra.yml) > i also checked the pending compactions. there are no pending compactions > at the moment. > > bg - roland otta > > On Fri, 2017-04-07 at 06:47 -0400, Jacob Shadix wrote: > > What version are you running? Do you see any errors in the system.log > (SocketTimeout, for instance)? > > And what values do you have for the following in cassandra.yaml: > - - stream_throughput_outbound_megabits_per_sec > - - compaction_throughput_mb_per_sec > - - streaming_socket_timeout_in_ms > > -- Jacob Shadix > > On Fri, Apr 7, 2017 at 6:00 AM, Roland Otta <roland.o...@willhaben.at> > wrote: > > hi, > > we are trying to setup a new datacenter and are initalizing the data > with nodetool rebuild. > > after some hours it seems that the node stopped streaming (at least > there is no more streaming traffic on the network interface). > > nodetool netstats shows that the streaming is still in progress > > Mode: NORMAL > Bootstrap 6918dc90-1ad6-11e7-9f16-51230e2be4e9 > Rebuild 41606030-1ad9-11e7-9f16-51230e2be4e9 > /192.168.0.26 > Receiving 257 files, 145444246572 bytes total. Already received > 1 files, 1744027 bytes total > bds/adcounter_total 76456/47310255 bytes(0%) received from > idx:0/192.168.0.26 > bds/upselling_event 1667571/1667571 bytes(100%) received > from idx:0/192.168.0.26 > /192.168.0.188 > /192.168.0.27 > Receiving 169 files, 79355302464 bytes total. Already received > 1 files, 81585975 bytes total > bds/ad_event_history 81585975/81585975 bytes(100%) received > from idx:0/192.168.0.27 > /192.168.0.189 > Receiving 140 files, 19673034809 bytes total. Already received > 1 files, 5996604 bytes total > bds/adcounter_per_day 5956840/42259846 bytes(14%) received > from idx:0/192.168.0.189 > bds/user_event 39764/39764 bytes(100%) received from > idx:0/192.168.0.189 > Read Repair Statistics: > Attempted: 0 > Mismatch (Blocking): 0 > Mismatch (Background): 0 > Pool Name Active Pending Completed Dropped > Large messages n/a 2 3 0 > Small messages n/a 0 68632465 0 > Gossip messages n/a 0 217661 0 > > > > it is in that state for approx 15 hours now > > does it make sense waiting for the streaming to finish or do i have to > restart the node, discard data and restart the rebuild? > > > > >