Logged "nodetool failuredetector" every 5sec. Doesn't seems to be an issue for phi_convict_threshold value
On Tue, Oct 5, 2021 at 4:35 PM Surbhi Gupta <surbhi.gupt...@gmail.com> wrote: > Hi , > > Try to adjust phi_convict_threshold and see if that helps. > When we did migration from on prim to AWS, this was one of the factor to > consider. > > Thanks > > > On Tue, Oct 5, 2021 at 4:00 AM MyWorld <timeplus.1...@gmail.com> wrote: > >> Hi all, >> >> Need urgent help. >> We have one Physical Data Center of 5 nodes with 1 TB data on each >> (Location: Dallas). Currently we are using Cassandra ver 3.0.9. Now we are >> Adding one more Data Center of 5 nodes(Location GCP-US) and have joined it >> to the existing one. >> >> While running nodetool rebuild command, we are getting following error : >> On GCP node (where we ran rebuild command) : >> >>> ERROR [STREAM-IN-/192.x.x.x] 2021-10-05 15:56:52,246 >>> StreamSession.java:639 - [Stream #66646d30-25a2-11ec-903b-774f88efe725] >>> Remote peer 192.x.x.x failed stream session. >>> INFO [STREAM-IN-/192.x.x.x] 2021-10-05 15:56:52,266 >>> StreamResultFuture.java:183 - [Stream >>> #66646d30-25a2-11ec-903b-774f88efe725] Session with /192.x.x.x is complete >> >> >> On DL source node : >> >>> INFO [STREAM-IN-/34.x.x.x] 2021-10-05 15:55:53,785 >>> StreamResultFuture.java:183 - [Stream >>> #66646d30-25a2-11ec-903b-774f88efe725] Session with /34.x.x.x is complete >>> ERROR [STREAM-OUT-/34.x.x.x] 2021-10-05 15:55:53,785 >>> StreamSession.java:534 - [Stream #66646d30-25a2-11ec-903b-774f88efe725] >>> Streaming error occurred >>> java.lang.RuntimeException: Transfer of file >>> /var/lib/cassandra/data/clickstream/glusr_usr_paid_url_mv-3c49c392b35511e9bd0a8f42dfb09617/mc-45676-big-Data.db >>> already completed or aborted (perhaps session failed?). >>> at >>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.startTransfer(OutgoingFileMessage.java:120) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at >>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at >>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at >>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:48) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:387) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at >>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:367) >>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192] >>> WARN [STREAM-IN-/34.x.x.x] 2021-10-05 15:55:53,786 >>> StreamResultFuture.java:210 - [Stream >>> #66646d30-25a2-11ec-903b-774f88efe725] Stream failed >> >> >> Before starting this rebuild, we have made the following changes: >> 1. Set setstreamthroughput to 600 Mb/sec >> 2. Set setinterdcstreamthroughput to 600 Mb/sec >> 3. streaming_socket_timeout_in_ms is 24 hrs >> 4. Disabled autocompaction on GCP node as this was heavily utilising CPU >> resource >> >> FYI, GCP rebuild process starts with data streaming from 3 nodes, and all >> fails one by one after streaming for a few hours. >> Please help out how to correct this issue. >> Is there any other way to rebuild such big data. >> We have a few tables with 200 - 400GB of data and some smaller tables. >> Also, we have Mviews in our environment >> >> Regards, >> Ashish Gupta >> >> >