Re: New node block in autobootstrap
> > Forgot to set replication for new data center :( I was feeling like it could be it :-). From the other thread: > It should be ran from DC3 servers, after altering keyspace to add > keyspaces to the new datacenter. Is this the way you're doing it? > >- Are all the nodes using the same version ('nodetool version')? >- What does 'nodetool status keyspace_name1' output? >- Are you sure to be using Network Topology Strategy on ' >*keyspace_name1'? *Have you modified this schema to add replications >on DC3 > > My guess is something could be wrong with the configuration. > I was starting to wonder about this one though, so thanks for letting us about it :-). C*heers, --- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-09-28 23:54 GMT+02:00 techpyaasa .: > Forgot to set replication for new data center :( > > On Wed, Sep 28, 2016 at 11:33 PM, Jonathan Haddad > wrote: > >> What was the reason? >> >> On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . >> wrote: >> >>> Very sorry...I got the reason for this issue.. >>> Please ignore. >>> >>> >>> On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . >>> wrote: >>> @Paulo We have done changes as you said net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 and increased streaming_socket_timeout_in_ms to 48 hours , "phi_convict_threshold : 9". And once again recommissioned new data center (DC3) , ran " nodetool rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild' got exit without any exception. Please check logs below *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571 StorageService.java (line 914) rebuild from dc: IDC* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520 StreamResultFuture.java (line 87) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.75* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.132* * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.75* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.133* * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.132* * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.133* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.167* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.78* * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.167* * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.78* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.126* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.191* * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.126* * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526
Re: New node block in autobootstrap
Forgot to set replication for new data center :( On Wed, Sep 28, 2016 at 11:33 PM, Jonathan Haddadwrote: > What was the reason? > > On Wed, Sep 28, 2016 at 9:58 AM techpyaasa . wrote: > >> Very sorry...I got the reason for this issue.. >> Please ignore. >> >> >> On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . >> wrote: >> >>> @Paulo >>> >>> We have done changes as you said >>> net.ipv4.tcp_keepalive_time=60 >>> net.ipv4.tcp_keepalive_probes=3 >>> net.ipv4.tcp_keepalive_intvl=10 >>> >>> and increased streaming_socket_timeout_in_ms to 48 hours , >>> "phi_convict_threshold : 9". >>> >>> And once again recommissioned new data center (DC3) , ran " nodetool >>> rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild' >>> got exit without any exception. >>> >>> Please check logs below >>> >>> *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571 >>> StorageService.java (line 914) rebuild from dc: IDC* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520 >>> StreamResultFuture.java (line 87) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.75* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.132* >>> * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.75* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.133* >>> * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.132* >>> * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.133* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.167* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.78* >>> * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.167* >>> * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.78* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.126* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.191* >>> * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.126* >>> * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.191* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.168* >>> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527 >>> StreamResultFuture.java (line 91) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >>> /xxx.xxx.198.169* >>> * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.168* >>> * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528 >>> StreamSession.java (line 214) [Stream >>> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >>> /xxx.xxx.198.169* >>> * INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713 >>>
Re: New node block in autobootstrap
What was the reason? On Wed, Sep 28, 2016 at 9:58 AM techpyaasa .wrote: > Very sorry...I got the reason for this issue.. > Please ignore. > > > On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . > wrote: > >> @Paulo >> >> We have done changes as you said >> net.ipv4.tcp_keepalive_time=60 >> net.ipv4.tcp_keepalive_probes=3 >> net.ipv4.tcp_keepalive_intvl=10 >> >> and increased streaming_socket_timeout_in_ms to 48 hours , >> "phi_convict_threshold : 9". >> >> And once again recommissioned new data center (DC3) , ran " nodetool >> rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild' >> got exit without any exception. >> >> Please check logs below >> >> *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571 >> StorageService.java (line 914) rebuild from dc: IDC* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520 >> StreamResultFuture.java (line 87) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.75* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.132* >> * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.75* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.133* >> * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.132* >> * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.133* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.167* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.78* >> * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.167* >> * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.78* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.126* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.191* >> * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.126* >> * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.191* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.168* >> * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527 >> StreamResultFuture.java (line 91) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with >> /xxx.xxx.198.169* >> * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.168* >> * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528 >> StreamSession.java (line 214) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to >> /xxx.xxx.198.169* >> * INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713 >> StreamResultFuture.java (line 186) [Stream >> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.132 is >> complete* >> * INFO [STREAM-IN-/xxx.xxx.198.191] 2016-09-28 09:18:47,715 >> StreamResultFuture.java (line 186) [Stream >>
Re: New node block in autobootstrap
Very sorry...I got the reason for this issue.. Please ignore. On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa .wrote: > @Paulo > > We have done changes as you said > net.ipv4.tcp_keepalive_time=60 > net.ipv4.tcp_keepalive_probes=3 > net.ipv4.tcp_keepalive_intvl=10 > > and increased streaming_socket_timeout_in_ms to 48 hours , > "phi_convict_threshold : 9". > > And once again recommissioned new data center (DC3) , ran " nodetool > rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild' > got exit without any exception. > > Please check logs below > > *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571 > StorageService.java (line 914) rebuild from dc: IDC* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520 > StreamResultFuture.java (line 87) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.75* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.132* > * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.75* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.133* > * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.132* > * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.133* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.167* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.78* > * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.167* > * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.78* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.126* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.191* > * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.126* > * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.191* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.168* > * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527 > StreamResultFuture.java (line 91) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with > /xxx.xxx.198.169* > * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.168* > * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528 > StreamSession.java (line 214) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to > /xxx.xxx.198.169* > * INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713 > StreamResultFuture.java (line 186) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.132 is > complete* > * INFO [STREAM-IN-/xxx.xxx.198.191] 2016-09-28 09:18:47,715 > StreamResultFuture.java (line 186) [Stream > #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.191 is > complete* > * INFO [STREAM-IN-/xxx.xxx.198.133] 2016-09-28 09:18:47,716 > StreamResultFuture.java (line 186) [Stream >
Re: New node block in autobootstrap
@Paulo We have done changes as you said net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 and increased streaming_socket_timeout_in_ms to 48 hours , "phi_convict_threshold : 9". And once again recommissioned new data center (DC3) , ran " nodetool rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild' got exit without any exception. Please check logs below *INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:44,571 StorageService.java (line 914) rebuild from dc: IDC* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,520 StreamResultFuture.java (line 87) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,521 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.75* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.132* * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.75* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,522 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.133* * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.132* * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.133* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,523 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.167* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.78* * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.167* * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.78* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,524 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.126* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,525 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.191* * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.126* * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.191* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,526 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.168* * INFO [RMI TCP Connection(10)-xxx.xxx.12.140] 2016-09-28 09:18:47,527 StreamResultFuture.java (line 91) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with /xxx.xxx.198.169* * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.168* * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528 StreamSession.java (line 214) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to /xxx.xxx.198.169* * INFO [STREAM-IN-/xxx.xxx.198.132] 2016-09-28 09:18:47,713 StreamResultFuture.java (line 186) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.132 is complete* * INFO [STREAM-IN-/xxx.xxx.198.191] 2016-09-28 09:18:47,715 StreamResultFuture.java (line 186) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.191 is complete* * INFO [STREAM-IN-/xxx.xxx.198.133] 2016-09-28 09:18:47,716 StreamResultFuture.java (line 186) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.133 is complete* * INFO [STREAM-IN-/xxx.xxx.198.169] 2016-09-28 09:18:47,716 StreamResultFuture.java (line 186) [Stream #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with /xxx.xxx.198.169 is complete* * INFO [STREAM-IN-/xxx.xxx.198.167] 2016-09-28 09:18:47,715 StreamResultFuture.java (line 186)
Re: New node block in autobootstrap
Ok... Thanks for the reply... I'm going to retry nodetool rebuild with following changes as you said net.ipv4.tcp_keepalive_time=60 net.ipv4.tcp_keepalive_probes=3 net.ipv4.tcp_keepalive_intvl=10 Hope this changes would be enough on the new node where I'm running 'nodetool rebuild' and hope NOT required on all existing nodes from which data is going to get streamed..Am I right? On Sep 28, 2016 1:04 AM, "Paulo Motta"wrote: > Yeah this is likely to be caused by idle connections being shut down, so > you may need to update your tcp_keepalive* and/or network/firewall settings. > > 2016-09-27 15:29 GMT-03:00 laxmikanth sadula : > >> Hi paul, >> >> Thanks for the reply... >> >> I'm getting following streaming exceptions during nodetool rebuild in >> c*-2.0.17 >> >> *04:24:49,759 StreamSession.java (line 461) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* >> *java.io.IOException: Connection timed out* >> *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* >> *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* >> *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* >> *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* >> *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* >> *at >> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* >> *at >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* >> *at >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)* >> *at java.lang.Thread.run(Thread.java:745)* >> *DEBUG [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >> ConnectionHandler.java (line 104) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on >> /xxx.xxx.98.168* >> * INFO [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >> StreamResultFuture.java (line 186) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with /xxx.xxx.98.168 is >> complete* >> *ERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 >> StreamSession.java (line 461) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* >> *java.io.IOException: Broken pipe* >> *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* >> *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* >> *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* >> *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* >> *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* >> *at >> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* >> *at >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* >> *at >> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)* >> *at java.lang.Thread.run(Thread.java:745)* >> *DEBUG [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 >> ConnectionHandler.java (line 244) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId: >> 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys: >> 4736, transfer size: 2306880, compressed?: true), file: >> /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)* >> *ERROR [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 >> StreamSession.java (line 461) [Stream >> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* >> *java.lang.RuntimeException: Outgoing stream handler has been closed* >> *at >> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)* >> *at >> org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)* >> *at >> org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* >> *at >> org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* >> *at java.lang.Thread.run(Thread.java:745)* >> >> On Sep 27, 2016 11:48 PM, "Paulo Motta" wrote: >> >>> What type of streaming timeout are you getting? Do you have a stack >>> trace? What version are you in? >>> >>> See more information about tuning tcp_keepalive* here: >>> https://docs.datastax.com/en/cassandra/2.0/cassandra/trouble >>> shooting/trblshootIdleFirewall.html >>> >>> 2016-09-27 14:07 GMT-03:00 laxmikanth sadula : >>> @Paulo Motta Even we are facing Streaming timeout exceptions during 'nodetool rebuild' , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as suggested in datastax blog - https://support.datastax.com/h c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s treaming-errors-or-failures ,
Re: New node block in autobootstrap
Yeah this is likely to be caused by idle connections being shut down, so you may need to update your tcp_keepalive* and/or network/firewall settings. 2016-09-27 15:29 GMT-03:00 laxmikanth sadula: > Hi paul, > > Thanks for the reply... > > I'm getting following streaming exceptions during nodetool rebuild in > c*-2.0.17 > > *04:24:49,759 StreamSession.java (line 461) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* > *java.io.IOException: Connection timed out* > *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* > *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* > *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* > *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* > *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* > *at > org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* > *at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* > *at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)* > *at java.lang.Thread.run(Thread.java:745)* > *DEBUG [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 > ConnectionHandler.java (line 104) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on > /xxx.xxx.98.168* > * INFO [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 > StreamResultFuture.java (line 186) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with /xxx.xxx.98.168 is > complete* > *ERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 > StreamSession.java (line 461) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* > *java.io.IOException: Broken pipe* > *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* > *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* > *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* > *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* > *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* > *at > org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* > *at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* > *at > org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)* > *at java.lang.Thread.run(Thread.java:745)* > *DEBUG [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 > ConnectionHandler.java (line 244) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId: > 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys: > 4736, transfer size: 2306880, compressed?: true), file: > /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)* > *ERROR [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 > StreamSession.java (line 461) [Stream > #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* > *java.lang.RuntimeException: Outgoing stream handler has been closed* > *at > org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)* > *at > org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)* > *at > org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* > *at > org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* > *at java.lang.Thread.run(Thread.java:745)* > > On Sep 27, 2016 11:48 PM, "Paulo Motta" wrote: > >> What type of streaming timeout are you getting? Do you have a stack >> trace? What version are you in? >> >> See more information about tuning tcp_keepalive* here: >> https://docs.datastax.com/en/cassandra/2.0/cassandra/trouble >> shooting/trblshootIdleFirewall.html >> >> 2016-09-27 14:07 GMT-03:00 laxmikanth sadula : >> >>> @Paulo Motta >>> >>> Even we are facing Streaming timeout exceptions during 'nodetool >>> rebuild' , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as >>> suggested in datastax blog - https://support.datastax.com/h >>> c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s >>> treaming-errors-or-failures , but still we are getting streaming >>> exceptions. >>> >>> And what is the suggestible settings/values for kernel tcp_keepalive >>> which would help streaming succeed ? >>> >>> Thank you >>> >>> On Tue, Aug 16, 2016 at 12:21 AM, Paulo Motta >>> wrote: >>> What version are you in? This seems like a typical case were there was a problem with streaming (hanging, etc), do you have access to the logs? Maybe look for streaming errors? Typically streaming errors are related to timeouts, so you should
Re: New node block in autobootstrap
Hi paul, Thanks for the reply... I'm getting following streaming exceptions during nodetool rebuild in c*-2.0.17 *04:24:49,759 StreamSession.java (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* *java.io.IOException: Connection timed out* *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* *at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* *at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* *at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:311)* *at java.lang.Thread.run(Thread.java:745)* *DEBUG [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 ConnectionHandler.java (line 104) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on /xxx.xxx.98.168* * INFO [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 StreamResultFuture.java (line 186) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with /xxx.xxx.98.168 is complete* *ERROR [STREAM-OUT-/xxx.xxx.98.168] 2016-09-27 04:24:49,764 StreamSession.java (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* *java.io.IOException: Broken pipe* *at sun.nio.ch.FileDispatcherImpl.write0(Native Method)* *at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)* *at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)* *at sun.nio.ch.IOUtil.write(IOUtil.java:65)* *at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)* *at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:44)* *at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:339)* *at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:319)* *at java.lang.Thread.run(Thread.java:745)* *DEBUG [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 ConnectionHandler.java (line 244) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId: 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys: 4736, transfer size: 2306880, compressed?: true), file: /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)* *ERROR [STREAM-IN-/xxx.xxx.98.168] 2016-09-27 04:24:49,909 StreamSession.java (line 461) [Stream #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred* *java.lang.RuntimeException: Outgoing stream handler has been closed* *at org.apache.cassandra.streaming.ConnectionHandler.sendMessage(ConnectionHandler.java:126)* *at org.apache.cassandra.streaming.StreamSession.receive(StreamSession.java:524)* *at org.apache.cassandra.streaming.StreamSession.messageReceived(StreamSession.java:413)* *at org.apache.cassandra.streaming.ConnectionHandler$IncomingMessageHandler.run(ConnectionHandler.java:245)* *at java.lang.Thread.run(Thread.java:745)* On Sep 27, 2016 11:48 PM, "Paulo Motta"wrote: > What type of streaming timeout are you getting? Do you have a stack trace? > What version are you in? > > See more information about tuning tcp_keepalive* here: > https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/ > trblshootIdleFirewall.html > > 2016-09-27 14:07 GMT-03:00 laxmikanth sadula : > >> @Paulo Motta >> >> Even we are facing Streaming timeout exceptions during 'nodetool rebuild' >> , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as suggested >> in datastax blog - https://support.datastax.com/h >> c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s >> treaming-errors-or-failures , but still we are getting streaming >> exceptions. >> >> And what is the suggestible settings/values for kernel tcp_keepalive >> which would help streaming succeed ? >> >> Thank you >> >> On Tue, Aug 16, 2016 at 12:21 AM, Paulo Motta >> wrote: >> >>> What version are you in? This seems like a typical case were there was a >>> problem with streaming (hanging, etc), do you have access to the logs? >>> Maybe look for streaming errors? Typically streaming errors are related to >>> timeouts, so you should review your cassandra >>> streaming_socket_timeout_in_ms and kernel tcp_keepalive settings. >>> >>> If you're on 2.2+ you can resume a failed bootstrap with nodetool >>> bootstrap resume. There were also some streaming hanging problems fixed >>> recently, so I'd advise you to upgrade to the latest version of your >>> particular series for a more robust version. >>> >>> Is there any
Re: New node block in autobootstrap
What type of streaming timeout are you getting? Do you have a stack trace? What version are you in? See more information about tuning tcp_keepalive* here: https://docs.datastax.com/en/cassandra/2.0/cassandra/troubleshooting/trblshootIdleFirewall.html 2016-09-27 14:07 GMT-03:00 laxmikanth sadula: > @Paulo Motta > > Even we are facing Streaming timeout exceptions during 'nodetool rebuild' > , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as suggested > in datastax blog - https://support.datastax.com/h > c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of- > streaming-errors-or-failures , but still we are getting streaming > exceptions. > > And what is the suggestible settings/values for kernel tcp_keepalive which > would help streaming succeed ? > > Thank you > > On Tue, Aug 16, 2016 at 12:21 AM, Paulo Motta > wrote: > >> What version are you in? This seems like a typical case were there was a >> problem with streaming (hanging, etc), do you have access to the logs? >> Maybe look for streaming errors? Typically streaming errors are related to >> timeouts, so you should review your cassandra >> streaming_socket_timeout_in_ms and kernel tcp_keepalive settings. >> >> If you're on 2.2+ you can resume a failed bootstrap with nodetool >> bootstrap resume. There were also some streaming hanging problems fixed >> recently, so I'd advise you to upgrade to the latest version of your >> particular series for a more robust version. >> >> Is there any reason why you didn't use the replace procedure >> (-Dreplace_address) to replace the node with the same tokens? This would be >> a bit faster than remove + bootstrap procedure. >> >> 2016-08-15 15:37 GMT-03:00 Jérôme Mainaud : >> >>> Hello, >>> >>> A client of mime have problems when adding a node in the cluster. >>> After 4 days, the node is still in joining mode, it doesn't have the >>> same level of load than the other and there seems to be no streaming from >>> and to the new node. >>> >>> This node has a history. >>> >>>1. At the begin, it was in a seed in the cluster. >>>2. Ops detected that client had problems with it. >>>3. They tried to reset it but failed. In their process they launched >>>several repair and rebuild process on the node. >>>4. Then they asked me to help them. >>>5. We stopped the node, >>>6. removed it from the list of seeds (more precisely it was replaced >>>by another node), >>>7. removed it from the cluster (I choose not to use decommission >>>since node data was compromised) >>>8. deleted all files from data, commitlog and savedcache >>>directories. >>>9. after the leaving process ended, it was started as a fresh new >>>node and began autobootstrap. >>> >>> >>> As I don’t have direct access to the cluster I don't have a lot of >>> information, but I will have tomorrow (logs and results of some commands). >>> And I can ask for people any required information. >>> >>> Does someone have any idea of what could have happened and what I should >>> investigate first ? >>> What would you do to unlock the situation ? >>> >>> Context: The cluster consists of two DC, each with 15 nodes. Average >>> load is around 3 TB per node. The joining node froze a little after 2 TB. >>> >>> Thank you for your help. >>> Cheers, >>> >>> >>> -- >>> Jérôme Mainaud >>> jer...@mainaud.com >>> >> >> > > > -- > Regards, > Laxmikanth > 99621 38051 > >
Re: New node block in autobootstrap
@Paulo Motta Even we are facing Streaming timeout exceptions during 'nodetool rebuild' , I set streaming_socket_timeout_in_ms to 8640 (24 hours) as suggested in datastax blog - https://support.datastax.com/hc/en-us/articles/206502913- FAQ-How-to-reduce-the-impact-of-streaming-errors-or-failures , but still we are getting streaming exceptions. And what is the suggestible settings/values for kernel tcp_keepalive which would help streaming succeed ? Thank you On Tue, Aug 16, 2016 at 12:21 AM, Paulo Mottawrote: > What version are you in? This seems like a typical case were there was a > problem with streaming (hanging, etc), do you have access to the logs? > Maybe look for streaming errors? Typically streaming errors are related to > timeouts, so you should review your cassandra > streaming_socket_timeout_in_ms and kernel tcp_keepalive settings. > > If you're on 2.2+ you can resume a failed bootstrap with nodetool > bootstrap resume. There were also some streaming hanging problems fixed > recently, so I'd advise you to upgrade to the latest version of your > particular series for a more robust version. > > Is there any reason why you didn't use the replace procedure > (-Dreplace_address) to replace the node with the same tokens? This would be > a bit faster than remove + bootstrap procedure. > > 2016-08-15 15:37 GMT-03:00 Jérôme Mainaud : > >> Hello, >> >> A client of mime have problems when adding a node in the cluster. >> After 4 days, the node is still in joining mode, it doesn't have the same >> level of load than the other and there seems to be no streaming from and to >> the new node. >> >> This node has a history. >> >>1. At the begin, it was in a seed in the cluster. >>2. Ops detected that client had problems with it. >>3. They tried to reset it but failed. In their process they launched >>several repair and rebuild process on the node. >>4. Then they asked me to help them. >>5. We stopped the node, >>6. removed it from the list of seeds (more precisely it was replaced >>by another node), >>7. removed it from the cluster (I choose not to use decommission >>since node data was compromised) >>8. deleted all files from data, commitlog and savedcache directories. >>9. after the leaving process ended, it was started as a fresh new >>node and began autobootstrap. >> >> >> As I don’t have direct access to the cluster I don't have a lot of >> information, but I will have tomorrow (logs and results of some commands). >> And I can ask for people any required information. >> >> Does someone have any idea of what could have happened and what I should >> investigate first ? >> What would you do to unlock the situation ? >> >> Context: The cluster consists of two DC, each with 15 nodes. Average load >> is around 3 TB per node. The joining node froze a little after 2 TB. >> >> Thank you for your help. >> Cheers, >> >> >> -- >> Jérôme Mainaud >> jer...@mainaud.com >> > > -- Regards, Laxmikanth 99621 38051
Re: New node block in autobootstrap
Hello Paul, Thank you for your reply. The version is 2.2.6. I received the logs today and can confirm three streams failed after timeout. We will try to resume the bootstrap as you recommended. I didn't use -Dreplace_address for two reasons: 1. Because someone tried to reset the node someway. Because this person is on vacation, nobody really knows what he did. I supposed he just trash the data directory and launch the node again without (-Dreplace_address) nor removing the node before. I was unsure about how valid the tokens were so I preferred to remove it to go back to a clean situation. 2. Since the replacing node and the new node have the same endpoint address (this is a fresh version of the same node) I was not sure the replace_address will not be confused. Since I had time and was not sure that replacing the node would work in my situation, I chose the slow safe way. Maybe I could have used it. -- Jérôme Mainaud jer...@mainaud.com 2016-08-15 20:51 GMT+02:00 Paulo Motta: > What version are you in? This seems like a typical case were there was a > problem with streaming (hanging, etc), do you have access to the logs? > Maybe look for streaming errors? Typically streaming errors are related to > timeouts, so you should review your cassandra > streaming_socket_timeout_in_ms and kernel tcp_keepalive settings. > > If you're on 2.2+ you can resume a failed bootstrap with nodetool > bootstrap resume. There were also some streaming hanging problems fixed > recently, so I'd advise you to upgrade to the latest version of your > particular series for a more robust version. > > Is there any reason why you didn't use the replace procedure > (-Dreplace_address) to replace the node with the same tokens? This would be > a bit faster than remove + bootstrap procedure. > > 2016-08-15 15:37 GMT-03:00 Jérôme Mainaud : > >> Hello, >> >> A client of mime have problems when adding a node in the cluster. >> After 4 days, the node is still in joining mode, it doesn't have the same >> level of load than the other and there seems to be no streaming from and to >> the new node. >> >> This node has a history. >> >>1. At the begin, it was in a seed in the cluster. >>2. Ops detected that client had problems with it. >>3. They tried to reset it but failed. In their process they launched >>several repair and rebuild process on the node. >>4. Then they asked me to help them. >>5. We stopped the node, >>6. removed it from the list of seeds (more precisely it was replaced >>by another node), >>7. removed it from the cluster (I choose not to use decommission >>since node data was compromised) >>8. deleted all files from data, commitlog and savedcache directories. >>9. after the leaving process ended, it was started as a fresh new >>node and began autobootstrap. >> >> >> As I don’t have direct access to the cluster I don't have a lot of >> information, but I will have tomorrow (logs and results of some commands). >> And I can ask for people any required information. >> >> Does someone have any idea of what could have happened and what I should >> investigate first ? >> What would you do to unlock the situation ? >> >> Context: The cluster consists of two DC, each with 15 nodes. Average load >> is around 3 TB per node. The joining node froze a little after 2 TB. >> >> Thank you for your help. >> Cheers, >> >> >> -- >> Jérôme Mainaud >> jer...@mainaud.com >> > >
Re: New node block in autobootstrap
What version are you in? This seems like a typical case were there was a problem with streaming (hanging, etc), do you have access to the logs? Maybe look for streaming errors? Typically streaming errors are related to timeouts, so you should review your cassandra streaming_socket_timeout_in_ms and kernel tcp_keepalive settings. If you're on 2.2+ you can resume a failed bootstrap with nodetool bootstrap resume. There were also some streaming hanging problems fixed recently, so I'd advise you to upgrade to the latest version of your particular series for a more robust version. Is there any reason why you didn't use the replace procedure (-Dreplace_address) to replace the node with the same tokens? This would be a bit faster than remove + bootstrap procedure. 2016-08-15 15:37 GMT-03:00 Jérôme Mainaud: > Hello, > > A client of mime have problems when adding a node in the cluster. > After 4 days, the node is still in joining mode, it doesn't have the same > level of load than the other and there seems to be no streaming from and to > the new node. > > This node has a history. > >1. At the begin, it was in a seed in the cluster. >2. Ops detected that client had problems with it. >3. They tried to reset it but failed. In their process they launched >several repair and rebuild process on the node. >4. Then they asked me to help them. >5. We stopped the node, >6. removed it from the list of seeds (more precisely it was replaced >by another node), >7. removed it from the cluster (I choose not to use decommission since >node data was compromised) >8. deleted all files from data, commitlog and savedcache directories. >9. after the leaving process ended, it was started as a fresh new node >and began autobootstrap. > > > As I don’t have direct access to the cluster I don't have a lot of > information, but I will have tomorrow (logs and results of some commands). > And I can ask for people any required information. > > Does someone have any idea of what could have happened and what I should > investigate first ? > What would you do to unlock the situation ? > > Context: The cluster consists of two DC, each with 15 nodes. Average load > is around 3 TB per node. The joining node froze a little after 2 TB. > > Thank you for your help. > Cheers, > > > -- > Jérôme Mainaud > jer...@mainaud.com >
New node block in autobootstrap
Hello, A client of mime have problems when adding a node in the cluster. After 4 days, the node is still in joining mode, it doesn't have the same level of load than the other and there seems to be no streaming from and to the new node. This node has a history. 1. At the begin, it was in a seed in the cluster. 2. Ops detected that client had problems with it. 3. They tried to reset it but failed. In their process they launched several repair and rebuild process on the node. 4. Then they asked me to help them. 5. We stopped the node, 6. removed it from the list of seeds (more precisely it was replaced by another node), 7. removed it from the cluster (I choose not to use decommission since node data was compromised) 8. deleted all files from data, commitlog and savedcache directories. 9. after the leaving process ended, it was started as a fresh new node and began autobootstrap. As I don’t have direct access to the cluster I don't have a lot of information, but I will have tomorrow (logs and results of some commands). And I can ask for people any required information. Does someone have any idea of what could have happened and what I should investigate first ? What would you do to unlock the situation ? Context: The cluster consists of two DC, each with 15 nodes. Average load is around 3 TB per node. The joining node froze a little after 2 TB. Thank you for your help. Cheers, -- Jérôme Mainaud jer...@mainaud.com