Re: Cassandra 2.1.2 node stuck on joining the cluster

2014-12-08 Thread Omri Bahumi
Any chance you have something along the path that causes the
connectivity issues?
What's the network connectivity between this node and the other node?

Can you try transferring a big file between the two servers? perhaps
you have an MTU issue that causes TCP PMTU discovery fail.
Can you send large pings between the servers? try pinging them from
both sides with large packets (5000, 1).

On Mon, Dec 8, 2014 at 3:22 PM, Krzysztof Zarzycki  wrote:
> Hi Cassandra users,
>
> I'm trying but failing to join a new (well old, but wiped out/decomissioned)
> node to an existing cluster.
>
> Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
> start a third node with 2.1.2, it gets to joining state, it bootstraps, i.e.
> streams some data as shown by nodetool netstats, but after some time, it
> gets stuck. From that point nothing gets streamed, the new node stays in
> joining state. I restarted node multiple times, each time it streamed more
> data, but then got stuck again.
>
> Other facts:
>
> I don't see any errors in the log on any of the nodes.
> The connectivity seems fine, I can ping, netcat to port 7000 all ways.
> I have ~ 200 GB load per running node, replication 2, 16 tokens.
> Load of a new node got to around 300GBs now.
>
> The bootstrapping process stops in the middle of streaming some table,
> always after sending exactly 10MB of some SSTable, e.g.:
>
> $ nodetool netstats | grep -P -v "bytes\(100" Mode: NORMAL Bootstrap
> e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
> 12493900 bytes total
> /home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
> 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
> Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch (Background):
> 168721 Pool Name Active Pending Completed Commands n/a 0 55802918 Responses
> n/a 0 425963
>
>
> I'm trying to join this node for several days and I don't know what to do
> with it... I'll be grateful for any help!
>
>
> Cheers,
>
> Krzysztof Zarzycki
>
>


Cassandra 2.1.2 node stuck on joining the cluster

2014-12-08 Thread Krzysztof Zarzycki
Hi Cassandra users,

I'm trying but failing to join a new (well old, but wiped
out/decomissioned) node to an existing cluster.

Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
start a third node with 2.1.2, it gets to joining state, it bootstraps,
i.e. streams some data as shown by nodetool netstats, but after some time,
it gets stuck. From that point nothing gets streamed, the new node stays in
joining state. I restarted node multiple times, each time it streamed more
data, but then got stuck again.

Other facts:

   - I don't see any errors in the log on any of the nodes.
   - The connectivity seems fine, I can ping, netcat to port 7000 all ways.
   - I have ~ 200 GB load per running node, replication 2, 16 tokens.
   - Load of a new node got to around 300GBs now.
   -

   The bootstrapping process stops in the middle of streaming some table,
   *always* after sending exactly 10MB of some SSTable, e.g.:

   $ nodetool netstats | grep -P -v "bytes\(100" Mode: NORMAL Bootstrap
   e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
   12493900 bytes total
   
/home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
   10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
   Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch
   (Background): 168721 Pool Name Active Pending Completed Commands n/a 0
   55802918 Responses n/a 0 425963


I'm trying to join this node for several days and I don't know what to do
with it... I'll be grateful for any help!


Cheers,

Krzysztof Zarzycki