Re: nodetool removenode causing the schema out of sync

2017-07-13 Thread Jai Bheemsen Rao Dhanwada
Also changing the compaction throughput on the fly while removing nodes is
not scalable as we have 100s of nodes.
I can try and test though

On Thursday, July 13, 2017, Jai Bheemsen Rao Dhanwada <jaibheem...@gmail.com>
wrote:

> Yes i did removenode and removenode force, and ecnountered same issue in
> both the cases.
>
> On Thursday, July 13, 2017, Subroto Barua <sbarua...@yahoo.com.invalid>
> wrote:
>
>> set streamthroughput higher than 200 on the source side and lower on the
>> target node
>>
>> just curious, have you tried removenode force?
>>
>>
>> On Thursday, July 13, 2017, 8:35:38 AM PDT, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>
>> Thank you Sean,
>>
>> you mean setstreamthroughput to a lower value on the node where we are
>> doing a "nodetool removenode "?
>>
>>
>> On Thu, Jul 13, 2017 at 8:07 AM, Durity, Sean R <
>> sean_r_dur...@homedepot.com> wrote:
>>
>> Late to this party, but Jeff is talking about nodetool
>> setstreamthroughput. The default in most versions is 200 Mb/s (set in yaml
>> file as stream_throughput_outbound_ megabits_per_sec). This is outbound
>> throttle only. So, if streams from multiple nodes are going to one, it can
>> get inundated.
>>
>>
>>
>> The nodetool command lets you change this on the fly (no bounce
>> required), but I don’t think it affects any current streaming from that
>> node (only future). You can use nodetool getstreamthroughput to see the
>> current value.
>>
>>
>>
>>
>>
>> Sean Durity
>>
>>
>>
>> *From:* Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
>> *Sent:* Thursday, June 29, 2017 6:39 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: nodetool removenode causing the schema out of sync
>>
>>
>>
>> Thanks Jeff,
>>
>>
>>
>> Can you please suggest what value to tweak from the Cassandra side?
>>
>>
>>
>> On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa <jji...@apache.org> wrote:
>>
>>
>>
>> On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>> > Hello Jeff,
>> >
>> > Sorry the Version I am using 2.1.16, my first email had typo.
>> > When I say schema out of sync
>> >
>> > 1. nodetool descriebcluster shows Schema versions same for all nodes.
>>
>> Ok got it, this is what I was most concerned with.
>>
>> > 2. nodetool removenode, shows the node down messages in the logs
>> > 3. nodetool describecluster during this 1-2 mins shows several nodes as
>> > UNREACHABLE and recovers with in a minute or two.
>>
>> This is likely due to overhead of streaming - you're probably running
>> pretty close to your tipping point, and your streaming throughput creates
>> enough GC pressure on the destinations to make them flap a bit. If you use
>> the streaming throughput throttle, you may be able to help mitigate that
>> somewhat (at the cost of speed).
>>
>>
>>
>>
>> -- -- -
>> To unsubscribe, e-mail: user-unsubscribe@cassandra. apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>> --
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>>
>>


Re: nodetool removenode causing the schema out of sync

2017-07-13 Thread Jai Bheemsen Rao Dhanwada
Yes i did removenode and removenode force, and ecnountered same issue in
both the cases.

On Thursday, July 13, 2017, Subroto Barua <sbarua...@yahoo.com.invalid>
wrote:

> set streamthroughput higher than 200 on the source side and lower on the
> target node
>
> just curious, have you tried removenode force?
>
>
> On Thursday, July 13, 2017, 8:35:38 AM PDT, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com
> <javascript:_e(%7B%7D,'cvml','jaibheem...@gmail.com');>> wrote:
>
>
> Thank you Sean,
>
> you mean setstreamthroughput to a lower value on the node where we are
> doing a "nodetool removenode "?
>
>
> On Thu, Jul 13, 2017 at 8:07 AM, Durity, Sean R <
> sean_r_dur...@homedepot.com
> <javascript:_e(%7B%7D,'cvml','sean_r_dur...@homedepot.com');>> wrote:
>
> Late to this party, but Jeff is talking about nodetool
> setstreamthroughput. The default in most versions is 200 Mb/s (set in yaml
> file as stream_throughput_outbound_ megabits_per_sec). This is outbound
> throttle only. So, if streams from multiple nodes are going to one, it can
> get inundated.
>
>
>
> The nodetool command lets you change this on the fly (no bounce required),
> but I don’t think it affects any current streaming from that node (only
> future). You can use nodetool getstreamthroughput to see the current value.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com
> <javascript:_e(%7B%7D,'cvml','jaibheem...@gmail.com');>]
> *Sent:* Thursday, June 29, 2017 6:39 PM
> *To:* user@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user@cassandra.apache.org');>
> *Subject:* Re: nodetool removenode causing the schema out of sync
>
>
>
> Thanks Jeff,
>
>
>
> Can you please suggest what value to tweak from the Cassandra side?
>
>
>
> On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa <jji...@apache.org
> <javascript:_e(%7B%7D,'cvml','jji...@apache.org');>> wrote:
>
>
>
> On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com
> <javascript:_e(%7B%7D,'cvml','jaibheem...@gmail.com');>> wrote:
> > Hello Jeff,
> >
> > Sorry the Version I am using 2.1.16, my first email had typo.
> > When I say schema out of sync
> >
> > 1. nodetool descriebcluster shows Schema versions same for all nodes.
>
> Ok got it, this is what I was most concerned with.
>
> > 2. nodetool removenode, shows the node down messages in the logs
> > 3. nodetool describecluster during this 1-2 mins shows several nodes as
> > UNREACHABLE and recovers with in a minute or two.
>
> This is likely due to overhead of streaming - you're probably running
> pretty close to your tipping point, and your streaming throughput creates
> enough GC pressure on the destinations to make them flap a bit. If you use
> the streaming throughput throttle, you may be able to help mitigate that
> somewhat (at the cost of speed).
>
>
>
>
> -- -- -
> To unsubscribe, e-mail: user-unsubscribe@cassandra. apache.org
> <javascript:_e(%7B%7D,'cvml','user-unsubscr...@cassandra.apache.org');>
> For additional commands, e-mail: user-h...@cassandra.apache.org
> <javascript:_e(%7B%7D,'cvml','user-h...@cassandra.apache.org');>
>
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
>


Re: nodetool removenode causing the schema out of sync

2017-07-13 Thread Subroto Barua
set streamthroughput higher than 200 on the source side and lower on the target 
node
just curious, have you tried removenode force?

On Thursday, July 13, 2017, 8:35:38 AM PDT, Jai Bheemsen Rao Dhanwada 
<jaibheem...@gmail.com> wrote:

Thank you Sean,
you mean setstreamthroughput to a lower value on the node where we are doing a 
"nodetool removenode "?

On Thu, Jul 13, 2017 at 8:07 AM, Durity, Sean R <sean_r_dur...@homedepot.com> 
wrote:


Late to this party, but Jeff is talking about nodetool setstreamthroughput. The 
default in most versions is 200 Mb/s (set in yaml file as 
stream_throughput_outbound_ megabits_per_sec). This is outbound throttle only. 
So, if streams from multiple nodes are going to one, it can get inundated.

 

The nodetool command lets you change this on the fly (no bounce required), but 
I don’t think it affects any current streaming from that node (only future). 
You can use nodetool getstreamthroughput to see the current value.

 

 

Sean Durity

 

From: Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
Sent: Thursday, June 29, 2017 6:39 PM
To: user@cassandra.apache.org
Subject: Re: nodetool removenode causing the schema out of sync

 

Thanks Jeff,

 

Can you please suggest what value to tweak from the Cassandra side?

 

On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa <jji...@apache.org> wrote:




On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada <jaibheem...@gmail.com> 
wrote:
> Hello Jeff,
>
> Sorry the Version I am using 2.1.16, my first email had typo.
> When I say schema out of sync
>
> 1. nodetool descriebcluster shows Schema versions same for all nodes.

Ok got it, this is what I was most concerned with.

> 2. nodetool removenode, shows the node down messages in the logs
> 3. nodetool describecluster during this 1-2 mins shows several nodes as
> UNREACHABLE and recovers with in a minute or two.

This is likely due to overhead of streaming - you're probably running pretty 
close to your tipping point, and your streaming throughput creates enough GC 
pressure on the destinations to make them flap a bit. If you use the streaming 
throughput throttle, you may be able to help mitigate that somewhat (at the 
cost of speed).




-- -- -
To unsubscribe, e-mail: user-unsubscribe@cassandra. apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


 


The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.




Re: nodetool removenode causing the schema out of sync

2017-07-13 Thread Jai Bheemsen Rao Dhanwada
Thank you Sean,

you mean setstreamthroughput to a lower value on the node where we are
doing a "nodetool removenode "?


On Thu, Jul 13, 2017 at 8:07 AM, Durity, Sean R <sean_r_dur...@homedepot.com
> wrote:

> Late to this party, but Jeff is talking about nodetool
> setstreamthroughput. The default in most versions is 200 Mb/s (set in yaml
> file as stream_throughput_outbound_megabits_per_sec). This is outbound
> throttle only. So, if streams from multiple nodes are going to one, it can
> get inundated.
>
>
>
> The nodetool command lets you change this on the fly (no bounce required),
> but I don’t think it affects any current streaming from that node (only
> future). You can use nodetool getstreamthroughput to see the current value.
>
>
>
>
>
> Sean Durity
>
>
>
> *From:* Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
> *Sent:* Thursday, June 29, 2017 6:39 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: nodetool removenode causing the schema out of sync
>
>
>
> Thanks Jeff,
>
>
>
> Can you please suggest what value to tweak from the Cassandra side?
>
>
>
> On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa <jji...@apache.org> wrote:
>
>
>
> On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
> > Hello Jeff,
> >
> > Sorry the Version I am using 2.1.16, my first email had typo.
> > When I say schema out of sync
> >
> > 1. nodetool descriebcluster shows Schema versions same for all nodes.
>
> Ok got it, this is what I was most concerned with.
>
> > 2. nodetool removenode, shows the node down messages in the logs
> > 3. nodetool describecluster during this 1-2 mins shows several nodes as
> > UNREACHABLE and recovers with in a minute or two.
>
> This is likely due to overhead of streaming - you're probably running
> pretty close to your tipping point, and your streaming throughput creates
> enough GC pressure on the destinations to make them flap a bit. If you use
> the streaming throughput throttle, you may be able to help mitigate that
> somewhat (at the cost of speed).
>
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


RE: nodetool removenode causing the schema out of sync

2017-07-13 Thread Durity, Sean R
Late to this party, but Jeff is talking about nodetool setstreamthroughput. The 
default in most versions is 200 Mb/s (set in yaml file as 
stream_throughput_outbound_megabits_per_sec). This is outbound throttle only. 
So, if streams from multiple nodes are going to one, it can get inundated.

The nodetool command lets you change this on the fly (no bounce required), but 
I don’t think it affects any current streaming from that node (only future). 
You can use nodetool getstreamthroughput to see the current value.


Sean Durity

From: Jai Bheemsen Rao Dhanwada [mailto:jaibheem...@gmail.com]
Sent: Thursday, June 29, 2017 6:39 PM
To: user@cassandra.apache.org
Subject: Re: nodetool removenode causing the schema out of sync

Thanks Jeff,

Can you please suggest what value to tweak from the Cassandra side?

On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa 
<jji...@apache.org<mailto:jji...@apache.org>> wrote:


On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada 
<jaibheem...@gmail.com<mailto:jaibheem...@gmail.com>> wrote:
> Hello Jeff,
>
> Sorry the Version I am using 2.1.16, my first email had typo.
> When I say schema out of sync
>
> 1. nodetool descriebcluster shows Schema versions same for all nodes.

Ok got it, this is what I was most concerned with.

> 2. nodetool removenode, shows the node down messages in the logs
> 3. nodetool describecluster during this 1-2 mins shows several nodes as
> UNREACHABLE and recovers with in a minute or two.

This is likely due to overhead of streaming - you're probably running pretty 
close to your tipping point, and your streaming throughput creates enough GC 
pressure on the destinations to make them flap a bit. If you use the streaming 
throughput throttle, you may be able to help mitigate that somewhat (at the 
cost of speed).



-
To unsubscribe, e-mail: 
user-unsubscr...@cassandra.apache.org<mailto:user-unsubscr...@cassandra.apache.org>
For additional commands, e-mail: 
user-h...@cassandra.apache.org<mailto:user-h...@cassandra.apache.org>




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: nodetool removenode causing the schema out of sync

2017-06-29 Thread Jai Bheemsen Rao Dhanwada
Thanks Jeff,

Can you please suggest what value to tweak from the Cassandra side?

On Thu, Jun 29, 2017 at 2:53 PM, Jeff Jirsa  wrote:

>
>
> On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
> > Hello Jeff,
> >
> > Sorry the Version I am using 2.1.16, my first email had typo.
> > When I say schema out of sync
> >
> > 1. nodetool descriebcluster shows Schema versions same for all nodes.
>
> Ok got it, this is what I was most concerned with.
>
> > 2. nodetool removenode, shows the node down messages in the logs
> > 3. nodetool describecluster during this 1-2 mins shows several nodes as
> > UNREACHABLE and recovers with in a minute or two.
>
> This is likely due to overhead of streaming - you're probably running
> pretty close to your tipping point, and your streaming throughput creates
> enough GC pressure on the destinations to make them flap a bit. If you use
> the streaming throughput throttle, you may be able to help mitigate that
> somewhat (at the cost of speed).
>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: nodetool removenode causing the schema out of sync

2017-06-29 Thread Jeff Jirsa


On 2017-06-29 13:45 (-0700), Jai Bheemsen Rao Dhanwada  
wrote: 
> Hello Jeff,
> 
> Sorry the Version I am using 2.1.16, my first email had typo.
> When I say schema out of sync
> 
> 1. nodetool descriebcluster shows Schema versions same for all nodes.

Ok got it, this is what I was most concerned with. 

> 2. nodetool removenode, shows the node down messages in the logs
> 3. nodetool describecluster during this 1-2 mins shows several nodes as
> UNREACHABLE and recovers with in a minute or two.

This is likely due to overhead of streaming - you're probably running pretty 
close to your tipping point, and your streaming throughput creates enough GC 
pressure on the destinations to make them flap a bit. If you use the streaming 
throughput throttle, you may be able to help mitigate that somewhat (at the 
cost of speed).



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: nodetool removenode causing the schema out of sync

2017-06-29 Thread Jai Bheemsen Rao Dhanwada
Hello Jeff,

Sorry the Version I am using 2.1.16, my first email had typo.
When I say schema out of sync

1. nodetool descriebcluster shows Schema versions same for all nodes.
2. nodetool removenode, shows the node down messages in the logs
3. nodetool describecluster during this 1-2 mins shows several nodes as
UNREACHABLE and recovers with in a minute or two.

On Thu, Jun 29, 2017 at 12:51 PM, Jeff Jirsa  wrote:

>
> 2.1.16 is old, but it's not as old as 2.1.6, which is what you originally
> put, and would be much more concerning.
>
> It is true, however, that 'removenode' involves streaming data, and
> streaming data can be GC intensive (especially with compression enabled),
> which means if your cluster is on the edge of health you may cause it to
> teeter over the edge during streaming, causing nodes to flap (the DOWN
> messages in the logs). That doesn't really explain the schema change,
> though - how confident are you that the schema was properly in sync prior
> to the removenode?
>
> - Jeff
>
> On 2017-06-29 09:49 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
> > Hello Jeff,
> >
> > Yes 2.1.16 is old version, and we are planning to upgrade in few months.
> >
> > Only the gossiper info is logged stating that it marked several nodes
> down
> > and nothing else.
> >
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: nodetool removenode causing the schema out of sync

2017-06-29 Thread Jeff Jirsa

2.1.16 is old, but it's not as old as 2.1.6, which is what you originally put, 
and would be much more concerning.

It is true, however, that 'removenode' involves streaming data, and streaming 
data can be GC intensive (especially with compression enabled), which means if 
your cluster is on the edge of health you may cause it to teeter over the edge 
during streaming, causing nodes to flap (the DOWN messages in the logs). That 
doesn't really explain the schema change, though - how confident are you that 
the schema was properly in sync prior to the removenode?

- Jeff

On 2017-06-29 09:49 (-0700), Jai Bheemsen Rao Dhanwada  
wrote: 
> Hello Jeff,
> 
> Yes 2.1.16 is old version, and we are planning to upgrade in few months.
> 
> Only the gossiper info is logged stating that it marked several nodes down
> and nothing else.
> 
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: nodetool removenode causing the schema out of sync

2017-06-29 Thread Jai Bheemsen Rao Dhanwada
Hello Jeff,

Yes 2.1.16 is old version, and we are planning to upgrade in few months.

Only the gossiper info is logged stating that it marked several nodes down
and nothing else.


On Wed, Jun 28, 2017 at 8:15 PM, Jeff Jirsa  wrote:

>
>
> On 2017-06-28 18:51 (-0700), Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> wrote:
> > Hello,
> >
> > We are using C* version 2.1.6 and lately we are seeing an issue where,
> > nodetool removenode causing the schema to go out of sync and causing
> client
> > to fail for 2-3 minutes.
> >
> > C* cluster is in 8 Datacenters with RF=3 and has 50 nodes.
> > We have 130 Keyspaces and 500 CF in the cluster.
> >
> > Here are the sequence of actions that were performed.
> >
> > 1. One node failed abruptly in the cluster due to hardware issue
> > 2. Remove the node from the cluster using nodetool removenode from a live
> > node.
> > 3. Immediately I see all the nodes schema go out of sync and on the logs
> of
> > all the C* nodes, I see they mark few other (random) nodes as down. and
> > eventually recover after 2 minutes
> >
> > Logs in the nodes:
> >
> > INFO  [GossipTasks:1] 2017-06-27 20:34:39,707 Gossiper.java:1008 -
> > InetAddress /10.10.10.20 is now DOWN
> > INFO  [GossipTasks:1] 2017-06-27 20:34:39,714 Gossiper.java:1008 -
> > InetAddress /10.10.11.14 is now DOWN
> >
> > Any one have an idea why, removenode causing the cluster to go out of
> sync?
> >
>
> That's not really expected - I've never seen behavior like that. However,
> 2.1.6 is pretty old (just about 2 years, give or take), there have been
> hundreds or (more likely) thousands of fixes since then.
>
> Is the gossiper line the only thing logged? Anything about invalid
> generations?
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: nodetool removenode causing the schema out of sync

2017-06-28 Thread Jeff Jirsa


On 2017-06-28 18:51 (-0700), Jai Bheemsen Rao Dhanwada  
wrote: 
> Hello,
> 
> We are using C* version 2.1.6 and lately we are seeing an issue where,
> nodetool removenode causing the schema to go out of sync and causing client
> to fail for 2-3 minutes.
> 
> C* cluster is in 8 Datacenters with RF=3 and has 50 nodes.
> We have 130 Keyspaces and 500 CF in the cluster.
> 
> Here are the sequence of actions that were performed.
> 
> 1. One node failed abruptly in the cluster due to hardware issue
> 2. Remove the node from the cluster using nodetool removenode from a live
> node.
> 3. Immediately I see all the nodes schema go out of sync and on the logs of
> all the C* nodes, I see they mark few other (random) nodes as down. and
> eventually recover after 2 minutes
> 
> Logs in the nodes:
> 
> INFO  [GossipTasks:1] 2017-06-27 20:34:39,707 Gossiper.java:1008 -
> InetAddress /10.10.10.20 is now DOWN
> INFO  [GossipTasks:1] 2017-06-27 20:34:39,714 Gossiper.java:1008 -
> InetAddress /10.10.11.14 is now DOWN
> 
> Any one have an idea why, removenode causing the cluster to go out of sync?
> 

That's not really expected - I've never seen behavior like that. However, 2.1.6 
is pretty old (just about 2 years, give or take), there have been hundreds or 
(more likely) thousands of fixes since then.  

Is the gossiper line the only thing logged? Anything about invalid generations?


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org