RE: Jmx metrics shows node down

2019-07-29 Thread ZAIDI, ASAD A
Another way to purge gossip info  from each node is to:


1.  Gracefully stop cassandra i.e. nodetool drain; kill Casandra PID

2.  Move/delete files from $DATADIR/system/peers/

3.  Add JVM_OPTS="$JVM_OPTS -Dcassandra.load_ring_state=false" in jvm.options 
file

4.  Restart Cassandra service.

5.  Repeat above steps on each nodes in dc/cluster.

6.  Once gossip info is purged, remove jvm option added in step 3 and restart 
instance again.

Depending on cluster,load size you may get this done swiftly.

~Asad


From: yuping wang [mailto:yupingwyp1...@gmail.com]
Sent: Monday, July 29, 2019 10:56 AM
To: user@cassandra.apache.org
Subject: Re: Jmx metrics shows node down

Is there workaround to shorten 72 hours to something shorter?(you said by 
default, wondering if one can set a non-default value?)

Thanks,
Yuping

On Jul 29, 2019, at 7:28 AM, Oleksandr Shulgin 
mailto:oleksandr.shul...@zalando.de>> wrote:
On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy 
mailto:rahulreddy1...@gmail.com>> wrote:

Decommissioned 2 nodes from cluster nodetool status doesn't  list the nodes as 
expected but jmx metrics shows still those 2 nodes has down. Nodetool gossip 
shows the 2 nodes in Left state. Why does my jmx still shows those nodes down 
even after 24 hours. Cassandra version 3.11.3 ?

AFAIK, the nodes are not removed from gossip for 72 hours by default.

Anything else need to be done?

Wait another 48 hours? ;-)

--
Alex



Re: Jmx metrics shows node down

2019-07-29 Thread yuping wang
Is there workaround to shorten 72 hours to something shorter?(you said by 
default, wondering if one can set a non-default value?)

Thanks,
Yuping 

On Jul 29, 2019, at 7:28 AM, Oleksandr Shulgin  
wrote:

> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy  wrote:

> 
> Decommissioned 2 nodes from cluster nodetool status doesn't  list the nodes 
> as expected but jmx metrics shows still those 2 nodes has down. Nodetool 
> gossip shows the 2 nodes in Left state. Why does my jmx still shows those 
> nodes down even after 24 hours. Cassandra version 3.11.3 ?

AFAIK, the nodes are not removed from gossip for 72 hours by default.
 
> Anything else need to be done?

Wait another 48 hours? ;-)

--
Alex



Re: Jmx metrics shows node down

2019-07-29 Thread yuping wang
We have the same issue. We observed the JMX only cleared after exactly 72 hours 
too. 

On Jul 29, 2019, at 11:23 AM, Rahul Reddy  wrote:

And also system.peers table doesn't have the information on  old nodes only 
ghost nodes to be there in JMX

> On Mon, Jul 29, 2019, 7:39 AM Rahul Reddy  wrote:
> We removed many times nodes from a cluster but never seen the jmx metric down 
> stay for 72 hours. So it has to be completely removed from gossip to show the 
> metric as expected? This would be problem for using the metric to alert on 
> call 
> 
>> On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin 
>>  wrote:
>>> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy  
>>> wrote:
>> 
>>> 
>>> Decommissioned 2 nodes from cluster nodetool status doesn't  list the nodes 
>>> as expected but jmx metrics shows still those 2 nodes has down. Nodetool 
>>> gossip shows the 2 nodes in Left state. Why does my jmx still shows those 
>>> nodes down even after 24 hours. Cassandra version 3.11.3 ?
>> 
>> AFAIK, the nodes are not removed from gossip for 72 hours by default.
>>  
>>> Anything else need to be done?
>> 
>> Wait another 48 hours? ;-)
>> 
>> --
>> Alex
>> 


Re: Jmx metrics shows node down

2019-07-29 Thread Rahul Reddy
And also system.peers table doesn't have the information on  old nodes only
ghost nodes to be there in JMX

On Mon, Jul 29, 2019, 7:39 AM Rahul Reddy  wrote:

> We removed many times nodes from a cluster but never seen the jmx metric
> down stay for 72 hours. So it has to be completely removed from gossip to
> show the metric as expected? This would be problem for using the metric to
> alert on call
>
> On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin <
> oleksandr.shul...@zalando.de> wrote:
>
>> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy 
>> wrote:
>>
>>>
>>> Decommissioned 2 nodes from cluster nodetool status doesn't  list the
>>> nodes as expected but jmx metrics shows still those 2 nodes has down.
>>> Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still
>>> shows those nodes down even after 24 hours. Cassandra version 3.11.3 ?
>>>
>>
>> AFAIK, the nodes are not removed from gossip for 72 hours by default.
>>
>>
>>> Anything else need to be done?
>>>
>>
>> Wait another 48 hours? ;-)
>>
>> --
>> Alex
>>
>>


Re: Jmx metrics shows node down

2019-07-29 Thread Rahul Reddy
We removed many times nodes from a cluster but never seen the jmx metric
down stay for 72 hours. So it has to be completely removed from gossip to
show the metric as expected? This would be problem for using the metric to
alert on call

On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy 
> wrote:
>
>>
>> Decommissioned 2 nodes from cluster nodetool status doesn't  list the
>> nodes as expected but jmx metrics shows still those 2 nodes has down.
>> Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still
>> shows those nodes down even after 24 hours. Cassandra version 3.11.3 ?
>>
>
> AFAIK, the nodes are not removed from gossip for 72 hours by default.
>
>
>> Anything else need to be done?
>>
>
> Wait another 48 hours? ;-)
>
> --
> Alex
>
>


Re: Jmx metrics shows node down

2019-07-29 Thread Oleksandr Shulgin
On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy 
wrote:

>
> Decommissioned 2 nodes from cluster nodetool status doesn't  list the
> nodes as expected but jmx metrics shows still those 2 nodes has down.
> Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still
> shows those nodes down even after 24 hours. Cassandra version 3.11.3 ?
>

AFAIK, the nodes are not removed from gossip for 72 hours by default.


> Anything else need to be done?
>

Wait another 48 hours? ;-)

--
Alex