RE: Jmx metrics shows node down
Another way to purge gossip info from each node is to: 1. Gracefully stop cassandra i.e. nodetool drain; kill Casandra PID 2. Move/delete files from $DATADIR/system/peers/ 3. Add JVM_OPTS="$JVM_OPTS -Dcassandra.load_ring_state=false" in jvm.options file 4. Restart Cassandra service. 5. Repeat above steps on each nodes in dc/cluster. 6. Once gossip info is purged, remove jvm option added in step 3 and restart instance again. Depending on cluster,load size you may get this done swiftly. ~Asad From: yuping wang [mailto:yupingwyp1...@gmail.com] Sent: Monday, July 29, 2019 10:56 AM To: user@cassandra.apache.org Subject: Re: Jmx metrics shows node down Is there workaround to shorten 72 hours to something shorter?(you said by default, wondering if one can set a non-default value?) Thanks, Yuping On Jul 29, 2019, at 7:28 AM, Oleksandr Shulgin mailto:oleksandr.shul...@zalando.de>> wrote: On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy mailto:rahulreddy1...@gmail.com>> wrote: Decommissioned 2 nodes from cluster nodetool status doesn't list the nodes as expected but jmx metrics shows still those 2 nodes has down. Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still shows those nodes down even after 24 hours. Cassandra version 3.11.3 ? AFAIK, the nodes are not removed from gossip for 72 hours by default. Anything else need to be done? Wait another 48 hours? ;-) -- Alex
Re: Jmx metrics shows node down
Is there workaround to shorten 72 hours to something shorter?(you said by default, wondering if one can set a non-default value?) Thanks, Yuping On Jul 29, 2019, at 7:28 AM, Oleksandr Shulgin wrote: > On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy wrote: > > Decommissioned 2 nodes from cluster nodetool status doesn't list the nodes > as expected but jmx metrics shows still those 2 nodes has down. Nodetool > gossip shows the 2 nodes in Left state. Why does my jmx still shows those > nodes down even after 24 hours. Cassandra version 3.11.3 ? AFAIK, the nodes are not removed from gossip for 72 hours by default. > Anything else need to be done? Wait another 48 hours? ;-) -- Alex
Re: Jmx metrics shows node down
We have the same issue. We observed the JMX only cleared after exactly 72 hours too. On Jul 29, 2019, at 11:23 AM, Rahul Reddy wrote: And also system.peers table doesn't have the information on old nodes only ghost nodes to be there in JMX > On Mon, Jul 29, 2019, 7:39 AM Rahul Reddy wrote: > We removed many times nodes from a cluster but never seen the jmx metric down > stay for 72 hours. So it has to be completely removed from gossip to show the > metric as expected? This would be problem for using the metric to alert on > call > >> On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin >> wrote: >>> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy >>> wrote: >> >>> >>> Decommissioned 2 nodes from cluster nodetool status doesn't list the nodes >>> as expected but jmx metrics shows still those 2 nodes has down. Nodetool >>> gossip shows the 2 nodes in Left state. Why does my jmx still shows those >>> nodes down even after 24 hours. Cassandra version 3.11.3 ? >> >> AFAIK, the nodes are not removed from gossip for 72 hours by default. >> >>> Anything else need to be done? >> >> Wait another 48 hours? ;-) >> >> -- >> Alex >>
Re: Jmx metrics shows node down
And also system.peers table doesn't have the information on old nodes only ghost nodes to be there in JMX On Mon, Jul 29, 2019, 7:39 AM Rahul Reddy wrote: > We removed many times nodes from a cluster but never seen the jmx metric > down stay for 72 hours. So it has to be completely removed from gossip to > show the metric as expected? This would be problem for using the metric to > alert on call > > On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin < > oleksandr.shul...@zalando.de> wrote: > >> On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy >> wrote: >> >>> >>> Decommissioned 2 nodes from cluster nodetool status doesn't list the >>> nodes as expected but jmx metrics shows still those 2 nodes has down. >>> Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still >>> shows those nodes down even after 24 hours. Cassandra version 3.11.3 ? >>> >> >> AFAIK, the nodes are not removed from gossip for 72 hours by default. >> >> >>> Anything else need to be done? >>> >> >> Wait another 48 hours? ;-) >> >> -- >> Alex >> >>
Re: Jmx metrics shows node down
We removed many times nodes from a cluster but never seen the jmx metric down stay for 72 hours. So it has to be completely removed from gossip to show the metric as expected? This would be problem for using the metric to alert on call On Mon, Jul 29, 2019, 7:28 AM Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy > wrote: > >> >> Decommissioned 2 nodes from cluster nodetool status doesn't list the >> nodes as expected but jmx metrics shows still those 2 nodes has down. >> Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still >> shows those nodes down even after 24 hours. Cassandra version 3.11.3 ? >> > > AFAIK, the nodes are not removed from gossip for 72 hours by default. > > >> Anything else need to be done? >> > > Wait another 48 hours? ;-) > > -- > Alex > >
Re: Jmx metrics shows node down
On Mon, Jul 29, 2019 at 1:21 PM Rahul Reddy wrote: > > Decommissioned 2 nodes from cluster nodetool status doesn't list the > nodes as expected but jmx metrics shows still those 2 nodes has down. > Nodetool gossip shows the 2 nodes in Left state. Why does my jmx still > shows those nodes down even after 24 hours. Cassandra version 3.11.3 ? > AFAIK, the nodes are not removed from gossip for 72 hours by default. > Anything else need to be done? > Wait another 48 hours? ;-) -- Alex