Re: removenode stuck - cassandra 4.1.0

2023-01-23 Thread Joe Obernberger

Thank you - I was just impatient.  :)

-Joe

On 1/23/2023 12:56 PM, Jeff Jirsa wrote:

Those hosts are likely sending streams.

If you do `nodetool netstats` on the replicas of the node you're 
removing, you should see byte counters and file counters - they should 
all be incrementing. If one of them isnt incremening, that one is 
probably stuck.


There's at least one bug in 4.1 that can cause (I think? rate 
limiters) to interact in a way that can cause this. 
https://issues.apache.org/jira/browse/CASSANDRA-18110 describes it and 
has a workaround.




On Mon, Jan 23, 2023 at 9:41 AM Joe Obernberger 
 wrote:


I had a drive fail (first drive in the list) on a Cassandra cluster.
I've stopped the node (as it no longer starts), and am trying to
remove
it from the cluster, but the removenode command is hung (been running
for 3 hours so far):
nodetool removenode status is always reporting the same token as
being
removed.  Help?

nodetool removenode status
RemovalStatus: Removing token (-9196617215347134065). Waiting for
replication confirmation from
[/172.16.100.248 ,/172.16.100.249
,/172.16.100.251
,/172.16.100.252
,/172.16.100.34
,/172.16.100.35
,/172.16.100.36
,/172.16.100.37
,/172.16.100.38
,/172.16.100.42
,/172.16.100.44
,/172.16.100.45 ].

Thanks.

-Joe


-- 
This email has been checked for viruses by AVG antivirus software.

www.avg.com 


Re: removenode stuck - cassandra 4.1.0

2023-01-23 Thread Jeff Jirsa
Those hosts are likely sending streams.

If you do `nodetool netstats` on the replicas of the node you're removing,
you should see byte counters and file counters - they should all be
incrementing. If one of them isnt incremening, that one is probably stuck.

There's at least one bug in 4.1 that can cause (I think? rate limiters) to
interact in a way that can cause this.
https://issues.apache.org/jira/browse/CASSANDRA-18110 describes it and has
a workaround.



On Mon, Jan 23, 2023 at 9:41 AM Joe Obernberger <
joseph.obernber...@gmail.com> wrote:

> I had a drive fail (first drive in the list) on a Cassandra cluster.
> I've stopped the node (as it no longer starts), and am trying to remove
> it from the cluster, but the removenode command is hung (been running
> for 3 hours so far):
> nodetool removenode status is always reporting the same token as being
> removed.  Help?
>
> nodetool removenode status
> RemovalStatus: Removing token (-9196617215347134065). Waiting for
> replication confirmation from
> [/172.16.100.248,/172.16.100.249,/172.16.100.251,/172.16.100.252,/
> 172.16.100.34,/172.16.100.35,/172.16.100.36,/172.16.100.37,/172.16.100.38
> ,/172.16.100.42,/172.16.100.44,/172.16.100.45].
>
> Thanks.
>
> -Joe
>
>
> --
> This email has been checked for viruses by AVG antivirus software.
> www.avg.com
>