I always prefer to do decommission, but the issue here is these servers are on-prem, and disks die from time to time. It's a very large cluster, in multiple datacenters around the world, so it can take some time before we have a replacement, so we usually need to run removenode in such cases.
Other than that there are no issues in the cluster, the load is reasonable, and when this issue happens, following a removenode, this huge number of NTR is what I see, weird thing it's only on some nodes. I have been running with a very small native_transport_max_concurrent_requests_in_bytes setting for a few days now on some nodes (few mb's compared to the default 0.8 of a 60gb heap), it looks like it's good enough for the app, will roll it out to the entire dc and test removal again. On Tue, Mar 9, 2021 at 10:51 AM Kane Wilson <k...@raft.so> wrote: > It's unlikely to help in this case, but you should be using nodetool > decommission on the node you want to remove rather than removenode from > another node (and definitely don't force removal) > > native_transport_max_concurrent_requests_in_bytes defaults to 10% of the > heap, which I suppose depending on your configuration could potentially > result in a smaller number of concurrent requests than previously. It's > worth a shot setting it higher to see if the issue is related. Is this the > only issue you see on the cluster? I assume load on the cluster is still > low/reasonable and the only symptom you're seeing is the increased NTR > requests? > > raft.so - Cassandra consulting, support, and managed services > > > On Mon, Mar 8, 2021 at 10:47 PM Gil Ganz <gilg...@gmail.com> wrote: > >> >> Hey, >> We have a 3.11.9 cluster (recently upgraded from 2.1.14), and after the >> upgrade we have an issue when we remove a node. >> >> The moment I run the removenode command, 3 servers in the same dc start >> to have a high amount of pending native-transport-requests (getting to >> around 1M) and clients are having issues due to that. We are using vnodes >> (32), so I I don't see why I would have 3 servers busier than others (RF is >> 3 but I don't see why it will be related). >> >> Each node has a few TB of data, and in the past we were able to remove a >> node in ~half a day, today what happens is in the first 1-2 hours we have >> these issues with some nodes, then things go quite, remove is still running >> and clients are ok, a few hours later the same issue is back (with same >> nodes as the problematic ones), and clients have issues again, leading us >> to run removenode force. >> >> Reducing stream throughput and number of compactors has helped >> to mitigate the issues a bit, but we still have this issue of pending >> native-transport requests getting to insane numbers and clients suffering, >> eventually causing us to run remove force. Any idea? >> >> I saw since 3.11.6 there is a parameter >> native_transport_max_concurrent_requests_in_bytes, looking into setting >> this, perhaps this will prevent the amount of pending tasks to get so high. >> >> Gil >> >