RE: Datacenter decommissioning on Cassandra 4.1.4

Michalis Kotsiouros (EXT) via user Tue, 23 Apr 2024 00:45:12 -0700

Hello Alain,
Thanks a lot for the confirmation.
Yes this procedure seems like a workaround. But for my use case where 
system_auth contains a small amount of data and consistency level for 
authentication/authorization is switched to LOCAL_ONE, I think it is good 
enough.
I completely get that this could be improved since there might be requirements 
from other users that cannot be covered with the proposed procedure.

BR
MK
From: Alain Rodriguez <al...@casterix.fr>
Sent: April 22, 2024 18:27
To: user@cassandra.apache.org
Cc: Michalis Kotsiouros (EXT) <michalis.kotsiouros....@ericsson.com>
Subject: Re: Datacenter decommissioning on Cassandra 4.1.4

Hi Michalis,

It's been a while since I removed a DC for the last time, but I see there is 
now a protection to avoid accidentally leaving a DC without auth capability.

This was introduced in C* 4.1 through CASSANDRA-17478 
(https://issues.apache.org/jira/browse/CASSANDRA-17478).

The process of dropping a data center might have been overlooked while doing 
this work.

It's never correct for an operator to remove a DC from system_auth replication 
settings while there are currently nodes up in that DC.

I believe this assertion is not correct. As Jon and Jeff mentioned, usually we 
remove the replication before decommissioning any node in the case of removing 
an entire DC, for reasons exposed by Jeff. The existing documentation is also 
clear about this: 
https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/operations/opsDecomissionDC.html
 and 
https://thelastpickle.com/blog/2019/02/26/data-center-switch.html<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-3f5d78e47d9f728a&q=1&e=1b5f9bb8-e8af-49b9-9e2d-26622cb77bfc&u=https%3A%2F%2Fthelastpickle.com%2Fblog%2F2019%2F02%2F26%2Fdata-center-switch.html>.

Michalis, the solution you suggest seems to be the (good/only?) way to go, even 
though it looks like a workaround, not really "clean" and something we need to 
improve. It was also mentioned here: 
https://dba.stackexchange.com/questions/331732/not-a-able-to-decommission-the-old-datacenter#answer-334890.
 It should work quickly, but only because this keyspace has a fairly low amount 
of data, but it will still not be optimal and as fast as it should (it should 
be a near no-op as explained above by Jeff). It also obliges you to use 
"--force" option that could lead you to delete one of your nodes in another DC 
by mistake and in a loaded cluster or a 3-node cluster - RF = 3, this could 
hurt...). Having to operate using "nodetool decommission --force" cannot be 
standard, but for now I can't think of anything better for you. Maybe wait for 
someone else's confirmation, it's been a while since operated Cassandra :).

I think it would make sense to fix this somehow in Cassandra. Maybe should we 
ensure that no other keyspaces has a RF > 0 for this data center instead of 
looking at active nodes, or that there is no client connected to the nodes, add 
a manual flag somewhere, or something else? Even though I understand the 
motivation to protect users against a wrongly distributed system_auth keyspace, 
I think this protection should not be kept with this implementation. If that 
makes sense I can create a ticket for this problem.

C*heers,

Alain Rodriguez

casterix.fr<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-c572154016f8885b&q=1&e=1b5f9bb8-e8af-49b9-9e2d-26622cb77bfc&u=http%3A%2F%2Fcasterix.fr%2F>

[Image removed by sender.]

Le lun. 8 avr. 2024 à 16:26, Michalis Kotsiouros (EXT) via user 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>> a écrit :
Hello Jon and Jeff,
Thanks a lot for your replies.
I completely get your points.
Some more clarification about my issue.
When trying to update the Replication before the decommission, I get the 
following error message when I remove the replication for system_auth kesypace.
ConfigurationException: Following datacenters have active nodes and must be 
present in replication options for keyspace system_auth: [datacenter1]

This error message does not appear in the rest of the application keyspaces.
So, may I change the procedure to:

  1.  Make sure no clients are still writing to any nodes in the datacenter.
  2.  Run a full repair with nodetool repair.
  3.  Change all keyspaces so they no longer reference the datacenter being 
removed apart from system_auth keyspace.
  4.  Run nodetool decommission using the --force option on every node in the 
datacenter being removed.
  5.  Change system_auth keyspace so they no longer reference the datacenter 
being removed.
BR
MK

From: Jeff Jirsa <jji...@gmail.com<mailto:jji...@gmail.com>>
Sent: April 08, 2024 17:19
To: cassandra <user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: Michalis Kotsiouros (EXT) 
<michalis.kotsiouros....@ericsson.com<mailto:michalis.kotsiouros....@ericsson.com>>
Subject: Re: Datacenter decommissioning on Cassandra 4.1.4

To Jon’s point, if you remove from replication after step 1 or step 2 (probably 
step 2 if your goal is to be strictly correct), the nodetool decommission phase 
becomes almost a no-op.

If you use the order below, the last nodes to decommission will cause those 
surviving machines to run out of space (assuming you have more than a few nodes 
to start)

On Apr 8, 2024, at 6:58 AM, Jon Haddad 
<j...@jonhaddad.com<mailto:j...@jonhaddad.com>> wrote:

You shouldn’t decom an entire DC before removing it from replication.

—

Jon Haddad
Rustyrazorblade Consulting
rustyrazorblade.com<https://protect2.fireeye.com/v1/url?k=31323334-501d5122-313273af-454445555731-1624a77accb6d839&q=1&e=8a954d2d-17da-40df-8732-bdcc7893179a&u=http%3A%2F%2Frustyrazorblade.com%2F>

On Mon, Apr 8, 2024 at 6:26 AM Michalis Kotsiouros (EXT) via user 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>> wrote:
Hello community,
In our deployments, we usually rebuild the Cassandra datacenters for 
maintenance or recovery operations.
The procedure used since the days of Cassandra 3.x was the one documented in 
datastax documentation. Decommissioning a datacenter | Apache Cassandra 3.x 
(datastax.com)<https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsDecomissionDC.html>
After upgrading to Cassandra 4.1.4, we have realized that there are some 
stricter rules that do not allo to remove the replication when active Cassandra 
nodes still exist in a datacenter.
This check makes the above-mentioned procedure obsolete.
I am thinking to use the following as an alternative:

  1.  Make sure no clients are still writing to any nodes in the datacenter.
  2.  Run a full repair with nodetool repair.
  3.  Run nodetool decommission using the --force option on every node in the 
datacenter being removed.
  4.  Change all keyspaces so they no longer reference the datacenter being 
removed.

What is the procedure followed by other users? Do you see any risk following 
the proposed procedure?

BR
MK

RE: Datacenter decommissioning on Cassandra 4.1.4

Reply via email to