Re: Datacenter decommissioning on Cassandra 4.1.4

2024-04-22 Thread Sebastian Marsching
Recently, I successfully used the following procedure when decommissioning a datacenter: 1. Reduced the replication factor for this DC to zero for all keyspaces except the system_auth keyspace. For that keyspace, I reduced the RF to one. 2. Decommissioned all nodes except one in the DC using

Re: Documentation about TTL and tombstones

2024-03-18 Thread Sebastian Marsching
> It's actually correct to do it how it is today. > Insertion date does not matter, what matters is the time after tombstones are > supposed to be deleted. > If the delete got to all nodes, sure, no problem, but if any of the nodes > didn't get the delete, and you would get rid of the

Re: Documentation about TTL and tombstones

2024-03-16 Thread Sebastian Marsching
> That's not how gc_grace_seconds work. > gc_grace_seconds controls how much time *after* a tombstone can be deleted, > it can actually be deleted, in order to give you enough time to run repairs. > > Say you have data that is about to expire on March 16 8am, and > gc_grace_seconds is 10 days.

Re: Documentation about TTL and tombstones

2024-03-14 Thread Sebastian Marsching
> by reading the documentation about TTL > https://cassandra.apache.org/doc/4.1/cassandra/operating/compaction/index.html#ttl > It mention that it creates a tombstone when data expired, how does it > possible without writing to the tombstone on the table ? I thought TTL > doesn't create

Re: SStables stored in directory with different table ID than the one found in system_schema.tables

2024-02-09 Thread Sebastian Marsching
You might the following discussion from the mailing-list archive helpful: https://lists.apache.org/thread/6hnypp6vfxj1yc35ptp0xf15f11cx77d This thread discusses a similar situation gives a few pointers on when it might be save to simply move the SSTables around. > Am 08.02.2024 um 13:06

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
paired SSTables because some unrepaired SSTables are > being marked as repaired on one node but not on another, you would then > understand why over-streaming can happen. The over-streaming is only > problematic for the repaired SSTables, because they are much bigger than the > unrepai

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> Caution, using the method you described, the amount of data streamed at the > end with the full repair is not the amount of data written between stopping > the first node and the last node, but depends on the table size, the number > of partitions written, their distribution in the ring and

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> That's a feature we need to implement in Reaper. I think disallowing the > start of the new incremental repair would be easier to manage than pausing > the full repair that's already running. It's also what I think I'd expect as > a user. > > I'll create an issue to track this. Thank you,

Re: Switching to Incremental Repair

2024-02-07 Thread Sebastian Marsching
> Full repair running for an entire week sounds excessively long. Even if > you've got 1 TB of data per node, 1 week means the repair speed is less than > 2 MB/s, that's very slow. Perhaps you should focus on finding the bottleneck > of the full repair speed and work on that instead. We store

Re: Switching to Incremental Repair

2024-02-03 Thread Sebastian Marsching
Hi, > 2. use an orchestration tool, such as Cassandra Reaper, to take care of that > for you. You will still need monitor and alert to ensure the repairs are run > successfully, but fixing a stuck or failed repair is not very time sensitive, > you can usually leave it till Monday morning if it

Re: Over streaming in one node during repair.

2024-01-23 Thread Sebastian Marsching
I would check whether some SSTables are marked as repaired while others are not (by running sstablemetadata and checking the value of repairedAt). An inconsistency in the repaired state, it might explain overstreaming. During repairs, data from repaired SSTables on one node is only compared

Re: Design Query Optimization for Cassandra Table with Date-based Filtering

2023-12-14 Thread Sebastian Marsching
Hi Arjun, this is strange. You should be able to use a range query on a column that is part of the clustering key, as long as all columns in the clustering key left to this column are set to fixed values. So, given the table definition that you specified, your query should work (I just

Re: Schema inconsistency in mixed-version cluster

2023-12-12 Thread Sebastian Marsching
> I assume these are column names of a non-system table. > This is correct. It is one of our application tables. The table has the following schema: CREATE TABLE pv_archive.channels ( channel_name text, decimation_level int, bucket_start_time bigint, channel_data_id uuid static,

Re: Schema inconsistency in mixed-version cluster

2023-12-12 Thread Sebastian Marsching
> If an upgrade involves changing the schema, I think backwards compatibility > would be out of the question? That’s a good point. I just noticed that during the upgrade, the output of “nodetool describecluster” showed a schema version disagreement, where the nodes running 3.11.14 were on

Schema inconsistency in mixed-version cluster

2023-12-12 Thread Sebastian Marsching
Hi, while upgrading our production cluster from C* 3.11.14 to 4.1.3, we experienced the issue that some SELECT queries failed due to supposedly no replica being available. The system logs on the C* nodes where full of messages like the following one: ERROR [ReadStage-1] 2023-12-11

Migrating to incremental repair in C* 4.x

2023-11-23 Thread Sebastian Marsching
Hi, we are currently in the process of migrating from C* 3.11 to C* 4.1 and we want to start using incremental repairs after the upgrade has been completed. It seems like all the really bad bugs that made using incremental repairs dangerous in C* 3.x have been fixed in 4.x, and for our

Re: Upgrade from C* 3 to C* 4 per datacenter

2023-10-26 Thread Sebastian Marsching
Hi, as we are currently facing the same challenge (upgrading an existing cluster from C* 3 to C* 4), I wanted to share our strategy with you. It largely is what Scott already suggested, but I have some extra details, so I thought it might still be useful. We duplicated our cluster using the