Just an FYI - I understand you're using this to do an apples to apples
comparison, so it's good to have everything the same in both.

But.... in case you're not aware - 64KB chunk length is a *terrible*
setting for performance and cost.  Disk is always cheaper than CPU, so for
production deployments, I would **NEVER** use 64KB.  I've given a few talks
on why it's so awful, you can read the JIRA where we changed the default:
https://issues.apache.org/jira/browse/CASSANDRA-13241

Jon

On Tue, Mar 10, 2026 at 3:23 PM dbms-tech <[email protected]> wrote:

> Thanks for taking the time to reply.
>
> In my lab environment, I just altered the table to simulate LCS. I'll
> update with my findings tomorrow.
>
> INITIAL: compaction = {'class':
> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy',
> 'max_sstables_to_compact': '64', 'min_sstable_size': '100MiB',
> 'scaling_parameters': 'T4', 'sstable_growth': '0.3333333333333333',
> 'target_sstable_size': '1GiB'}
> ----
> NEW: COMPACTION = compaction = {'base_shard_count': '8', 'class':
> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy',
> 'scaling_parameters': 'L10', 'target_sstable_size': '256MiB'}
>
>
> On Tue, Mar 10, 2026 at 2:42 PM Patrick McFadin <[email protected]>
> wrote:
>
>> It looks like you are comparing Cassandra 2 LCS to Cassandra 5 UCS with
>> T4, which is a tiered/STCS-like layout. If so, I would not treat the
>> storage increase as an expected Cassandra 5 baseline. Since the compression
>> ratio is nearly identical, I’d retest with UCS L10 for a more
>> apples-to-apples comparison with LCS before drawing conclusions about disk
>> usage.
>>
>> Patrick
>>
>>
>> On Mon, Mar 9, 2026 at 7:39 AM dbms-tech <[email protected]> wrote:
>>
>>> Testing a C5 upgrade from C5 in our lab environment as PoC. I migrated a
>>> 1.3B rows table from C2 to C5.
>>> On C2, table was 7.4 TB large. On C5, it is 9.5 TB.
>>>
>>> I verified the following ...
>>> ==============================
>>> - There are no stale snapshots in C5.
>>> - C5 size above is exclusively for the table's sstables. No other files
>>> reside in the /data directory.
>>> - tablestats shows 'SSTable Compression Ratio: 0.25316' for C5 and 0.25
>>> for C2.
>>> - I adjusted C5 table to use: chunk_length_in_kb': '64', 'class':
>>> 'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level': '10
>>> to obtain the above compression ratio.
>>> - C5 uses default UCS while C2 uses LTS.
>>> - C2 used DeflateCompressor.
>>> - Compactions are optimal with near zero pending compactions.
>>> - Cluster is fully & regularly repaired using Reaper.
>>>
>>> Inquiries ...
>>> - Is C5 (UCS) expected to utilize this much more storage compared to C2
>>> (LCS)?
>>> - Other than increasing compression level and chunk_size, what else can
>>> be done to match C2 storage?
>>>
>>> Thanks in advance.
>>>
>>> ==============
>>> C5
>>> ==============
>>> CREATE TABLE xxx.yyy (
>>>
>>>     ddd bigint PRIMARY KEY,
>>>     xxx boolean,
>>>     yyy text,
>>>     zzz text,
>>>     ddd bigint
>>> ) WITH additional_write_policy = '99p'
>>>     AND allow_auto_snapshot = true
>>>     AND bloom_filter_fp_chance = 0.01
>>>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>>>     AND cdc = false
>>>     AND comment = ''
>>>     AND compaction = {'class':
>>> 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy',
>>> 'max_sstables_to_compact': '64', 'min_sstable_size': '100MiB',
>>> 'scaling_parameters': 'T4', 'sstable_growth': '0.3333333333333333',
>>> 'target_sstable_size': '1GiB'}
>>>     AND compression = {'chunk_length_in_kb': '64', 'class':
>>> 'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level':
>>> '10'}
>>>     AND memtable = 'default'
>>>     AND crc_check_chance = 1.0
>>>     AND default_time_to_live = 0
>>>     AND extensions = {}
>>>     AND gc_grace_seconds = 864000
>>>     AND incremental_backups = true
>>>     AND max_index_interval = 2048
>>>     AND memtable_flush_period_in_ms = 0
>>>     AND min_index_interval = 128
>>>     AND read_repair = 'BLOCKING'
>>>     AND speculative_retry = '99p';
>>>
>>> =================
>>> C5
>>> ==================
>>> nodetool tablestats xxx.yyy
>>> Total number of tables: 1
>>> ----------------
>>> Keyspace: xxx
>>>         Read Count: 764664474
>>>         Read Latency: 1.092679410718903 ms
>>>         Write Count: 127141592
>>>         Write Latency: 0.02513343241761516 ms
>>>         Pending Flushes: 0
>>>                 Table: document
>>>                 SSTable count: 783
>>>                 Old SSTable count: 0
>>>                 Max SSTable size: 4.664GiB
>>>                 Space used (live): 1208104887289
>>>                 Space used (total): 1208104887289
>>>                 Space used by snapshots (total): 0
>>>                 Off heap memory used (total): 1324105435
>>>                 SSTable Compression Ratio: 0.25316
>>>                 Number of partitions (estimate): 126389507
>>>                 Memtable cell count: 23023
>>>                 Memtable data size: 424380268
>>>                 Memtable off heap memory used: 450102571
>>>                 Memtable switch count: 1242
>>>                 Speculative retries: 4016043
>>>                 Local read count: 725885015
>>>                 Local read latency: 1.153 ms
>>>                 Local write count: 24986436
>>>                 Local write latency: 0.052 ms
>>>                 Local read/write ratio: 29.05116
>>>                 Pending flushes: 0
>>>                 Percent repaired: 0.0
>>>                 Bytes repaired: 0B
>>>                 Bytes unrepaired: 4.328TiB
>>>                 Bytes pending repair: 0B
>>>                 Bloom filter false positives: 4722014
>>>                 Bloom filter false ratio: 0.01052
>>>                 Bloom filter space used: 289491392
>>>                 Bloom filter off heap memory used: 289485128
>>>                 Index summary off heap memory used: 0
>>>                 Compression metadata off heap memory used: 584517736
>>>                 Compacted partition minimum bytes: 21
>>>                 Compacted partition maximum bytes: 10090808
>>>                 Compacted partition mean bytes: 21423
>>>                 Average live cells per slice (last five minutes): 1.0
>>>                 Maximum live cells per slice (last five minutes): 1
>>>                 Average tombstones per slice (last five minutes): 1.0
>>>                 Maximum tombstones per slice (last five minutes): 1
>>>                 Droppable tombstone ratio: 0.01064
>>>                 Top partitions by size (last update:
>>> 2026-03-06T16:13:07Z):
>>>
>>>
>>> --
>>>
>>> ----------------------------------------
>>> Thank you
>>>
>>>
>>>
>
> --
>
> ----------------------------------------
> Thank you
>
>
>

Reply via email to