Testing a C5 upgrade from C5 in our lab environment as PoC. I migrated a
1.3B rows table from C2 to C5.
On C2, table was 7.4 TB large. On C5, it is 9.5 TB.
I verified the following ...
==============================
- There are no stale snapshots in C5.
- C5 size above is exclusively for the table's sstables. No other files
reside in the /data directory.
- tablestats shows 'SSTable Compression Ratio: 0.25316' for C5 and 0.25 for
C2.
- I adjusted C5 table to use: chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level': '10
to obtain the above compression ratio.
- C5 uses default UCS while C2 uses LTS.
- C2 used DeflateCompressor.
- Compactions are optimal with near zero pending compactions.
- Cluster is fully & regularly repaired using Reaper.
Inquiries ...
- Is C5 (UCS) expected to utilize this much more storage compared to C2
(LCS)?
- Other than increasing compression level and chunk_size, what else can be
done to match C2 storage?
Thanks in advance.
==============
C5
==============
CREATE TABLE xxx.yyy (
ddd bigint PRIMARY KEY,
xxx boolean,
yyy text,
zzz text,
ddd bigint
) WITH additional_write_policy = '99p'
AND allow_auto_snapshot = true
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND cdc = false
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy',
'max_sstables_to_compact': '64', 'min_sstable_size': '100MiB',
'scaling_parameters': 'T4', 'sstable_growth': '0.3333333333333333',
'target_sstable_size': '1GiB'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.ZstdCompressor', 'compression_level':
'10'}
AND memtable = 'default'
AND crc_check_chance = 1.0
AND default_time_to_live = 0
AND extensions = {}
AND gc_grace_seconds = 864000
AND incremental_backups = true
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair = 'BLOCKING'
AND speculative_retry = '99p';
=================
C5
==================
nodetool tablestats xxx.yyy
Total number of tables: 1
----------------
Keyspace: xxx
Read Count: 764664474
Read Latency: 1.092679410718903 ms
Write Count: 127141592
Write Latency: 0.02513343241761516 ms
Pending Flushes: 0
Table: document
SSTable count: 783
Old SSTable count: 0
Max SSTable size: 4.664GiB
Space used (live): 1208104887289
Space used (total): 1208104887289
Space used by snapshots (total): 0
Off heap memory used (total): 1324105435
SSTable Compression Ratio: 0.25316
Number of partitions (estimate): 126389507
Memtable cell count: 23023
Memtable data size: 424380268
Memtable off heap memory used: 450102571
Memtable switch count: 1242
Speculative retries: 4016043
Local read count: 725885015
Local read latency: 1.153 ms
Local write count: 24986436
Local write latency: 0.052 ms
Local read/write ratio: 29.05116
Pending flushes: 0
Percent repaired: 0.0
Bytes repaired: 0B
Bytes unrepaired: 4.328TiB
Bytes pending repair: 0B
Bloom filter false positives: 4722014
Bloom filter false ratio: 0.01052
Bloom filter space used: 289491392
Bloom filter off heap memory used: 289485128
Index summary off heap memory used: 0
Compression metadata off heap memory used: 584517736
Compacted partition minimum bytes: 21
Compacted partition maximum bytes: 10090808
Compacted partition mean bytes: 21423
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
Droppable tombstone ratio: 0.01064
Top partitions by size (last update: 2026-03-06T16:13:07Z):
--
----------------------------------------
Thank you