As others pointed out, compression will reduce the size and replication will (across nodes) increase the total size.
The other thing to note is that you can have multiple versions of the data in different sstables, and tombstones related to deletions and TTLs, and indexes, and any snapshots, and room for the temporary artifacts of compactions. If you are just trying to have a quick guestimate of your space needs, I’d probably use your uncompressed calculation as a heuristic for the per-node storage required. From: lampahome <pahome.c...@mirlab.org> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org> Date: Monday, December 30, 2019 at 9:37 PM To: "user@cassandra.apache.org" <user@cassandra.apache.org> Subject: How bottom of cassandra save data efficiently? Message from External Sender If I use var a as primary key and var b as second key, and a and b are 16 bytes and 8 bytes. And other data are 32 bytes. In one row, I have a+b+data = 16+8+32 = 56 bytes. If I have 100,000 rows to store in cassandra, will it occupy space 56x100000 bytes in my disk? Or data will be compressed? thx