As others pointed out, compression will reduce the size and replication will 
(across nodes) increase the total size.

The other thing to note is that you can have multiple versions of the data in 
different sstables, and tombstones related to deletions and TTLs, and indexes, 
and any snapshots, and room for the temporary artifacts of compactions.   If 
you are just trying to have a quick guestimate of your space needs, I’d 
probably use your uncompressed calculation as a heuristic for the per-node 
storage required.

From: lampahome <pahome.c...@mirlab.org>
Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Date: Monday, December 30, 2019 at 9:37 PM
To: "user@cassandra.apache.org" <user@cassandra.apache.org>
Subject: How bottom of cassandra save data efficiently?

Message from External Sender
If I use var a as primary key and var b as second key, and a and b are 16 bytes 
and 8 bytes.

And other data are 32 bytes.

In one row, I have a+b+data = 16+8+32 = 56 bytes.

If I have 100,000 rows to store in cassandra, will it occupy space 56x100000 
bytes in my disk? Or data will be compressed?

thx

Reply via email to