My experience is repair of 300GB compressed data takes longer than 300GB of 
uncompressed, but I cannot point to an exact number. Calculating the 
differences is mostly CPU bound and works on the non compressed data. 

Streaming uses compression (after uncompressing the on disk data).

So if you have 300GB of compressed data, take a look at how long repair takes 
and see if you are comfortable with that. You may also want to test replacing a 
node so you can get the procedure documented and understand how long it takes.  

The idea of the soft 300GB to 500GB limit cam about because of a number of 
cases where people had 1 TB on a single node and they were surprised it took 
days to repair or replace. If you know how long things may take, and that fits 
in your operations then go with it. 

Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 18/02/2013, at 10:08 PM, Vegard Berget <p...@fantasista.no> wrote:

>  
> Just out of curiosity :
> 
> When using compression, does this affect this one way or another?  Is 300G 
> (compressed) SSTable size, or total size of data?   
> 
> .vegard,
> 
> 
> ----- Original Message -----
> From:
> user@cassandra.apache.org
> 
> To:
> <user@cassandra.apache.org>
> Cc:
> 
> Sent:
> Mon, 18 Feb 2013 08:41:25 +1300
> Subject:
> Re: cassandra vs. mongodb quick question
> 
> 
> If you have spinning disk and 1G networking and no virtual nodes, I would 
> still say 300G to 500G is a soft limit. 
> 
> If you are using virtual nodes, SSD, JBOD disk configuration or faster 
> networking you may go higher. 
> 
> The limiting factors are the time it take to repair, the time it takes to 
> replace a node, the memory considerations for 100's of millions of rows. If 
> you the performance of those operations is acceptable to you, then go crazy. 
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 16/02/2013, at 9:05 AM, "Hiller, Dean" <dean.hil...@nrel.gov> wrote:
> 
> So I found out mongodb varies their node size from 1T to 42T per node 
> depending on the profile.  So if I was going to be writing a lot but rarely 
> changing rows, could I also use cassandra with a per node size of +20T or is 
> that not advisable?
> 
> Thanks,
> Dean
> 

Reply via email to