> The bottleneck now seems to be the repair time. If any node becomes too 
> inconsistent, or needs to be replaced, the rebuilt time is over a week.

This is why i've recommended 300GB to 400GB per node in the past. It's not a 
hard limit, but it seems to be a nice balance. You need to take into 
consideration compaction, repair, backup / restore, node replacement, 
upgrading, disaster recovery. 

That said, compression, SSD's and faster networking may mean you can run more 
data per node. Also Virtual Nodes coming in 1.2X will increase the parallelism 
of repairing / bootstrapping a node. (It wont help reduce the time taken to 
calculate merkle trees though). 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 7/09/2012, at 7:28 AM, Dustin Wenz <dustinw...@ebureau.com> wrote:

> This is actually another problem that we've encountered with Cassandra; the 
> range of platforms it can be deployed on is fairly limited. If you want to 
> run with Oracle's JRE (which is apparently recommended), you are pretty much 
> stuck with Linux on x86/64 (I haven't tried the new JDK on ARM yet, but it 
> sounds promising). You could probably do ok on Solaris, too, with a custom 
> Snappy jar and some JNA concessions.
> 
>       - .Dustin
> 
> On Sep 5, 2012, at 10:36 PM, Rob Coli <rc...@palominodb.com> wrote:
> 
>> On Sun, Jul 29, 2012 at 7:40 PM, Dustin Wenz <dustinw...@ebureau.com> wrote:
>>> We've just set up a new 7-node cluster with Cassandra 1.1.2 running under 
>>> OpenJDK6.
>> 
>> It's worth noting that Cassandra project recommends Sun JRE. Without
>> the Sun JRE, you might not be able to use JAMM to determine the live
>> ratio. Very few people use OpenJDK in production, so using it also
>> increases the likelihood that you might be the first to encounter a
>> given issue. FWIW!
>> 
>> =Rob
>> 
>> -- 
>> =Robert Coli
>> AIM&GTALK - rc...@palominodb.com
>> YAHOO - rcoli.palominob
>> SKYPE - rcoli_palominodb
> 

Reply via email to