What is the recommendation on the number of tokens value? I am asking because 
of the issue with sequential repairs on token range after token range.

Rahul Neelakantan

> On Sep 29, 2014, at 2:29 PM, Robert Coli <rc...@eventbrite.com> wrote:
> 
>> On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux <gene.robich...@match.com> 
>> wrote:
>> I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and 4 in 
>> another.
>> 
>>  
>> 
>> Running a repair on a large column family seems to be moving much slower 
>> than I expect.
>> 
> 
> Unfortunately, as others have mentioned, the slowness/broken-ness of repair 
> is a long running (groan!) issue and therefore currently expected. 
> 
> At this time, I do not recommend upgrading to 2.1 in production to attempt to 
> fix it. I am also broadly skeptical that it as fixed in 2.1 as all that.
> 
> Once can increase gc_grace_seconds to 34 days [1] and repair once a month, 
> which should help make repair slightly more tractable.
> 
> For now you should probably evaluate which of your column families you 
> *absolutely must* repair (because you do DELETE like operations in them, 
> etc.) and only repair those.
> 
> As an aside, you "just lose" with vnodes and clusters of the size. I presume 
> you plan to grow over appx 9 nodes per DC, in which case you probably do want 
> vnodes enabled.
> 
> One note :
>>  Looking at nodetool compaction stats it indicates the Validation phase is 
>> running that the total bytes is 4.5T (4505336278756).
> 
> This is the uncompressed size, I'm betting your actual on disk size is closer 
> to 2T? Even though 2.0 has improved performance for nodes with lots of data, 
> 2T per node is still relatively "fat" for a Cassandra node.
> 
> 
> =Rob
> [1] https://issues.apache.org/jira/browse/CASSANDRA-5850

Reply via email to