Re: Repair taking long time

Robert Coli Mon, 29 Sep 2014 11:30:54 -0700

On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux <[email protected]>
wrote:


>  I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and
> 4 in another.
>
>
>
> Running a repair on a large column family seems to be moving much slower
> than I expect.
>

Unfortunately, as others have mentioned, the slowness/broken-ness of repair
is a long running (groan!) issue and therefore currently expected.

At this time, I do not recommend upgrading to 2.1 in production to attempt
to fix it. I am also broadly skeptical that it as fixed in 2.1 as all that.

Once can increase gc_grace_seconds to 34 days [1] and repair once a month,
which should help make repair slightly more tractable.

For now you should probably evaluate which of your column families you
*absolutely must* repair (because you do DELETE like operations in them,
etc.) and only repair those.

As an aside, you "just lose" with vnodes and clusters of the size. I
presume you plan to grow over appx 9 nodes per DC, in which case you
probably do want vnodes enabled.

One note :

>  Looking at nodetool compaction stats it indicates the Validation phase
> is running that the total bytes is 4.5T (4505336278756).


This is the uncompressed size, I'm betting your actual on disk size is
closer to 2T? Even though 2.0 has improved performance for nodes with lots
of data, 2T per node is still relatively "fat" for a Cassandra node.


=Rob
[1] https://issues.apache.org/jira/browse/CASSANDRA-5850

Re: Repair taking long time

Reply via email to