My cluster is very small (300 MB) and compact was taking more than 2 hours.

I ended up bouncing all the nodes.  After that,  I was able to run repair
on all nodes, and each one takes less than a minute.

If this happens again I will be sure to run compactionstats and netstats.
Thanks for that tip.

Bill

On Wed, Apr 25, 2012 at 11:49 AM, Gregg Ulrich <gulr...@netflix.com> wrote:

> How much data do you have and how long is "a while"?  In my experience
> repairs can take a very long time.  Check to see if validation compactions
> are running (nodetool compactionstats) or if files are streaming (nodetool
> netstats).  If either of those are in progress then your repair should be
> running.  I've seen 12 node, 50G clusters take days to repair to a new data
> center.
>
> Not sure if 1.0 is different but in 0.X I don't believe killing the
> nodetool process stops the repair.  When we need to stop a repair we have
> bounced all of the participating nodes.  I've been told that there is no
> harm in stopping repairs.
>
> On Apr 24, 2012, at 2:55 PM, Bill Au wrote:
>
> > I am running 1.0.8.  I am adding a new data center to an existing
> cluster.  Following steps outlined in another thread on the mailing list,
> things went fine except for the last step, which is to run repair on all
> the nodes in the new data center.  Repair seems to be hanging indefinitely.
>  There is no activity in system.log.  I did notice that the node being
> repair is requesting ranges from nodes in both the existing and new data
> center.  Since there is not data in the new data center initially, I though
> that it may be why repair is hanging.  So I break out of the repair with a
> control-C after waiting for a while.  I do see data being added to the new
> nodes.  When I ran repair for the second time it is still hanging.
> >
> > Why is repair hanging?  Is it save to use control-C to break out of it.
>  How do I recover from this?
> >
> > Bill
>
>

Reply via email to