nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox
I have a six node cluster in AWS (repl:3) and recently noticed that repair was hanging. I've run with the -pr switch. I see this output in the nodetool command line (and also in that node's system.log): Starting repair command #9, repairing 256 ranges for keyspace dev_a but then no other

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Kevin Burton
if the boxes are idle, you could use jstack and look at the stackā€¦ perhaps it's locked somewhere. Worth a shot. On Tue, Jul 1, 2014 at 9:24 AM, Brian Tarbox tar...@cabotresearch.com wrote: I have a six node cluster in AWS (repl:3) and recently noticed that repair was hanging. I've run with

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Robert Coli
On Tue, Jul 1, 2014 at 9:24 AM, Brian Tarbox tar...@cabotresearch.com wrote: I have a six node cluster in AWS (repl:3) and recently noticed that repair was hanging. I've run with the -pr switch. It'll do that. What version of Cassandra? =Rob

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox
We're running 1.2.13. Any chance that doing a rolling-restart would help? Would running without the -pr improve the odds? Thanks. On Tue, Jul 1, 2014 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul 1, 2014 at 9:24 AM, Brian Tarbox tar...@cabotresearch.com wrote: I have a

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox
Does this output from jstack indicate a problem? ReadRepairStage:12170 daemon prio=10 tid=0x7f9dcc018800 nid=0x7361 waiting on condition [0x7f9db540c000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Robert Coli
On Tue, Jul 1, 2014 at 11:09 AM, Brian Tarbox tar...@cabotresearch.com wrote: We're running 1.2.13. 1.2.17 contains a few streaming fixes which might help. Any chance that doing a rolling-restart would help? Probably not. Would running without the -pr improve the odds? No, that'd

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox
Given that an upgrade is (for various internal reasons) not an option at this point...is there anything I can do to get repair working again? I'll also mention that I see this behavior from all nodes. Thanks. On Tue, Jul 1, 2014 at 2:51 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jul

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Robert Coli
On Tue, Jul 1, 2014 at 11:54 AM, Brian Tarbox tar...@cabotresearch.com wrote: Given that an upgrade is (for various internal reasons) not an option at this point...is there anything I can do to get repair working again? I'll also mention that I see this behavior from all nodes. I think

Re: nodetool repair saying starting and then nothing, and nothing in any of the server logs either

2014-07-01 Thread Brian Tarbox
For what purpose are you running repair? Because I read that we should! :-) We do delete data from one column family quite regularly...from the other CFs occasionally. We almost never run with less than 100% of our nodes up. In this configuration do we *need* to run repair? Thanks, On Tue,