Re: getting status of long running repair

2012-05-09 Thread Bill Au
I am running 1.0.8. Two data center with 8 machines in each dc. Nodes are all up while repairing is running. No dropped Mutations/Messages. I do see HintedHandoff messages. Bill On Tue, May 8, 2012 at 11:15 PM, Vijay vijay2...@gmail.com wrote: What is the version you are using? is it Multi

Re: getting status of long running repair

2012-05-09 Thread Vijay
Are you by using Broadcast Address? if yes then you might be affected by https://issues.apache.org/jira/browse/CASSANDRA-3503 Nodes are all up while repairing is running. I should have been clear are you seeing the following messages in logs (UP/DOWN) during the period of the repair... INFO

Re: getting status of long running repair

2012-05-08 Thread aaron morton
When you look in the logs please let me know if you see this error… https://issues.apache.org/jira/browse/CASSANDRA-4223 I look at nodetool compactionstats (for the Merkle tree phase), nodetool netstats for the streaming, and this to check for streaming progress: while true; do date; diff

Re: getting status of long running repair

2012-05-08 Thread Bill Au
There are no error message in my log. I ended up restarting all the nodes in my cluster. After that I was able to run repair successfully on one of the node. It took about 40 minutes. Feeling lucky I ran repair on another node and it is stuck again. tpstats show 1 active and 1 pending

Re: getting status of long running repair

2012-05-07 Thread Bill Au
I restarted the nodes and then restarted the repair. It is still hanging like before. Do I keep repeating until the repair actually finish? Bill On Fri, May 4, 2012 at 2:18 PM, Rob Coli rc...@palominodb.com wrote: On Fri, May 4, 2012 at 10:30 AM, Bill Au bill.w...@gmail.com wrote: I know

Re: getting status of long running repair

2012-05-07 Thread Ben Coverston
Check the log files for warnings or errors. They may indicate why your repair failed. On Mon, May 7, 2012 at 10:09 AM, Bill Au bill.w...@gmail.com wrote: I restarted the nodes and then restarted the repair. It is still hanging like before. Do I keep repeating until the repair actually

getting status of long running repair

2012-05-04 Thread Bill Au
I know repair may take a long time to run. I am running repair on a node with about 15 GB of data and it is taking more than 24 hours. Is that normal? Is there any way to get status of the repair? tpstats does show 2 active and 2 pending AntiEntropySessions. But netstats and compactionstats

Re: getting status of long running repair

2012-05-04 Thread Rob Coli
On Fri, May 4, 2012 at 10:30 AM, Bill Au bill.w...@gmail.com wrote: I know repair may take a long time to run.  I am running repair on a node with about 15 GB of data and it is taking more than 24 hours.  Is that normal?  Is there any way to get status of the repair?  tpstats does show 2