Re: Repair Process Taking too long

2012-05-22 Thread aaron morton
It repairs the ranges they have in common. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/05/2012, at 4:05 PM, Raj N wrote: Can I infer from this that if I have 3 replicas, then running repair without -pr won 1 node will repair the

Re: Repair Process Taking too long

2012-05-19 Thread Raj N
Can I infer from this that if I have 3 replicas, then running repair without -pr won 1 node will repair the other 2 replicas as well. -Raj On Sat, Apr 14, 2012 at 2:54 AM, Zhu Han han...@nutstore.net wrote: On Sat, Apr 14, 2012 at 1:57 PM, Igor i...@4friends.od.ua wrote: Hi! What is the

Re: Repair Process Taking too long

2012-04-14 Thread Igor
Hi! What is the difference between 'repair' and '-pr repair'? Simple repair touch all token ranges (for all nodes) and -pr touch only range for which given node responsible? On 04/12/2012 05:59 PM, Sylvain Lebresne wrote: On Thu, Apr 12, 2012 at 4:06 PM, Frank Ngbuzzt...@gmail.com wrote:

Re: Repair Process Taking too long

2012-04-14 Thread Zhu Han
On Sat, Apr 14, 2012 at 1:57 PM, Igor i...@4friends.od.ua wrote: Hi! What is the difference between 'repair' and '-pr repair'? Simple repair touch all token ranges (for all nodes) and -pr touch only range for which given node responsible? -pr only touches the primary range of the node. If

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
I also noticed that if I use the -pr option, the repair process went down from 30 hours to 9 hours. Is the -pr option safe to use if I want to run repair processes in parallel on nodes that are not replication peers? thanks On Thu, Apr 12, 2012 at 12:06 AM, Frank Ng berryt...@gmail.com wrote:

Re: Repair Process Taking too long

2012-04-12 Thread Sylvain Lebresne
On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng buzzt...@gmail.com wrote: I also noticed that if I use the -pr option, the repair process went down from 30 hours to 9 hours.  Is the -pr option safe to use if I want to run repair processes in parallel on nodes that are not replication peers? There is

Re: Repair Process Taking too long

2012-04-12 Thread Frank Ng
Thanks for the clarification. I'm running repairs as in case 2 (to avoid deleted data coming back). On Thu, Apr 12, 2012 at 10:59 AM, Sylvain Lebresne sylv...@datastax.comwrote: On Thu, Apr 12, 2012 at 4:06 PM, Frank Ng buzzt...@gmail.com wrote: I also noticed that if I use the -pr option,

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Can you expand further on your issue? Were you using Random Patitioner? thanks On Tue, Apr 10, 2012 at 5:35 PM, David Leimbach leim...@gmail.com wrote: I had this happen when I had really poorly generated tokens for the ring. Cassandra seems to accept numbers that are too big. You get hot

Re: Repair Process Taking too long

2012-04-11 Thread aaron morton
If you have 1TB of data it will take a long time to repair. Every bit of data has to be read and a hash generated. This is one of the reasons we often suggest that around 300 to 400Gb per node is a good load in the general case. Look at nodetool compactionstats .Is there a validation

Re: Repair Process Taking too long

2012-04-11 Thread Frank Ng
Thank you for confirming that the per node data size is most likely causing the long repair process. I have tried a repair on smaller column families and it was significantly faster. On Wed, Apr 11, 2012 at 9:55 PM, aaron morton aa...@thelastpickle.comwrote: If you have 1TB of data it will

Repair Process Taking too long

2012-04-10 Thread Frank Ng
Hello, I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to complete. Is it normal for the repair process to take this long? I wonder if it's because I am using the ext3 file system. thanks

Re: Repair Process Taking too long

2012-04-10 Thread Igor
Hi You can check with nodetool which part of repair process is slow - network streams or verify compactions. use nodetool netstats or compactionstats. On 04/10/2012 05:16 PM, Frank Ng wrote: Hello, I am on Cassandra 1.0.7. My repair processes are taking over 30 hours to complete. Is it

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I think both processes are taking a while. When it starts up, netstats and compactionstats show nothing. Anyone out there successfully using ext3 and their repair processes are faster than this? On Tue, Apr 10, 2012 at 10:42 AM, Igor i...@4friends.od.ua wrote: Hi You can check with nodetool

Re: Repair Process Taking too long

2012-04-10 Thread Igor
On 04/10/2012 07:16 PM, Frank Ng wrote: Short answer - yes. But you are asking wrong question. I think both processes are taking a while. When it starts up, netstats and compactionstats show nothing. Anyone out there successfully using ext3 and their repair processes are faster than this?

Re: Repair Process Taking too long

2012-04-10 Thread Jonathan Rhone
Data size, number of nodes, RF? Are you using size-tiered compaction on any of the column families that hold a lot of your data? Do your cassandra logs say you are streaming a lot of ranges? zgrep -E (Performing streaming repair|out of sync) On Tue, Apr 10, 2012 at 9:45 AM, Igor

Re: Repair Process Taking too long

2012-04-10 Thread Igor
also - JVM heap size, and anything related to memory pressure On 04/10/2012 07:56 PM, Jonathan Rhone wrote: Data size, number of nodes, RF? Are you using size-tiered compaction on any of the column families that hold a lot of your data? Do your cassandra logs say you are streaming a lot of

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I have 12 nodes with approximately 1TB load per node. The RF is 3. I am considering moving to ext4. I checked the ranges and the numbers go from 1 to the 9000s . On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone rh...@tinyco.com wrote: Data size, number of nodes, RF? Are you using

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I am not using tier-sized compaction. On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone rh...@tinyco.com wrote: Data size, number of nodes, RF? Are you using size-tiered compaction on any of the column families that hold a lot of your data? Do your cassandra logs say you are streaming a

Re: Repair Process Taking too long

2012-04-10 Thread David Leimbach
I had this happen when I had really poorly generated tokens for the ring. Cassandra seems to accept numbers that are too big. You get hot spots when you think you should be balanced and repair never ends (I think there is a 48 hour timeout). On Tuesday, April 10, 2012, Frank Ng wrote: I am