Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Sylvain Lebresne
On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu springri...@gmail.com wrote:
 me neither don't want to repair one CF at the time.
 the node repair took a week and still running, compactionstats and
 netstream shows nothing is running on every node,  and also no error
 message, no exception, really no idea what was it doing,

To add to the list of things repair does wrong in 0.7, we'll have to add that
if one of the node participating in the repair (so any node that share a range
with the node on which repair was started) goes down (even for a short time),
then the repair will simply hang forever doing nothing. And no specific
error message will be logged. That could be what happened. Again, recent
releases of 0.8 fix that too.

--
Sylvain

 I stopped yesterday.  maybe I should run repair again while disable
 compaction on all nodes?
 thanks!

 On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller
 peter.schul...@infidyne.com wrote:

  I think it is a serious problem since I can not repair.  I am
  using cassandra on production servers. is there some way to fix it
  without upgrade?  I heard of that 0.8.x is still not quite ready in
  production environment.

 It is a serious issue if you really need to repair one CF at the time.
 However, looking at your original post it seems this is not
 necessarily your issue. Do you need to, or was your concern rather the
 overall time repair took?

 There are other things that are improved in 0.8 that affect 0.7. In
 particular, (1) in 0.7 compaction, including validating compactions
 that are part of repair, is non-concurrent so if your repair starts
 while there is a long-running compaction going it will have to wait,
 and (2) semi-related is that the merkle tree calculation that is part
 of repair/anti-entropy may happen out of synch if one of the nodes
 participating happen to be busy with compaction. This in turns causes
 additional data to be sent as part of repair.

 That might be why your immediately following repair took a long time,
 but it's difficult to tell.

 If you're having issues with repair and large data sets, I would
 generally say that upgrading to 0.8 is recommended. However, if you're
 on 0.7.4, beware of
 https://issues.apache.org/jira/browse/CASSANDRA-3166

 --
 / Peter Schuller (@scode on twitter)




Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Yan Chunlu
is 0.8 ready for production use?   as I know currently many companies
including reddit.com are using 0.7, how does they get rid of the repair
problem?

On Wed, Sep 14, 2011 at 2:47 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu springri...@gmail.com wrote:
  me neither don't want to repair one CF at the time.
  the node repair took a week and still running, compactionstats and
  netstream shows nothing is running on every node,  and also no error
  message, no exception, really no idea what was it doing,

 To add to the list of things repair does wrong in 0.7, we'll have to add
 that
 if one of the node participating in the repair (so any node that share a
 range
 with the node on which repair was started) goes down (even for a short
 time),
 then the repair will simply hang forever doing nothing. And no specific
 error message will be logged. That could be what happened. Again, recent
 releases of 0.8 fix that too.

 --
 Sylvain

  I stopped yesterday.  maybe I should run repair again while disable
  compaction on all nodes?
  thanks!
 
  On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller
  peter.schul...@infidyne.com wrote:
 
   I think it is a serious problem since I can not repair.  I am
   using cassandra on production servers. is there some way to fix it
   without upgrade?  I heard of that 0.8.x is still not quite ready in
   production environment.
 
  It is a serious issue if you really need to repair one CF at the time.
  However, looking at your original post it seems this is not
  necessarily your issue. Do you need to, or was your concern rather the
  overall time repair took?
 
  There are other things that are improved in 0.8 that affect 0.7. In
  particular, (1) in 0.7 compaction, including validating compactions
  that are part of repair, is non-concurrent so if your repair starts
  while there is a long-running compaction going it will have to wait,
  and (2) semi-related is that the merkle tree calculation that is part
  of repair/anti-entropy may happen out of synch if one of the nodes
  participating happen to be busy with compaction. This in turns causes
  additional data to be sent as part of repair.
 
  That might be why your immediately following repair took a long time,
  but it's difficult to tell.
 
  If you're having issues with repair and large data sets, I would
  generally say that upgrading to 0.8 is recommended. However, if you're
  on 0.7.4, beware of
  https://issues.apache.org/jira/browse/CASSANDRA-3166
 
  --
  / Peter Schuller (@scode on twitter)
 
 



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Sylvain Lebresne
On Wed, Sep 14, 2011 at 9:27 AM, Yan Chunlu springri...@gmail.com wrote:
 is 0.8 ready for production use?

some related discussion here:
http://www.mail-archive.com/user@cassandra.apache.org/msg17055.html
but my personal answer is yes.

  as I know currently many companies including reddit.com are using 0.7, how
 does they get rid of the repair problem?

Repair problems in 0.7 don't hit everyone equally. For some people, it works
relatively well even if not in the most efficient ways. Also, for some workload
(if you don't do  much deletes for instance), you can set a big gc_grace_seconds
value (say a month) and only run repair that often, which can make repair
inefficiencies more bearable.
That being said, I can't speak for many companies, but I do advise evaluating
an upgrade to 0.8.

--
Sylvain


 On Wed, Sep 14, 2011 at 2:47 PM, Sylvain Lebresne sylv...@datastax.com
 wrote:

 On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu springri...@gmail.com wrote:
  me neither don't want to repair one CF at the time.
  the node repair took a week and still running, compactionstats and
  netstream shows nothing is running on every node,  and also no error
  message, no exception, really no idea what was it doing,

 To add to the list of things repair does wrong in 0.7, we'll have to add
 that
 if one of the node participating in the repair (so any node that share a
 range
 with the node on which repair was started) goes down (even for a short
 time),
 then the repair will simply hang forever doing nothing. And no specific
 error message will be logged. That could be what happened. Again, recent
 releases of 0.8 fix that too.

 --
 Sylvain

  I stopped yesterday.  maybe I should run repair again while disable
  compaction on all nodes?
  thanks!
 
  On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller
  peter.schul...@infidyne.com wrote:
 
   I think it is a serious problem since I can not repair.  I am
   using cassandra on production servers. is there some way to fix it
   without upgrade?  I heard of that 0.8.x is still not quite ready in
   production environment.
 
  It is a serious issue if you really need to repair one CF at the time.
  However, looking at your original post it seems this is not
  necessarily your issue. Do you need to, or was your concern rather the
  overall time repair took?
 
  There are other things that are improved in 0.8 that affect 0.7. In
  particular, (1) in 0.7 compaction, including validating compactions
  that are part of repair, is non-concurrent so if your repair starts
  while there is a long-running compaction going it will have to wait,
  and (2) semi-related is that the merkle tree calculation that is part
  of repair/anti-entropy may happen out of synch if one of the nodes
  participating happen to be busy with compaction. This in turns causes
  additional data to be sent as part of repair.
 
  That might be why your immediately following repair took a long time,
  but it's difficult to tell.
 
  If you're having issues with repair and large data sets, I would
  generally say that upgrading to 0.8 is recommended. However, if you're
  on 0.7.4, beware of
  https://issues.apache.org/jira/browse/CASSANDRA-3166
 
  --
  / Peter Schuller (@scode on twitter)
 
 




Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Sasha Dolgy
It was mentioned in another thread that Twitter uses 0.8 in
productionfor me that was a fairly strong testimonial...
On Sep 14, 2011 9:28 AM, Yan Chunlu springri...@gmail.com wrote:
 is 0.8 ready for production use? as I know currently many companies
 including reddit.com are using 0.7, how does they get rid of the repair
 problem?

 On Wed, Sep 14, 2011 at 2:47 PM, Sylvain Lebresne sylv...@datastax.com
wrote:

 On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu springri...@gmail.com
wrote:
  me neither don't want to repair one CF at the time.
  the node repair took a week and still running, compactionstats and
  netstream shows nothing is running on every node, and also no error
  message, no exception, really no idea what was it doing,

 To add to the list of things repair does wrong in 0.7, we'll have to add
 that
 if one of the node participating in the repair (so any node that share a
 range
 with the node on which repair was started) goes down (even for a short
 time),
 then the repair will simply hang forever doing nothing. And no specific
 error message will be logged. That could be what happened. Again, recent
 releases of 0.8 fix that too.

 --
 Sylvain

  I stopped yesterday. maybe I should run repair again while disable
  compaction on all nodes?
  thanks!
 
  On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller
  peter.schul...@infidyne.com wrote:
 
   I think it is a serious problem since I can not repair. I am
   using cassandra on production servers. is there some way to fix it
   without upgrade? I heard of that 0.8.x is still not quite ready in
   production environment.
 
  It is a serious issue if you really need to repair one CF at the time.
  However, looking at your original post it seems this is not
  necessarily your issue. Do you need to, or was your concern rather the
  overall time repair took?
 
  There are other things that are improved in 0.8 that affect 0.7. In
  particular, (1) in 0.7 compaction, including validating compactions
  that are part of repair, is non-concurrent so if your repair starts
  while there is a long-running compaction going it will have to wait,
  and (2) semi-related is that the merkle tree calculation that is part
  of repair/anti-entropy may happen out of synch if one of the nodes
  participating happen to be busy with compaction. This in turns causes
  additional data to be sent as part of repair.
 
  That might be why your immediately following repair took a long time,
  but it's difficult to tell.
 
  If you're having issues with repair and large data sets, I would
  generally say that upgrading to 0.8 is recommended. However, if you're
  on 0.7.4, beware of
  https://issues.apache.org/jira/browse/CASSANDRA-3166
 
  --
  / Peter Schuller (@scode on twitter)
 
 



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Yan Chunlu
thanks a lot for the help!

 I have read the post and think 0.8 might be good enough for me, especially
0.8.5.

also change gc_grace_seconds is a acceptable solution.



On Wed, Sep 14, 2011 at 4:03 PM, Sylvain Lebresne sylv...@datastax.comwrote:

 On Wed, Sep 14, 2011 at 9:27 AM, Yan Chunlu springri...@gmail.com wrote:
  is 0.8 ready for production use?

 some related discussion here:
 http://www.mail-archive.com/user@cassandra.apache.org/msg17055.html
 but my personal answer is yes.

   as I know currently many companies including reddit.com are using 0.7,
 how
  does they get rid of the repair problem?

 Repair problems in 0.7 don't hit everyone equally. For some people, it
 works
 relatively well even if not in the most efficient ways. Also, for some
 workload
 (if you don't do  much deletes for instance), you can set a big
 gc_grace_seconds
 value (say a month) and only run repair that often, which can make repair
 inefficiencies more bearable.
 That being said, I can't speak for many companies, but I do advise
 evaluating
 an upgrade to 0.8.

 --
 Sylvain

 
  On Wed, Sep 14, 2011 at 2:47 PM, Sylvain Lebresne sylv...@datastax.com
  wrote:
 
  On Wed, Sep 14, 2011 at 2:38 AM, Yan Chunlu springri...@gmail.com
 wrote:
   me neither don't want to repair one CF at the time.
   the node repair took a week and still running, compactionstats and
   netstream shows nothing is running on every node,  and also no error
   message, no exception, really no idea what was it doing,
 
  To add to the list of things repair does wrong in 0.7, we'll have to add
  that
  if one of the node participating in the repair (so any node that share a
  range
  with the node on which repair was started) goes down (even for a short
  time),
  then the repair will simply hang forever doing nothing. And no specific
  error message will be logged. That could be what happened. Again, recent
  releases of 0.8 fix that too.
 
  --
  Sylvain
 
   I stopped yesterday.  maybe I should run repair again while disable
   compaction on all nodes?
   thanks!
  
   On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller
   peter.schul...@infidyne.com wrote:
  
I think it is a serious problem since I can not repair.  I am
using cassandra on production servers. is there some way to fix it
without upgrade?  I heard of that 0.8.x is still not quite ready in
production environment.
  
   It is a serious issue if you really need to repair one CF at the
 time.
   However, looking at your original post it seems this is not
   necessarily your issue. Do you need to, or was your concern rather
 the
   overall time repair took?
  
   There are other things that are improved in 0.8 that affect 0.7. In
   particular, (1) in 0.7 compaction, including validating compactions
   that are part of repair, is non-concurrent so if your repair starts
   while there is a long-running compaction going it will have to wait,
   and (2) semi-related is that the merkle tree calculation that is part
   of repair/anti-entropy may happen out of synch if one of the nodes
   participating happen to be busy with compaction. This in turns causes
   additional data to be sent as part of repair.
  
   That might be why your immediately following repair took a long time,
   but it's difficult to tell.
  
   If you're having issues with repair and large data sets, I would
   generally say that upgrading to 0.8 is recommended. However, if
 you're
   on 0.7.4, beware of
   https://issues.apache.org/jira/browse/CASSANDRA-3166
  
   --
   / Peter Schuller (@scode on twitter)
  
  
 
 



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Anand Somani
On Tue, Sep 13, 2011 at 3:57 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  I think it is a serious problem since I can not repair.  I am
  using cassandra on production servers. is there some way to fix it
  without upgrade?  I heard of that 0.8.x is still not quite ready in
  production environment.

 It is a serious issue if you really need to repair one CF at the time.

Why is it serious to do repair one CF at a time, if I cannot do that it at a
CF level, then does it mean that I cannot use more than 50% disk space? Is
this specific to this problem or is that a general statement? I ask because
I am planning on doing this so I can limit the max disk overhead to be a CF
(+ some factor) worth. I am going to be testing this in the next couple of
weeks or so.

 However, looking at your original post it seems this is not
 necessarily your issue. Do you need to, or was your concern rather the
 overall time repair took?

 There are other things that are improved in 0.8 that affect 0.7. In
 particular, (1) in 0.7 compaction, including validating compactions
 that are part of repair, is non-concurrent so if your repair starts
 while there is a long-running compaction going it will have to wait,
 and (2) semi-related is that the merkle tree calculation that is part
 of repair/anti-entropy may happen out of synch if one of the nodes
 participating happen to be busy with compaction. This in turns causes
 additional data to be sent as part of repair.

 That might be why your immediately following repair took a long time,
 but it's difficult to tell.

 If you're having issues with repair and large data sets, I would
 generally say that upgrading to 0.8 is recommended. However, if you're
 on 0.7.4, beware of
 https://issues.apache.org/jira/browse/CASSANDRA-3166

 --
 / Peter Schuller (@scode on twitter)



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-14 Thread Peter Schuller
 It is a serious issue if you really need to repair one CF at the time.

 Why is it serious to do repair one CF at a time, if I cannot do that it at a
 CF level, then does it mean that I cannot use more than 50% disk space? Is
 this specific to this problem or is that a general statement? I ask because
 I am planning on doing this so I can limit the max disk overhead to be a CF
 (+ some factor) worth. I am going to be testing this in the next couple of
 weeks or so.

The bug in 0.7 is causes data to be streamed for all CF:s when doing a
repair on one. So, if you specifically need to repair a specific CF at
a time, such as because you're trying to repair a small CF quite often
while leaving a huge CF with less frequent repairs, you have an issue.

If you're just wanting to repair the entire keyspace, it doesn't affect you.

I'm not sure how this relates to the 50% disk space bit though.

-- 
/ Peter Schuller (@scode on twitter)


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-13 Thread Peter Schuller
 It's okay but won't do what you want; due to a bug you'll see
 streaming of data for other column families than the one you're trying
 to repair. This will be fixed in 1.0.

 I think we might be running into this. Is CASSANDRA-2280 the issue
 you're referring to?

Yes. Sorry for not providing the reference.

-- 
/ Peter Schuller (@scode on twitter)


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-13 Thread Peter Schuller
 I think it is a serious problem since I can not repair.  I am
 using cassandra on production servers. is there some way to fix it
 without upgrade?  I heard of that 0.8.x is still not quite ready in
 production environment.

It is a serious issue if you really need to repair one CF at the time.
However, looking at your original post it seems this is not
necessarily your issue. Do you need to, or was your concern rather the
overall time repair took?

There are other things that are improved in 0.8 that affect 0.7. In
particular, (1) in 0.7 compaction, including validating compactions
that are part of repair, is non-concurrent so if your repair starts
while there is a long-running compaction going it will have to wait,
and (2) semi-related is that the merkle tree calculation that is part
of repair/anti-entropy may happen out of synch if one of the nodes
participating happen to be busy with compaction. This in turns causes
additional data to be sent as part of repair.

That might be why your immediately following repair took a long time,
but it's difficult to tell.

If you're having issues with repair and large data sets, I would
generally say that upgrading to 0.8 is recommended. However, if you're
on 0.7.4, beware of
https://issues.apache.org/jira/browse/CASSANDRA-3166

-- 
/ Peter Schuller (@scode on twitter)


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-13 Thread Yan Chunlu
me neither don't want to repair one CF at the time.

the node repair took a week and still running, compactionstats and
netstream shows nothing is running on every node,  and also no error
message, no exception, really no idea what was it doing,  I stopped
yesterday.  maybe I should run repair again while disable  compaction on all
nodes?

thanks!


On Wed, Sep 14, 2011 at 6:57 AM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  I think it is a serious problem since I can not repair.  I am
  using cassandra on production servers. is there some way to fix it
  without upgrade?  I heard of that 0.8.x is still not quite ready in
  production environment.

 It is a serious issue if you really need to repair one CF at the time.
 However, looking at your original post it seems this is not
 necessarily your issue. Do you need to, or was your concern rather the
 overall time repair took?

 There are other things that are improved in 0.8 that affect 0.7. In
 particular, (1) in 0.7 compaction, including validating compactions
 that are part of repair, is non-concurrent so if your repair starts
 while there is a long-running compaction going it will have to wait,
 and (2) semi-related is that the merkle tree calculation that is part
 of repair/anti-entropy may happen out of synch if one of the nodes
 participating happen to be busy with compaction. This in turns causes
 additional data to be sent as part of repair.

 That might be why your immediately following repair took a long time,
 but it's difficult to tell.

 If you're having issues with repair and large data sets, I would
 generally say that upgrading to 0.8 is recommended. However, if you're
 on 0.7.4, beware of
 https://issues.apache.org/jira/browse/CASSANDRA-3166

 --
 / Peter Schuller (@scode on twitter)



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Peter Schuller
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

It's okay but won't do what you want; due to a bug you'll see
streaming of data for other column families than the one you're trying
to repair. This will be fixed in 1.0.

-- 
/ Peter Schuller (@scode on twitter)


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Jim Ancona
On Mon, Sep 12, 2011 at 1:44 PM, Peter Schuller
peter.schul...@infidyne.com wrote:
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

 It's okay but won't do what you want; due to a bug you'll see
 streaming of data for other column families than the one you're trying
 to repair. This will be fixed in 1.0.

I think we might be running into this. Is CASSANDRA-2280 the issue
you're referring to?

Jim


Re: what's the difference between repair CF separately and repair the entire node?

2011-09-12 Thread Yan Chunlu
I think it is a serious problem since I can not repair.  I am
using cassandra on production servers. is there some way to fix it
without upgrade?  I heard of that 0.8.x is still not quite ready in
production environment.

thanks!

On Tue, Sep 13, 2011 at 1:44 AM, Peter Schuller
peter.schul...@infidyne.com wrote:
 I am using 0.7.4.  so it is always okay to do the routine repair on
 Column Family basis? thanks!

 It's okay but won't do what you want; due to a bug you'll see
 streaming of data for other column families than the one you're trying
 to repair. This will be fixed in 1.0.

 --
 / Peter Schuller (@scode on twitter)



Re: what's the difference between repair CF separately and repair the entire node?

2011-09-09 Thread Sylvain Lebresne
On Fri, Sep 9, 2011 at 4:18 AM, Yan Chunlu springri...@gmail.com wrote:
 I have 3 nodes and RF=3.  I  tried to repair every node in the cluster by
 using nodetool repair mykeyspace mycf on every column family.  it finished
 within 3 hours, the data size is no more than 50GB.
 after the repair, I have tried using nodetool repair immediately to repair
 the entire node, but 48 hours has past it still going on. compactionstats
 shows it is doing SSTable rebuild.
 so I am frustrating about why does nodetool repair so slow?   how does it
 different with repair every CF?

What version of Cassandra are you using. If you are using something  0.8.2,
then it may be because nodetool repair used to schedule its sub-task poorly,
in ways that were counter-productive (fixed by CASSANDRA-2816).

If you are using a more recent version, then it's an interesting report.

 I didn't tried to repair the system keyspace, does it also need to repair?

It doesn't.

--
Sylvain


what's the difference between repair CF separately and repair the entire node?

2011-09-08 Thread Yan Chunlu
I have 3 nodes and RF=3.  I  tried to repair every node in the cluster by
using nodetool repair mykeyspace mycf on every column family.  it finished
within 3 hours, the data size is no more than 50GB.
after the repair, I have tried using nodetool repair immediately to repair
the entire node, but 48 hours has past it still going on. compactionstats
shows it is doing SSTable rebuild.

so I am frustrating about why does nodetool repair so slow?   how does it
different with repair every CF?

I didn't tried to repair the system keyspace, does it also need to repair?
 thanks!