Re: (unofficial) Community Poll for Production Operators : Repair

2013-08-05 Thread Robert Coli
On Fri, May 10, 2013 at 11:24 AM, Robert Coli rc...@eventbrite.com wrote:

 I have been wondering how Repair is actually used by operators. If
 people operating Cassandra in production could answer the following
 questions, I would greatly appreciate it.


https://issues.apache.org/jira/browse/CASSANDRA-5850

Filed based in part on feedback from this thread.

Thanks to all participants! :D

=Rob


Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-16 Thread Alain RODRIGUEZ
@Rob: Thanks about the feedback.

Yet I have a weird behavior still unexplained about repairing. Are counters
supposed to be repaired too ? I mean, while reading at CL.ONE I can have
different values depending on what node is answering. Even after a read
repair or a full repair. Shouldn't a repair fix these discrepancies ?

The only way I found to get always the same count is to read data at
CL.QUORUM, but this is a workaround since the data itself remains wrong on
some nodes.

Any clue on it ?

Alain

2013/5/15 Edward Capriolo edlinuxg...@gmail.com

 http://basho.com/introducing-riak-1-3/

 Introduced Active Anti-Entropy. Riak now has active anti-entropy. In
 distributed systems, inconsistencies can arise between replicas due to
 failure modes, concurrent updates, and physical data loss or corruption.
 Pre-1.3 Riak already had several features for repairing this “entropy”, but
 they all required some form of user intervention. Riak 1.3 introduces
 automatic, self-healing properties that repair entropy on an ongoing basis.


 On Wed, May 15, 2013 at 5:32 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ arodr...@gmail.com
 wrote:
  Rob, I was wondering something. Are you a commiter working on improving
 the
  repair or something similar ?

 I am not a committer [1], but I have an active interest in potential
 improvements to the best practices for repair. The specific change
 that I am considering is a modification to the default
 gc_grace_seconds value, which seems picked out of a hat at 10 days. My
 view is that the current implementation of repair has such negative
 performance consequences that I do not believe that holding onto
 tombstones for longer than 10 days could possibly be as bad as the
 fixed cost of running repair once every 10 days. I believe that this
 value is too low for a default (it also does not map cleanly to the
 work week!) and likely should be increased to 14, 21 or 28 days.

  Anyway, if a commiter (or any other expert) could give us some feedback
 on
  our comments (Are we doing well or not, whether things we observe are
 normal
  or unexplained, what is going to be improved in the future about
 repair...)

 1) you are doing things according to best practice
 2) unfortunately your experience with significantly degraded
 performance, including a blocked go-live due to repair bloat is pretty
 typical
 3) the things you are experiencing are part of the current
 implementation of repair and are also typical, however I do not
 believe they are fully explained [2]
 4) as has been mentioned further down thread, there are discussions
 regarding (and some already committed) improvements to both the
 current repair paradigm and an evolution to a new paradigm

 Thanks to all for the responses so far, please keep them coming! :D

 =Rob
 [1] hence the (unofficial) tag for this thread. I do have minor
 patches accepted to the codebase, but always merged by an actual
 committer. :)
 [2] driftx@#cassandra feels that these things are explained/understood
 by core team, and points to
 https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful
 approach to minimize same.





Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-16 Thread Janne Jalkanen

Might you be experiencing this? 
https://issues.apache.org/jira/browse/CASSANDRA-4417

/Janne

On May 16, 2013, at 14:49 , Alain RODRIGUEZ arodr...@gmail.com wrote:

 @Rob: Thanks about the feedback.
 
 Yet I have a weird behavior still unexplained about repairing. Are counters 
 supposed to be repaired too ? I mean, while reading at CL.ONE I can have 
 different values depending on what node is answering. Even after a read 
 repair or a full repair. Shouldn't a repair fix these discrepancies ?
 
 The only way I found to get always the same count is to read data at 
 CL.QUORUM, but this is a workaround since the data itself remains wrong on 
 some nodes. 
 
 Any clue on it ?
 
 Alain
 
 2013/5/15 Edward Capriolo edlinuxg...@gmail.com
 http://basho.com/introducing-riak-1-3/
 
 Introduced Active Anti-Entropy. Riak now has active anti-entropy. In 
 distributed systems, inconsistencies can arise between replicas due to 
 failure modes, concurrent updates, and physical data loss or corruption. 
 Pre-1.3 Riak already had several features for repairing this “entropy”, but 
 they all required some form of user intervention. Riak 1.3 introduces 
 automatic, self-healing properties that repair entropy on an ongoing basis.
 
 
 On Wed, May 15, 2013 at 5:32 PM, Robert Coli rc...@eventbrite.com wrote:
 On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:
  Rob, I was wondering something. Are you a commiter working on improving the
  repair or something similar ?
 
 I am not a committer [1], but I have an active interest in potential
 improvements to the best practices for repair. The specific change
 that I am considering is a modification to the default
 gc_grace_seconds value, which seems picked out of a hat at 10 days. My
 view is that the current implementation of repair has such negative
 performance consequences that I do not believe that holding onto
 tombstones for longer than 10 days could possibly be as bad as the
 fixed cost of running repair once every 10 days. I believe that this
 value is too low for a default (it also does not map cleanly to the
 work week!) and likely should be increased to 14, 21 or 28 days.
 
  Anyway, if a commiter (or any other expert) could give us some feedback on
  our comments (Are we doing well or not, whether things we observe are normal
  or unexplained, what is going to be improved in the future about repair...)
 
 1) you are doing things according to best practice
 2) unfortunately your experience with significantly degraded
 performance, including a blocked go-live due to repair bloat is pretty
 typical
 3) the things you are experiencing are part of the current
 implementation of repair and are also typical, however I do not
 believe they are fully explained [2]
 4) as has been mentioned further down thread, there are discussions
 regarding (and some already committed) improvements to both the
 current repair paradigm and an evolution to a new paradigm
 
 Thanks to all for the responses so far, please keep them coming! :D
 
 =Rob
 [1] hence the (unofficial) tag for this thread. I do have minor
 patches accepted to the codebase, but always merged by an actual
 committer. :)
 [2] driftx@#cassandra feels that these things are explained/understood
 by core team, and points to
 https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful
 approach to minimize same.
 
 



Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-16 Thread Alain RODRIGUEZ
I indeed had some of those in the past. But my point is not that much to
understand how I can get different counts depending on the node (I consider
this as a weakness of counters and I am aware of it),  my wonder is more
why those inconsistent, distinct counters never converge even after a
repair. Your last comment on this JIRA summarize quite well our problem.

I hope that commiters will find out something.


2013/5/16 Janne Jalkanen janne.jalka...@ecyrd.com


 Might you be experiencing this?
 https://issues.apache.org/jira/browse/CASSANDRA-4417

 /Janne

 On May 16, 2013, at 14:49 , Alain RODRIGUEZ arodr...@gmail.com wrote:

 @Rob: Thanks about the feedback.

 Yet I have a weird behavior still unexplained about repairing. Are
 counters supposed to be repaired too ? I mean, while reading at CL.ONE I
 can have different values depending on what node is answering. Even after a
 read repair or a full repair. Shouldn't a repair fix these discrepancies ?

 The only way I found to get always the same count is to read data at
 CL.QUORUM, but this is a workaround since the data itself remains wrong on
 some nodes.

 Any clue on it ?

 Alain

 2013/5/15 Edward Capriolo edlinuxg...@gmail.com

 http://basho.com/introducing-riak-1-3/

 Introduced Active Anti-Entropy. Riak now has active anti-entropy. In
 distributed systems, inconsistencies can arise between replicas due to
 failure modes, concurrent updates, and physical data loss or corruption.
 Pre-1.3 Riak already had several features for repairing this “entropy”, but
 they all required some form of user intervention. Riak 1.3 introduces
 automatic, self-healing properties that repair entropy on an ongoing basis.


 On Wed, May 15, 2013 at 5:32 PM, Robert Coli rc...@eventbrite.comwrote:

 On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ arodr...@gmail.com
 wrote:
  Rob, I was wondering something. Are you a commiter working on
 improving the
  repair or something similar ?

 I am not a committer [1], but I have an active interest in potential
 improvements to the best practices for repair. The specific change
 that I am considering is a modification to the default
 gc_grace_seconds value, which seems picked out of a hat at 10 days. My
 view is that the current implementation of repair has such negative
 performance consequences that I do not believe that holding onto
 tombstones for longer than 10 days could possibly be as bad as the
 fixed cost of running repair once every 10 days. I believe that this
 value is too low for a default (it also does not map cleanly to the
 work week!) and likely should be increased to 14, 21 or 28 days.

  Anyway, if a commiter (or any other expert) could give us some
 feedback on
  our comments (Are we doing well or not, whether things we observe are
 normal
  or unexplained, what is going to be improved in the future about
 repair...)

 1) you are doing things according to best practice
 2) unfortunately your experience with significantly degraded
 performance, including a blocked go-live due to repair bloat is pretty
 typical
 3) the things you are experiencing are part of the current
 implementation of repair and are also typical, however I do not
 believe they are fully explained [2]
 4) as has been mentioned further down thread, there are discussions
 regarding (and some already committed) improvements to both the
 current repair paradigm and an evolution to a new paradigm

 Thanks to all for the responses so far, please keep them coming! :D

 =Rob
 [1] hence the (unofficial) tag for this thread. I do have minor
 patches accepted to the codebase, but always merged by an actual
 committer. :)
 [2] driftx@#cassandra feels that these things are explained/understood
 by core team, and points to
 https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful
 approach to minimize same.







Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread Alain RODRIGUEZ
Rob, I was wondering something. Are you a commiter working on improving the
repair or something similar ?

Anyway, if a commiter (or any other expert) could give us some feedback on
our comments (Are we doing well or not, whether things we observe are
normal or unexplained, what is going to be improved in the future about
repair...)

I am always interested on hearing about how things work and whether I am
doing well or not.

Alain


2013/5/14 Wei Zhu wz1...@yahoo.com

 1) 1.1.6 on 5 nodes, 24CPU, 72 RAM
 2) local quorum (we only have one DC though). We do delete through TTL
 3) yes
 4) once a week rolling repairs -pr using cron job
 5) it definitely has negative impact on the performance. Our data size is
 around 100G per node and during repair it brings in additional 60G - 80G
 data and created about 7K compaction (We use LCS with SSTable size of 10M
 which was a mistake we made at the beginning). It takes more than a day for
 the compaction tasks to clear and by then the next compaction starts. We
 had to set client side (Hector) timeout to deal with it and the SLA is
 still under control for now.
 But we had to halt go live for another cluster due to the unanticipated
 double the space during the repair.

 Per Dean's question to simulate the slow response, someone in the IRC
 mentioned a trick to start Cassandra with -f and ctrl-z and it works for
 our test.

 -Wei
 --
 *From: *Dean Hiller dean.hil...@nrel.gov
 *To: *user@cassandra.apache.org
 *Sent: *Tuesday, May 14, 2013 4:48:02 AM

 *Subject: *Re: (unofficial) Community Poll for Production Operators :
 Repair

 We had to roll out a fix in cassandra as a slow node was slowing down our
 clients of cassandra in 1.2.2 for some reason.  Every time we had a slow
 node, we found out fast as performance degraded.  We tested this in QA and
 had the same issue.  This means a repair made that node slow which made our
 clients slow.  With this fix which I think one our team is going to try to
 get it back into cassandra, the slow node does not affect our clients
 anymore.

 I am curious though, if someone else would use the tc program to
 simulate linux packet delay on a single node, does your client's response
 time get much slower?  We simulated a 500ms delay on the node to simulate
 the slow node….it seems the co-ordinator node was incorrectly waiting for
 BOTH responses on CL_QUOROM instead of just one (as itself was one as well)
 or something like that.  (I don't know too much as my colleague was the one
 that debugged this issue)

 Dean

 From: Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Tuesday, May 14, 2013 1:42 AM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: (unofficial) Community Poll for Production Operators : Repair

 Hi Rob,

 1) 1.2.2 on 6 to 12 EC2 m1.xlarge
 2) Quorum RW . Almost no deletes (just some TTL)
 3) Yes
 4) On each node once a week (rolling repairs using crontab)
 5) The only behavior that is quite odd or unexplained to me is why a
 repair doesn't fix a counter mismatch between 2 nodes. I mean when I read
 my counters with a CL.One I have inconsistency (the counter value may
 change anytime I read it, depending, I guess, on what node I read from.
 Reading with CL.Quorum fixes this bug, but the data is still wrong on some
 nodes. About performance, it's quite expensive to run a repair but doing it
 in a low charge period and in a rolling fashion works quite well and has no
 impact on the service.

 Hope this will help somehow. Let me know if you need more information.

 Alain



 2013/5/10 Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com
 Hi!

 I have been wondering how Repair is actually used by operators. If
 people operating Cassandra in production could answer the following
 questions, I would greatly appreciate it.

 1) What version of Cassandra do you run, on what hardware?
 2) What consistency level do you write at? Do you do DELETEs?
 3) Do you run a regularly scheduled repair?
 4) If you answered yes to 3, what is the frequency of the repair?
 5) What has been your subjective experience with the performance of
 repair? (Does it work as you would expect? Does its overhead have a
 significant impact on the performance of your cluster?)

 Thanks!

 =Rob





Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread horschi
Hi Alain,

have you had a look at the following tickets?

CASSANDRA-4905 - Repair should exclude gcable tombstones from merkle-tree
computation
CASSANDRA-4932 - Agree on a gcbefore/expirebefore value for all replica
during validation compaction
CASSANDRA-4917 - Optimize tombstone creation for ExpiringColumns
CASSANDRA-5398 - Remove localTimestamp from merkle-tree calculation (for
tombstones)

Imho these should reduce the over-repair to some degree. Especially when
using TTL. Some of them are already fixed in 1.2. The rest will (hopefully)
follow :-)

cheers,
Christian


On Wed, May 15, 2013 at 10:27 AM, Alain RODRIGUEZ arodr...@gmail.comwrote:

 Rob, I was wondering something. Are you a commiter working on improving
 the repair or something similar ?

 Anyway, if a commiter (or any other expert) could give us some feedback on
 our comments (Are we doing well or not, whether things we observe are
 normal or unexplained, what is going to be improved in the future about
 repair...)

 I am always interested on hearing about how things work and whether I am
 doing well or not.

 Alain




Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread André Cruz
On May 10, 2013, at 7:24 PM, Robert Coli rc...@eventbrite.com wrote:

 1) What version of Cassandra do you run, on what hardware?

1.1.5 - 6 nodes, 32GB RAM, 300GB data per node, 900GB 10k RAID1, Intel(R) 
Xeon(R) CPU E5-2609 0 @ 2.40GHz.

 2) What consistency level do you write at? Do you do DELETEs?

QUORUM. Yes, we do deletes.

 3) Do you run a regularly scheduled repair?

Yes.

 4) If you answered yes to 3, what is the frequency of the repair?

Every 2 days.

 5) What has been your subjective experience with the performance of
 repair? (Does it work as you would expect? Does its overhead have a
 significant impact on the performance of your cluster?)

It works as we expect, it has some impact on performance, but it takes a long 
time. We used to run daily repairs, but they started overlapping. The reason 
for more frequent repairs is that we do a lot of deletes so we lowered the 
gc_grace_period otherwise the dataset would grow too large.

André



Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread Edward Capriolo
I have actually tested repair in many interesting scenarios:
Once I joined a node and forgot autobootstrap=true
So the data looked like this in the ring
left node
8GB
new node
0GB
right node
8GB

After repair
left node
10 GB
new node
13 gb
right node
12 gb

We do not run repair at all. It is better then the 0.6 and 0.7 days, but a
missed delete does not mean much to us. The difference between an 8gb
sstable and a 12 gb one could be major performance for us since we do
thousands of reads/sec.



On Wed, May 15, 2013 at 5:37 AM, André Cruz andre.c...@co.sapo.pt wrote:

 On May 10, 2013, at 7:24 PM, Robert Coli rc...@eventbrite.com wrote:

  1) What version of Cassandra do you run, on what hardware?

 1.1.5 - 6 nodes, 32GB RAM, 300GB data per node, 900GB 10k RAID1, Intel(R)
 Xeon(R) CPU E5-2609 0 @ 2.40GHz.

  2) What consistency level do you write at? Do you do DELETEs?

 QUORUM. Yes, we do deletes.

  3) Do you run a regularly scheduled repair?

 Yes.

  4) If you answered yes to 3, what is the frequency of the repair?

 Every 2 days.

  5) What has been your subjective experience with the performance of
  repair? (Does it work as you would expect? Does its overhead have a
  significant impact on the performance of your cluster?)

 It works as we expect, it has some impact on performance, but it takes a
 long time. We used to run daily repairs, but they started overlapping. The
 reason for more frequent repairs is that we do a lot of deletes so we
 lowered the gc_grace_period otherwise the dataset would grow too large.

 André




Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread Robert Coli
On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ arodr...@gmail.com wrote:
 Rob, I was wondering something. Are you a commiter working on improving the
 repair or something similar ?

I am not a committer [1], but I have an active interest in potential
improvements to the best practices for repair. The specific change
that I am considering is a modification to the default
gc_grace_seconds value, which seems picked out of a hat at 10 days. My
view is that the current implementation of repair has such negative
performance consequences that I do not believe that holding onto
tombstones for longer than 10 days could possibly be as bad as the
fixed cost of running repair once every 10 days. I believe that this
value is too low for a default (it also does not map cleanly to the
work week!) and likely should be increased to 14, 21 or 28 days.

 Anyway, if a commiter (or any other expert) could give us some feedback on
 our comments (Are we doing well or not, whether things we observe are normal
 or unexplained, what is going to be improved in the future about repair...)

1) you are doing things according to best practice
2) unfortunately your experience with significantly degraded
performance, including a blocked go-live due to repair bloat is pretty
typical
3) the things you are experiencing are part of the current
implementation of repair and are also typical, however I do not
believe they are fully explained [2]
4) as has been mentioned further down thread, there are discussions
regarding (and some already committed) improvements to both the
current repair paradigm and an evolution to a new paradigm

Thanks to all for the responses so far, please keep them coming! :D

=Rob
[1] hence the (unofficial) tag for this thread. I do have minor
patches accepted to the codebase, but always merged by an actual
committer. :)
[2] driftx@#cassandra feels that these things are explained/understood
by core team, and points to
https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful
approach to minimize same.


Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-15 Thread Edward Capriolo
http://basho.com/introducing-riak-1-3/

Introduced Active Anti-Entropy. Riak now has active anti-entropy. In
distributed systems, inconsistencies can arise between replicas due to
failure modes, concurrent updates, and physical data loss or corruption.
Pre-1.3 Riak already had several features for repairing this “entropy”, but
they all required some form of user intervention. Riak 1.3 introduces
automatic, self-healing properties that repair entropy on an ongoing basis.


On Wed, May 15, 2013 at 5:32 PM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, May 15, 2013 at 1:27 AM, Alain RODRIGUEZ arodr...@gmail.com
 wrote:
  Rob, I was wondering something. Are you a commiter working on improving
 the
  repair or something similar ?

 I am not a committer [1], but I have an active interest in potential
 improvements to the best practices for repair. The specific change
 that I am considering is a modification to the default
 gc_grace_seconds value, which seems picked out of a hat at 10 days. My
 view is that the current implementation of repair has such negative
 performance consequences that I do not believe that holding onto
 tombstones for longer than 10 days could possibly be as bad as the
 fixed cost of running repair once every 10 days. I believe that this
 value is too low for a default (it also does not map cleanly to the
 work week!) and likely should be increased to 14, 21 or 28 days.

  Anyway, if a commiter (or any other expert) could give us some feedback
 on
  our comments (Are we doing well or not, whether things we observe are
 normal
  or unexplained, what is going to be improved in the future about
 repair...)

 1) you are doing things according to best practice
 2) unfortunately your experience with significantly degraded
 performance, including a blocked go-live due to repair bloat is pretty
 typical
 3) the things you are experiencing are part of the current
 implementation of repair and are also typical, however I do not
 believe they are fully explained [2]
 4) as has been mentioned further down thread, there are discussions
 regarding (and some already committed) improvements to both the
 current repair paradigm and an evolution to a new paradigm

 Thanks to all for the responses so far, please keep them coming! :D

 =Rob
 [1] hence the (unofficial) tag for this thread. I do have minor
 patches accepted to the codebase, but always merged by an actual
 committer. :)
 [2] driftx@#cassandra feels that these things are explained/understood
 by core team, and points to
 https://issues.apache.org/jira/browse/CASSANDRA-5280 as a useful
 approach to minimize same.



Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-14 Thread Alain RODRIGUEZ
Hi Rob,

1) 1.2.2 on 6 to 12 EC2 m1.xlarge
2) Quorum RW . Almost no deletes (just some TTL)
3) Yes
4) On each node once a week (rolling repairs using crontab)
5) The only behavior that is quite odd or unexplained to me is why a repair
doesn't fix a counter mismatch between 2 nodes. I mean when I read my
counters with a CL.One I have inconsistency (the counter value may change
anytime I read it, depending, I guess, on what node I read from. Reading
with CL.Quorum fixes this bug, but the data is still wrong on some nodes.
About performance, it's quite expensive to run a repair but doing it in a
low charge period and in a rolling fashion works quite well and has no
impact on the service.

Hope this will help somehow. Let me know if you need more information.

Alain



2013/5/10 Robert Coli rc...@eventbrite.com

 Hi!

 I have been wondering how Repair is actually used by operators. If
 people operating Cassandra in production could answer the following
 questions, I would greatly appreciate it.

 1) What version of Cassandra do you run, on what hardware?
 2) What consistency level do you write at? Do you do DELETEs?
 3) Do you run a regularly scheduled repair?
 4) If you answered yes to 3, what is the frequency of the repair?
 5) What has been your subjective experience with the performance of
 repair? (Does it work as you would expect? Does its overhead have a
 significant impact on the performance of your cluster?)

 Thanks!

 =Rob



Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-14 Thread Hiller, Dean
We had to roll out a fix in cassandra as a slow node was slowing down our 
clients of cassandra in 1.2.2 for some reason.  Every time we had a slow node, 
we found out fast as performance degraded.  We tested this in QA and had the 
same issue.  This means a repair made that node slow which made our clients 
slow.  With this fix which I think one our team is going to try to get it back 
into cassandra, the slow node does not affect our clients anymore.

I am curious though, if someone else would use the tc program to simulate 
linux packet delay on a single node, does your client's response time get much 
slower?  We simulated a 500ms delay on the node to simulate the slow node….it 
seems the co-ordinator node was incorrectly waiting for BOTH responses on 
CL_QUOROM instead of just one (as itself was one as well) or something like 
that.  (I don't know too much as my colleague was the one that debugged this 
issue)

Dean

From: Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Tuesday, May 14, 2013 1:42 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: (unofficial) Community Poll for Production Operators : Repair

Hi Rob,

1) 1.2.2 on 6 to 12 EC2 m1.xlarge
2) Quorum RW . Almost no deletes (just some TTL)
3) Yes
4) On each node once a week (rolling repairs using crontab)
5) The only behavior that is quite odd or unexplained to me is why a repair 
doesn't fix a counter mismatch between 2 nodes. I mean when I read my counters 
with a CL.One I have inconsistency (the counter value may change anytime I read 
it, depending, I guess, on what node I read from. Reading with CL.Quorum fixes 
this bug, but the data is still wrong on some nodes. About performance, it's 
quite expensive to run a repair but doing it in a low charge period and in a 
rolling fashion works quite well and has no impact on the service.

Hope this will help somehow. Let me know if you need more information.

Alain



2013/5/10 Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com
Hi!

I have been wondering how Repair is actually used by operators. If
people operating Cassandra in production could answer the following
questions, I would greatly appreciate it.

1) What version of Cassandra do you run, on what hardware?
2) What consistency level do you write at? Do you do DELETEs?
3) Do you run a regularly scheduled repair?
4) If you answered yes to 3, what is the frequency of the repair?
5) What has been your subjective experience with the performance of
repair? (Does it work as you would expect? Does its overhead have a
significant impact on the performance of your cluster?)

Thanks!

=Rob



RE: (unofficial) Community Poll for Production Operators : Repair

2013-05-14 Thread Viktor Jevdokimov
 1) What version of Cassandra do you run, on what hardware?
1.0.12 (upgrade to 1.2.x is planned)
Blade servers with
  1x6 CPU cores with HT (12 vcores) (upgradable to 2x CPUs)
  96GB RAM (upgrade is planned to 128GB, 256GB max)
  1x300GB 15k Data and 1x300GB 10k CommitLog/System SAS HDDs

 2) What consistency level do you write at? Do you do DELETEs?
Write/Delete failover policy (where needed): try QUORUM then ONE finally ANY.

 3) Do you run a regularly scheduled repair?
NO, read repair is enough (where needed).

 4) If you answered yes to 3, what is the frequency of the repair?
If we'll do it, we'll do it once a day.

 5) What has been your subjective experience with the performance of
 repair? (Does it work as you would expect? Does its overhead have a
 significant impact on the performance of your cluster?)
For our use case it has too much significant impact on performance of the 
cluster without real value.




Best regards / Pagarbiai

Viktor Jevdokimov
Senior Developer

Email: viktor.jevdoki...@adform.com
Phone: +370 5 212 3063
Fax: +370 5 261 0453

J. Jasinskio 16C,
LT-01112 Vilnius,
Lithuania



Disclaimer: The information contained in this message and attachments is 
intended solely for the attention and use of the named addressee and may be 
confidential. If you are not the intended recipient, you are reminded that the 
information remains the property of the sender. You must not use, disclose, 
distribute, copy, print or rely on this e-mail. If you have received this 
message in error, please contact the sender immediately and irrevocably delete 
this message and any copies.


Re: (unofficial) Community Poll for Production Operators : Repair

2013-05-14 Thread Wei Zhu
1) 1.1.6 on 5 nodes, 24CPU, 72 RAM 
2) local quorum (we only have one DC though). We do delete through TTL 
3) yes 
4) once a week rolling repairs -pr using cron job 
5) it definitely has negative impact on the performance. Our data size is 
around 100G per node and during repair it brings in additional 60G - 80G data 
and created about 7K compaction (We use LCS with SSTable size of 10M which was 
a mistake we made at the beginning). It takes more than a day for the 
compaction tasks to clear and by then the next compaction starts. We had to set 
client side (Hector) timeout to deal with it and the SLA is still under control 
for now. 
But we had to halt go live for another cluster due to the unanticipated 
double the space during the repair. 

Per Dean's question to simulate the slow response, someone in the IRC mentioned 
a trick to start Cassandra with -f and ctrl-z and it works for our test. 

-Wei 
- Original Message -

From: Dean Hiller dean.hil...@nrel.gov 
To: user@cassandra.apache.org 
Sent: Tuesday, May 14, 2013 4:48:02 AM 
Subject: Re: (unofficial) Community Poll for Production Operators : Repair 

We had to roll out a fix in cassandra as a slow node was slowing down our 
clients of cassandra in 1.2.2 for some reason. Every time we had a slow node, 
we found out fast as performance degraded. We tested this in QA and had the 
same issue. This means a repair made that node slow which made our clients 
slow. With this fix which I think one our team is going to try to get it back 
into cassandra, the slow node does not affect our clients anymore. 

I am curious though, if someone else would use the tc program to simulate 
linux packet delay on a single node, does your client's response time get much 
slower? We simulated a 500ms delay on the node to simulate the slow node….it 
seems the co-ordinator node was incorrectly waiting for BOTH responses on 
CL_QUOROM instead of just one (as itself was one as well) or something like 
that. (I don't know too much as my colleague was the one that debugged this 
issue) 

Dean 

From: Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com 
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org 
Date: Tuesday, May 14, 2013 1:42 AM 
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
user@cassandra.apache.orgmailto:user@cassandra.apache.org 
Subject: Re: (unofficial) Community Poll for Production Operators : Repair 

Hi Rob, 

1) 1.2.2 on 6 to 12 EC2 m1.xlarge 
2) Quorum RW . Almost no deletes (just some TTL) 
3) Yes 
4) On each node once a week (rolling repairs using crontab) 
5) The only behavior that is quite odd or unexplained to me is why a repair 
doesn't fix a counter mismatch between 2 nodes. I mean when I read my counters 
with a CL.One I have inconsistency (the counter value may change anytime I read 
it, depending, I guess, on what node I read from. Reading with CL.Quorum fixes 
this bug, but the data is still wrong on some nodes. About performance, it's 
quite expensive to run a repair but doing it in a low charge period and in a 
rolling fashion works quite well and has no impact on the service. 

Hope this will help somehow. Let me know if you need more information. 

Alain 



2013/5/10 Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com 
Hi! 

I have been wondering how Repair is actually used by operators. If 
people operating Cassandra in production could answer the following 
questions, I would greatly appreciate it. 

1) What version of Cassandra do you run, on what hardware? 
2) What consistency level do you write at? Do you do DELETEs? 
3) Do you run a regularly scheduled repair? 
4) If you answered yes to 3, what is the frequency of the repair? 
5) What has been your subjective experience with the performance of 
repair? (Does it work as you would expect? Does its overhead have a 
significant impact on the performance of your cluster?) 

Thanks! 

=Rob