[jira] [Updated] (CASSANDRA-3070) counter repair

2011-12-09 Thread Peter Schuller (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Schuller updated CASSANDRA-3070:
--

Comment: was deleted

(was: This may be relevant, quoting myself from IRC:

{code}
21:20:01  scode pcmanus: Hey, are you there?  

21:20:21  scode pcmanus: I am 
investigating something which might be 
https://issues.apache.org/jira/browse/CASSANDRA-3070
 21:20:37  scode pcmanus: And 
I could use the help of someone with his brain all over counters, and Stu isn't 
here atm. :)
 21:21:16  scode pcmanus: 
https://gist.github.com/8202cb46c8bd00c8391b

 21:21:37  scode pcmanus: I am investigating why with CL.ALL and 
CL.QUORUM, I get seemingly random/varying results when I read a counter.
  21:21:53  scode 
pcmanus: I have the offending sstables on a three-node test setup and am 
inserting debug printouts in the code to trace the reconiliation.   
 21:21:57  scode pcmanus: The gist above shows 
what's happening.   

21:22:11  scode pcmanus: The latter is the wrong one, and the former is the 
correct one.
  21:22:28  scode pcmanus: The 
interesting bit is that I see shards with the same node_id *AND* clock, but 
*DIFFERENT* counts. 
 21:22:53  scode pcmanus: My understanding of counters is that 
there should never (globally across an entire cluster in all sstables) exist 
two shards for the same node_id+clock but with different  
counts. 

  21:22:57  scode pcmanus: Is my understanding correct 
there?  
 21:25:10  
scode pcmanus: There is one node out of the three that has the offending 
card (with a count of 2 instead of 1). Like with 3070, we observed this after 
having expanded a cluster (though I'm not sure how that would cause it, and we 
don't know if there existed a problem before the expansion).
 {code}
)

 counter repair
 --

 Key: CASSANDRA-3070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.4
Reporter: ivan
Assignee: Sylvain Lebresne
 Attachments: counter_local_quroum_maybeschedulerepairs.txt, 
 counter_local_quroum_maybeschedulerepairs_2.txt, 
 counter_local_quroum_maybeschedulerepairs_3.txt


 Hi!
 We have some counters out of sync but repair doesn't sync values.
 We tried nodetool repair.
 We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
 while reading a bad row but counters wasn't repaired by mutation.
 Output of two nodes were uploaded. (Some new debug messages were added.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3070) counter repair

2011-09-01 Thread ivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ivan updated CASSANDRA-3070:


Attachment: counter_local_quroum_maybeschedulerepairs_3.txt

 counter repair
 --

 Key: CASSANDRA-3070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.4
Reporter: ivan
Assignee: Sylvain Lebresne
 Attachments: counter_local_quroum_maybeschedulerepairs.txt, 
 counter_local_quroum_maybeschedulerepairs_2.txt, 
 counter_local_quroum_maybeschedulerepairs_3.txt


 Hi!
 We have some counters out of sync but repair doesn't sync values.
 We tried nodetool repair.
 We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
 while reading a bad row but counters wasn't repaired by mutation.
 Output of two nodes were uploaded. (Some new debug messages were added.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3070) counter repair

2011-08-23 Thread ivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ivan updated CASSANDRA-3070:


Attachment: counter_local_quroum_maybeschedulerepairs.txt

 counter repair
 --

 Key: CASSANDRA-3070
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3070
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.4
Reporter: ivan
 Attachments: counter_local_quroum_maybeschedulerepairs.txt


 Hi!
 We have some counters out of sync but repair doesn't sync values.
 We tried nodetool repair.
 We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes 
 while reading a bad row but counters wasn't repaired by mutation.
 Output of two nodes were uploaded. (Some new debug messages were added.)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira