[ https://issues.apache.org/jira/browse/CASSANDRA-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13095012#comment-13095012 ]
ivan commented on CASSANDRA-3070: --------------------------------- Hi Sylvain, our sstables contain sensitive information so i can't provide them. Sorry. I reloaded sstables in our test environment and catched a new ouput log (). In this new log there is two new debug message: 1. rows containing "CF resolve" string (message printed at the begining of resolve method in src/java/org/apache/cassandra/db/ColumnFamily.java) 2. rows containing "CF addAll" string (message printed at the begining of addAll method in src/java/org/apache/cassandra/db/ColumnFamily.java) We have a backup of sstables with these counters so I can do any test on them. We have a 6 node cluster using RF=3. When we experienced problems with some counters I started to debug this problem. Using LOCAL_QUORUM CL we get the same answer from all servers but using ONE CL we get a lower number from 2 servers of 6. The results from the 2 server was lower with 3 than other server. I found the following: - server (10.20.255.55) notices when there is a digest mismatch (using LOCAL_QUORUM) - server (10.20.255.55) sends a repair (rowmutation) message to related servers - server (10.20.255.53) receives this mutation (which contains the same total() received by client) - when mutation is handled by Memtable.put() ColumnFamily.resolve() produces a different result (data contained in Memtable have a delta and the right counter value is not applied instead of this deltha) I don't know the resolved value is correct or not (I suspect it's not beacuse total() value seems to be wrong), because I don't know in details how counter works. Regards, ivan > counter repair > -------------- > > Key: CASSANDRA-3070 > URL: https://issues.apache.org/jira/browse/CASSANDRA-3070 > Project: Cassandra > Issue Type: Bug > Components: Core > Affects Versions: 0.8.4 > Reporter: ivan > Assignee: Sylvain Lebresne > Attachments: counter_local_quroum_maybeschedulerepairs.txt, > counter_local_quroum_maybeschedulerepairs_2.txt > > > Hi! > We have some counters out of sync but repair doesn't sync values. > We tried nodetool repair. > We use LOCAL_QUORUM for read. A repair row mutation is sent to other nodes > while reading a bad row but counters wasn't repaired by mutation. > Output of two nodes were uploaded. (Some new debug messages were added.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira