[ 
https://issues.apache.org/jira/browse/CASSANDRA-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13201768#comment-13201768
 ] 

Yuki Morishita commented on CASSANDRA-3821:
-------------------------------------------

Here is my initial look at the issue (might be wrong):

Concurrent counter mutation replay from commitlog and AtomicSortedColumns 
inside Memtable seem to be the cause of over count.
There is a race condition when adding column to memtable, and when it happens 
AtomicSortedColumns calls {{{IColumn#reconcile}}} multiple times until column 
is stored. It causes over count since counter column's {{reconcile}} is not 
idempotent operation.
                
> Counters in super columns don't preserve correct values after cluster restart
> -----------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3821
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3821
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 1.1
>         Environment: ubuntu, 'trunk' branch, used ccm to create a 3 node 
> cluster with rf=3. A dtest was created to demonstrate.
>            Reporter: Tyler Patterson
>
> Set up a 3-node cluster with rf=3. Create a counter super column family and 
> increment a bunch of subcolumns 100 times each, with cf=QUORUM. Then wait a 
> few second, restart the cluster, and read the values back. They almost all 
> come back different (and higher) then they are supposed to be.
> Here are some extra things I've noticed:
>  - Reading back the values before the restart always produces correct results.
>  - Doing a nodetool flush before killing the cluster greatly improves the 
> results, though sometimes a value will still be incorrect. You might have to 
> run the test several times to see an incorrect value after a flush.
>  - This problem doesn't happen on C* 1.0.7, unless you don't sleep between 
> doing the increments and killing the cluster. Then it sometimes happens to a 
> lesser degree.
> The dtest that demonstrates this issue is called "super_counter_test.py". Run 
> it like this: nosetests --nocapture super_counter_test.py  You'll need ccm 
> from [email protected]:tpatterson/ccm.git.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to