[ 
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915352#action_12915352
 ] 

zhu han commented on CASSANDRA-1546:
------------------------------------

Thank you for you detailed response.

{quote}
During a write, after having apply the increment locally, there is a read a one 
column (the one corresponding to the local count).
This is this value that is sent for replication (this thus integrate the 
fleshly written update). This read is a normal read, so it hits as
many sstables as need be, if that's what you mean. 
{quote}

Yep. What I mean is we don't need to read multiple sstables but the most fresh 
one to get the latest value of the single column.  If the counter column is 
updated frequently, it should reside in memtable. So, We can read it from 
memtable directly, even without touching on disk sstables.  That is, we do not 
need any disk IO for the counter incr/decr mutation, just like the normal 
column mutation. 

This can keep the update of counter still faster than read, which is the keen 
competitive advantage of cassandra.

{quote}
Ok, the problem is the following: suppose you issue one increment (+1), then 
you remove the counter, then you increment again (+1).
Say the leader replicate is always the same one, but he receives the two 
increments first. It will 'merge' those two increment, and
we'll end up with one column, whose count is 2 and whose timestamp is the one 
of the last increment. Then it receives the delete.
But as far as he's concerned, this delete is obsolete and will be discarded. 
Even if we were somehow able to detect that the delete
should have delete something, how can we know which parts of the now merged 
count should be kept or not.
{quote}
I see. What I said does not work here.

 What makes things more complicated, is commands from two clients does not have 
any total order at all. For example, two clients, one issued increment, the 
other one issued deletion, both at time t1. Whether the effect of increment 
left after execution of these two commands are not deterministic.

So, I agree with you, this issue should not be blocker of this feature. 
Cassandra can not provide atomic incr/decr, or delete, no matter how hard we 
try,  as long as CAP theorem is right .  I even thought we should not solve 
this tricky problem. Let's expose this constraint to the client application 
directly.

> (Yet another) approach to counting
> ----------------------------------
>
>                 Key: CASSANDRA-1546
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Sylvain Lebresne
>            Assignee: Sylvain Lebresne
>             Fix For: 0.7.0
>
>         Attachments: 0001-Remove-IClock-from-internals.patch, 
> 0002-Counters.patch, 0003-Generated-thrift-files-changes.patch
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and 
> CASSANDRA-1421.
> More details in the comment below.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to