[
https://issues.apache.org/jira/browse/CASSANDRA-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12915352#action_12915352
]
zhu han commented on CASSANDRA-1546:
------------------------------------
Thank you for you detailed response.
{quote}
During a write, after having apply the increment locally, there is a read a one
column (the one corresponding to the local count).
This is this value that is sent for replication (this thus integrate the
fleshly written update). This read is a normal read, so it hits as
many sstables as need be, if that's what you mean.
{quote}
Yep. What I mean is we don't need to read multiple sstables but the most fresh
one to get the latest value of the single column. If the counter column is
updated frequently, it should reside in memtable. So, We can read it from
memtable directly, even without touching on disk sstables. That is, we do not
need any disk IO for the counter incr/decr mutation, just like the normal
column mutation.
This can keep the update of counter still faster than read, which is the keen
competitive advantage of cassandra.
{quote}
Ok, the problem is the following: suppose you issue one increment (+1), then
you remove the counter, then you increment again (+1).
Say the leader replicate is always the same one, but he receives the two
increments first. It will 'merge' those two increment, and
we'll end up with one column, whose count is 2 and whose timestamp is the one
of the last increment. Then it receives the delete.
But as far as he's concerned, this delete is obsolete and will be discarded.
Even if we were somehow able to detect that the delete
should have delete something, how can we know which parts of the now merged
count should be kept or not.
{quote}
I see. What I said does not work here.
What makes things more complicated, is commands from two clients does not have
any total order at all. For example, two clients, one issued increment, the
other one issued deletion, both at time t1. Whether the effect of increment
left after execution of these two commands are not deterministic.
So, I agree with you, this issue should not be blocker of this feature.
Cassandra can not provide atomic incr/decr, or delete, no matter how hard we
try, as long as CAP theorem is right . I even thought we should not solve
this tricky problem. Let's expose this constraint to the client application
directly.
> (Yet another) approach to counting
> ----------------------------------
>
> Key: CASSANDRA-1546
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1546
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Sylvain Lebresne
> Assignee: Sylvain Lebresne
> Fix For: 0.7.0
>
> Attachments: 0001-Remove-IClock-from-internals.patch,
> 0002-Counters.patch, 0003-Generated-thrift-files-changes.patch
>
>
> This could be described as a mix between CASSANDRA-1072 without clocks and
> CASSANDRA-1421.
> More details in the comment below.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.