[
https://issues.apache.org/jira/browse/CASSANDRA-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13701132#comment-13701132
]
Timo Kinnunen commented on CASSANDRA-4775:
------------------------------------------
I've elaborated about my understanding of the similarities and differences
between Cassandra's counters and CRDT Positive-Negative Counters, what the
design looks like in the problem areas in contrast to CRDT's design, what
changes could be made to bring the design closer to how PN-Counters are
designed to work and finally how garbage collecting shards, shard ownership
changes, decommissioning nodes and retrying could be implemented under the
changed design, it's all here:
https://gist.github.com/Overruler/14c0f3810e870666a328
To summarize these 3 changes together should make the design more robust:
1) The buffering and bulk processing of incoming requests isn’t specific to
counters so the unprocessed increments don’t need to be stored in the counter,
instead they should be inserted into a work queue that's separate from the
counter and only belongs to the replica. This way unprocessed increments won’t
get propagated to other replicas. When the replica needs to calculate the value
of the counter, it processes the work queue and increments its shards as normal
with all the locking and writing.
2) To propagate the incremented counter the replica can create a list of all
(or some of) the shards that are replicating the counter and insert an exact
duplicate shard for each of them to receive. The new shards are transmitted to
other replicas like before.
3) Resolving two shards that are owned by the same replica is changed to happen
the same way on every replica. To ensure a shard is never decremented the node
always keeps the shard with the highest absolute value and ignores timestamps.
Last-Write-Wins must not be allowed to affect the convergence of the values in
shards. I’m not sure how big of a threat this is in practice.
With these the values in shards and the whole counter should keep converging.
I'm probably using wrong terms for things somewhere, apologies. Please tell me
if I'm missing something about Cassandra's workings or CRDTs.
> Counters 2.0
> ------------
>
> Key: CASSANDRA-4775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4775
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Arya Goudarzi
> Assignee: Aleksey Yeschenko
> Labels: counters
> Fix For: 2.1
>
>
> The existing partitioned counters remain a source of frustration for most
> users almost two years after being introduced. The remaining problems are
> inherent in the design, not something that can be fixed given enough
> time/eyeballs.
> Ideally a solution would give us
> - similar performance
> - less special cases in the code
> - potential for a retry mechanism
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira