[
https://issues.apache.org/jira/browse/CASSANDRA-4775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693681#comment-13693681
]
Colin B. commented on CASSANDRA-4775:
-------------------------------------
Would anyone be interested in a type of counter that is about 99% correct, but
not exact?
The Hyperloglog cardinality estimation algorithm would be fairly straight
forward to implement inside Cassandra. It estimates the number of distinct
elements in a set. One way to use it as a counter is to have a two "set"s of
timeuuids, A and R. Each time you want to increment the counter add a timeuuid
to the A set, each time you want to decrement add a timeuuid to the R set.
Count is the count in A minus the count in R. Re-adding the same item
(timeuuid) to a "set" is idempotent. A read would need to access a constant
amount of internal data and the internal data is a good fit for Cassandra's
method of merging distributed writes.
A description of the Hyperloglog algorithm is available here:
http://blog.aggregateknowledge.com/2012/10/25/sketch-of-the-day-hyperloglog-cornerstone-of-a-big-data-infrastructure/
> Counters 2.0
> ------------
>
> Key: CASSANDRA-4775
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4775
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Arya Goudarzi
> Assignee: Aleksey Yeschenko
> Labels: counters
> Fix For: 2.1
>
>
> The existing partitioned counters remain a source of frustration for most
> users almost two years after being introduced. The remaining problems are
> inherent in the design, not something that can be fixed given enough
> time/eyeballs.
> Ideally a solution would give us
> - similar performance
> - less special cases in the code
> - potential for a retry mechanism
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira