[
https://issues.apache.org/jira/browse/CASSANDRA-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904942#action_12904942
]
Jonathan Ellis commented on CASSANDRA-1421:
-------------------------------------------
I am open to suggestions to improving either this approach or the
CASSANDRA-1072 one.
But "1421 is slow for workload X" scares me a lot less than "1072 does not
allow higher write CL than ONE or idempotent retries for any workload."
> An eventually consistent approach to counting
> ---------------------------------------------
>
> Key: CASSANDRA-1421
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1421
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Jonathan Ellis
> Fix For: 0.7.0
>
>
> Counters may be implemented as multiple rows in a column family; that is,
> counters will have a configurable shard parameter; a shard factor of 128
> would have 128 rows.
> An increment will be a (uuid, count) name, value tuple. The row shard will
> be uuid % shardfactor. Timestamp is ignored. This could be implemented w/
> the existing Thrift write api, or we could add a special case method for it.
> Either is fine; the main advantage of the former is it lets increments be
> included in batch mutations.
> (Decrements we get for free as simply negative values.)
> Each node will be responsible for aggregating *the rows replicated to it*
> after GCGraceSeconds have elapsed. Count aggregation will be a scheduled
> task on each machine. This will require a mutex for each shard vs both
> writes and reads.
> This will not have the conflict resolution problem of CASSANDRA-580, or the
> write fragility of CASSANDRA-1072. Normal CL will apply on both read and
> write. Write idempotentcy is preserved. I expect writes will be faster than
> either, since no reads are required at all on the write path. Reads will be
> slower, but the read overhead can be reduced by lowering GCGraceSeconds to
> below your repair frequency if you are okay with the durability tradeoff
> there (it will not be worse than CASSANDRA-1072, for instance). More disk
> space will be used by this approach, but that is the cheapest resource we
> have.
> Special case code required will be much less than either the 580 or 1072
> approach -- primarily some code in StorageProxy to combine the uuid slices
> with their aggregation columns and sum them for all the shards, the local
> aggregation code, and minor changes to read/write path to add the mutex vs
> aggregation.
>
> We could also get rid of the Clock change and go back to i64 timestamps; if
> we're not going to use Clocks for increments I don't think they have much
> raison d'ĂȘtre. (Those of you just joining us, see
> http://pl.atyp.us/wordpress/?p=2601 for background.) The CASSANDRA-1072
> approach doesn't use Clocks either, or rather, it uses Clocks but not a
> byte[] value, which really means the Clock is unnecessary.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.