[ 
https://issues.apache.org/jira/browse/CASSANDRA-6412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14964240#comment-14964240
 ] 

Jeff Jirsa edited comment on CASSANDRA-6412 at 10/19/15 11:38 PM:
------------------------------------------------------------------

Implemented some of this just to see how 8099 works. When talking through the 
tombstone issue, I think I've stumbled upon a secondary problem in read repair 
that's tangential to, but not necessarily isolated from, tombstones. 

Assuming a read with CL > ONE, on digest mismatch, it's not sufficient to 
re-reconcile the winning cells from each replica - you'd need to read repair 
every single cell stored by each node, and then reconcile all of them again. 
For the three most obvious single-primitive resolvers (min/max/first), removing 
tombstones may allow you to simply reconcile the winning cells, and read-repair 
that winning cell back to the replicas, and things should be OK.  For more 
complex resolvers (those that may use tuples/UDTs - mean, HLL, etc), this 
becomes less likely to be true. 




was (Author: jjirsa):
Implemented some of this just to see how 8099 works. When talking through the 
tombstone issue, I think I've stumbled upon a secondary problem in read repair 
that's tangential to, but not necessarily isolated from, tombstones. 

Assuming a read with CL > ONE, on digest mismatch, it's not sufficient to 
re-reconcile the winning cells from each replica - you'd need to read repair 
every single cell stored by each node, and then reconcile all of them again. 
For the three most obvious single-primitive resolvers (min/max/first), removing 
tombstones may allow you to simply reconcile the winning cells, and read-repair 
that winning cell back to the replicas, and things should be OK.  For more 
complex resolvers (those that may use tuples/UDTs - mean, HLL, etc), this 
becomes less likely to be true. 

It seems to me that without a full vector-clock implementation, this may become 
very difficult (vector clocks would, at least, allow significantly faster read 
repair following a mismatch, as comparing the vectors would make identification 
of the missing components much easier). CASSANDRA-580 was killed years ago - 
and probably for good reason. Is there interest in reconsidering vector clocks? 

> Custom creation and merge functions for user-defined column types
> -----------------------------------------------------------------
>
>                 Key: CASSANDRA-6412
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6412
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Nicolas Favre-Felix
>
> This is a proposal for a new feature, mapping custom types to Cassandra 
> columns.
> These types would provide a creation function and a merge function, to be 
> implemented in Java by the user.
> This feature relates to the concept of CRDTs; the proposal is to replicate 
> "operations" on these types during write, to apply these operations 
> internally during merge (Column.reconcile), and to also merge their values on 
> read.
> The following operations are made possible without reading back any data:
> * MIN or MAX(value) for a column
> * First value for a column
> * Count Distinct
> * HyperLogLog
> * Count-Min
> And any composition of these too, e.g. a Candlestick type includes first, 
> last, min, and max.
> The merge operations exposed by these types need to be commutative; this is 
> the case for many functions used in analytics.
> This feature is incomplete without some integration with CASSANDRA-4775 
> (Counters 2.0) which provides a Read-Modify-Write implementation for 
> distributed counters. Integrating custom creation and merge functions with 
> new counters would let users implement complex CRDTs in Cassandra, including:
> * Averages & related (sum of squares, standard deviation)
> * Graphs
> * Sets
> * Custom registers (even with vector clocks)
> I have a working prototype with implementations for min, max, and Candlestick 
> at https://github.com/acunu/cassandra/tree/crdts - I'd appreciate any 
> feedback on the design and interfaces.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to