[ 
https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787092#action_12787092
 ] 

Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------

I've been talking w/ the authors of the interval tree clocks (ITC) paper about 
how to apply ITC to Cassandra, and it looks like we may need to modify the ITC 
algorithm for our use-case.

The crux of the matter is Cassandra's hinted hand-off feature.  The ITC 
algorithm composes an id-tree and event-tree to represent the version of a 
given value.  The id-tree is a nice way to create unique ids on-the-fly for any 
node (by splitting the id-tree, as necessary) and the event-tree represents 
causality.  However, the problem is that for a node to update the event-tree 
for a value, it has to be assigned a part of the id-tree beforehand.

A short example, follows:
If a node tries to forward a value, but (because of failure scenarios) it has 
to store the value, locally.  It wouldn't be able to update the version of the 
value, unless it had been assigned a part of the id-tree beforehand from the 
set of nodes responsible for the value.

The authors have a couple of solutions:
1) Split the id-tree between all nodes in the cluster from the very start.  
This solves the problem, but it does mute the attractive benefits of ITC over 
traditional version vectors.  i.e. dynamically partitioning the id space at 
run-time and only to the extent necessary to conserve space.
2) On client reads, doing a "fork" instead of a "peek" and sharing the id-tree 
w/ the client.  However, this is a more complicated approach that may need to 
be worked out some more.

In any case, since we're using an opaque context, these decisions won't affect 
the interface.  However, it's an interesting implementation concern.  Depending 
on the average size of a Cassandra cluster, it may or may not be worth 
pre-forking the id-tree to all nodes from the very start.

> vector clock support
> --------------------
>
>                 Key: CASSANDRA-580
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-580
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>         Environment: N/A
>            Reporter: Kelvin Kakugawa
>            Assignee: Kelvin Kakugawa
>         Attachments: 580-interface-1-add-vector-clock.diff, 
> 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch, 
> 580-thrift-v4.patch, 580-thrift-v5.patch
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long 
> timestamps.  Purpose: enable incr/decr; flexible conflict resolution.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to