[
https://issues.apache.org/jira/browse/CASSANDRA-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787092#action_12787092
]
Kelvin Kakugawa commented on CASSANDRA-580:
-------------------------------------------
I've been talking w/ the authors of the interval tree clocks (ITC) paper about
how to apply ITC to Cassandra, and it looks like we may need to modify the ITC
algorithm for our use-case.
The crux of the matter is Cassandra's hinted hand-off feature. The ITC
algorithm composes an id-tree and event-tree to represent the version of a
given value. The id-tree is a nice way to create unique ids on-the-fly for any
node (by splitting the id-tree, as necessary) and the event-tree represents
causality. However, the problem is that for a node to update the event-tree
for a value, it has to be assigned a part of the id-tree beforehand.
A short example, follows:
If a node tries to forward a value, but (because of failure scenarios) it has
to store the value, locally. It wouldn't be able to update the version of the
value, unless it had been assigned a part of the id-tree beforehand from the
set of nodes responsible for the value.
The authors have a couple of solutions:
1) Split the id-tree between all nodes in the cluster from the very start.
This solves the problem, but it does mute the attractive benefits of ITC over
traditional version vectors. i.e. dynamically partitioning the id space at
run-time and only to the extent necessary to conserve space.
2) On client reads, doing a "fork" instead of a "peek" and sharing the id-tree
w/ the client. However, this is a more complicated approach that may need to
be worked out some more.
In any case, since we're using an opaque context, these decisions won't affect
the interface. However, it's an interesting implementation concern. Depending
on the average size of a Cassandra cluster, it may or may not be worth
pre-forking the id-tree to all nodes from the very start.
> vector clock support
> --------------------
>
> Key: CASSANDRA-580
> URL: https://issues.apache.org/jira/browse/CASSANDRA-580
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Environment: N/A
> Reporter: Kelvin Kakugawa
> Assignee: Kelvin Kakugawa
> Attachments: 580-interface-1-add-vector-clock.diff,
> 580-interface-2-add-vector-clock.diff, 580-thrift-v3.patch,
> 580-thrift-v4.patch, 580-thrift-v5.patch
>
> Original Estimate: 672h
> Remaining Estimate: 672h
>
> Allow a ColumnFamily to be versioned via vector clocks, instead of long
> timestamps. Purpose: enable incr/decr; flexible conflict resolution.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.