Hi Timothee,

I read the paper by Mark Shapiro only a couple of month ago while I was thinking about generalising the concept of atomic counters we started to introduce for Oak [1].

To see how these ideas would apply to Oak I POCed some of them (atomic counter, last writer wins, multi value register, atomic set). Those nicely show the capabilities and mechanisms of Oak for handling (and avoiding) various type of conflicts. I plan to do a session on this at this year's .adatpTo conference in Berlin [2].

On a more theoretical note it seems that most of Shapiro's data structures become much simpler on Oak. This is somewhat expected as Oak's consistency model is much stronger (sequential consistency) than the one used in Shapiros work. AFAIR he uses a gossip protocol where independent cluster nodes can receive updates from each other at any time.

OTOH CRDTs have their limitations. The requirements that in the state based style successive states form a semi-lattice where merge operations correspond to the least upper bound of the involved states is quite a strong one. Dually the operations based style requires operations to be commutative, which is a very strict requirement. The fact that such a basic data structures as a general set (with add and remove operations) cannot be implemented as a CRDT (because add and remove do not commute) shows how strict these requirements actually are. In a way this is what we intuitively knew all the time: if the order of operations doesn't matter applying incoming messages is trivial. Or alternatively if for any two states we can find a state that is the successor of both, we know how to merge the initial states. Shapiro's biggest contribution IMO is to state this so clearly in a rigorous and formal manner. This is a great basis for reasoning about existing data structures and about which bit of functionality to rip out to get consistency back.

Anyway, my implementations so far are pretty basic. Also I didn't bother with the more sophisticated data structures like graphs and maps. So this might be a good starting point for a student's project.

Michael


[1] https://issues.apache.org/jira/browse/OAK-2220
[2] https://wiki.day.com/content/wiki/Users/mduerig/Avoiding%20and%20dealing%20with%20conflicting%20updates%20in%20Oak.html



On 4.6.15 10:31 , Timothée Maret wrote:
Hi,

I stumbled upon the CRDT concept [0,1] of data types offering strong
eventual consistency.
I am thinking such data structures could be really useful for apps wanting
to manage huge collections (Set, Map, etc.) but can't afford to linearise
writes.

At Adobe we have such use cases for cross region login infrastructure or
group management applications.

AFAIK, the idea of CRDT is to offer data types that ultimately converge to
the the same value from all observers.
The data structure is structured in such a way that changes to its state
can be done in any order which avoid mineralization yet guarantee
consistency (ultimately).

Looking at Oak, it seems it offers the bases for implementing such CRDTs.
AFAIK, Oak could be tweaked in such a way that conflicts are automatically
resolved according to specific merging rules.

I have read the presentation in [2] and I was wondering if one of those
mode of conflict handling would help to implement CRDTs.
Knowing more about configuration of conflict handling was the intend of my
question in [4].

I have seen Riak [3] is supporting them and we could do the same for Oak
(release a library of CRDTs that leverages Oak).

The added value of this library would be to offer general structures that
are easy to reason about consistency wise and would be bug-free.

Since Oak is a tree, we could likely implement a CRDT Set nicely. From the
CRDT Set we could implement a Map, a graph, and so on.

Would it make sense to investigate CRDTs for Oak ? It may make a really
interesting student project (Google Summer of Code or such).

wdyt ?


Regards,

Timothee


[0] http://hal.upmc.fr/file/index/docid/555588/filename/techreport.pdf
[1] http://arxiv.org/pdf/1201.1784v1.pdf
[2] http://www.slideshare.net/MichaelDrig/oak-39377061?related=1
[3] http://docs.basho.com/riak/2.0.2/dev/using/data-types/
[4]
https://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201505.mbox/browser

  • CRDTs in Oak Timothée Maret
    • Re: CRDTs in Oak Michael Dürig

Reply via email to