Re: MongoMK^2 design proposal

Michael Dürig Tue, 29 Jan 2013 03:44:06 -0800

Hi,

Thanks for putting this together. I think this makes a lot of sense.Primarily since it will reduce coupling with the underlying storagemechanism. But also since it pro-actively tackles our main pain pointslike write scalability, large and flat hierarchies, fragmentation andremotability.

I think we should give this a try and go forward with it. Maybe withinsome (time)-boxed POC so we could better evaluate the overall impact andvalidate the idea.


On 29.1.13 9:10, Jukka Zitting wrote:

[...]

Journals
========

Journals are special, atomically updated documents that record the
state of the repository as a sequence of references to successive
root node records.

A small system could consist of just a single journal and would
serialize all repository updates through atomic updates of that journal.
A larger system that needs more write throughput can have more journals,
linked to each other in a tree hierarchy. Commits to journals in lower
levels of the tree can proceed concurrently, but will need to be
periodically merged back to the root journal. Potential conflicts and
resulting data loss or inconsistency caused by such merges can be avoided
by always committing against the root journal.

Nice idea!! This puts clients into control over the trade off betweenconsistency and availability by choosing the "right" journal to commit to.

I think this approach has a lot of potential, which will only be fullyunveiled further down the line: in a Twitter like application, differentmessage streams will probably never conflict and never block each othersince they could just commit to different journals. OTOH applicationswhich need strong consistency guarantees can just commit to the rootjournal.

Furthermore it nicely generalises the branch and merge concept ofoak-core and it will go nicely along with handling conflicts on branchesas discussed earlier: http://markmail.org/message/wtaarmdtgyf5lvjt


[...]

Node records
------------

The overall structure of the content tree is stored in node records.
Node records hold the actual content structure of the repository.

A typical node record consists of a template reference followed by
property value references (list references for multivalued properties)
and zero, one or more child node entries as indicated by the template.
If the node has more than one child nodes, then those entries are stored
as an array of name-node pairs of references.

Maybe we can even pick up an earlier idea and use the node typeinformation (i.e. template records here) to optimise how nodes arestored. That is, whether and which child nodes are inlined. Whilerecursive node types (like nt:folder) are obviously bad candidates forinlining fully. For others, non recursive node types, the child nodedefinitions might provide some valuable information about locality.



Michael

Re: MongoMK^2 design proposal

Reply via email to