Hi,
Thanks for putting this together. I think this makes a lot of sense. Primarily since it will reduce coupling with the underlying storage mechanism. But also since it pro-actively tackles our main pain points like write scalability, large and flat hierarchies, fragmentation and remotability.
I think we should give this a try and go forward with it. Maybe within some (time)-boxed POC so we could better evaluate the overall impact and validate the idea.
On 29.1.13 9:10, Jukka Zitting wrote: [...]
Journals ======== Journals are special, atomically updated documents that record the state of the repository as a sequence of references to successive root node records. A small system could consist of just a single journal and would serialize all repository updates through atomic updates of that journal. A larger system that needs more write throughput can have more journals, linked to each other in a tree hierarchy. Commits to journals in lower levels of the tree can proceed concurrently, but will need to be periodically merged back to the root journal. Potential conflicts and resulting data loss or inconsistency caused by such merges can be avoided by always committing against the root journal.
Nice idea!! This puts clients into control over the trade off between consistency and availability by choosing the "right" journal to commit to.
I think this approach has a lot of potential, which will only be fully unveiled further down the line: in a Twitter like application, different message streams will probably never conflict and never block each other since they could just commit to different journals. OTOH applications which need strong consistency guarantees can just commit to the root journal.
Furthermore it nicely generalises the branch and merge concept of oak-core and it will go nicely along with handling conflicts on branches as discussed earlier: http://markmail.org/message/wtaarmdtgyf5lvjt
[...]
Node records ------------ The overall structure of the content tree is stored in node records. Node records hold the actual content structure of the repository. A typical node record consists of a template reference followed by property value references (list references for multivalued properties) and zero, one or more child node entries as indicated by the template. If the node has more than one child nodes, then those entries are stored as an array of name-node pairs of references.
Maybe we can even pick up an earlier idea and use the node type information (i.e. template records here) to optimise how nodes are stored. That is, whether and which child nodes are inlined. While recursive node types (like nt:folder) are obviously bad candidates for inlining fully. For others, non recursive node types, the child node definitions might provide some valuable information about locality.
Michael
