Re: [jr3] clustering

Michael Dürig Thu, 01 Mar 2012 09:06:06 -0800

how does MVCC fit into this? multiple revisions of the same
JCR/MK node could be stored on a B-tree node. whenever
an update happens the garbage collection could kick in an
purge outdated revisions. providing a consistent journal across
all servers is not clear to me right now.


I think MVCC is not a problem as such. To the contrary, since it is
append only it should even be less problematic. IMO garbage collection
is an entirely different story and we shouldn't worry too much about it
until we have a good working model for clustering itself.

Wrt. the journal: isn't that just the list of versions of the root node?
This should be for free then. But I think I'm missing something here...


the model I have in mind doesn't have root node versions that
correspond to MK revisions. Is this mandated somehow by the MK
API design?

in my model only the nodes that changed get new revisions.
and reading from the tree with a given revision means it
will pick the revision which is less or equal to the given revision.

e.g. if you have a node /a/b/c which was changed three times
in revision 2, 7, and 12 and a client reads at revision 9. the
implementation will return revision 7.

I don't see a need why the parent node needs to be updated
when a child node is added, removed or updated.

Hmm I see. I came up with a similar approach loooong time ago. Evenbefore the Microkernel. Anyway, I think the Microkernel API does notmandate root node versions corresponding to revisions. In fact I thinkthe approach you are proposing will scale better wrt. write contentionon the root node since there is no need for writing a new root node onevery write operation. However, getting a consistent journal acrosscluster nodes seems more difficult here as you said.

How does backup work? this is quite tricky because it is
difficult to get a consistent snapshot of the distributed
tree.


MVCC should make that easy: just make a backup of the head revision at
that time.


hmm, I'm not sure that will scale. consider a large repository
where traversing all nodes takes a long time.

I think backup should be supported at a lower level to be
efficient.


Hmm right, that makes sense.

Michael


e.g. something like proposed in [0] 4.9.

regards
  marcel

[0] http://cs.ucla.edu/~kohler/class/08w-dsi/aguilera07sinfonia.pdf

Re: [jr3] clustering

Reply via email to