Hi Robert
> 1. What are your thoughts on the basic multiplexing implementation
> as done in this prototype?
it's an interesting approach and would allow for storing data
in backends optimized for certain usage patterns in sub trees.
as noted already by you, the big challenge is how to deal with
consistency rules imposed by commit hooks and other subsystems
when part of the tree is shared and another part is local.
so far my view on this topic is: this is possible, but only with
a number of limitations. mounted trees must not contribute to
indexes defined in the 'root' store. this implies, mounted trees
must not contain referenceable nodes. this again implies, mounted
trees must not contain versionable nodes, which are by definition
referenceable. at the same time this also avoids the problem of
the global version store.
the mounted trees must not have access control entries, because
those would need to be reflected in the global persistent store.
in short, the mounted trees must be entirely self contained.
> 2. Do you have any hints on where I should start debugging the
> error with the missing Tree in the DocumentNodeStore test [6]?
I didn't look at the code yet, but I assume the problem could be
caused by the DocumentStore reference in NodeDocument.
A NodeDocument keeps a reference to the DocumentStore where it
was loaded from and uses this store to read other NodeDocuments,
e.g. to find the commit status of changes in the current
NodeDocument. This is probably the case here. The NodeDocument
created below /tmp references a commit root document in the
other DocumentStore, which cannot be accessed using the local
DocumentStore.
Regards
Marcel
On 08/07/15 15:27, "Robert Munteanu" wrote:
>Hi,
>
>I am working on a prototype to multiplex multiple DocumentStore
>instances behind a single DocumentStore. The prototype is advanced
>enough to start a discussion on and also I have some bugs to track down
>which would probably be much easier to explain by someone with more
>knowledge of Oak internals.
>
>== Use case and high-level approach ==
>
>The scenario for this multiplexing is the following:
>
>- multiple Oak instances configured using a DocumentNodeStore
>- all DocumentNodeStore instances connect to the same physical backend,
>e.g. a mongod/mongos instance
>- each Oak instance needs a private area that is not shared with the
>other instances ( e.g. /tmp )
>
>The concept is similar to Unix filesystem mounts managed in /etc/fstab
>. A 'root' store manages the whole repository, while at certain points
>other sub-stores take over.
>
>An example configuration can be:
>
>/ <- root store
> /apps <- sub-store 1
> /libs <- sub-store 1
> /tmp <- sub-store 2
>
>== What works ==
>
>I have created a proof-of-concept implementation [1]. It's probably not
>as fast as it could be, but seems to work at the DocumentStore level.
>The key piecese are:
>
>- added a MultiplexingDocumentStore [2] which wraps two or more
>DocumentStore instances
>- allowed the MongoDocumentStore to prefix collection names [3]
>- updated the DocumentMK.Builder [4] to allow configuring a MongoDB
>backend with mounts
>
>This works fine at the DocumentStore level, as shown by the
>MultiplexingDocumentStoreTest [5].
>
>== What does not work ==
>
>I seem to have missed something as the implementation does not work as
>expected at the DocumentNodeStore level. I have written a test case [6]
>which creates and saves a Tree in a DocumentNodeStore backed by a
>MultiplexingDocumentStore.
>
>A sub-store is mounted at /tmp, and I create two trees in my test:
>
>- one at /content
>- one at /tmp
>
>The write succeeds, but when trying to retrieve the trees, the one at
>/content is found, but the one at /tmp is not...
>
>The MongoDB collections look to have the right data;
>
>- nodes ( corresponding to the root store) holds
>
>{
> "_id" : "0:/",
> "_revisions" : {
> "r14e6da2e8e0-0-1" : "c",
> "r14e6da2f6df-0-1" : "c"
> },
> "_modified" : NumberLong(1436358470),
> "_deleted" : {
> "r14e6da2e8e0-0-1" : "false"
> },
> "_modCount" : NumberLong(4),
> "_lastRev" : {
> "r0-0-1" : "r14e6da2e8e0-0-1"
> },
> "_children" : true,
> "_commitRoot" : {
>
> }
>}
>{
> "_id" : "1:/content",
> "_modified" : NumberLong(1436358470),
> "_commitRoot" : {
> "r14e6da2f6df-0-1" : "0"
> },
> "_deleted" : {
> "r14e6da2f6df-0-1" : "false"
> },
> "_modCount" : NumberLong(1)
>}
>
>- private_nodes ( corresponding to the store mounted at /tmp ) holds
>
>{
> "_id" : "1:/tmp",
> "_modified" : NumberLong(1436358470),
> "_commitRoot" : {
> "r14e6da2f6df-0-1" : "0"
> },
> "_deleted" : {
> "r14e6da2f6df-0-1" : "false"
> },
> "_modCount" : NumberLong(1)
>}
>
>== What is not expected to work now ==
>
>A number of Oak subsystems - ACLs, Indexing, etc - need to be adapted
>for this to be fully usable. This is acknowledged but needs to be
>handled separately, after I get the basic implemetation right.
>
>To wrap up the email, two questions:
>
>1. What are your thoughts on the basic multiplexing implementation as
>done in this prototype?
>2. Do you have any hints on where I should start debugging the error
>with the missing Tree in the DocumentNodeStore test [6]?
>
>Thanks,
>
>Robert
>
>
>[1]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1
>[2]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1#diff-2
>[3]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1#diff-3
>[4]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1#diff-1
>[5]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1#diff-5
>[6]: https://github.com/apache/jackrabbit
>-oak/compare/apache:trunk...rombert:features/docstore
>-multiplex?expand=1#diff-6