Hi, I am working on a prototype to multiplex multiple DocumentStore instances behind a single DocumentStore. The prototype is advanced enough to start a discussion on and also I have some bugs to track down which would probably be much easier to explain by someone with more knowledge of Oak internals.
== Use case and high-level approach == The scenario for this multiplexing is the following: - multiple Oak instances configured using a DocumentNodeStore - all DocumentNodeStore instances connect to the same physical backend, e.g. a mongod/mongos instance - each Oak instance needs a private area that is not shared with the other instances ( e.g. /tmp ) The concept is similar to Unix filesystem mounts managed in /etc/fstab . A 'root' store manages the whole repository, while at certain points other sub-stores take over. An example configuration can be: / <- root store /apps <- sub-store 1 /libs <- sub-store 1 /tmp <- sub-store 2 == What works == I have created a proof-of-concept implementation [1]. It's probably not as fast as it could be, but seems to work at the DocumentStore level. The key piecese are: - added a MultiplexingDocumentStore [2] which wraps two or more DocumentStore instances - allowed the MongoDocumentStore to prefix collection names [3] - updated the DocumentMK.Builder [4] to allow configuring a MongoDB backend with mounts This works fine at the DocumentStore level, as shown by the MultiplexingDocumentStoreTest [5]. == What does not work == I seem to have missed something as the implementation does not work as expected at the DocumentNodeStore level. I have written a test case [6] which creates and saves a Tree in a DocumentNodeStore backed by a MultiplexingDocumentStore. A sub-store is mounted at /tmp, and I create two trees in my test: - one at /content - one at /tmp The write succeeds, but when trying to retrieve the trees, the one at /content is found, but the one at /tmp is not... The MongoDB collections look to have the right data; - nodes ( corresponding to the root store) holds { "_id" : "0:/", "_revisions" : { "r14e6da2e8e0-0-1" : "c", "r14e6da2f6df-0-1" : "c" }, "_modified" : NumberLong(1436358470), "_deleted" : { "r14e6da2e8e0-0-1" : "false" }, "_modCount" : NumberLong(4), "_lastRev" : { "r0-0-1" : "r14e6da2e8e0-0-1" }, "_children" : true, "_commitRoot" : { } } { "_id" : "1:/content", "_modified" : NumberLong(1436358470), "_commitRoot" : { "r14e6da2f6df-0-1" : "0" }, "_deleted" : { "r14e6da2f6df-0-1" : "false" }, "_modCount" : NumberLong(1) } - private_nodes ( corresponding to the store mounted at /tmp ) holds { "_id" : "1:/tmp", "_modified" : NumberLong(1436358470), "_commitRoot" : { "r14e6da2f6df-0-1" : "0" }, "_deleted" : { "r14e6da2f6df-0-1" : "false" }, "_modCount" : NumberLong(1) } == What is not expected to work now == A number of Oak subsystems - ACLs, Indexing, etc - need to be adapted for this to be fully usable. This is acknowledged but needs to be handled separately, after I get the basic implemetation right. To wrap up the email, two questions: 1. What are your thoughts on the basic multiplexing implementation as done in this prototype? 2. Do you have any hints on where I should start debugging the error with the missing Tree in the DocumentNodeStore test [6]? Thanks, Robert [1]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1 [2]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1#diff-2 [3]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1#diff-3 [4]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1#diff-1 [5]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1#diff-5 [6]: https://github.com/apache/jackrabbit -oak/compare/apache:trunk...rombert:features/docstore -multiplex?expand=1#diff-6