Hi Arek, I like the idea generally. As was mentioned, some stores will require additional parameters not mentioned in the examples, e.g. S3DataStore will require an AWS access key, secret key, bucket name, etc.
As these become more complex the chance for error increases, so we'd want to be sure detailed error messages are provided to indicate not only a faulty configuration but give hints as to what the problem is. This would need to detect possible conflicts or overlaps that can't be determined from context or via implementation rules, or if there are gaps (e.g. no default store defined), etc. It may be worth considering the implementation of an oak-run mode that can scan the config and report status via stdout as a help for defining the nstab. This seems like it would be really useful for the federated data store concept I'm working on, as you mentioned. -MR On Fri, Apr 28, 2017 at 4:56 AM, Arek Kita <[email protected]> wrote: > Hi, > > I've noticed recently that with many different NodeStore > implementation (Segment, Document, Multiplexing) but also DataStore > implementation (File, S3, Azure) and some composite ones like > (Hierarchical, Federated - that was already mentioned in [0]) it > becomes more and more difficult to set up everything correctly and be > able to know the current persistence state of repository (especially > with pretty aged repos). > > Moreover, the configuration pattern that is based on individual PID of > one service becomes problematic (i.e. recent change for > SegmentNodeStoreService). > > From the operations and user perspective everything should be treated > as a whole IMHO no matter which service handles which fragment of > persistence layout. Oak should know itself how to "autowire" different > parts, obviously with some hints and pointers from users as they want > to run Oak in their own preferred layout. > > My proposal would be to integrate everything together to a pretty old > concept called "fstab". For our purposes I would call it "nstab". > > This could look like [1] for the most simple case (with internal > blobs), [2] for typical SegmentMK + FDS, [3] for SegmentMK + S3DS, [4] > for MultiplexingNodeStore with some areas of repo set as read only. I > think we could also model Hierarchical and Federated DataStores as > well in the future. > > Examples are for illustration purposes but I guess such setup will > help changing layout without a need to inspect many OSGi > configurations in a current setup and making sure some conflicting > ones aren't active. > > The schema is also similar to an UNIX-way of configuring filesystem so > it will help Oak users to understand the layout (at least better than > it is now). I see also advantage for automated tooling like > oak-upgrade for complex cases in the future - user just provides > source nstab and target nstab in order to migrate repository. > > The config should be also simpler avoiding things like customBlobStore > (it will be inferred from context). > > WDYT? I have some thoughts how could this be implemented but first I > would like to know your opinions on that. > > Thanks in advance for feedback! > Arek > > > [0] http://oak.markmail.org/thread/22dvuo6b7ab5ib7m > [1] https://gist.githubusercontent.com/kitarek/ > f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e3 > 8e565a9259/nstab.1 > [2] https://gist.githubusercontent.com/kitarek/ > f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e3 > 8e565a9259/nstab.2 > [3] https://gist.githubusercontent.com/kitarek/ > f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e3 > 8e565a9259/nstab.3 > [4] https://gist.githubusercontent.com/kitarek/ > f755dab6e889d1dfc5a1c595727f0171/raw/53d41ac7f935886783afd6c85d60e3 > 8e565a9259/nstab.4 >
