Hi Matt, On Thu, Apr 20, 2017 at 11:50 PM, Matt Ryan <o...@mvryan.org> wrote: > Oak would then be > able to choose which data store to use based on a number of criteria, like > file size, JCR path, node type, existence of a node property, a node > property value, or other items, or a combination of items. In my thinking > these are defined in configuration so the federated data store would know > how to select which data store is used to store which binary.
This would need some more details. The way a binary gets written using the JCR API is 1. Code create a Binary using ValueFactory say by spooling the stream. By this time binary is already added to DataStore 2. The returned binary reference is then stored as part of JCR Node by setting the passed Binary property. So to make storage of Binary a function of final Node would require some more thought. A federated store has 2 aspects 1. Writing a binary - Destination store selection = f(node, path, user option) 2. Reading a binary - This would be simple as the actual store information would be encoded within the blobId (like some url?) and then BlobStore which needs to be used for reading should be selected based on scheme in blobid Further current Blob related API is used in following ways B1. Code logic dealing with blob creation - JCR ValueFactory, NodeStore#createBlob. They only work with BlobStore api B2. Code logic dealing with BlobGC - It uses methods in GarbageCollectableBlobStore Amit added a BlobStore#writeBlob(InputStream, BlobOption) as part of OAK-5174. This can now be extended to support Federated usecase. One possible approach can be like below 1. Setup would have multiple BlobStore service implementations registered. 2. These service would have a property "type" defined to indicate the scheme. 3. The setup would have a default BlobStore and multiple secondary stores 4. Any code in #B1 above would be dealing with a FederatedBlobStore aka the "master"/primary store 5. The NodeStores would be bound to this "master" BlobStore FederatedBlobStore would use the default store for any Binary created via NodeStore#createBlob. . However any call to BlobStore#writeBlob(InputStream, BlobOption) would be passed to other stored which can indicate if they can handle the call or not. If yes then they would return the Blob ID. We can also look into exposing the new method as part of NodeStore API OakValueFactory can then wrap the "context" i.e. path, node etc as part of BlobOption which can then be used for store selection. How this impacts the GC logic would also needs to be thought about. Chetan Mehrotra PS: Above is more of a brain dump in thinking out loud mode :)