2017-07-26 17:38 GMT+02:00 Matt Ryan <[email protected]>: > Hi Arek, > > >> Regarding CompositeBlobStore -- what if customer changes the storage >> rules in the meantime (refers also to Curation section). This will >> result in the new layout for writes but binaries won't be read >> correctly, am I right? >> I guess this could be resolved by nesting: CompositeBlobStore with old >> rules and CompositeBlobStore with new rules in OverlayBlobStore, but I >> see that "curation" is for now deferred to other topic but still I >> think this is quite important IMO to think about it a little upfront >> at least for this blob store type. > > > Agreed. I’m just not sure how to go about it yet, open to ideas here. :) > > Perhaps a better approach would be to only have one type. Use the overlay > approach in Composite blob store, meaning reads will be tried to all > delegates, but if rules are supplied we use those for priority before > checking any other stores. In that case, when configuration change happens > the blob could still be found, but it would be suboptimal because it isn’t > in the expected location. The blob store could at that point start an async > background job to move it to the correct location if needed. > > WDYT? >
+1 Not focusing yet on solution or implementation under the hood but on the problem statement I guess that both are very similar to each other. Composite means for me more than one. The current meaning of Composite is RoutingDataStore (sorry for the name), whilst OverlayDataStore is OrderedDataStore to me. Both are representing typical composite pattern [2] where Composite Object implements the same interface as leaf + something that will allow to compose leaf (but not digging whether this will be ordered list or map with DS per criterion etc). To me it looks like the solution (implementation) will be very similar for both RoutingDataStore and OrderedDataStore (sorry for names again). In fact in each case we are routing reads and writes and for both we need to have a "migration" process to migrate binaries (whether online or offline) like we have async indexing process. Integrating them makes sense from caller perspective as it will be just simpler -- one composite data store which is still DataStore -- both are different only when it comes to rules defined at configuration so I think this leaves us open to what more rules you can add in the future etc. without creating another DataStore type and then migrating them supplying configurations etc). I'm thinking here even for Oak users with questions how to decide which to use? Do I need? In case of one composite you don't need to, you just specify rules you need and that's all. It looks like a bit simpler solution. What is more important it seems that one helps another to reach a desired state (obviously with temporarily lookup overhead but without downtime). I imagine that changing the rules no matter if you are doing it dynamically or during restarting the repository and whether the migration is done automatically or in offline mode via additional tool to reach a new desired state requires to have a temporary overhead as you won't be able for large binary repository to reach the state immediately... ...unless you have a layout of Data Stores already framed which to me it looks to be an edge case. Maybe in fact only routing rules (and thus migration rules) under the hood should be pluggable. I'm in favour of solution where configuration change allows to migrate automatically to a new desired layout of binaries no matter if they are divided into fast|slow|slower, important|archival|duplicated etc. and in front of CompositeDataStore the whole system works (for some binaries maybe a bit slower where there is a MISS). I guess for current CompositeDataStore there could be a fallback strategy like strict=true|false if only rules defined should be taken into account for the edge cases but then what we'll do when rules aren't changed but properties in JCR repo are changed without our control (moving binary in synchronised way to repo write). I prefer again async migration under the hood as after the write you might have immediate read and in case of fallback the binary will be retrieved anyway correctly no matter if it is copied or not. Please correct me if went much further than I should :) Thanks, Arek [2] https://en.wikipedia.org/wiki/Composite_pattern
