2017-07-26 17:38 GMT+02:00 Matt Ryan <[email protected]>:
> Hi Arek,
>
>
>> Regarding CompositeBlobStore -- what if customer changes the storage
>> rules in the meantime (refers also to Curation section). This will
>> result in the new layout for writes but binaries won't be read
>> correctly, am I right?
>> I guess this could be resolved by nesting: CompositeBlobStore with old
>> rules and CompositeBlobStore with new rules in OverlayBlobStore, but I
>> see that "curation" is for now deferred to other topic but still I
>> think this is quite important IMO to think about it a little upfront
>> at least for this blob store type.
>
>
> Agreed.  I’m just not sure how to go about it yet, open to ideas here. :)
>
> Perhaps a better approach would be to only have one type.  Use the overlay
> approach in Composite blob store, meaning reads will be tried to all
> delegates, but if rules are supplied we use those for priority before
> checking any other stores.  In that case, when configuration change happens
> the blob could still be found, but it would be suboptimal because it isn’t
> in the expected location.  The blob store could at that point start an async
> background job to move it to the correct location if needed.
>
> WDYT?
>

+1

Not focusing yet on solution or implementation under the hood but on
the problem statement I guess that both are very similar to each
other. Composite means for me more than one. The current meaning of
Composite is RoutingDataStore (sorry for the name), whilst
OverlayDataStore is OrderedDataStore to me.

Both are representing typical composite pattern [2] where Composite
Object implements the same interface as leaf + something that will
allow to compose leaf (but not digging whether this will be ordered
list or map with DS per criterion etc).

To me it looks like the solution (implementation) will be very similar
for both RoutingDataStore and OrderedDataStore (sorry for names
again). In fact in each case we are routing reads and writes and for
both we need to have a "migration" process to migrate binaries
(whether online or offline) like we have async indexing process.

Integrating them makes sense from caller perspective as it will be
just simpler -- one composite data store which is still DataStore --
both are different only when it comes to rules defined at
configuration so I think this leaves us open to what more rules you
can add in the future etc. without creating another DataStore type and
then migrating them supplying configurations etc). I'm thinking here
even for Oak users with questions how to decide which to use? Do I
need? In case of one composite you don't need to, you just specify
rules you need and that's all. It looks like a bit simpler solution.

What is more important it seems that one helps another to reach a
desired state (obviously with temporarily lookup overhead but without
downtime).
I imagine that changing the rules no matter if you are doing it
dynamically or during restarting the repository and whether the
migration is done automatically or in offline mode via additional tool
to reach a new desired state requires to have a temporary overhead as
you won't be able for large binary repository to reach the state
immediately... ...unless you have a layout of Data Stores already
framed which to me it looks to be an edge case.

Maybe in fact only routing rules (and thus migration rules) under the
hood should be pluggable.

I'm in favour of solution where configuration change allows to migrate
automatically to a new desired layout of binaries no matter if they
are divided into fast|slow|slower, important|archival|duplicated etc.
and in front of CompositeDataStore the whole system works (for some
binaries maybe a bit slower where there is a MISS).

I guess for current CompositeDataStore there could be a fallback
strategy like strict=true|false if only rules defined should be taken
into account for the edge cases but then what we'll do when rules
aren't changed but properties in JCR repo are changed without our
control (moving binary in synchronised way to repo write). I prefer
again async migration under the hood as after the write you might have
immediate read and in case of fallback the binary will be retrieved
anyway correctly no matter if it is copied or not.

Please correct me if went much further than I should :)

Thanks,
Arek


[2] https://en.wikipedia.org/wiki/Composite_pattern

Reply via email to