Hi,

I’ve been thinking the past few days about how a composite blob store might
go about prioritizing the delegate blob stores for reading and writing,
considering concepts like storage filters on a blob store, read-only blob
stores, and archive or “cold” blob stores (which we don’t currently have,
but could in the future).

Storage filters basically restrict what can be stored in a delegate - like
saying only blobs with a certain JCR property, etc.  (I realize there are
implications with this too - I’ll worry about that in a separate thread
someday.)

I’d like feedback on the following idea:
- Create a new public interface in Oak that can be injected into the
composite blob store and used to handle the delegate prioritization for
reads and writes.
- Create a default implementation of this interface that can be used in
most cases (see below).

This would allow extensibility in this area to implement new or more custom
algorithms for any future use cases, as needed, without tying it to
configuration.

The default implementation would be basically this:
- For reads:
  - Delegates with storage filters first
  - Delegates without storage filters next
  - Read-only delegates next (with filters first, then without)
  - Retry reads on delegates with with filters that were previously skipped
(this is a special case)
  - Cold storage delegates last

- For writes:
  - Search for an existing blob first using the “read” algorithm - always
update an existing blob, if one is found (except in cold storage)
  - If not found:
    - Try delegates with storage filters first
    - Delegates without storage filters next

The special case to retry reads on delegates with filters that were
previously skipped is to handle configuration change.  Essentially, if a
blob is stored in a delegate blob store, and then the configuration for
that delegate changes so that the blob wouldn’t be stored there if it was
being written now, we want to be able to locate it during the time between
when the configuration change happens and some background curator moves the
blob to the correct location.


So in short, I’d do the default implementation as described, but a
different implementation could be injected instead, if someone wanted a
more custom one.


WDYT?


-MR

Reply via email to