Hi,

With respect to Oak data stores, this is something I am hoping to support
later this year after the implementation of the CompositeDataStore (which
I'm still working on).

First, the assumption is that there would be a working CompositeDataStore
that can manage multiple data stores, and can select a data store for a
blob based on something like a JCR property (I'm still figuring this part
out).  In such a case, it would be possible to add a property to blobs that
can be archived, and then the CompositeDataStore could store them in a
different location - think AWS Glacier if there were a Glacier-compatible
data store.  Of course this would require that we also support an access
pattern in Oak where Oak knows that a blob can be retrieved but cannot
reply to a request with the requested blob immediately.  Instead Oak would
have to give a response indicating "I can get it, but it will take a while"
and suggest when it might be available.

That's just one example.  I believe once I figure out the
CompositeDataStore it will be able to support a lot of neat scenarios from
on the blob store side of things anyway.

-MR

On Mon, Jun 26, 2017 at 2:22 AM, Davide Giannella <[email protected]> wrote:

> On 26/06/2017 09:00, Michael Dürig wrote:
> >
> > I agree we should have a better look at access patterns, not only for
> > indexing. I recently came across a repository with about 65% of its
> > content in the version store. That content is pretty much archived and
> > never accessed. Yet it fragments the index and thus impacts general
> > access times.
>
> I may say something stupid as usual, but here I can see for example that
> such content could be "moved to a slower repository". So for example
> speaking of segment, it could be stored in a compressed segment (rather
> than plain tar) and the repository could either automatically configure
> the indexes to skip such part or/and additionally create an ad-hoc index
> which could async by definition every, let's say, 10 seconds.
>
> We would gain on the repository size and indexing speed.
>
> Just a couple of ideas off the top of my head.
>
> Davide
>
>
>

Reply via email to