Hi, With respect to Oak data stores, this is something I am hoping to support later this year after the implementation of the CompositeDataStore (which I'm still working on).
First, the assumption is that there would be a working CompositeDataStore that can manage multiple data stores, and can select a data store for a blob based on something like a JCR property (I'm still figuring this part out). In such a case, it would be possible to add a property to blobs that can be archived, and then the CompositeDataStore could store them in a different location - think AWS Glacier if there were a Glacier-compatible data store. Of course this would require that we also support an access pattern in Oak where Oak knows that a blob can be retrieved but cannot reply to a request with the requested blob immediately. Instead Oak would have to give a response indicating "I can get it, but it will take a while" and suggest when it might be available. That's just one example. I believe once I figure out the CompositeDataStore it will be able to support a lot of neat scenarios from on the blob store side of things anyway. -MR On Mon, Jun 26, 2017 at 2:22 AM, Davide Giannella <[email protected]> wrote: > On 26/06/2017 09:00, Michael Dürig wrote: > > > > I agree we should have a better look at access patterns, not only for > > indexing. I recently came across a repository with about 65% of its > > content in the version store. That content is pretty much archived and > > never accessed. Yet it fragments the index and thus impacts general > > access times. > > I may say something stupid as usual, but here I can see for example that > such content could be "moved to a slower repository". So for example > speaking of segment, it could be stored in a compressed segment (rather > than plain tar) and the repository could either automatically configure > the indexes to skip such part or/and additionally create an ad-hoc index > which could async by definition every, let's say, 10 seconds. > > We would gain on the repository size and indexing speed. > > Just a couple of ideas off the top of my head. > > Davide > > >
