Re: Usecases around Binary handling in Oak

2016-07-26 Thread Bertrand Delacretaz
Hi Chetan,

On Wed, Jun 1, 2016 at 9:30 AM, Chetan Mehrotra
 wrote:
> ...To move forward on that I have tried to collect the various usecases at [2]
> which I have seen in the past

I've thought about adding an "adopt-a-binary" feature to Sling
recently, to allow it to serve existing (disk or cloud) binaries along
with those stored in Oak.

I think this might help for your use cases. I'm not sure about
implementing it at the Sling level or in a custom BlobStore so far,
but that's not important at this point.

Here's how I envision this:

1. Client prepares a set of binaries and metadata on disk or on cloud
storage. This happens without interacting with Sling, to make it
easier to farm out the costly binary preparation / metadata extraction
etc.

2. Client POSTs those binaries and their metadata to Sling, but
instead of including the actual binaries it uploads small references
in a specific binary format that's conceptually like
SLINGREF:mystore:myfile.mp4: a constant prefix + binary URI.

4. When serving such a binary, Sling recognizes the SLINGREF: prefix
(which needs to be made robust/unique) and dereferences it to get an
InputStream

Of course this means fully delegating the management of binaries to
external tools, though they can be replaced transparently with actual
non-adopted ones.

Also, if fine-grained access control is needed it needs to be
implemented by the URI resolvers that provide the actual binaries.

But in some cases I think using Sling/Oak as "just" a metadata
decorator for existing binaries might make a lot of sense.

Although this is a Sling idea and only in the back of my mind so far,
I wanted to mention it here as there are parallels with your use
cases.

-Bertrand




> [2] https://wiki.apache.org/jackrabbit/JCR%20Binary%20Usecase


Re: Are dumb segments dumb?

2016-07-26 Thread Francesco Mari
With my latest commits on this branch [1] I enabled every previously
ignored test, fixing them when needed., The only two exceptions are
RecordUsageAnalyserTest and SegmentSizeTest, that were simply deleted.
I also added a couple of tests to cover the cases that work slightly
differently than before.

[1]: https://github.com/francescomari/jackrabbit-oak/tree/dumb

2016-07-25 17:48 GMT+02:00 Francesco Mari :
> It might be a variation in the process I tried. This shouldn't affect
> much the statistics anyway, given that the population sample is big
> enough in both cases.
>
> 2016-07-25 17:46 GMT+02:00 Michael Dürig :
>>
>> Interesting numbers. Most of them look as I would have expected. I.e. the
>> distributions in the dumb case are more regular (smaller std. dev, mean and
>> median closer to each other), bigger segment sizes, etc.
>>
>> What I don't understand is the total number of records. These numbers differ
>> greatly between current and dumb. Is this a test artefact (i.e. test not
>> reproducible) or are we missing out on something.
>>
>> Michael
>>
>>
>> On 25.7.16 4:01 , Francesco Mari wrote:
>>>
>>> I put together some statistics [1] for the process I described above.
>>> The "dumb" variant requires more segments to store the same amount of
>>> data, because of the increased size of serialised record IDs.  As you
>>> can see the amount of records per segment is definitely lower in the
>>> dumb variant.
>>>
>>> On the other hand, ignoring the growth of segment ID reference table
>>> seems to be a good choice. As shown from the segment size average,
>>> dumb segments are usually fuller that their counterpart. Moreover, a
>>> lower standard deviation shows that it's more common to have full dumb
>>> segments.
>>>
>>> In addition, my analysis seems to have found a bug too. There are a
>>> lot of segments with no segment ID references and only one record,
>>> which is very likely to be the segment info. The flush thread writes
>>> every 5 seconds the current segment buffer, provided that the buffer
>>> is not empty. It turns out that a segment buffer is never empty, since
>>> it always contains at least one record. As such, we are currently
>>> leaking almost empty segments every 5 seconds, that waste additional
>>> space on disk because of the padding required by the TAR format.
>>>
>>> [1]:
>>> https://docs.google.com/spreadsheets/d/1gXhmPsm4rDyHnle4TUh-mtB2HRtRyADXALARRFDh7z4/edit?usp=sharing
>>>
>>> 2016-07-25 10:05 GMT+02:00 Michael Dürig :


 Hi Jukka,

 Thanks for sharing your perspective and the historical background.

 I agree that repository size shouldn't be a primary concern. However, we
 have seen many repositories (especially with an external data store)
 where
 the content is extremely fine granular. Much more than in an initial
 content
 installation of CQ (which I believe was one of the initial setup for
 collecting statistics). So we should at least understand the impact of
 the
 patch in various scenarios.

 My main concern is the cache footprint of node records. Those are made up
 of
 a list of record ids and would thus grow by a factor of 6 with the
 current
 patch.

 Locality is not so much of concern here. I would expect it to actually
 improve as the patch gets rid of the 255 references limit of segments. A
 limit which in practical deployments leads to degeneration of segment
 sizes
 (I regularly see median sizes below 5k). See OAK-2896 for some background
 on
 this.
 Furthermore we already did a big step forward in improving locality in
 concurrent write scenarios when we introduced the
 SegmentBufferWriterPool.
 In essence: thread affinity for segments.

 We should probably be more carefully looking at the micro benchmarks. I
 guess we neglected this part a bit in the past. Unfortunately CI
 infrastructure isn't making this easy for us... OTOH those benchmarks
 only
 tell you so much. Many of the problems we recently faced only surfaced in
 the large: huge repos, high concurrent load, many days of traffic.

 Michael





 On 23.7.16 12:34 , Jukka Zitting wrote:
>
>
> Hi,
>
> Cool! I'm pretty sure there are various ways in which the format could
> be
> improved, as the original design was based mostly on intuition, guided
> somewhat by collected stats
> 
> and
> the micro-benchmarks 
> used
> to optimize common operations.
>
> Note though that the total size of the repository was not and probably
> shouldn't be a primary metric, since the size of a typical repository is
> governed mostly by binaries and string properties (though it's a good
> idea
> to make