Re: Using BlobStore by default with SegmentNodeStore

2014-09-09 Thread Thomas Mueller
Hi,

In addition or instead of using the BlobStore, we could store the Lucene
index to the filesystem (persistence = file, path = ...).

But I would probably only do that on a case-by-case basis. I think it
would reduce, but not solve the compaction problem. Some numbers from a
test repository I have (not compacted):

* 7 million segments in 3 tar files, of which are
* 4.3 million (146 GB) data segments, and
* 2.7 million (187 GB) binary segments.

For this case, using external blobs would at most reduce the repository
size by around 60% (so 40%, 146 GB of data segments would still remain).
This change might make it possible to more efficiently compact. But I'm
not sure.

Regards,
Thomas




On 04/09/14 13:25, Chetan Mehrotra chetan.mehro...@gmail.com wrote:

Hi Team,

Currently SegmentNodeStore does not uses BlobStore by default and
stores the binary data within data tar files. This has the goodness
that

1. Backup is simpler - User just needs to backup segmentstore directory
2. No Blob GC - The RevisionGC would also delete the binary content and a
separate Blob GC need not be performed
3. Faster IO - The binary content would be fetched via memory mapped files
and hence might have better performance compared to streamed io.

However of late we are seeing issue where repository is not able to
reclaim space from deleted binary content as part of normal cleanup
and full scale compaction needs to be performed to reclaim the space.
However running compaction has other issue (see OAK-2045) and
currently it needs to be performed offline to get optimum results.

In quite a few cases it has been see that repository growth is mostly
due to Lucene index content changes which leads to creation of new
binary content and also causes fragmentation due to newer revisions.
Further as Segment logic does not perform de duplication any change in
Lucene index file would probably re create the whole index file (need
to confirm).

Given that such repository growth is troublesome it might be better if
we configure a BlobStore by default with SegmentNodeStore (or atleast
for applications like AEM). This should reduce the rate of repository
growth due to

1. De duplication - BlobStore and DataStore (current impls) implement
de duplication so adding same binary would not cause size growth

2. Lesser Fragmentation - As large binary content would not be part of
data tar files Blob GC would be able to reclaim space. Currently
in a cleanup if even one bulk segment in a data tar is having a
reference the cleanup would not be able to remove that. That space can
only be reclaimed via compaction.

Compared to benefits mentioned initially

1. Backup - User needs to backup two folders
2. Blob GC would need to be run separately
3. Faster IO - That needs to be seen. For Lucene this can be mitigated
to an extent with proposed CopyOnReadDirectory support in OAK-1724

Further we also get the benefit of sharing the BlobStore between
multiple instances if required!!

Thoughts?

Chetan Mehrotra



Re: Using BlobStore by default with SegmentNodeStore

2014-09-09 Thread Chetan Mehrotra
I have updated OAK-2082 with test run results. Looking at the result I
think FDS does provide a benefit in terms of lesser storage space.

Putting Lucene index on file system provides best storage efficiency
but then it would not work once we have TarMK failover implemented.
Chetan Mehrotra


On Tue, Sep 9, 2014 at 12:44 PM, Thomas Mueller muel...@adobe.com wrote:
 Hi,

 In addition or instead of using the BlobStore, we could store the Lucene
 index to the filesystem (persistence = file, path = ...).

 But I would probably only do that on a case-by-case basis. I think it
 would reduce, but not solve the compaction problem. Some numbers from a
 test repository I have (not compacted):

 * 7 million segments in 3 tar files, of which are
 * 4.3 million (146 GB) data segments, and
 * 2.7 million (187 GB) binary segments.

 For this case, using external blobs would at most reduce the repository
 size by around 60% (so 40%, 146 GB of data segments would still remain).
 This change might make it possible to more efficiently compact. But I'm
 not sure.

 Regards,
 Thomas




 On 04/09/14 13:25, Chetan Mehrotra chetan.mehro...@gmail.com wrote:

Hi Team,

Currently SegmentNodeStore does not uses BlobStore by default and
stores the binary data within data tar files. This has the goodness
that

1. Backup is simpler - User just needs to backup segmentstore directory
2. No Blob GC - The RevisionGC would also delete the binary content and a
separate Blob GC need not be performed
3. Faster IO - The binary content would be fetched via memory mapped files
and hence might have better performance compared to streamed io.

However of late we are seeing issue where repository is not able to
reclaim space from deleted binary content as part of normal cleanup
and full scale compaction needs to be performed to reclaim the space.
However running compaction has other issue (see OAK-2045) and
currently it needs to be performed offline to get optimum results.

In quite a few cases it has been see that repository growth is mostly
due to Lucene index content changes which leads to creation of new
binary content and also causes fragmentation due to newer revisions.
Further as Segment logic does not perform de duplication any change in
Lucene index file would probably re create the whole index file (need
to confirm).

Given that such repository growth is troublesome it might be better if
we configure a BlobStore by default with SegmentNodeStore (or atleast
for applications like AEM). This should reduce the rate of repository
growth due to

1. De duplication - BlobStore and DataStore (current impls) implement
de duplication so adding same binary would not cause size growth

2. Lesser Fragmentation - As large binary content would not be part of
data tar files Blob GC would be able to reclaim space. Currently
in a cleanup if even one bulk segment in a data tar is having a
reference the cleanup would not be able to remove that. That space can
only be reclaimed via compaction.

Compared to benefits mentioned initially

1. Backup - User needs to backup two folders
2. Blob GC would need to be run separately
3. Faster IO - That needs to be seen. For Lucene this can be mitigated
to an extent with proposed CopyOnReadDirectory support in OAK-1724

Further we also get the benefit of sharing the BlobStore between
multiple instances if required!!

Thoughts?

Chetan Mehrotra



Re: Using BlobStore by default with SegmentNodeStore

2014-09-05 Thread Michael Dürig



On 4.9.14 1:25 , Chetan Mehrotra wrote:


Given that such repository growth is troublesome it might be better if
we configure a BlobStore by default with SegmentNodeStore (or atleast
for applications like AEM). This should reduce the rate of repository
growth due to


I'd leave the default as it is for Oak as this has the beauty of 
simplicity. We could just change it for applications where we know that 
the inline storing of binaries is troublesome.


OTOH in the longer term we should address the underlying issue and get 
compaction to work properly. If changing the default helps us with that 
(i.e. giving us some air to breath, gain additional information), I'm 
all in favour of such a move.




1. De duplication - BlobStore and DataStore (current impls) implement
de duplication so adding same binary would not cause size growth

2. Lesser Fragmentation - As large binary content would not be part of
data tar files Blob GC would be able to reclaim space. Currently
in a cleanup if even one bulk segment in a data tar is having a
reference the cleanup would not be able to remove that. That space can
only be reclaimed via compaction.


Do we have enough evidence backing those claims or is this just what we 
would reasonable expect? I.e. if we see that such a change would reduce 
growth to an acceptable rate, +1. Otherwise let's gather that evidence ;-)


Michael


Using BlobStore by default with SegmentNodeStore

2014-09-04 Thread Chetan Mehrotra
Hi Team,

Currently SegmentNodeStore does not uses BlobStore by default and
stores the binary data within data tar files. This has the goodness
that

1. Backup is simpler - User just needs to backup segmentstore directory
2. No Blob GC - The RevisionGC would also delete the binary content and a
separate Blob GC need not be performed
3. Faster IO - The binary content would be fetched via memory mapped files
and hence might have better performance compared to streamed io.

However of late we are seeing issue where repository is not able to
reclaim space from deleted binary content as part of normal cleanup
and full scale compaction needs to be performed to reclaim the space.
However running compaction has other issue (see OAK-2045) and
currently it needs to be performed offline to get optimum results.

In quite a few cases it has been see that repository growth is mostly
due to Lucene index content changes which leads to creation of new
binary content and also causes fragmentation due to newer revisions.
Further as Segment logic does not perform de duplication any change in
Lucene index file would probably re create the whole index file (need
to confirm).

Given that such repository growth is troublesome it might be better if
we configure a BlobStore by default with SegmentNodeStore (or atleast
for applications like AEM). This should reduce the rate of repository
growth due to

1. De duplication - BlobStore and DataStore (current impls) implement
de duplication so adding same binary would not cause size growth

2. Lesser Fragmentation - As large binary content would not be part of
data tar files Blob GC would be able to reclaim space. Currently
in a cleanup if even one bulk segment in a data tar is having a
reference the cleanup would not be able to remove that. That space can
only be reclaimed via compaction.

Compared to benefits mentioned initially

1. Backup - User needs to backup two folders
2. Blob GC would need to be run separately
3. Faster IO - That needs to be seen. For Lucene this can be mitigated
to an extent with proposed CopyOnReadDirectory support in OAK-1724

Further we also get the benefit of sharing the BlobStore between
multiple instances if required!!

Thoughts?

Chetan Mehrotra


Re: Using BlobStore by default with SegmentNodeStore

2014-09-04 Thread Davide Giannella
On 04/09/2014 12:25, Chetan Mehrotra wrote:
 ... (supermegacut!)

 Thoughts?

As you mentioned AEM, the deployment based on JR2 already delivers 2
different directories for repository/segment and blobs.

Both AEM and JR2 are used to run separate tasks for cleaning the blobs IIRC.

So I'm in favour of having as default segment+blob. My only concern are
the deployment already in place. We may need (if not there already) a
process/tool for migrating between the two scenarios.

Davide