[ 
https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588423#comment-15588423
 ] 

Chetan Mehrotra edited comment on OAK-1312 at 10/19/16 11:04 AM:
-----------------------------------------------------------------

h4. Benchmark - Result with bundling enabled

Ran a benchmark using [script|^run-benchmark.sh] with 
[results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics 
stats etc. Results are also summarized 
[here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8]

{noformat}
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Fixtues      | C | min | 10% | 50% | 90%  | max  | N   | Reader | Mutator | 
Assets# | Mongo Doc# | Mongo Size | Idx Size | Find#  | Query# | Comment        
              |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251  | 2504    | 
3680    | 56966      | 58         | 43       | 44387  | 2808   | #default       
              |
| Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805  | 1798    | 
3480    | 8710       | 36         | 5        | 5105   | 1906   | #bundling,ALL  
              |
| Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085  | 2268    | 
3550    | 30162      | 58         | 22       | 26655  | 12008  | 
#bundling,EXCLUDE_RENDITIONS |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+

{noformat}

*Environment details*
{noformat}
$ uname -a
Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux

$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

$ mongo -version
MongoDB shell version: 2.6.4
{noformat}

*Legend*
* Mongo Doc# - number of Mongo documents across all collections
* Mongo Size - Size in MB of Mongo DB
* Idx Size - Size of all indexes in Mongo (MB)
* ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata, 
jcr:content/renditions/**}}
* EXCLUDE_RENDITIONS -  It uses bundling pattern {{jcr:content, 
jcr:content/metadata}}


*Highlights*
* With ALL bundling there is a significant reduction in 
** Mongo docs - 56966 -> 8710
** Index size - 43 -> 5
** Calls to mongo for find
* BUT there is a decrease in read/write also
** Reads 75251 -> 41805
** Updates 2504 -> 1798
* Changing the bundling pattern helps in improving reads 

So bundling leads to very signification savings in Mongo level storage. However 
has some adverse impacts on read and updates. 

*Next Steps*
* Merge current branch to trunk - As shown in previous comment if bundling is 
disabled there is no perf imapct. So its safe in disabled state
* Analyze why reads have reduced - Given that access should involve lesser 
number of remote calls we need to see why reads are slow
* benchmark in more real world scenarios where the read access pattern is more 
real
* Benchmark on RDB - [~reschke] Can you run it against any DB setup you have 
once I have done the merge to trunk
* Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger

/cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx]  [~mmarth] [~tmueller]


was (Author: chetanm):
h4. Benchmark - Result with bundling enabled

Ran a benchmark using [script|^run-benchmark.sh] with 
[results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics 
stats etc. Results are also summarized 
[here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8]

{noformat}
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Fixtues      | C | min | 10% | 50% | 90%  | max  | N   | Reader | Mutator | 
Assets# | Mongo Doc# | Mongo Size | Idx Size | Find#  | Query# | Comment        
              |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251  | 2504    | 
3680    | 56966      | 58         | 43       | 44387  | 2808   | #default       
              |
| Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805  | 1798    | 
3480    | 8710       | 36         | 5        | 5105   | 1906   | #bundling,ALL  
              |
| Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085  | 2268    | 
3550    | 30162      | 58         | 22       | 26655  | 12008  | 
#bundling,EXCLUDE_RENDITIONS |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+

{noformat}

*Environment details*
{noformat}
$ uname -a
Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 
2014 x86_64 x86_64 x86_64 GNU/Linux

$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)

$ mongo -version
MongoDB shell version: 2.6.4
{noformat}

*Legend*
* Mongo Doc# - number of Mongo documents across all collections
* Mongo Size - Size in MB of Mongo DB
* Idx Size - Size of all indexes in Mongo (MB)
* ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata, 
jcr:content/renditions/**}}
* EXCLUDE_RENDITIONS -  It uses bundling pattern {{jcr:content, 
jcr:content/metadata}}


*Highlights*
* With ALL bundling there is a significant reduction in 
** Mongo docs - 56966 -> 8710
** Index size - 43 -> 5
** Calls to mongo for find
* BUT there is a decrease in read/write also
** Reads 75251 -> 41805
** Updates 2504 -> 1798
* Changing the bundling pattern helps in improving reads 

So bundling leads to very signification savings in Mongo level storage. However 
has some adverse impacts on read and updates. 

*Next Steps*
* Merge current branch to trunk - As shown in previous comment if bundling is 
disabled there is no perf imapct. So its safe in disabled state
* Analyze why reads have reduced - Given that access should involve lesser 
number of remote calls we need to see why reads are slow
* benchmark in more real world scenarios where the read access pattern is more 
real
* Benchmark on RDB - [~reschke] Can you run it against any DB setup you have 
once I have done the merge to trunk
* Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger

/cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx]  [~mmarth]

> Bundle nodes into a document
> ----------------------------
>
>                 Key: OAK-1312
>                 URL: https://issues.apache.org/jira/browse/OAK-1312
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, documentmk
>            Reporter: Marcel Reutegger
>            Assignee: Chetan Mehrotra
>              Labels: performance
>             Fix For: 1.6
>
>         Attachments: OAK-1312-meta-prop-handling.patch, 
> OAK-1312-review-v1.diff, OAK-1312-review-v2.diff, benchmark-results.txt, 
> run-benchmark.sh
>
>
> For very fine grained content with many nodes and only few properties per 
> node it would be more efficient to bundle multiple nodes into a single 
> MongoDB document. Mostly reading would benefit because there are less 
> roundtrips to the backend. At the same time storage footprint would be lower 
> because metadata overhead is per document.
> Feature branch - 
> https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to