[
https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588423#comment-15588423
]
Chetan Mehrotra edited comment on OAK-1312 at 10/19/16 11:04 AM:
-----------------------------------------------------------------
h4. Benchmark - Result with bundling enabled
Ran a benchmark using [script|^run-benchmark.sh] with
[results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics
stats etc. Results are also summarized
[here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8]
{noformat}
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Fixtues | C | min | 10% | 50% | 90% | max | N | Reader | Mutator |
Assets# | Mongo Doc# | Mongo Size | Idx Size | Find# | Query# | Comment
|
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251 | 2504 |
3680 | 56966 | 58 | 43 | 44387 | 2808 | #default
|
| Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805 | 1798 |
3480 | 8710 | 36 | 5 | 5105 | 1906 | #bundling,ALL
|
| Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085 | 2268 |
3550 | 30162 | 58 | 22 | 26655 | 12008 |
#bundling,EXCLUDE_RENDITIONS |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
{noformat}
*Environment details*
{noformat}
$ uname -a
Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux
$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
$ mongo -version
MongoDB shell version: 2.6.4
{noformat}
*Legend*
* Mongo Doc# - number of Mongo documents across all collections
* Mongo Size - Size in MB of Mongo DB
* Idx Size - Size of all indexes in Mongo (MB)
* ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata,
jcr:content/renditions/**}}
* EXCLUDE_RENDITIONS - It uses bundling pattern {{jcr:content,
jcr:content/metadata}}
*Highlights*
* With ALL bundling there is a significant reduction in
** Mongo docs - 56966 -> 8710
** Index size - 43 -> 5
** Calls to mongo for find
* BUT there is a decrease in read/write also
** Reads 75251 -> 41805
** Updates 2504 -> 1798
* Changing the bundling pattern helps in improving reads
So bundling leads to very signification savings in Mongo level storage. However
has some adverse impacts on read and updates.
*Next Steps*
* Merge current branch to trunk - As shown in previous comment if bundling is
disabled there is no perf imapct. So its safe in disabled state
* Analyze why reads have reduced - Given that access should involve lesser
number of remote calls we need to see why reads are slow
* benchmark in more real world scenarios where the read access pattern is more
real
* Benchmark on RDB - [~reschke] Can you run it against any DB setup you have
once I have done the merge to trunk
* Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger
/cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx] [~mmarth] [~tmueller]
was (Author: chetanm):
h4. Benchmark - Result with bundling enabled
Ran a benchmark using [script|^run-benchmark.sh] with
[results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics
stats etc. Results are also summarized
[here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8]
{noformat}
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Fixtues | C | min | 10% | 50% | 90% | max | N | Reader | Mutator |
Assets# | Mongo Doc# | Mongo Size | Idx Size | Find# | Query# | Comment
|
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
| Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251 | 2504 |
3680 | 56966 | 58 | 43 | 44387 | 2808 | #default
|
| Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805 | 1798 |
3480 | 8710 | 36 | 5 | 5105 | 1906 | #bundling,ALL
|
| Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085 | 2268 |
3550 | 30162 | 58 | 22 | 26655 | 12008 |
#bundling,EXCLUDE_RENDITIONS |
+--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+
{noformat}
*Environment details*
{noformat}
$ uname -a
Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC
2014 x86_64 x86_64 x86_64 GNU/Linux
$ java -version
java version "1.8.0_66"
Java(TM) SE Runtime Environment (build 1.8.0_66-b17)
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode)
$ mongo -version
MongoDB shell version: 2.6.4
{noformat}
*Legend*
* Mongo Doc# - number of Mongo documents across all collections
* Mongo Size - Size in MB of Mongo DB
* Idx Size - Size of all indexes in Mongo (MB)
* ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata,
jcr:content/renditions/**}}
* EXCLUDE_RENDITIONS - It uses bundling pattern {{jcr:content,
jcr:content/metadata}}
*Highlights*
* With ALL bundling there is a significant reduction in
** Mongo docs - 56966 -> 8710
** Index size - 43 -> 5
** Calls to mongo for find
* BUT there is a decrease in read/write also
** Reads 75251 -> 41805
** Updates 2504 -> 1798
* Changing the bundling pattern helps in improving reads
So bundling leads to very signification savings in Mongo level storage. However
has some adverse impacts on read and updates.
*Next Steps*
* Merge current branch to trunk - As shown in previous comment if bundling is
disabled there is no perf imapct. So its safe in disabled state
* Analyze why reads have reduced - Given that access should involve lesser
number of remote calls we need to see why reads are slow
* benchmark in more real world scenarios where the read access pattern is more
real
* Benchmark on RDB - [~reschke] Can you run it against any DB setup you have
once I have done the merge to trunk
* Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger
/cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx] [~mmarth]
> Bundle nodes into a document
> ----------------------------
>
> Key: OAK-1312
> URL: https://issues.apache.org/jira/browse/OAK-1312
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core, documentmk
> Reporter: Marcel Reutegger
> Assignee: Chetan Mehrotra
> Labels: performance
> Fix For: 1.6
>
> Attachments: OAK-1312-meta-prop-handling.patch,
> OAK-1312-review-v1.diff, OAK-1312-review-v2.diff, benchmark-results.txt,
> run-benchmark.sh
>
>
> For very fine grained content with many nodes and only few properties per
> node it would be more efficient to bundle multiple nodes into a single
> MongoDB document. Mostly reading would benefit because there are less
> roundtrips to the backend. At the same time storage footprint would be lower
> because metadata overhead is per document.
> Feature branch -
> https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)