[ https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588423#comment-15588423 ]
Chetan Mehrotra edited comment on OAK-1312 at 10/19/16 11:04 AM: ----------------------------------------------------------------- h4. Benchmark - Result with bundling enabled Ran a benchmark using [script|^run-benchmark.sh] with [results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics stats etc. Results are also summarized [here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8] {noformat} +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ | Fixtues | C | min | 10% | 50% | 90% | max | N | Reader | Mutator | Assets# | Mongo Doc# | Mongo Size | Idx Size | Find# | Query# | Comment | +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ | Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251 | 2504 | 3680 | 56966 | 58 | 43 | 44387 | 2808 | #default | | Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805 | 1798 | 3480 | 8710 | 36 | 5 | 5105 | 1906 | #bundling,ALL | | Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085 | 2268 | 3550 | 30162 | 58 | 22 | 26655 | 12008 | #bundling,EXCLUDE_RENDITIONS | +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ {noformat} *Environment details* {noformat} $ uname -a Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux $ java -version java version "1.8.0_66" Java(TM) SE Runtime Environment (build 1.8.0_66-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode) $ mongo -version MongoDB shell version: 2.6.4 {noformat} *Legend* * Mongo Doc# - number of Mongo documents across all collections * Mongo Size - Size in MB of Mongo DB * Idx Size - Size of all indexes in Mongo (MB) * ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata, jcr:content/renditions/**}} * EXCLUDE_RENDITIONS - It uses bundling pattern {{jcr:content, jcr:content/metadata}} *Highlights* * With ALL bundling there is a significant reduction in ** Mongo docs - 56966 -> 8710 ** Index size - 43 -> 5 ** Calls to mongo for find * BUT there is a decrease in read/write also ** Reads 75251 -> 41805 ** Updates 2504 -> 1798 * Changing the bundling pattern helps in improving reads So bundling leads to very signification savings in Mongo level storage. However has some adverse impacts on read and updates. *Next Steps* * Merge current branch to trunk - As shown in previous comment if bundling is disabled there is no perf imapct. So its safe in disabled state * Analyze why reads have reduced - Given that access should involve lesser number of remote calls we need to see why reads are slow * benchmark in more real world scenarios where the read access pattern is more real * Benchmark on RDB - [~reschke] Can you run it against any DB setup you have once I have done the merge to trunk * Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger /cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx] [~mmarth] [~tmueller] was (Author: chetanm): h4. Benchmark - Result with bundling enabled Ran a benchmark using [script|^run-benchmark.sh] with [results|^benchmark-results.txt]. Script also dumps Mongo DB stats, Metrics stats etc. Results are also summarized [here|https://docs.google.com/spreadsheets/d/1lzwDjwS-HSL0WazYBx9Wtx2ZI3J-fGl-EJ08-rxdAE8] {noformat} +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ | Fixtues | C | min | 10% | 50% | 90% | max | N | Reader | Mutator | Assets# | Mongo Doc# | Mongo Size | Idx Size | Find# | Query# | Comment | +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ | Oak-Mongo-DS | 5 | 360 | 483 | 710 | 1509 | 2843 | 350 | 75251 | 2504 | 3680 | 56966 | 58 | 43 | 44387 | 2808 | #default | | Oak-Mongo-DS | 5 | 346 | 477 | 787 | 1508 | 2498 | 336 | 41805 | 1798 | 3480 | 8710 | 36 | 5 | 5105 | 1906 | #bundling,ALL | | Oak-Mongo-DS | 5 | 312 | 469 | 746 | 1491 | 2630 | 339 | 67085 | 2268 | 3550 | 30162 | 58 | 22 | 26655 | 12008 | #bundling,EXCLUDE_RENDITIONS | +--------------+---+-----+-----+-----+------+------+-----+--------+---------+---------+------------+------------+----------+--------+--------+------------------------------+ {noformat} *Environment details* {noformat} $ uname -a Linux chetanm-laptop 3.13.0-37-generic #64-Ubuntu SMP Mon Sep 22 21:28:38 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux $ java -version java version "1.8.0_66" Java(TM) SE Runtime Environment (build 1.8.0_66-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode) $ mongo -version MongoDB shell version: 2.6.4 {noformat} *Legend* * Mongo Doc# - number of Mongo documents across all collections * Mongo Size - Size in MB of Mongo DB * Idx Size - Size of all indexes in Mongo (MB) * ALL - It uses bundling pattern {{jcr:content, jcr:content/metadata, jcr:content/renditions/**}} * EXCLUDE_RENDITIONS - It uses bundling pattern {{jcr:content, jcr:content/metadata}} *Highlights* * With ALL bundling there is a significant reduction in ** Mongo docs - 56966 -> 8710 ** Index size - 43 -> 5 ** Calls to mongo for find * BUT there is a decrease in read/write also ** Reads 75251 -> 41805 ** Updates 2504 -> 1798 * Changing the bundling pattern helps in improving reads So bundling leads to very signification savings in Mongo level storage. However has some adverse impacts on read and updates. *Next Steps* * Merge current branch to trunk - As shown in previous comment if bundling is disabled there is no perf imapct. So its safe in disabled state * Analyze why reads have reduced - Given that access should involve lesser number of remote calls we need to see why reads are slow * benchmark in more real world scenarios where the read access pattern is more real * Benchmark on RDB - [~reschke] Can you run it against any DB setup you have once I have done the merge to trunk * Benchmark with Mongo 3.x - [~mreutegg] Can you try it against Wired Tiger /cc [~mreutegg] [~catholicon] [~ianeboston] [~alexxx] [~mmarth] > Bundle nodes into a document > ---------------------------- > > Key: OAK-1312 > URL: https://issues.apache.org/jira/browse/OAK-1312 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core, documentmk > Reporter: Marcel Reutegger > Assignee: Chetan Mehrotra > Labels: performance > Fix For: 1.6 > > Attachments: OAK-1312-meta-prop-handling.patch, > OAK-1312-review-v1.diff, OAK-1312-review-v2.diff, benchmark-results.txt, > run-benchmark.sh > > > For very fine grained content with many nodes and only few properties per > node it would be more efficient to bundle multiple nodes into a single > MongoDB document. Mostly reading would benefit because there are less > roundtrips to the backend. At the same time storage footprint would be lower > because metadata overhead is per document. > Feature branch - > https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312 -- This message was sent by Atlassian JIRA (v6.3.4#6332)