[
https://issues.apache.org/jira/browse/OAK-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15430076#comment-15430076
]
Chetan Mehrotra edited comment on OAK-1312 at 8/22/16 6:46 AM:
---------------------------------------------------------------
Planned feature work is now done and [patch|^OAK-1312-review-v1.diff] is ready
for review.
h3. Implementation Details
Some details are provided [above|#UsageandConfiguration].
h4. Commit Side Changes
{{CommitDiff}} obtains a {{BundlingHandler}} from {{DocumentNodeStore}} which
takes care of bundling relative node. For any new node getting added it looks
up {{DocumentBundlor}} from {{BundledTypesRegistry}} based on its primary type
or mixin. {{DocumentBundlor}} generates {{Matcher}} which determine if give
nodestate needs to be bundled. As {{CommitDiff}} traverses downwards new
matchers get generated from parent matcher.
* The pattern itself is saved as part of nodestate.
* hasChildren support - Children status is managed separately for bundled and
non bundled child node. This is later used to optimize calls around child
access in {{DocumentNodeState}}
** For each bundled node {{:doc-has-child-bundled}} property is set to true to
indicate that parent node has a bundled child
** For each non bundled node {{:doc-has-child-non-bundled}} property is set to
true to indicate that parent node has a non bundled node.
h4. Bundling Config
Bundling as a whole feature is controlled via {{bundlingEnabled}} flag on
{{DocumentNodeStoreService}}. If the flag is disabled then bundling would be
disabled for *new nodes only*. The bundling config is stored in repository and
{{BundlingConfigHandler}} observes any changes around that and refreshes the
{{BundledTypesRegistry}} in case of any change
h4. Reading Side Change
On reading side {{DocumentNodeState}} would construct a {{BundlingContext}}. In
case a bundling pattern is found then bundling context would filter out
properties as per current node. For any child lookup it would determine if the
child is bundled then it would construct a {{DocumentNodeState}} instance from
properties of bundling root. For listing of child node it would provide a merge
iterator of bundled and non bundled nodes. In case it can be determined that
all nodes are bundled then it would avoid the call to DocumentNodeStore
h3. Open Question
# *Config Path* - Currently the bundling config is stored as node in repository
itself under {{/jcr:system/documentstore/bundlor}}. Should that be final name.
Any steps needs to be taken to make it secure
# *Wildcard Support* - Design has support for wildcard in bundling pattern.
Should we allow that or restrict that for initial release
# *Boostrapping default config* - Per default we should ship with a bundling
pattern for {{nt:file}}. Logic for that is implemented in
{{BundlingConfigInitializer}}. How should that be registered with Oak. For test
it is getting invoked from within {{OakMongoNSRepositoryStub}}. For production
setup how should this initializer be registered. One approach would be to
expose it as OSGi service and then have a new
{{WhiteboardRepositoryInitializer}} implementation and have that registered
with Oak class
[~mreutegg] [~catholicon] Please review the feature patch and provide feedback
so that it can be merged to trunk! Patch is big but quite a bit of stuff is
around test. Key parts are changes in {{DocumentNodeState}} , {{CommitDiff}},
{{Commit}} and {{BundlingHandler}}
*Update* - Hold on for review as some conflicts are seen with this feature
enabled and package installation. Would ping back once analyzed that
was (Author: chetanm):
Planned feature work is now done and [patch|^OAK-1312-review-v1.diff] is ready
for review.
h3. Implementation Details
Some details are provided [above|#UsageandConfiguration].
h4. Commit Side Changes
{{CommitDiff}} obtains a {{BundlingHandler}} from {{DocumentNodeStore}} which
takes care of bundling relative node. For any new node getting added it looks
up {{DocumentBundlor}} from {{BundledTypesRegistry}} based on its primary type
or mixin. {{DocumentBundlor}} generates {{Matcher}} which determine if give
nodestate needs to be bundled. As {{CommitDiff}} traverses downwards new
matchers get generated from parent matcher.
* The pattern itself is saved as part of nodestate.
* hasChildren support - Children status is managed separately for bundled and
non bundled child node. This is later used to optimize calls around child
access in {{DocumentNodeState}}
** For each bundled node {{:doc-has-child-bundled}} property is set to true to
indicate that parent node has a bundled child
** For each non bundled node {{:doc-has-child-non-bundled}} property is set to
true to indicate that parent node has a non bundled node.
h4. Bundling Config
Bundling as a whole feature is controlled via {{bundlingEnabled}} flag on
{{DocumentNodeStoreService}}. If the flag is disabled then bundling would be
disabled for *new nodes only*. The bundling config is stored in repository and
{{BundlingConfigHandler}} observes any changes around that and refreshes the
{{BundledTypesRegistry}} in case of any change
h4. Reading Side Change
On reading side {{DocumentNodeState}} would construct a {{BundlingContext}}. In
case a bundling pattern is found then bundling context would filter out
properties as per current node. For any child lookup it would determine if the
child is bundled then it would construct a {{DocumentNodeState}} instance from
properties of bundling root. For listing of child node it would provide a merge
iterator of bundled and non bundled nodes. In case it can be determined that
all nodes are bundled then it would avoid the call to DocumentNodeStore
h3. Open Question
# *Config Path* - Currently the bundling config is stored as node in repository
itself under {{/jcr:system/documentstore/bundlor}}. Should that be final name.
Any steps needs to be taken to make it secure
# *Wildcard Support* - Design has support for wildcard in bundling pattern.
Should we allow that or restrict that for initial release
# *Boostrapping default config* - Per default we should ship with a bundling
pattern for {{nt:file}}. Logic for that is implemented in
{{BundlingConfigInitializer}}. How should that be registered with Oak. For test
it is getting invoked from within {{OakMongoNSRepositoryStub}}. For production
setup how should this initializer be registered. One approach would be to
expose it as OSGi service and then have a new
{{WhiteboardRepositoryInitializer}} implementation and have that registered
with Oak class
[~mreutegg] [~catholicon] Please review the feature patch and provide feedback
so that it can be merged to trunk! Patch is big but quite a bit of stuff is
around test. Key parts are changes in {{DocumentNodeState}} , {{CommitDiff}},
{{Commit}} and {{BundlingHandler}}
> Bundle nodes into a document
> ----------------------------
>
> Key: OAK-1312
> URL: https://issues.apache.org/jira/browse/OAK-1312
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core, documentmk
> Reporter: Marcel Reutegger
> Assignee: Chetan Mehrotra
> Labels: performance
> Fix For: 1.6
>
> Attachments: OAK-1312-review-v1.diff
>
>
> For very fine grained content with many nodes and only few properties per
> node it would be more efficient to bundle multiple nodes into a single
> MongoDB document. Mostly reading would benefit because there are less
> roundtrips to the backend. At the same time storage footprint would be lower
> because metadata overhead is per document.
> Feature branch -
> https://github.com/chetanmeh/jackrabbit-oak/compare/trunk...chetanmeh:OAK-1312
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)