[
https://issues.apache.org/jira/browse/OAK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14390464#comment-14390464
]
Julian Sedding commented on OAK-2586:
-------------------------------------
After looking into the upgrade code some more, I have some replies to your
questions.
The short answer is: wrapping the {{NodeState}} does not make anything worse
than it was before.
During an upgrade, {{NodeState}}s are copied between two different
{{NodeStore}} instances. The target instance is empty, so only
{{childNodeAdded}} and {{propertyAdded}} events are generated. Comparing
{{NodeState}}s is only required in the case of {{childNodeChanged}}, however.
So no impact in this case.
In OAK-2619 I propose to open up the upgrade tool to run the same upgrade
multiple times. E.g. run the full upgrade today and run it again tomorrow for
the delta.
In this scenario comparing {{NodeState}}s becomes very important. During my
experiments I realized that there were lots of false-positive
{{childNodeChanged}} events, which caused additional load on the commit-hooks,
but it also indicated that more writes than necessary for an incremental update
were happening.
My expectation was that for the extreme case where there is no delta (i.e.
repeated upgrade from the same source), I would expect a single
repository-traversal to determine that nothing has changed and virtually no
time spent in commit-hooks.
In order to achieve this goal, I implemented a {{NodeStateCopier}} class, which
copies changes in a post-order traversal, essentially only creating/deleting
nodes and updating properties.
I believe without prior knowledge of the differences in the source repository,
the upgrade cannot be made more efficient than a single repository traversal.
Would you agree?
> Support including and excluding paths during upgrade
> ----------------------------------------------------
>
> Key: OAK-2586
> URL: https://issues.apache.org/jira/browse/OAK-2586
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: upgrade
> Affects Versions: 1.1.6
> Reporter: Julian Sedding
> Labels: patch
> Attachments: OAK-2586.patch
>
>
> When upgrading a Jackrabbit 2 to an Oak repository it can be desirable to
> constrain which paths/sub-trees should be copied from the source repository.
> Not least because this can (drastically) reduce the amount of content that
> needs to be traversed, copied and indexed.
> I suggest to allow filtering the content visible from the source repository
> by wrapping the JackrabbitNodeState instance and hiding selected paths.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)