[
https://issues.apache.org/jira/browse/JCR-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399849#comment-13399849
]
Unico Hommes commented on JCR-3263:
-----------------------------------
Ah yes, good thinking. Thanks Jukka.
> Consistency checker performance improvements
> --------------------------------------------
>
> Key: JCR-3263
> URL: https://issues.apache.org/jira/browse/JCR-3263
> Project: Jackrabbit Content Repository
> Issue Type: Improvement
> Reporter: Unico Hommes
> Fix For: 2.6
>
> Attachments: checkerperformance.patch
>
>
> Currently the consistency checker loads in a batch of node ids and for each
> node id fetches the corresponding bundle, its child bundles, and parent
> bundle separately. This makes the consistency checker perform less than
> optimal and may take hours (days?) to complete for large repositories.
> I've been able to make the checker execute about 20 times faster on my local
> machine by loading in batches of node prop bundles at once. For 17000 nodes
> in the workspace the current implementation ran for about 23 seconds whereas
> with the enhancements I made it finished in 1.2 seconds.
> Now the problem lies in the fact that loading in node prop bundles in batches
> may require a lot of memory. And it is not very predictable how much per
> batch size because the sizes of the individual bundles are unpredictable.
> Also the node prop bundle contains much more information than is needed for a
> consistency check.
> What would be ideal in this situation is to introduce a new type - call it
> NodeInfo - that contains only the structural information the checker needs to
> do its work. Meaning the node id, the parent id and the child ids. In order
> to allow for a possible future referential integrity check perhaps also its
> reference type propeties.
> The IterablePersistenceManager interface would then get an additional method:
> Map<NodeId, NodeInfo> getAllNodeInfos();
> If this is an acceptable proposal I would like to work on this and contribute
> a patch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira