Joerg Hoh created OAK-11607:
-------------------------------

             Summary: Node.getNodes() not lazy for orderable nodetype
                 Key: OAK-11607
                 URL: https://issues.apache.org/jira/browse/OAK-11607
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.76.0
            Reporter: Joerg Hoh


in AEM we have lot of functionality, which retrieves childnodes, but does not 
consume all children of that NodeIterator.

For example we have a function like this:

{noformat}
    private boolean hasRelevantChildren(Resource resource) {
        for (Iterator<Resource> it = resource.listChildren(); it.hasNext(); ) {
            Resource r = it.next();

            // don't consider repository nodes (e.g. rep:policy) or content 
resources as children 
            if (r.getName().startsWith("rep:") || 
r.getName().equals(JcrConstants.JCR_CONTENT)
                    || r.getName().equals(JcrConstants.JCR_FROZENNODE)) {
                continue;
            }
            return true;
        }

        return false;
    }
{noformat}

which normally just reads a few nodes from the iterator. Now I have found a 
good number of occurrences of stacktraces like this:

{noformat}
        at 
org.apache.jackrabbit.oak.plugins.document.DocumentNodeState$2.iterator(DocumentNodeState.java:368)
        at java.lang.Iterable.forEach([email protected]/Iterable.java:74)
        at 
org.apache.jackrabbit.guava.common.collect.Iterables$5.forEach(Iterables.java:748)
        at 
org.apache.jackrabbit.guava.common.collect.Iterables$4.forEach(Iterables.java:586)
        at 
org.apache.jackrabbit.oak.commons.collections.CollectionUtils.toLinkedSet(CollectionUtils.java:139)
        at 
org.apache.jackrabbit.oak.plugins.tree.impl.AbstractTree.getChildNames(AbstractTree.java:129)
        at 
org.apache.jackrabbit.oak.plugins.tree.impl.AbstractTree.getChildren(AbstractTree.java:312)
        at 
org.apache.jackrabbit.oak.core.MutableTree.getChildren(MutableTree.java:178)
        at 
org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.getChildren(NodeDelegate.java:343)
        at 
org.apache.jackrabbit.oak.jcr.session.NodeImpl$8.perform(NodeImpl.java:582)
        at 
org.apache.jackrabbit.oak.jcr.session.NodeImpl$8.perform(NodeImpl.java:578)
        at 
org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:229)
        at 
org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
        at 
org.apache.jackrabbit.oak.jcr.session.NodeImpl.getNodes(NodeImpl.java:578)
        at 
org.apache.sling.jcr.resource.internal.helper.jcr.JcrNodeResource.listJcrChildren(JcrNodeResource.java:227)
        at 
org.apache.sling.jcr.resource.internal.helper.jcr.JcrResourceProvider.listChildren(JcrResourceProvider.java:404)
        at 
org.apache.sling.resourceresolver.impl.providers.stateful.AuthenticatedResourceProvider.listChildren(AuthenticatedResourceProvider.java:169)
        at 
org.apache.sling.resourceresolver.impl.helper.ResourceResolverControl.listChildren(ResourceResolverControl.java:297)
        at 
org.apache.sling.resourceresolver.impl.ResourceResolverImpl.listChildren(ResourceResolverImpl.java:546)
        at 
org.apache.sling.api.resource.AbstractResource.listChildren(AbstractResource.java:91)
        at 
org.apache.sling.api.resource.ResourceWrapper.listChildren(ResourceWrapper.java:105)
        at ...hasRelevantChildren(....java:279)
{noformat}

Looking at this stacktrace it makes me think, that that Node.getNode() is not 
entirely lazy, but at deep in oak.core {{AbstractTree.getChildNames()}} is 
called, which reads _all_ child names into a Set. Even if the underlying 
DocumentNodeState itself itself returns an iterator and would therefor be lazy.

This means, that for nodetypes with ordered childnodes getting the NodeIterator 
is an expensive operation (not even iterating over the iterator) if a lot of 
child nodes are present.

We should find a way to optimize this case and not read all all childNames 
already when building the Iterator (and therefor get a lazy semantic).










--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to