[
https://issues.apache.org/jira/browse/SLING-10447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17451871#comment-17451871
]
Nitin Gupta commented on SLING-10447:
-------------------------------------
{quote}It works because the ntBaseLucene index has evaluatePathRestrictions=true
{quote}
evaluatePathRestrictions does work for isDescendantNode, but I don't think it
works for the NOT isDescendantNode.
{quote}the amount of nodes that is fetched from the index is drastically less
than without that restriction.
{quote}
HI [~Henry Kuijpers]
If the test was done on the final result set, then yes, this could have worked,
but not because the index supported it, but most probably because the NOT
condition was applied in the query engine.
So something like
{code:java}
(ISDESCENDANTNODE('/content') OR ISDESCENDANTNODE('/libs') OR
ISDESCENDANTNODE('/apps')) {code}
would work effectively with index.
but
{code:java}
(NOT isdescendantNode(\"/jcr:system\") AND not
isdescendantnode('/content/usergenerated')) {code}
this might not.
A good test case to verify this would be -
Assuming query limit is set to 100000
Create >100000 nodes where sling:vanityPath is not null. Out of these, keep 80%
of nodes under /jcr:system.
Then verify this query -
{code:java}
SELECT sling:vanityPath from nt:base WHERE sling:vanityPath IS NOT NULL {code}
This should hit the Index-Traversal limit.
Then try with
{code:java}
SELECT sling:vanityPath from nt:base WHERE sling:vanityPath IS NOT NULL AND NOT
isdescendantNode(\"/jcr:system\"){code}
you should notice that the query would still hit the index traversal read limit
- because the NOT operation is not directly served from the index.
I can do some additional testing on this in a couple of days, but I am almost
certain that the NOT condition is not directly served from the index.
> sling:vanityPath are being searched during startup in the entire repository,
> including version storage
> ------------------------------------------------------------------------------------------------------
>
> Key: SLING-10447
> URL: https://issues.apache.org/jira/browse/SLING-10447
> Project: Sling
> Issue Type: Bug
> Components: ResourceResolver
> Affects Versions: Resource Resolver 1.7.6
> Reporter: Henry Kuijpers
> Assignee: Robert Munteanu
> Priority: Major
> Fix For: Resource Resolver 1.8.0
>
> Attachments: with-path-restrictions.json,
> without-path-restrictions.json
>
> Time Spent: 6h 50m
> Remaining Estimate: 0h
>
> We have a lot of pages on our production author instance. We also have a lot
> of pages that use sling:vanityPath. Everytime a page is replicated, a new
> version is created.
> We have roughly 168.000 pages in our instance. In the /content node, there
> are approx. 4500 pages with vanity URLs. In the version storage, however,
> there are 120.000+ entries that have a sling:vanityPath defined.
> When starting up Apache Sling, the Resource Resolver fetches all the existing
> sling:vanityPath entries, which it fails to do, because of the large amount
> of sling:vanityPath in the version storage.
> In the code, I specifically see checks (when processing the query results)
> about the version storage. However, this should have been put inside the
> query as a filter, in order to avoid fetching such a large amount of query
> result nodes:
> https://github.com/apache/sling-org-apache-sling-resourceresolver/blob/4406b8fed0fedb48202fc6472fb552c36aa06e35/src/main/java/org/apache/sling/resourceresolver/impl/mapping/MapEntries.java#L1158
> I propose to update the query with a "not isdescendantnode"-check, to make
> sure we do not return any content from the version storage and thus make the
> default query limits of 100.000 nodes work again.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)