[
https://issues.apache.org/jira/browse/SLING-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154400#comment-17154400
]
Wim Symons commented on SLING-9535:
-----------------------------------
Hi,
Some additional comments on this.
Not only the warning of longer startup time occurs, but the problem with the
optimization introduced a long time ago in SLING-2521 gets worse when you have
an amount of aliases surpassing the Oak query limit.
At that point you'll see this information in the log files:
{code:java}
*WARN* [Apache Sling Repository Startup Thread]
org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex
Index-Traversed 100000 nodes with filter Filter(query=SELECT sling:alias FROM
nt:base WHERE sling:alias IS NOT NULL, path=*, property=[sling:alias=[is not
null]])
01.07.2020 20:20:31.258 *WARN* [Apache Sling Repository Startup Thread]
org.apache.jackrabbit.oak.query.FilterIterators The query read or traversed
more than 100 000 nodes. java.lang.UnsupportedOperationException: The query
read or traversed more than 100000 nodes. To avoid affecting other tasks,
processing was stopped.
{code}
But when it happens, the in-memory cache of sling aliases will be blank at
startup. So when you make a request using an existing alias, you will get a 404
Page Not Found error because the in-memory cache wasn't initialized.
So changing the query to only select a sub-set of content will definitely
diminish the number of results, causing the error not to happen so soon.
Another point you might to a look at is the {{loadVanityPaths}} method, which
executes the following query:
{code:java}
SELECT sling:vanityPath, sling:redirect, sling:redirectStatus FROM nt:base
WHERE sling:vanityPath IS NOT NULL
{code}
This query has the same problem as described above.
Kind regards
Wim
> Improve performance of sling:alias Query when Optimize alias resolution is
> activated
> ------------------------------------------------------------------------------------
>
> Key: SLING-9535
> URL: https://issues.apache.org/jira/browse/SLING-9535
> Project: Sling
> Issue Type: Improvement
> Components: ResourceResolver
> Reporter: Leonardo Derks
> Priority: Major
> Attachments: image-2020-06-19-18-29-24-335.png,
> image-2020-06-19-18-33-21-657.png
>
>
> Improve performance of sling:alias Query when Optimize alias resolution is
> activated in the Resource Resolver Factory:
> !image-2020-06-19-18-29-24-335.png!
> By checking the logs at startup this query is executed:
> {noformat}
> (query=SELECT sling:alias FROM nt:base WHERE sling:alias IS NOT NULL, path=*,
> property=[sling:alias=[is not null]]){noformat}
> *The part that will be good to improve is that the query is not executed for
> path=*, instead a predefined set of locations is used.*
> (Something similar as it is for the Vanity Paths will be nice):
> !image-2020-06-19-18-33-21-657.png!
> Then if none fo these are configured then the query is executed with path=*.
>
> In our project several versions are created per page and it turns out that
> the sling:alias found under _/jcr:system/jcr:versionStorage_ are also
> including in the query exceeding the 10000 limit mentioned in the warning
> message of the property:
> _This might have an impact on the startup time and on the alias update time
> if the number of aliases is huge (over 10000)_
>
> We might have a different approach to solve our issue but did not want to
> leave this topic in the air. Might be also a good improvement for others.
>
> Thanks
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)