[ 
https://issues.apache.org/jira/browse/SLING-9535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17154400#comment-17154400
 ] 

Wim Symons commented on SLING-9535:
-----------------------------------

Hi,

Some additional comments on this.

Not only the warning of longer startup time occurs, but the problem with the 
optimization introduced a long time ago in SLING-2521 gets worse when you have 
an amount of aliases surpassing the Oak query limit.

At that point you'll see this information in the log files:
{code:java}
*WARN* [Apache Sling Repository Startup Thread] 
org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex 
Index-Traversed 100000 nodes with filter Filter(query=SELECT sling:alias FROM 
nt:base WHERE sling:alias IS NOT NULL, path=*, property=[sling:alias=[is not 
null]])
01.07.2020 20:20:31.258 *WARN* [Apache Sling Repository Startup Thread] 
org.apache.jackrabbit.oak.query.FilterIterators The query read or traversed 
more than 100 000 nodes. java.lang.UnsupportedOperationException: The query 
read or traversed more than 100000 nodes. To avoid affecting other tasks, 
processing was stopped.
{code}
But when it happens, the in-memory cache of sling aliases will be blank at 
startup. So when you make a request using an existing alias, you will get a 404 
Page Not Found error because the in-memory cache wasn't initialized.

So changing the query to only select a sub-set of content will definitely 
diminish the number of results, causing the error not to happen so soon.

Another point you might to a look at is the {{loadVanityPaths}} method, which 
executes the following query:
{code:java}
SELECT sling:vanityPath, sling:redirect, sling:redirectStatus FROM nt:base 
WHERE sling:vanityPath IS NOT NULL
{code}
This query has the same problem as described above.

Kind regards

Wim

 

> Improve performance of sling:alias Query when Optimize alias resolution is 
> activated
> ------------------------------------------------------------------------------------
>
>                 Key: SLING-9535
>                 URL: https://issues.apache.org/jira/browse/SLING-9535
>             Project: Sling
>          Issue Type: Improvement
>          Components: ResourceResolver
>            Reporter: Leonardo Derks
>            Priority: Major
>         Attachments: image-2020-06-19-18-29-24-335.png, 
> image-2020-06-19-18-33-21-657.png
>
>
> Improve performance of sling:alias Query when Optimize alias resolution is 
> activated in the Resource Resolver Factory:
> !image-2020-06-19-18-29-24-335.png!
> By checking the logs at startup this query is executed:
> {noformat}
> (query=SELECT sling:alias FROM nt:base WHERE sling:alias IS NOT NULL, path=*, 
> property=[sling:alias=[is not null]]){noformat}
> *The part that will be good to improve is that the query is not executed for 
> path=*, instead a predefined set of locations is used.*
> (Something similar as it is for the Vanity Paths will be nice):
> !image-2020-06-19-18-33-21-657.png!
> Then if none fo these are configured then the query is executed with path=*.
>  
> In our project several versions are created per page and it turns out that 
> the sling:alias found under _/jcr:system/jcr:versionStorage_ are also 
> including in the query exceeding the 10000 limit mentioned in the warning 
> message of the property:
> _This might have an impact on the startup time and on the alias update time 
> if the number of aliases is huge (over 10000)_
>  
> We might have a different approach to solve our issue but did not want to 
> leave this topic in the air. Might be also a good improvement for others.
>  
> Thanks
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to