[ 
https://issues.apache.org/jira/browse/OAK-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841334#comment-13841334
 ] 

Alex Parvulescu commented on OAK-1236:
--------------------------------------

Funny enough, I think the 2 following statements have the same effect:

bq. would it be faster, to just search for all language roots and then traverse 
the subtree instead of querying it?
and 
bq. I wonder what would happen if there is no index on the mixin type 
sling:Message? Wouldn't that make the query fast?

I've tested this (and fixed OAK-1269 in the process) and it looks like it would 
solve this issue: removing the node type index for the sling:Message causes a 
traversal which has minimal impact compared to the original issue.

On a more broader scope, I agree with Jukka that we should look into applying a 
similar optimization like the jackrabbit case: buffer the left side results and 
push the intermediate values on the right side of the join as a filter, but 
this could be tracked in a dedicated issue.

This issue is now a matter of index config which is outside the indexing code, 
so I will mark is as resolved soon if nobody objects.

> Query: optimize for sling's i18n support
> ----------------------------------------
>
>                 Key: OAK-1236
>                 URL: https://issues.apache.org/jira/browse/OAK-1236
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Alex Parvulescu
>            Assignee: Alex Parvulescu
>
> There are some performance issues with sling's internationalization support 
> query [0].
> The query for a specific locale looks like the following
> {noformat}
> //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
> {noformat}
> This turns into a join and it looks like it cannot properly leverage the 
> index on the left side to filter out content on the right side of the join.
> I'm going to use a standard CQ setup for the following analysis.
> The left side of the join is quite efficient with a property index
> {noformat}
> //element(*,mix:language)[@jcr:language='en']
> /libs/foundation/components/search/i18n/en
> /libs/foundation/components/mobilefooter/i18n/en
> /libs/commerce/components/search/i18n/en
> /libs/cq/searchpromote/components/pagination/i18n/en
> {noformat}
> fast query, so far so good.
> Now the trouble begins running the right side
> {noformat}
> //element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
> {noformat}
> As far as I see the biggest issue here is that the second query doesn't 
> leverage the left side join info. This affects the overall query time twice
>  - first it doesn't know that we're only looking for 'en' so the query will 
> traverse all the existing translations in all the languages (goes up to 91k 
> rows). So it will fetch 91k rows each time, filtering out for english at a 
> later phase
>  - second it appears to run the query for each of the left side hit, in our 
> case 4 times making the first issue 4 times worse.
> [0] http://sling.apache.org/site/internationalization-support.html



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to