[
https://issues.apache.org/jira/browse/OAK-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841334#comment-13841334
]
Alex Parvulescu commented on OAK-1236:
--------------------------------------
Funny enough, I think the 2 following statements have the same effect:
bq. would it be faster, to just search for all language roots and then traverse
the subtree instead of querying it?
and
bq. I wonder what would happen if there is no index on the mixin type
sling:Message? Wouldn't that make the query fast?
I've tested this (and fixed OAK-1269 in the process) and it looks like it would
solve this issue: removing the node type index for the sling:Message causes a
traversal which has minimal impact compared to the original issue.
On a more broader scope, I agree with Jukka that we should look into applying a
similar optimization like the jackrabbit case: buffer the left side results and
push the intermediate values on the right side of the join as a filter, but
this could be tracked in a dedicated issue.
This issue is now a matter of index config which is outside the indexing code,
so I will mark is as resolved soon if nobody objects.
> Query: optimize for sling's i18n support
> ----------------------------------------
>
> Key: OAK-1236
> URL: https://issues.apache.org/jira/browse/OAK-1236
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: query
> Reporter: Alex Parvulescu
> Assignee: Alex Parvulescu
>
> There are some performance issues with sling's internationalization support
> query [0].
> The query for a specific locale looks like the following
> {noformat}
> //element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
> {noformat}
> This turns into a join and it looks like it cannot properly leverage the
> index on the left side to filter out content on the right side of the join.
> I'm going to use a standard CQ setup for the following analysis.
> The left side of the join is quite efficient with a property index
> {noformat}
> //element(*,mix:language)[@jcr:language='en']
> /libs/foundation/components/search/i18n/en
> /libs/foundation/components/mobilefooter/i18n/en
> /libs/commerce/components/search/i18n/en
> /libs/cq/searchpromote/components/pagination/i18n/en
> {noformat}
> fast query, so far so good.
> Now the trouble begins running the right side
> {noformat}
> //element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
> {noformat}
> As far as I see the biggest issue here is that the second query doesn't
> leverage the left side join info. This affects the overall query time twice
> - first it doesn't know that we're only looking for 'en' so the query will
> traverse all the existing translations in all the languages (goes up to 91k
> rows). So it will fetch 91k rows each time, filtering out for english at a
> later phase
> - second it appears to run the query for each of the left side hit, in our
> case 4 times making the first issue 4 times worse.
> [0] http://sling.apache.org/site/internationalization-support.html
--
This message was sent by Atlassian JIRA
(v6.1#6144)