Alex Parvulescu created OAK-1236:
------------------------------------
Summary: Query: optimize for sling's i18n support
Key: OAK-1236
URL: https://issues.apache.org/jira/browse/OAK-1236
Project: Jackrabbit Oak
Issue Type: Improvement
Components: query
Reporter: Alex Parvulescu
Assignee: Alex Parvulescu
There are some performance issues with sling's internationalization support
query [0].
The query for a specific locale looks like the following
{noformat}
//element(*,mix:language)[@jcr:language='en']//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
{noformat}
This turns into a join and it looks like it cannot properly leverage the index
on the left side to filter out content on the right side of the join.
I'm going to use a standard CQ setup for the following analysis.
The left side of the join is quite efficient with a property index
{noformat}
//element(*,mix:language)[@jcr:language='en']
/libs/foundation/components/search/i18n/en
/libs/foundation/components/mobilefooter/i18n/en
/libs/commerce/components/search/i18n/en
/libs/cq/searchpromote/components/pagination/i18n/en
{noformat}
fast query, so far so good.
Now the trouble begins running the right side
{noformat}
//element(*,sling:Message)[@sling:message]/(@sling:key|@sling:message)
{noformat}
As far as I see the biggest issue here is that the second query doesn't
leverage the left side join info. This affects the overall query time twice
- first it doesn't know that we're only looking for 'en' so the query will
traverse all the existing translations in all the languages (goes up to 91k
rows). So it will fetch 91k rows each time, filtering out for english at a
later phase
- second it appears to run the query for each of the left side hit, in our
case 4 times making the first issue 4 times worse.
[0] http://sling.apache.org/site/internationalization-support.html
--
This message was sent by Atlassian JIRA
(v6.1#6144)