[
https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Mueller updated OAK-10648:
-
Description:
Using null props in a query can cause the query engine to incorrectly estimate
the cost of query plan which can lead to a traversal and slow queries to
execute.
If you look at the query plan below the number of null props documents is quiet
high yet the cost for the query is only 19. When we execute the UNION query the
cost is 38 which is why it is not selected when in reality the original cost
should be much higher.
After removing the null check the cost estimation is drastically different and
correctly reflects the number of documents in the index.
Queries:
{noformat}
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE
'%ksb1325bm%')
{noformat}
{noformat}
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%'
UNION
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND LOWER([jcr:title]) LIKE '%ksb1325bm%'
{noformat}
Index definition for the "cq:movedTo" property:
{noformat}
"cqMovedTo": {
"notNullCheckEnabled": true,
"nullCheckEnabled": true,
"propertyIndex": true,
"name": "cq:movedTo",
"type": "String"
}
{noformat}
was:
Using null props in a query can cause the query engine to incorrectly estimate
the cost of query plan which can lead to a traversal and slow queries to
execute.
If you look at the query plan below the number of null props documents is quiet
high yet the cost for the query is only 19. When we execute the UNION query the
cost is 38 which is why it is not selected when in reality the original cost
should be much higher.
After removing the null check the cost estimation is drastically different and
correctly reflects the number of documents in the index.
Queries:
{noformat}
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE
'%ksb1325bm%')
{noformat}
{noformat}
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%'
UNION
SELECT * FROM [cq:Tag]
WHERE [cq:movedTo] IS NULL
AND LOWER([jcr:title]) LIKE '%ksb1325bm%'
{noformat}
> "IS NULL" (Null Props) Cause Incorrect Query Estimation
> ---
>
> Key: OAK-10648
> URL: https://issues.apache.org/jira/browse/OAK-10648
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: indexing
>Reporter: Patrique Legault
>Priority: Major
> Attachments: Non Union Query Plan.json, Non Union With Null
> Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json,
> cqTagLucene.json
>
>
> Using null props in a query can cause the query engine to incorrectly
> estimate the cost of query plan which can lead to a traversal and slow
> queries to execute.
> If you look at the query plan below the number of null props documents is
> quiet high yet the cost for the query is only 19. When we execute the UNION
> query the cost is 38 which is why it is not selected when in reality the
> original cost should be much higher.
> After removing the null check the cost estimation is drastically different
> and correctly reflects the number of documents in the index.
> Queries:
> {noformat}
> SELECT * FROM [cq:Tag]
> WHERE [cq:movedTo] IS NULL
> AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE
> '%ksb1325bm%')
> {noformat}
>
> {noformat}
> SELECT * FROM [cq:Tag]
> WHERE [cq:movedTo] IS NULL
> AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%'
> UNION
> SELECT * FROM [cq:Tag]
> WHERE [cq:movedTo] IS NULL
> AND LOWER([jcr:title]) LIKE '%ksb1325bm%'
> {noformat}
> Index definition for the "cq:movedTo" property:
> {noformat}
> "cqMovedTo": {
> "notNullCheckEnabled": true,
> "nullCheckEnabled": true,
> "propertyIndex": true,
> "name": "cq:movedTo",
> "type": "String"
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)