[ 
https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817235#comment-17817235
 ] 

Thomas Mueller commented on OAK-10648:
--------------------------------------

I didn't test this yet, but the following change seem to be necessary:

{noformat}
oak-search FulltextIndexPlanner

                 if (pr.isNotNullRestriction()) {
                    // don't use weight for "is not null" restrictions
                    weight = 1;
---------------- missing code start --------------------------
                } else if (pr.isNullRestriction()) {
                    // don't use weight for "is null" restrictions
                    weight = 1;
---------------- missing code end --------------------------
                } else {
                    if (weight > 1) {
                        // for non-equality conditions such as
                        // where x > 1, x < 2, x like y,...:
                        // use a maximum weight of 3,
                        // so assume we read at least 30%
                        if (!isEqualityRestriction(pr)) {
                            weight = Math.min(3, weight);
                        }
                    }
                }
{noformat}

We should probably add a feature toggle / system property so that we can switch 
back to the original behavior, to we can switch back in case an application 
relies on the current behavior.

> Null Props Cause Incorrect Query Estimation
> -------------------------------------------
>
>                 Key: OAK-10648
>                 URL: https://issues.apache.org/jira/browse/OAK-10648
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: indexing
>            Reporter: Patrique Legault
>            Priority: Major
>         Attachments: Non Union Query Plan.json, Non Union With Null 
> Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, 
> cqTagLucene.json
>
>
> Using null props in a query can cause the query engine to incorrectly 
> estimate the cost of query plan which can lead to a traversal and slow 
> queries to execute.
>  
> If you look at the query plan below the number of null props documents is 
> quiet high yet the cost for the query is only 19. When we execute the UNION 
> query the cost is 38 which is why it is not selected when in reality the 
> original cost should be much higher.
>  
> After removing the null check the cost estimation is drastically different 
> and correctly reflects the number of documents in the index.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to