[ https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17817235#comment-17817235 ]
Thomas Mueller commented on OAK-10648: -------------------------------------- I didn't test this yet, but the following change seem to be necessary: {noformat} oak-search FulltextIndexPlanner if (pr.isNotNullRestriction()) { // don't use weight for "is not null" restrictions weight = 1; ---------------- missing code start -------------------------- } else if (pr.isNullRestriction()) { // don't use weight for "is null" restrictions weight = 1; ---------------- missing code end -------------------------- } else { if (weight > 1) { // for non-equality conditions such as // where x > 1, x < 2, x like y,...: // use a maximum weight of 3, // so assume we read at least 30% if (!isEqualityRestriction(pr)) { weight = Math.min(3, weight); } } } {noformat} We should probably add a feature toggle / system property so that we can switch back to the original behavior, to we can switch back in case an application relies on the current behavior. > Null Props Cause Incorrect Query Estimation > ------------------------------------------- > > Key: OAK-10648 > URL: https://issues.apache.org/jira/browse/OAK-10648 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing > Reporter: Patrique Legault > Priority: Major > Attachments: Non Union Query Plan.json, Non Union With Null > Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, > cqTagLucene.json > > > Using null props in a query can cause the query engine to incorrectly > estimate the cost of query plan which can lead to a traversal and slow > queries to execute. > > If you look at the query plan below the number of null props documents is > quiet high yet the cost for the query is only 19. When we execute the UNION > query the cost is 38 which is why it is not selected when in reality the > original cost should be much higher. > > After removing the null check the cost estimation is drastically different > and correctly reflects the number of documents in the index. -- This message was sent by Atlassian Jira (v8.20.10#820010)