[jira] [Commented] (OAK-10643) MongoDocumentStore: improve diagnostics for too large docs
[ https://issues.apache.org/jira/browse/OAK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817488#comment-17817488 ] Julian Reschke commented on OAK-10643: -- trunk: [66b8bef296|https://github.com/apache/jackrabbit-oak/commit/66b8bef296b132e821a26b2486cfa5339393395b] > MongoDocumentStore: improve diagnostics for too large docs > -- > > Key: OAK-10643 > URL: https://issues.apache.org/jira/browse/OAK-10643 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > Log or add to exception message (or both): > - attempted UpdateOp > - statistics about the document that was too large to be updated (that would > require a read from Mongo) > Later on, we may want to extend this to that higher layers > (DocumentNodeStore) can try some kind of recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10643) MongoDocumentStore: improve diagnostics for too large docs
[ https://issues.apache.org/jira/browse/OAK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-10643: - Labels: candidate_oak_1_22 (was: ) > MongoDocumentStore: improve diagnostics for too large docs > -- > > Key: OAK-10643 > URL: https://issues.apache.org/jira/browse/OAK-10643 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > Log or add to exception message (or both): > - attempted UpdateOp > - statistics about the document that was too large to be updated (that would > require a read from Mongo) > Later on, we may want to extend this to that higher layers > (DocumentNodeStore) can try some kind of recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OAK-10643) MongoDocumentStore: improve diagnostics for too large docs
[ https://issues.apache.org/jira/browse/OAK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-10643. -- Fix Version/s: 1.62.0 Resolution: Fixed > MongoDocumentStore: improve diagnostics for too large docs > -- > > Key: OAK-10643 > URL: https://issues.apache.org/jira/browse/OAK-10643 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > Fix For: 1.62.0 > > > Log or add to exception message (or both): > - attempted UpdateOp > - statistics about the document that was too large to be updated (that would > require a read from Mongo) > Later on, we may want to extend this to that higher layers > (DocumentNodeStore) can try some kind of recovery. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OAK-10281) Introduce recoveryDelay to ClusterNodeInfo.isRecoveryNeeded
[ https://issues.apache.org/jira/browse/OAK-10281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Egli resolved OAK-10281. --- Resolution: Fixed * merged [https://github.com/apache/jackrabbit-oak/pull/1288] * created OAK-10651 to look into improvements > Introduce recoveryDelay to ClusterNodeInfo.isRecoveryNeeded > --- > > Key: OAK-10281 > URL: https://issues.apache.org/jira/browse/OAK-10281 > Project: Jackrabbit Oak > Issue Type: Task > Components: documentmk >Reporter: Stefan Egli >Assignee: Stefan Egli >Priority: Major > Fix For: 1.62.0 > > > Oak instances periodically update their leases to signal to peers in the > cluster that they are still alive. A lease that has timed out is hence taken > as indication that the corresponding oak instance has crashed (and not > released the lease). It is also assumed that the corresponding, crashing oak > instance does not do any further write operations after the lease timeout - > as it would otherwise have been alive and updated their lease, which it did > not. > As already reported elsewhere (eg OAK-10254) there is a case where indeed > writes happen later than the lease timeout (aka "late writes"): a writing > thread could go passed the lease check, then a stop-the-world (eg high JVM > GC) could halt the thread for more than the lease timeout (eg 2min), and upon > continuation that writing thread could then send the write operation to the > DocumentStore. > One way to mitigate this late-write risk is to delay the recovery. Ie wait > with doing the LastRevRecovery for eg 10min after a lease failure. That > includes putting the state of the clusterNode back into inactive. > This ticket is about introducing such a recoveryDelay config parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OAK-10651) Improve ClusterNodeInfo.recoveryDelayMillis
Stefan Egli created OAK-10651: - Summary: Improve ClusterNodeInfo.recoveryDelayMillis Key: OAK-10651 URL: https://issues.apache.org/jira/browse/OAK-10651 Project: Jackrabbit Oak Issue Type: Task Components: documentmk Reporter: Stefan Egli In OAK-10281 a static ClusterNodeInfo.recoveryDelayMillis has been introduced. While not a drama, preferably we'd have it non static eg bound to some config/context or just DocumentNodeStore instead. This ticket is to revisit this static in the context of some broader refactoring that eg might also include the similarly static clock object. Several ideas were discussed in [PR#1288|https://github.com/apache/jackrabbit-oak/pull/1288#issuecomment-1921925331] eg [PR#1292|https://github.com/apache/jackrabbit-oak/pull/1292] or [PR#1301|https://github.com/apache/jackrabbit-oak/pull/1301] that could serve as a basis for future discussions. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10648) "IS NULL" (Null Props) Cause Incorrect Query Estimation
[ https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817235#comment-17817235 ] Thomas Mueller edited comment on OAK-10648 at 2/14/24 3:00 PM: --- I didn't test this yet, but the following change seem to be necessary: https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L851 {noformat} oak-search FulltextIndexPlanner if (pr.isNotNullRestriction()) { // don't use weight for "is not null" restrictions weight = 1; missing code start -- } else if (pr.isNullRestriction()) { // don't use weight for "is null" restrictions weight = 1; missing code end -- } else { if (weight > 1) { // for non-equality conditions such as // where x > 1, x < 2, x like y,...: // use a maximum weight of 3, // so assume we read at least 30% if (!isEqualityRestriction(pr)) { weight = Math.min(3, weight); } } } {noformat} We should probably add a feature toggle / system property so that we can switch back to the original behavior, to we can switch back in case an application relies on the current behavior. was (Author: tmueller): I didn't test this yet, but the following change seem to be necessary: {noformat} oak-search FulltextIndexPlanner if (pr.isNotNullRestriction()) { // don't use weight for "is not null" restrictions weight = 1; missing code start -- } else if (pr.isNullRestriction()) { // don't use weight for "is null" restrictions weight = 1; missing code end -- } else { if (weight > 1) { // for non-equality conditions such as // where x > 1, x < 2, x like y,...: // use a maximum weight of 3, // so assume we read at least 30% if (!isEqualityRestriction(pr)) { weight = Math.min(3, weight); } } } {noformat} We should probably add a feature toggle / system property so that we can switch back to the original behavior, to we can switch back in case an application relies on the current behavior. > "IS NULL" (Null Props) Cause Incorrect Query Estimation > --- > > Key: OAK-10648 > URL: https://issues.apache.org/jira/browse/OAK-10648 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Patrique Legault >Priority: Major > Attachments: Non Union Query Plan.json, Non Union With Null > Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, > cqTagLucene.json > > > Using null props in a query can cause the query engine to incorrectly > estimate the cost of query plan which can lead to a traversal and slow > queries to execute. > If you look at the query plan below the number of null props documents is > quiet high yet the cost for the query is only 19. When we execute the UNION > query the cost is 38 which is why it is not selected when in reality the > original cost should be much higher. > After removing the null check the cost estimation is drastically different > and correctly reflects the number of documents in the index. > Queries: > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE > '%ksb1325bm%') > {noformat} > > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' > UNION > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title]) LIKE '%ksb1325bm%' > {noformat} > Index definition for the "cq:movedTo" property: > {noformat} > "cqMovedTo": { > "notNullCheckEnabled": true, > "nullCheckEnabled": true, > "propertyIndex": true, > "name": "cq:movedTo", > "type": "String" > } > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (OAK-10650) MongoDocumentStore.findDocuments can fail with BSON exception
[ https://issues.apache.org/jira/browse/OAK-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817405#comment-17817405 ] Julian Reschke commented on OAK-10650: -- trunk: [f165691e0b|https://github.com/apache/jackrabbit-oak-/commit/f165691e0bff0aa7ed5a2650a11dd52630181b20] > MongoDocumentStore.findDocuments can fail with BSON exception > - > > Key: OAK-10650 > URL: https://issues.apache.org/jira/browse/OAK-10650 > Project: Jackrabbit Oak > Issue Type: Bug > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > This can happen in an edge case where the BSON condition exceeds the 16MB > limit (see in test for OAK-10642). > The quick fix is to catch the exception and then use a simplified version of > the method that get's the documents one-by-one. > Mid-term, we may want to refactor this so that we avoid the exception by > limiting the size of the BSON condition proactively. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (OAK-10650) MongoDocumentStore.findDocuments can fail with BSON exception
[ https://issues.apache.org/jira/browse/OAK-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke resolved OAK-10650. -- Fix Version/s: 1.62.0 Resolution: Fixed > MongoDocumentStore.findDocuments can fail with BSON exception > - > > Key: OAK-10650 > URL: https://issues.apache.org/jira/browse/OAK-10650 > Project: Jackrabbit Oak > Issue Type: Bug > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.62.0 > > > This can happen in an edge case where the BSON condition exceeds the 16MB > limit (see in test for OAK-10642). > The quick fix is to catch the exception and then use a simplified version of > the method that get's the documents one-by-one. > Mid-term, we may want to refactor this so that we avoid the exception by > limiting the size of the BSON condition proactively. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10650) MongoDocumentStore.findDocuments can fail with BSON exception
[ https://issues.apache.org/jira/browse/OAK-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-10650: - Labels: candidate_oak_1_22 (was: ) > MongoDocumentStore.findDocuments can fail with BSON exception > - > > Key: OAK-10650 > URL: https://issues.apache.org/jira/browse/OAK-10650 > Project: Jackrabbit Oak > Issue Type: Bug > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > This can happen in an edge case where the BSON condition exceeds the 16MB > limit (see in test for OAK-10642). > The quick fix is to catch the exception and then use a simplified version of > the method that get's the documents one-by-one. > Mid-term, we may want to refactor this so that we avoid the exception by > limiting the size of the BSON condition proactively. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OAK-10650) MongoDocumentStore.findDocuments can fail with BSON exception
Julian Reschke created OAK-10650: Summary: MongoDocumentStore.findDocuments can fail with BSON exception Key: OAK-10650 URL: https://issues.apache.org/jira/browse/OAK-10650 Project: Jackrabbit Oak Issue Type: Bug Components: documentmk Reporter: Julian Reschke Assignee: Julian Reschke This can happen in an edge case where the BSON condition exceeds the 16MB limit (see in test for OAK-10642). The quick fix is to catch the exception and then use a simplified version of the method that get's the documents one-by-one. Mid-term, we may want to refactor this so that we avoid the exception by limiting the size of the BSON condition proactively. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (OAK-10641) DocumentStore: improve test coverage for large properties / documents
[ https://issues.apache.org/jira/browse/OAK-10641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17815702#comment-17815702 ] Julian Reschke edited comment on OAK-10641 at 2/14/24 11:34 AM: trunk: [07fbc86a9a|https://github.com/apache/jackrabbit-oak/commit/07fbc86a9a8a241f4542cc9cb79f339a0e899c3a] [ed1274c878|https://github.com/apache/jackrabbit-oak/commit/ed1274c87866eaa7b7ef67bee5027150871fc09c] was (Author: reschke): trunk: [ed1274c878|https://github.com/apache/jackrabbit-oak/commit/ed1274c87866eaa7b7ef67bee5027150871fc09c] > DocumentStore: improve test coverage for large properties / documents > - > > Key: OAK-10641 > URL: https://issues.apache.org/jira/browse/OAK-10641 > Project: Jackrabbit Oak > Issue Type: Test > Components: documentmk >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Major > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > In BasicDocumentStore, we already test large string properties upon document > creation (but only up to 8MB). > Add tests for document *updates*, and also for adding large properties for > existing docs. > Note that these tests will always pass, they just exercise the store impl up > to the limit and log the results. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10635) BundledTypeRegistry's use of shaded Guava problematic when used outside Oak
[ https://issues.apache.org/jira/browse/OAK-10635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Reschke updated OAK-10635: - Labels: candidate_oak_1_22 (was: ) > BundledTypeRegistry's use of shaded Guava problematic when used outside Oak > --- > > Key: OAK-10635 > URL: https://issues.apache.org/jira/browse/OAK-10635 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: documentmk >Reporter: Mark Adamcin >Priority: Minor > Labels: candidate_oak_1_22 > Fix For: 1.62.0 > > > The oak-shaded-guava bundle exports shaded guava packages with a version that > is defined by google to match the version of the upstream artifact. While it > is a semantic versioning scheme, it follows the API contract of the entire > artifact, and does not distinguish API changes in included packages like > .base and .collect at a granular level, which can result in otherwise > avoidable OSGi wiring errors when references to guava types leak outside of > the greater Oak API boundary, such as when classes are embedded or when guava > types are explicitly referenced in signatures outside of oak-shaded-guava. > oak-commons should endeavor to provide a stable facade API for the simpler > parts of the guava library that are referenced at runtime by other oak > bundles, such as newHashMap(), ImmutableList.copyOf(), Preconditions.check*, > and perhaps Closer. > One example I know of that could where I could benefit from this approach > almost immediately is a project where I am embedding > BundlingConfigInitializer and BundledTypesRegistry from oak-store-document in > a customized repository configuration. When BundledTypesRegistry is embedded, > it brings with it imports of ImmutableMap, Maps, and Sets from > org.apache.jackrabbit.guava.common.collect. With the recent guava upgrade to > 33.0.0 in OAK-10605 in 1.61-SNAPSHOT, the custom repository bundle fails to > activate because the previous import-package bounds no longer match: > {{org.apache.jackrabbit.guava.common.collect;version=[32.1.3,33).}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (OAK-10649) MemoryDS: add toggle to limit document size
Julian Reschke created OAK-10649: Summary: MemoryDS: add toggle to limit document size Key: OAK-10649 URL: https://issues.apache.org/jira/browse/OAK-10649 Project: Jackrabbit Oak Issue Type: Improvement Components: documentmk, test Reporter: Julian Reschke Assignee: Julian Reschke To simplify testing related to MongoDB's 16 MB limit. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10648) "IS NULL" (Null Props) Cause Incorrect Query Estimation
[ https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-10648: - Description: Using null props in a query can cause the query engine to incorrectly estimate the cost of query plan which can lead to a traversal and slow queries to execute. If you look at the query plan below the number of null props documents is quiet high yet the cost for the query is only 19. When we execute the UNION query the cost is 38 which is why it is not selected when in reality the original cost should be much higher. After removing the null check the cost estimation is drastically different and correctly reflects the number of documents in the index. Queries: {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE '%ksb1325bm%') {noformat} {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' UNION SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title]) LIKE '%ksb1325bm%' {noformat} Index definition for the "cq:movedTo" property: {noformat} "cqMovedTo": { "notNullCheckEnabled": true, "nullCheckEnabled": true, "propertyIndex": true, "name": "cq:movedTo", "type": "String" } {noformat} was: Using null props in a query can cause the query engine to incorrectly estimate the cost of query plan which can lead to a traversal and slow queries to execute. If you look at the query plan below the number of null props documents is quiet high yet the cost for the query is only 19. When we execute the UNION query the cost is 38 which is why it is not selected when in reality the original cost should be much higher. After removing the null check the cost estimation is drastically different and correctly reflects the number of documents in the index. Queries: {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE '%ksb1325bm%') {noformat} {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' UNION SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title]) LIKE '%ksb1325bm%' {noformat} > "IS NULL" (Null Props) Cause Incorrect Query Estimation > --- > > Key: OAK-10648 > URL: https://issues.apache.org/jira/browse/OAK-10648 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Patrique Legault >Priority: Major > Attachments: Non Union Query Plan.json, Non Union With Null > Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, > cqTagLucene.json > > > Using null props in a query can cause the query engine to incorrectly > estimate the cost of query plan which can lead to a traversal and slow > queries to execute. > If you look at the query plan below the number of null props documents is > quiet high yet the cost for the query is only 19. When we execute the UNION > query the cost is 38 which is why it is not selected when in reality the > original cost should be much higher. > After removing the null check the cost estimation is drastically different > and correctly reflects the number of documents in the index. > Queries: > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE > '%ksb1325bm%') > {noformat} > > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' > UNION > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title]) LIKE '%ksb1325bm%' > {noformat} > Index definition for the "cq:movedTo" property: > {noformat} > "cqMovedTo": { > "notNullCheckEnabled": true, > "nullCheckEnabled": true, > "propertyIndex": true, > "name": "cq:movedTo", > "type": "String" > } > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10648) "IS NULL" (Null Props) Cause Incorrect Query Estimation
[ https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-10648: - Summary: "IS NULL" (Null Props) Cause Incorrect Query Estimation (was: Null Props Cause Incorrect Query Estimation) > "IS NULL" (Null Props) Cause Incorrect Query Estimation > --- > > Key: OAK-10648 > URL: https://issues.apache.org/jira/browse/OAK-10648 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Patrique Legault >Priority: Major > Attachments: Non Union Query Plan.json, Non Union With Null > Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, > cqTagLucene.json > > > Using null props in a query can cause the query engine to incorrectly > estimate the cost of query plan which can lead to a traversal and slow > queries to execute. > > If you look at the query plan below the number of null props documents is > quiet high yet the cost for the query is only 19. When we execute the UNION > query the cost is 38 which is why it is not selected when in reality the > original cost should be much higher. > > After removing the null check the cost estimation is drastically different > and correctly reflects the number of documents in the index. > Queries: > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE > '%ksb1325bm%') > {noformat} > > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' > UNION > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title]) LIKE '%ksb1325bm%' > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (OAK-10648) Null Props Cause Incorrect Query Estimation
[ https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-10648: - Description: Using null props in a query can cause the query engine to incorrectly estimate the cost of query plan which can lead to a traversal and slow queries to execute. If you look at the query plan below the number of null props documents is quiet high yet the cost for the query is only 19. When we execute the UNION query the cost is 38 which is why it is not selected when in reality the original cost should be much higher. After removing the null check the cost estimation is drastically different and correctly reflects the number of documents in the index. Queries: {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE '%ksb1325bm%') {noformat} {noformat} SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' UNION SELECT * FROM [cq:Tag] WHERE [cq:movedTo] IS NULL AND LOWER([jcr:title]) LIKE '%ksb1325bm%' {noformat} was: Using null props in a query can cause the query engine to incorrectly estimate the cost of query plan which can lead to a traversal and slow queries to execute. If you look at the query plan below the number of null props documents is quiet high yet the cost for the query is only 19. When we execute the UNION query the cost is 38 which is why it is not selected when in reality the original cost should be much higher. After removing the null check the cost estimation is drastically different and correctly reflects the number of documents in the index. > Null Props Cause Incorrect Query Estimation > --- > > Key: OAK-10648 > URL: https://issues.apache.org/jira/browse/OAK-10648 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Patrique Legault >Priority: Major > Attachments: Non Union Query Plan.json, Non Union With Null > Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, > cqTagLucene.json > > > Using null props in a query can cause the query engine to incorrectly > estimate the cost of query plan which can lead to a traversal and slow > queries to execute. > > If you look at the query plan below the number of null props documents is > quiet high yet the cost for the query is only 19. When we execute the UNION > query the cost is 38 which is why it is not selected when in reality the > original cost should be much higher. > > After removing the null check the cost estimation is drastically different > and correctly reflects the number of documents in the index. > Queries: > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND (LOWER([jcr:title.en]) LIKE '%ksb1325bm%' OR LOWER([jcr:title]) LIKE > '%ksb1325bm%') > {noformat} > > {noformat} > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title.en]) LIKE '%ksb1325bm%' > UNION > SELECT * FROM [cq:Tag] > WHERE [cq:movedTo] IS NULL > AND LOWER([jcr:title]) LIKE '%ksb1325bm%' > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)