[jira] [Commented] (OAK-10648) Null Props Cause Incorrect Query Estimation

2024-02-13 Thread Thomas Mueller (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817235#comment-17817235
 ] 

Thomas Mueller commented on OAK-10648:
--

I didn't test this yet, but the following change seem to be necessary:

{noformat}
oak-search FulltextIndexPlanner

 if (pr.isNotNullRestriction()) {
// don't use weight for "is not null" restrictions
weight = 1;
 missing code start --
} else if (pr.isNullRestriction()) {
// don't use weight for "is null" restrictions
weight = 1;
 missing code end --
} else {
if (weight > 1) {
// for non-equality conditions such as
// where x > 1, x < 2, x like y,...:
// use a maximum weight of 3,
// so assume we read at least 30%
if (!isEqualityRestriction(pr)) {
weight = Math.min(3, weight);
}
}
}
{noformat}

We should probably add a feature toggle / system property so that we can switch 
back to the original behavior, to we can switch back in case an application 
relies on the current behavior.

> Null Props Cause Incorrect Query Estimation
> ---
>
> Key: OAK-10648
> URL: https://issues.apache.org/jira/browse/OAK-10648
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: indexing
>Reporter: Patrique Legault
>Priority: Major
> Attachments: Non Union Query Plan.json, Non Union With Null 
> Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, 
> cqTagLucene.json
>
>
> Using null props in a query can cause the query engine to incorrectly 
> estimate the cost of query plan which can lead to a traversal and slow 
> queries to execute.
>  
> If you look at the query plan below the number of null props documents is 
> quiet high yet the cost for the query is only 19. When we execute the UNION 
> query the cost is 38 which is why it is not selected when in reality the 
> original cost should be much higher.
>  
> After removing the null check the cost estimation is drastically different 
> and correctly reflects the number of documents in the index.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-10643) MongoDocumentStore: improve diagnostics for too large docs

2024-02-13 Thread Julian Reschke (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17817104#comment-17817104
 ] 

Julian Reschke commented on OAK-10643:
--

example output from unit test:

{noformat}
18:14:46.339 ERROR [main] MongoDocumentStore.java:1151  Failed to update 
the document with 
Id=org.apache.jackrabbit.oak.plugins.document.BasicDocumentStoreTest.testMaxAddPropertyUpdate
 with MongoWriteException message = 'Resulting document after update is larger 
than 16777216'. Document statistics: _id: 
org.apache.jackrabbit.oak.plugins.document.BasicDocumentStoreTest.testMaxAddPropertyUpdate,
 _modCount: 16, memory: 31462382; Contents: foo: 31461660 bytes in 15 entries 
(2097444 avg), _id: 228 bytes, _modCount: 16 bytes.
{noformat}

> MongoDocumentStore: improve diagnostics for too large docs
> --
>
> Key: OAK-10643
> URL: https://issues.apache.org/jira/browse/OAK-10643
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> Log or add to exception message (or both):
> - attempted UpdateOp
> - statistics about the document that was too large to be updated (that would 
> require a read from Mongo)
> Later on, we may want to extend this to that higher layers 
> (DocumentNodeStore) can try some kind of recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (OAK-10648) Null Props Cause Incorrect Query Estimation

2024-02-13 Thread Patrique Legault (Jira)
Patrique Legault created OAK-10648:
--

 Summary: Null Props Cause Incorrect Query Estimation
 Key: OAK-10648
 URL: https://issues.apache.org/jira/browse/OAK-10648
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: indexing
Reporter: Patrique Legault
 Attachments: Non Union Query Plan.json, Non Union With Null 
Check.json, Screenshot 2024-02-13 at 9.30.43 AM.png, Union Query Plan.json, 
cqTagLucene.json

Using null props in a query can cause the query engine to incorrectly estimate 
the cost of query plan which can lead to a traversal and slow queries to 
execute.

 

If you look at the query plan below the number of null props documents is quiet 
high yet the cost for the query is only 19. When we execute the UNION query the 
cost is 38 which is why it is not selected when in reality the original cost 
should be much higher.

 

After removing the null check the cost estimation is drastically different and 
correctly reflects the number of documents in the index.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-10645) MongoDS docker container: set default Mongo version to 4.4

2024-02-13 Thread Julian Reschke (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816996#comment-17816996
 ] 

Julian Reschke commented on OAK-10645:
--

trunk: 
[1b0a692aa4|https://github.com/apache/jackrabbit-oak-/commit/1b0a692aa426f4be8ca97ab65eb6629baec86ce1]

> MongoDS docker container: set default Mongo version to 4.4
> --
>
> Key: OAK-10645
> URL: https://issues.apache.org/jira/browse/OAK-10645
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Trivial
> Fix For: 1.62.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (OAK-10645) MongoDS docker container: set default Mongo version to 4.4

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-10645:
-
Priority: Trivial  (was: Major)

> MongoDS docker container: set default Mongo version to 4.4
> --
>
> Key: OAK-10645
> URL: https://issues.apache.org/jira/browse/OAK-10645
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Trivial
> Fix For: 1.62.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (OAK-10645) MongoDS docker container: set default Mongo version to 4.4

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-10645.
--
Fix Version/s: 1.62.0
   Resolution: Fixed

> MongoDS docker container: set default Mongo version to 4.4
> --
>
> Key: OAK-10645
> URL: https://issues.apache.org/jira/browse/OAK-10645
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
> Fix For: 1.62.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (OAK-10639) NodeImpl: calculate mixinTypes lazy

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-10639:
-
Labels: candidate_oak_1_22  (was: )

> NodeImpl: calculate mixinTypes lazy
> ---
>
> Key: OAK-10639
> URL: https://issues.apache.org/jira/browse/OAK-10639
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: jcr
>Reporter: Joerg Hoh
>Assignee: Joerg Hoh
>Priority: Major
>  Labels: candidate_oak_1_22
> Fix For: 1.62.0
>
>
> NodeImpl.isNodeType() calls ReadOnlyTypeManager.isNodeType(), but calculates 
> all mixinTypes directly, even if the mixinTypes are not required in 
> ReadOnlyTypeManager.isNodeType().
> The calculation of the mixinTypes could be converted into a Supplier<> type; 
> and the resolution could be done only when the mixinTypes are actually 
> required. This would save a few CPU cycles for the common case, where the 
> mixinTypes are not required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (OAK-10639) NodeImpl: calculate mixinTypes lazy

2024-02-13 Thread Joerg Hoh (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Hoh resolved OAK-10639.
-
Fix Version/s: 1.62.0
   Resolution: Fixed

> NodeImpl: calculate mixinTypes lazy
> ---
>
> Key: OAK-10639
> URL: https://issues.apache.org/jira/browse/OAK-10639
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: jcr
>Reporter: Joerg Hoh
>Assignee: Joerg Hoh
>Priority: Major
> Fix For: 1.62.0
>
>
> NodeImpl.isNodeType() calls ReadOnlyTypeManager.isNodeType(), but calculates 
> all mixinTypes directly, even if the mixinTypes are not required in 
> ReadOnlyTypeManager.isNodeType().
> The calculation of the mixinTypes could be converted into a Supplier<> type; 
> and the resolution could be done only when the mixinTypes are actually 
> required. This would save a few CPU cycles for the common case, where the 
> mixinTypes are not required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-10644) JsopBuilder: remove JDK6ism

2024-02-13 Thread Julian Reschke (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816978#comment-17816978
 ] 

Julian Reschke commented on OAK-10644:
--

trunk: 
[f7b20aa777|https://github.com/apache/jackrabbit-oak/commit/f7b20aa777a4d95a8f8a54ee020b86cd35ee30c1]

> JsopBuilder: remove JDK6ism
> ---
>
> Key: OAK-10644
> URL: https://issues.apache.org/jira/browse/OAK-10644
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: commons
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_22
> Fix For: 1.62.0
>
>
> {noformat}
> default:
> if (c < ' ') {
> buff.append(String.format("\\u%04x", (int) c));
> } else if (c >= 0xd800 && c <= 0xdbff) {
> // isSurrogate(), only available in Java 7
> if (i < length - 1 && Character.isSurrogatePair(c, 
> s.charAt(i + 1))) {
> // ok surrogate
> buff.append(c);
> buff.append(s.charAt(i + 1));
> i += 1;
> } else {
> // broken surrogate -> escape
> buff.append(String.format("\\u%04x", (int) c));
> }
> } else {
> buff.append(c);
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (OAK-10644) JsopBuilder: remove JDK6ism

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-10644:
-
Labels: candidate_oak_1_22  (was: )

> JsopBuilder: remove JDK6ism
> ---
>
> Key: OAK-10644
> URL: https://issues.apache.org/jira/browse/OAK-10644
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: commons
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
>  Labels: candidate_oak_1_22
> Fix For: 1.62.0
>
>
> {noformat}
> default:
> if (c < ' ') {
> buff.append(String.format("\\u%04x", (int) c));
> } else if (c >= 0xd800 && c <= 0xdbff) {
> // isSurrogate(), only available in Java 7
> if (i < length - 1 && Character.isSurrogatePair(c, 
> s.charAt(i + 1))) {
> // ok surrogate
> buff.append(c);
> buff.append(s.charAt(i + 1));
> i += 1;
> } else {
> // broken surrogate -> escape
> buff.append(String.format("\\u%04x", (int) c));
> }
> } else {
> buff.append(c);
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (OAK-10644) JsopBuilder: remove JDK6ism

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-10644.
--
Fix Version/s: 1.62.0
   Resolution: Fixed

> JsopBuilder: remove JDK6ism
> ---
>
> Key: OAK-10644
> URL: https://issues.apache.org/jira/browse/OAK-10644
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: commons
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Minor
> Fix For: 1.62.0
>
>
> {noformat}
> default:
> if (c < ' ') {
> buff.append(String.format("\\u%04x", (int) c));
> } else if (c >= 0xd800 && c <= 0xdbff) {
> // isSurrogate(), only available in Java 7
> if (i < length - 1 && Character.isSurrogatePair(c, 
> s.charAt(i + 1))) {
> // ok surrogate
> buff.append(c);
> buff.append(s.charAt(i + 1));
> i += 1;
> } else {
> // broken surrogate -> escape
> buff.append(String.format("\\u%04x", (int) c));
> }
> } else {
> buff.append(c);
> }
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (OAK-10646) MongoDocumentStore: improve handling of large updates

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-10646:
-
Description: 
The change for OAK-10127 has only addressed one case - large create operations.

However, there are more code paths that need to be improved:

- "find()" will fails if the serialized conditions exceed 16MB (arguably 
unlikely, but happened in a test - OAK-10642)

- bulk updates can fail when the serialied ops exceed 16MB (although, when run 
one-by-one, it might succeed)

  was:
The change for OAK-10127 has only addressed one case - large create operations.

However, there are more code paths that need to be improved:

- "find()" will fails if the serialized conditions exceed 16MB (arguably 
unlikely, but happened in a test - OAK-10642)

- bulk updates can fail when the serialied ops exceed 16MB (although, when run 
one-by-one, it might suceed)


> MongoDocumentStore: improve handling of large updates
> -
>
> Key: OAK-10646
> URL: https://issues.apache.org/jira/browse/OAK-10646
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>
> The change for OAK-10127 has only addressed one case - large create 
> operations.
> However, there are more code paths that need to be improved:
> - "find()" will fails if the serialized conditions exceed 16MB (arguably 
> unlikely, but happened in a test - OAK-10642)
> - bulk updates can fail when the serialied ops exceed 16MB (although, when 
> run one-by-one, it might succeed)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (OAK-10645) MongoDS docker container: set default Mongo version to 4.4

2024-02-13 Thread Julian Reschke (Jira)


 [ 
https://issues.apache.org/jira/browse/OAK-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-10645:
-
Summary: MongoDS docker container: set default Mongo version to 4.4  (was: 
MongoDS docker container: set default Mong version to 4.4)

> MongoDS docker container: set default Mongo version to 4.4
> --
>
> Key: OAK-10645
> URL: https://issues.apache.org/jira/browse/OAK-10645
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: documentmk
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)