[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard
[ https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826049#comment-15826049 ] Chetan Mehrotra commented on OAK-5448: -- [~mreutegg] Reported an issue post this change {noformat} java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/cq:lastModified" appears more than once in this document (only one value is allowed per field) at org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54) at org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153) at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:66) at org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36) at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236) at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253) {noformat} This occurred because if more than 1 relative property have same parent path say 'jcr:content/lastModified' and 'jcr:content/status' then "jcr:content" was aggregated twice. Fixed this with 1779190 > Aggregate logic should optimize for case where patterns do not include > wildcard > --- > > Key: OAK-5448 > URL: https://issues.apache.org/jira/browse/OAK-5448 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Fix For: 1.5.18, 1.6 > > Attachments: OAK-5448.patch > > > Aggregate logic in oak-lucene currently tries to apply matcher on each of the > child node of a modified parent node. This is required for those case where > pattern involves wild card like aggregating '\*/\*/\*' pattern. > However this performs poorly if the aggregate does not involve pattern. For > e.g. if we have defined a property definition for 'jcr:content/@status' for > nt:base > {noformat} > + indexRules >+ nt:base > + properties > + status > - name = "jcr:content/status" > - propertyIndex = true > {noformat} > For above definition current logic would try to apply the matcher for > 'jcr:content' on each of the child nodes. So if we have a folder have 1000 > entries it would read that many child nodes. > As a fix we should check if the aggregate path has wild card or not. if its > specific then aggregate logic should directly lookup child with given name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard
[ https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822770#comment-15822770 ] Vikas Saurabh commented on OAK-5448: Oh, no I meant that may be it can be renamed. But nonetheless it was very minor point can be ignored :). > Aggregate logic should optimize for case where patterns do not include > wildcard > --- > > Key: OAK-5448 > URL: https://issues.apache.org/jira/browse/OAK-5448 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Fix For: 1.5.18, 1.6 > > Attachments: OAK-5448.patch > > > Aggregate logic in oak-lucene currently tries to apply matcher on each of the > child node of a modified parent node. This is required for those case where > pattern involves wild card like aggregating '\*/\*/\*' pattern. > However this performs poorly if the aggregate does not involve pattern. For > e.g. if we have defined a property definition for 'jcr:content/@status' for > nt:base > {noformat} > + indexRules >+ nt:base > + properties > + status > - name = "jcr:content/status" > - propertyIndex = true > {noformat} > For above definition current logic would try to apply the matcher for > 'jcr:content' on each of the child nodes. So if we have a folder have 1000 > entries it would read that many child nodes. > As a fix we should check if the aggregate path has wild card or not. if its > specific then aggregate logic should directly lookup child with given name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard
[ https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822749#comment-15822749 ] Chetan Mehrotra commented on OAK-5448: -- bq. Just a minor point though - getElementNameIfNotAPattern feels like a getSomethingOrNull but it throws instead. Thats intentional to ensure that logic should fail fast if it does not differentiate between pattern and exact matcher Fixed with 1778731 > Aggregate logic should optimize for case where patterns do not include > wildcard > --- > > Key: OAK-5448 > URL: https://issues.apache.org/jira/browse/OAK-5448 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Fix For: 1.5.18, 1.6 > > Attachments: OAK-5448.patch > > > Aggregate logic in oak-lucene currently tries to apply matcher on each of the > child node of a modified parent node. This is required for those case where > pattern involves wild card like aggregating '\*/\*/\*' pattern. > However this performs poorly if the aggregate does not involve pattern. For > e.g. if we have defined a property definition for 'jcr:content/@status' for > nt:base > {noformat} > + indexRules >+ nt:base > + properties > + status > - name = "jcr:content/status" > - propertyIndex = true > {noformat} > For above definition current logic would try to apply the matcher for > 'jcr:content' on each of the child nodes. So if we have a folder have 1000 > entries it would read that many child nodes. > As a fix we should check if the aggregate path has wild card or not. if its > specific then aggregate logic should directly lookup child with given name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard
[ https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822588#comment-15822588 ] Vikas Saurabh commented on OAK-5448: [~chetanm], patch lgtm. Just a minor point though - {{getElementNameIfNotAPattern}} feels like a {{getSomethingOrNull}} but it throws instead. > Aggregate logic should optimize for case where patterns do not include > wildcard > --- > > Key: OAK-5448 > URL: https://issues.apache.org/jira/browse/OAK-5448 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Fix For: 1.5.18, 1.6 > > Attachments: OAK-5448.patch > > > Aggregate logic in oak-lucene currently tries to apply matcher on each of the > child node of a modified parent node. This is required for those case where > pattern involves wild card like aggregating '\*/\*/\*' pattern. > However this performs poorly if the aggregate does not involve pattern. For > e.g. if we have defined a property definition for 'jcr:content/@status' for > nt:base > {noformat} > + indexRules >+ nt:base > + properties > + status > - name = "jcr:content/status" > - propertyIndex = true > {noformat} > For above definition current logic would try to apply the matcher for > 'jcr:content' on each of the child nodes. So if we have a folder have 1000 > entries it would read that many child nodes. > As a fix we should check if the aggregate path has wild card or not. if its > specific then aggregate logic should directly lookup child with given name -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard
[ https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821521#comment-15821521 ] Chetan Mehrotra commented on OAK-5448: -- Added ignored test in 1778520 > Aggregate logic should optimize for case where patterns do not include > wildcard > --- > > Key: OAK-5448 > URL: https://issues.apache.org/jira/browse/OAK-5448 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Chetan Mehrotra > Fix For: 1.5.18, 1.6 > > > Aggregate logic in oak-lucene currently tries to apply matcher on each of the > child node of a modified parent node. This is required for those case where > pattern involves wild card like aggregating '*/*/*' pattern. > However this performs poorly if the aggregate does not involve pattern. For > e.g. if we have defined a property definition for 'jcr:content/@status' for > nt:base > {noformat} > + indexRules >+ nt:base > + properties > + status > - name = "jcr:content/status" > - propertyIndex = true > {noformat} > For above definition current logic would try to apply the matcher for > 'jcr:content' on each of the child nodes. So if we have a folder have 1000 > entries it would read that many child nodes. > As a fix we should check if the aggregate path has wild card or not. if its > specific then aggregate logic should directly lookup child with given name -- This message was sent by Atlassian JIRA (v6.3.4#6332)