[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard

2017-01-17 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15826049#comment-15826049
 ] 

Chetan Mehrotra commented on OAK-5448:
--

[~mreutegg] Reported an issue post this change

{noformat}
java.lang.IllegalArgumentException: DocValuesField 
":dvjcr:content/cq:lastModified" appears more than once in this document (only 
one value is allowed per field)
at 
org.apache.lucene.index.NumericDocValuesWriter.addValue(NumericDocValuesWriter.java:54)
at 
org.apache.lucene.index.DocValuesProcessor.addNumericField(DocValuesProcessor.java:153)
at 
org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:66)
at 
org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36)
at 
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236)
at 
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253)
{noformat}

This occurred because if more than 1 relative property have same parent path 
say 'jcr:content/lastModified' and 'jcr:content/status' then "jcr:content" was 
aggregated twice. Fixed this with 1779190

> Aggregate logic should optimize for case where patterns do not include 
> wildcard
> ---
>
> Key: OAK-5448
> URL: https://issues.apache.org/jira/browse/OAK-5448
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.5.18, 1.6
>
> Attachments: OAK-5448.patch
>
>
> Aggregate logic in oak-lucene currently tries to apply matcher on each of the 
> child node of a modified parent node. This is required for those case where 
> pattern involves wild card like aggregating '\*/\*/\*' pattern.
> However this performs poorly if the aggregate does not involve pattern. For 
> e.g. if we have defined a property definition for 'jcr:content/@status' for 
> nt:base
> {noformat}
>   + indexRules 
>+ nt:base 
> + properties 
>  + status 
>   - name = "jcr:content/status"
>   - propertyIndex = true
> {noformat}
> For above definition current logic would try to apply the matcher for 
> 'jcr:content' on each of the child nodes. So if we have a folder have 1000 
> entries it would read that many child nodes. 
> As a fix we should check if the aggregate path has wild card or not. if its 
> specific then aggregate logic should directly lookup child with given name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard

2017-01-14 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822770#comment-15822770
 ] 

Vikas Saurabh commented on OAK-5448:


Oh, no I meant that may be it can be renamed. But nonetheless it was very minor 
point can be ignored :).

> Aggregate logic should optimize for case where patterns do not include 
> wildcard
> ---
>
> Key: OAK-5448
> URL: https://issues.apache.org/jira/browse/OAK-5448
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.5.18, 1.6
>
> Attachments: OAK-5448.patch
>
>
> Aggregate logic in oak-lucene currently tries to apply matcher on each of the 
> child node of a modified parent node. This is required for those case where 
> pattern involves wild card like aggregating '\*/\*/\*' pattern.
> However this performs poorly if the aggregate does not involve pattern. For 
> e.g. if we have defined a property definition for 'jcr:content/@status' for 
> nt:base
> {noformat}
>   + indexRules 
>+ nt:base 
> + properties 
>  + status 
>   - name = "jcr:content/status"
>   - propertyIndex = true
> {noformat}
> For above definition current logic would try to apply the matcher for 
> 'jcr:content' on each of the child nodes. So if we have a folder have 1000 
> entries it would read that many child nodes. 
> As a fix we should check if the aggregate path has wild card or not. if its 
> specific then aggregate logic should directly lookup child with given name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard

2017-01-14 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822749#comment-15822749
 ] 

Chetan Mehrotra commented on OAK-5448:
--

bq. Just a minor point though - getElementNameIfNotAPattern feels like a 
getSomethingOrNull but it throws instead.

Thats intentional to ensure that logic should fail fast if it does not 
differentiate between pattern and exact matcher

Fixed with 1778731

> Aggregate logic should optimize for case where patterns do not include 
> wildcard
> ---
>
> Key: OAK-5448
> URL: https://issues.apache.org/jira/browse/OAK-5448
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.5.18, 1.6
>
> Attachments: OAK-5448.patch
>
>
> Aggregate logic in oak-lucene currently tries to apply matcher on each of the 
> child node of a modified parent node. This is required for those case where 
> pattern involves wild card like aggregating '\*/\*/\*' pattern.
> However this performs poorly if the aggregate does not involve pattern. For 
> e.g. if we have defined a property definition for 'jcr:content/@status' for 
> nt:base
> {noformat}
>   + indexRules 
>+ nt:base 
> + properties 
>  + status 
>   - name = "jcr:content/status"
>   - propertyIndex = true
> {noformat}
> For above definition current logic would try to apply the matcher for 
> 'jcr:content' on each of the child nodes. So if we have a folder have 1000 
> entries it would read that many child nodes. 
> As a fix we should check if the aggregate path has wild card or not. if its 
> specific then aggregate logic should directly lookup child with given name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard

2017-01-13 Thread Vikas Saurabh (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15822588#comment-15822588
 ] 

Vikas Saurabh commented on OAK-5448:


[~chetanm], patch lgtm. Just a minor point though - 
{{getElementNameIfNotAPattern}} feels like a {{getSomethingOrNull}} but it 
throws instead.

> Aggregate logic should optimize for case where patterns do not include 
> wildcard
> ---
>
> Key: OAK-5448
> URL: https://issues.apache.org/jira/browse/OAK-5448
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.5.18, 1.6
>
> Attachments: OAK-5448.patch
>
>
> Aggregate logic in oak-lucene currently tries to apply matcher on each of the 
> child node of a modified parent node. This is required for those case where 
> pattern involves wild card like aggregating '\*/\*/\*' pattern.
> However this performs poorly if the aggregate does not involve pattern. For 
> e.g. if we have defined a property definition for 'jcr:content/@status' for 
> nt:base
> {noformat}
>   + indexRules 
>+ nt:base 
> + properties 
>  + status 
>   - name = "jcr:content/status"
>   - propertyIndex = true
> {noformat}
> For above definition current logic would try to apply the matcher for 
> 'jcr:content' on each of the child nodes. So if we have a folder have 1000 
> entries it would read that many child nodes. 
> As a fix we should check if the aggregate path has wild card or not. if its 
> specific then aggregate logic should directly lookup child with given name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-5448) Aggregate logic should optimize for case where patterns do not include wildcard

2017-01-13 Thread Chetan Mehrotra (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-5448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821521#comment-15821521
 ] 

Chetan Mehrotra commented on OAK-5448:
--

Added ignored test in 1778520

> Aggregate logic should optimize for case where patterns do not include 
> wildcard
> ---
>
> Key: OAK-5448
> URL: https://issues.apache.org/jira/browse/OAK-5448
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
> Fix For: 1.5.18, 1.6
>
>
> Aggregate logic in oak-lucene currently tries to apply matcher on each of the 
> child node of a modified parent node. This is required for those case where 
> pattern involves wild card like aggregating '*/*/*' pattern.
> However this performs poorly if the aggregate does not involve pattern. For 
> e.g. if we have defined a property definition for 'jcr:content/@status' for 
> nt:base
> {noformat}
>   + indexRules 
>+ nt:base 
> + properties 
>  + status 
>   - name = "jcr:content/status"
>   - propertyIndex = true
> {noformat}
> For above definition current logic would try to apply the matcher for 
> 'jcr:content' on each of the child nodes. So if we have a folder have 1000 
> entries it would read that many child nodes. 
> As a fix we should check if the aggregate path has wild card or not. if its 
> specific then aggregate logic should directly lookup child with given name



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)