[
https://issues.apache.org/jira/browse/OAK-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amit Jain updated OAK-3219:
---------------------------
Fix Version/s: (was: 1.3.7)
1.3.8
> Lucene IndexPlanner should also account for number of property constraints
> evaluated while giving cost estimation
> -----------------------------------------------------------------------------------------------------------------
>
> Key: OAK-3219
> URL: https://issues.apache.org/jira/browse/OAK-3219
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: lucene
> Reporter: Chetan Mehrotra
> Assignee: Chetan Mehrotra
> Priority: Minor
> Labels: performance
> Fix For: 1.3.8
>
>
> Currently the cost returned by Lucene index is a function of number of
> indexed documents present in the index. If the number of indexed entries are
> high then it might reduce chances of this index getting selected if some
> property index also support of the property constraint.
> {noformat}
> /jcr:root/content/freestyle-cms/customers//element(*,
> cq:Page)[(jcr:content/@title = 'm' or jcr:like(jcr:content/@title, 'm%')) and
> jcr:content/@sling:resourceType = '/components/page/customer’]
> {noformat}
> Consider above query with following index definition
> * A property index on resourceType
> * A Lucene index for cq:Page with properties {{jcr:content/title}},
> {{jcr:content/sling:resourceType}} indexed and also path restriction
> evaluation enabled
> Now what the two indexes can help in
> # Property index
> ## Path restriction
> ## Property restriction on {{sling:resourceType}}
> # Lucene index
> ## NodeType restriction
> ## Property restriction on {{sling:resourceType}}
> ## Property restriction on {{title}}
> ## Path restriction
> Now cost estimate currently works like this
> * Property index - {{f(indexedValueEstimate, estimateOfNodesUnderGivenPath)}}
> ** indexedValueEstimate - For 'sling:resourceType=foo' its the approximate
> count for nodes having that as 'foo'
> ** estimateOfNodesUnderGivenPath - Its derived from an approximate estimation
> of nodes present under given path
> * Lucene Index - {{f(totalIndexedEntries)}}
> As cost of Lucene is too simple it does not reflect the reality. Following 2
> changes can be done to make it better
> * Given that Lucene index can handle multiple constraints compared (4) to
> property index (2), the cost estimate returned by it should also reflect this
> state. This can be done by setting costPerEntry to 1/(no of property
> restriction evaluated)
> * Get the count for queried property value - This is similar to what
> PropertyIndex does and assumes that Lucene can provide that information in
> O(1) cost. In case of multiple supported property restriction this can be
> minima of all
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)