Thomas Mueller created OAK-4816:
-----------------------------------

             Summary: Property index: cost estimate with path restriction is 
too optimistic
                 Key: OAK-4816
                 URL: https://issues.apache.org/jira/browse/OAK-4816
             Project: Jackrabbit Oak
          Issue Type: Improvement
            Reporter: Thomas Mueller
            Assignee: Thomas Mueller
             Fix For: 1.6


The property index cost estimation is too optimistic in case there is a 
property restriction plus a path restriction. The current algorithm, as 
documented in 
http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation 
, assumes that matching entries are evenly distributed over the whole 
repository. In many cases, this is not the case. In extreme cases, _all_ 
entries that match the property restriction are in the subtree that matches the 
path restriction. Example: 

* 10'000 nodes with property color "red".
* 1 million nodes in the repository
* 10'000 nodes in the subtree /content
* query {{/jcr:root/content//\*[@color = 'red']}}

Currently, the cost estimate is about 100, there are about 10'000 entries for 
"red", and "/content" contains 1% of all nodes. But in reality, there might be 
10'000 entries with color "red" in that subtree (that is, all of them).

The cost estimation should take that into account, and assume that at least 80% 
of the matching nodes are in that subtree (if the subtree contains that many 
nodes).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to