[
https://issues.apache.org/jira/browse/OAK-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15548822#comment-15548822
]
Thomas Mueller commented on OAK-4816:
-------------------------------------
http://svn.apache.org/r1763455 (trunk)
> Property index: cost estimate with path restriction is too optimistic
> ---------------------------------------------------------------------
>
> Key: OAK-4816
> URL: https://issues.apache.org/jira/browse/OAK-4816
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: query
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
> Fix For: 1.5.12
>
>
> The property index cost estimation is too optimistic in case there is a
> property restriction plus a path restriction. The current algorithm, as
> documented in
> http://jackrabbit.apache.org/oak/docs/query/property-index.html#Cost_Estimation
> , assumes that matching entries are evenly distributed over the whole
> repository. In many cases, this is not the case. In extreme cases, _all_
> entries that match the property restriction are in the subtree that matches
> the path restriction. Example:
> * 10'000 nodes with property color "red".
> * 1 million nodes in the repository
> * 10'000 nodes in the subtree /content
> * query {{/jcr:root/content//\*[@color = 'red']}}
> Currently, the cost estimate is about 100, there are about 10'000 entries for
> "red", and "/content" contains 1% of all nodes. But in reality, there might
> be 10'000 entries with color "red" in that subtree (that is, all of them).
> The cost estimation should take that into account, and assume that at least
> 80% of the matching nodes are in that subtree (if the subtree contains that
> many nodes).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)