[
https://issues.apache.org/jira/browse/OAK-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Mueller reassigned OAK-4323:
-----------------------------------
Assignee: Thomas Mueller
> Query engine: index cost formula incorrect when using "limit"
> -------------------------------------------------------------
>
> Key: OAK-4323
> URL: https://issues.apache.org/jira/browse/OAK-4323
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: query
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
> Fix For: 1.6
>
>
> As described in OAK-2081, the cost formula currently used in the query engine
> is not correct if "limit" is used, because it doesn't account for false
> positives.
> Example: Let's say there are two indexes:
> * color: 10000 nodes with color=red, but a bit slower (lets say a remote
> index), cost per entry is 1.5.
> * size: 20000 nodes with size=M, but a bit faster (lets say a local index),
> cost per entry is 1.
> Without limit, the index for "color" should be used as 10000 * 1.5 = 15000 is
> lower than 20000 * 1 = 20000.
> With limit=100, then we could calculate as follows: there are at most 10000
> entries (according to index "color"), so the false positive rate of the
> "size" index is at least 50%. So cost of "color" is 100 * 1.5 = 150. Cost of
> "size" is 100 * 1 = 100, but with false positive rate of 50%, so cost is
> actually 200. Therefor, still the index "color" should be used.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)