[ 
https://issues.apache.org/jira/browse/OAK-6333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Mehrotra updated OAK-6333:
---------------------------------
           Labels:   (was: candidate_oak_1_4 candidate_oak_1_6)
    Fix Version/s: 1.4.18
                   1.6.4

Merged to 
* 1.6 - 1803762
* 1.4 - 1803776

For older branches this needs to be explicitly enabled by setting system 
property {{oak.lucene.useActualEntryCount}} to {{true}}

> IndexPlanner should use actual entryCount instead of limiting it to 1000
> ------------------------------------------------------------------------
>
>                 Key: OAK-6333
>                 URL: https://issues.apache.org/jira/browse/OAK-6333
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>            Assignee: Chetan Mehrotra
>             Fix For: 1.8, 1.7.4, 1.6.4, 1.4.18
>
>
> Currently IndexPlanner uses following logic for estimating the entryCount
> # If the index has fulltext indexing enable then and query has a fulltext 
> constraint clause specified
> ## If {{entryCount}} value is defined then min(entryCount, numOfDocs)
> ## If not then use the {{numDocs}} i.e. actual entry count
> # If the index is pure property index i.e. none of the property definitions 
> have {{analyzed}} set to true
> ## If {{entryCount}} value is defined then min(entryCount, numOfDocs)
> ## Else Take min(1000, numDocs)
> Revisiting the logic for #2 it appears in 1.0.x days (OAK-2200) we capped it 
> to 1000 because cost estimation for property indexes was inaccurate (they 
> used to report low values causing lucene index to loose). 
> With support for Counters the cost estimation for property index has improved 
> and now we should remove this capping and let it use numDocs.
> One area where it causes issue is when we have two indexes where one is 
> superset of other. For e.g. /oak:index/asset and /content/en/ 
> /oak:index/asset where both have some matching properties. Logically if query 
> can be handled by sub index then it should get picked but currently either of 
> them can be picked making query plan undeterministic



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to