Chetan Mehrotra created OAK-6333:
------------------------------------

             Summary: IndexPlanner should use actual entryCount instead of 
limiting it to 1000
                 Key: OAK-6333
                 URL: https://issues.apache.org/jira/browse/OAK-6333
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: lucene
            Reporter: Chetan Mehrotra
            Assignee: Chetan Mehrotra
             Fix For: 1.8


Currently IndexPlanner uses following logic for estimating the entryCount

# If the index has fulltext indexing enable then 
## If {{entryCount}} value is defined then min(entryCount, numOfDocs)
## If not then use the {{numDocs}} i.e. actual entry count
# If the index is pure property index i.e. none of the property definitions 
have {{analyzed}} set to true
## Take min(1000, numDocs)

Revisiting the logic for #2 it appears in 1.0.x days (OAK-2200) we capped it to 
1000 because cost estimation for property indexes was inaccurate (they used to 
report low values causing lucene index to loose). 

With support for Counters the cost estimation for property index has improved 
and now we should remove this capping and let it use numDocs.

One area where it causes issue is when we have two indexes where one is 
superset of other. For e.g. /oak:index/asset and /content/en/ /oak:index/asset 
where both have some matching properties. Logically if query can be handled by 
sub index then it should get picked but currently either of them can be picked 
making query plan undeterministic



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to