David Smiley created LUCENE-4418:
------------------------------------

             Summary: Improve RecursivePrefixTreeFilter's performance heuristic 
tunables
                 Key: LUCENE-4418
                 URL: https://issues.apache.org/jira/browse/LUCENE-4418
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/spatial
            Reporter: David Smiley
            Assignee: David Smiley
            Priority: Minor


RecursivePrefixTreeFilter recursively decomposes grid cells until it gets to a 
threshold grid level (e.g. 4 away from max levels), at which point it does a 
brute force scan because it's faster once the number of terms is smaller.  So 
if max levels is 10, then if the threshold is 4 then it will switch to scanning 
at 6.  Ideally, the filter would know exactly how many terms there are in that 
grid -- i.e. given a hi & lo term, determine how many indexed terms are 
in-between without actually iterating to find out.  

Instead, it could use the # docs that a grid cell has as a heuristic.  It's not 
perfect but I think its much better because it's dynamic based on density of 
actual indexed data.  It's not perfect because many documents could refer to 
the same indexed point, or few documents with multi-valued data could refer to 
many indexed points.

Before I do this, I need to re-invigorate my testing efforts so I can come up 
with a default threshold.  And it's also dependent on things like query shape 
complexity. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to