David Smiley created LUCENE-4418: ------------------------------------ Summary: Improve RecursivePrefixTreeFilter's performance heuristic tunables Key: LUCENE-4418 URL: https://issues.apache.org/jira/browse/LUCENE-4418 Project: Lucene - Core Issue Type: Improvement Components: modules/spatial Reporter: David Smiley Assignee: David Smiley Priority: Minor
RecursivePrefixTreeFilter recursively decomposes grid cells until it gets to a threshold grid level (e.g. 4 away from max levels), at which point it does a brute force scan because it's faster once the number of terms is smaller. So if max levels is 10, then if the threshold is 4 then it will switch to scanning at 6. Ideally, the filter would know exactly how many terms there are in that grid -- i.e. given a hi & lo term, determine how many indexed terms are in-between without actually iterating to find out. Instead, it could use the # docs that a grid cell has as a heuristic. It's not perfect but I think its much better because it's dynamic based on density of actual indexed data. It's not perfect because many documents could refer to the same indexed point, or few documents with multi-valued data could refer to many indexed points. Before I do this, I need to re-invigorate my testing efforts so I can come up with a default threshold. And it's also dependent on things like query shape complexity. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org