[
https://issues.apache.org/jira/browse/LUCENE-4644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539284#comment-13539284
]
David Smiley commented on LUCENE-4644:
--------------------------------------
There are a couple possible WITHIN semantics necessitating two implementations:
*All WITHIN*: Consider a document that has a MULTIPOLYGON (they are disjoint).
Or even if you didn't use one of the WKT multi\* shapes, you might have called
createIndexableFields(shape) multiple times. I've got some implementation code
for this and I can already tell it's going to be pretty slow since it needs to
go to all the cells where the shape isn't to collect disjoint docs. So at the
first level of recursion, it needs to visit 31 of the 32 grid cells and loop on
docsEnum to find all docs with one of those terms. Since these are the top-most
grid cells, these will each have a large number of matching docs relative to
the smaller ones. I think this implementation necessitates a term -> doc Ids
cache when the docFreq is sufficiently high to warrant it. The docIds
could/should be Bits, and only bother caching when docFreq > 64 (a guess).
*Some WITHIN*: I'm not sure how to name this but it basically means one of the
indexed shapes for a document is properly WITHIN the query shape. If you know
that all of your indexed shapes are comprised of one shape then you'd always
want to use this implementation. It should be reasonably performant. It will
need to buffer the query shape slightly to ensure that it looks one grid cell
away from the original query shape to see which docs are barely outside of the
query shape and then use this knowledge to ensure such docs don't get in the
results. There is no shape.buffer() but that could be added to Spatial4j.
FWIW JTS implements this and it would be easy to add to some of the basic
shapes (e.g. point, circle, rectangle).
> Implement spatial WITHIN query for RecursivePrefixTree
> ------------------------------------------------------
>
> Key: LUCENE-4644
> URL: https://issues.apache.org/jira/browse/LUCENE-4644
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/spatial
> Reporter: David Smiley
> Assignee: David Smiley
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]