[ 
https://issues.apache.org/jira/browse/LUCENE-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549504#comment-14549504
 ] 

Robert Muir commented on LUCENE-6477:
-------------------------------------

It looks like Mike's patch targets the sandbox entirely already... except for 
some BitDocIdSet changes that need a little explanation :)

Personally, I think this is a good approach when things aren't fully baked. 
Especially in this case where it has not-fully-baked fileformats, nothing 
anyone wants to infer backwards compatibility for at the very least.

I am a bit worried about OfflineSorter, it seems it will use java.io.tmpdir, 
which a lot of people probably don't configure for "serious" big files like 
this? And it is not as robust if things can't get cleaned up on windows and so 
on. But OfflineSorter has some ByteSequenceReader/Writer abstractions, i wonder 
if long-term we can plug those into our Directory api. Maybe its good to think 
about a documented way for codecs to freely and easily use scratch files like 
this, where IndexFileDeleter could help out. For another issue, it shouldn't 
hold this one up...


> Add BKD tree for spatial shape query intersecting indexed points
> ----------------------------------------------------------------
>
>                 Key: LUCENE-6477
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6477
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.2
>
>         Attachments: LUCENE-6477.patch, LUCENE-6477.patch
>
>
> I'd like to explore using dedicated spatial trees for faster shape
> intersection filters than postings-based implementations.
> I implemented the tree data structure from
> https://www.cs.duke.edu/~pankaj/publications/papers/bkd-sstd.pdf
> The idea is simple: it builds a full binary tree, partitioning 2D
> space, alternately on lat and then lon, into smaller and smaller
> rectangles until a leaf has <= N (default 1024) points.
> It cannot index shapes (just points), and can then do fast shape
> intersection queries.  Multi-valued fields are supported.
> I only implemented the "point is contained in this bounding box" query
> for now, but I think polygon shape querying should be easy to
> implement using the same approach from LUCENE-6450.
> For indexing, you add BKDPointField (takes lat, lon) to your doc, and
> must set up your Codec use BKDTreeDocValuesFormat for that field.
> This DV format wraps Lucene50DVFormat, but then builds the disk-based
> BKD tree structure on the side.  BKDPointInBBoxQuery then requires this
> DVFormat, and casts it to gain access to the tree.
> I quantize each incoming double lat/lon to 32 bits precision (so 64
> bits per point) = ~9 milli-meter lon precision at the equator, I
> think.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to