[ 
https://issues.apache.org/jira/browse/LUCENE-6881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996364#comment-14996364
 ] 

Michael McCandless commented on LUCENE-6881:
--------------------------------------------

I re-ran the "lat/lon points in rects around London, UK" perf test from 
luceneutil ({{IndexOSM*.java}} and {{SearchOSM*.java}} sources).

This test indexes 60.8 M lat/lon points derived from Open Street Maps data and 
then runs varying regularly spaced rectangles (225 queries in all) around 
London, UK.

I used SMS and LogDocsMP to get to a 5/5/5 segment structure for all three 
tests, and so only a single thread is used throughout for fair comparison of 
indexing times:

*Spatial module, using RecursivePrefixTreeStrategy with PackedQuadPrefixTree at 
25 levels:*
  - 1,464 sec to index
  - 7.8 GB index on disk
  - 239 MB in-heap (ramBytesUsed summed across all segments)
  - 3.98 sec to run 225 searches (best of 100 iters)

*GeoPointField (sandbox)*
  - 497 sec to index
  - 3.2 GB index on disk
  - 86 MB heap (ramBytesUsed summed across all segments)
  - 4.48 sec to run 225 searches (best of 100 iters)

*Dimensional values (this patch) using default codec's dimensional format*
  - 744 sec to index
  - 704 MB index on disk
  - 2.3 MB heap (ramBytesUsed summed across all segments)
  - 2.85 sec to run 225 searches (best of 100 iters)

The spatial module is purely postings, geo point field is postings + doc 
values, and dimensional values is the new BKD tree.

Net/net indexing time for dimensional values approach is in between geo point 
field and spatial, but the resulting index as well as heap required at search 
time is much smaller, and the searching is faster.

The search time for dimensional values is a bit slower than the specialized (to 
lat/lon) doc-values based BKD from LUCENE-6477 / LUCENE-6645 (2.32 sec to run 
225 searches) but I think we can optimize things later.

I haven't tested the 1D case, and I suspect there are important specializations 
we can make there, but I'll save that for a follow-on.


> Cutover all BKD tree implementations to the codec
> -------------------------------------------------
>
>                 Key: LUCENE-6881
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6881
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk
>
>         Attachments: LUCENE-6881.patch, LUCENE-6881.patch
>
>
> This is phase 4 for enabling indexing dimensional values in Lucene
> ... follow-on from LUCENE-6861.
> This issue removes the 3 pre-existing specialized experimental BKD
> implementations (BKD* in sandbox module for 2D lat/lon geo, BKD3D* in
> spatial3d module for 3D x/y/z geo, and range tree in sandbox module)
> and instead switches over to having the codec index the dimensional
> values.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to