[jira] [Updated] (LUCENE-7159) improve spatial point/rect vs. polygon performance

Robert Muir (JIRA) Mon, 04 Apr 2016 09:32:59 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-7159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Robert Muir updated LUCENE-7159:
--------------------------------
    Attachment: LUCENE-7159.patch

I added javadocs, asserts, a direct test, cleaned up the internal API, and 
re-ran benchmarks with mike's openstreetmap benchmark:

Point in polygon only becomes a bottleneck for complex polygons, so it really 
only helps there, but doesn't hurt simple ones.

More than half the gain here is unrelated to the grid, and only due to very 
minor optimizations in Polygon.relates(), which actually is the bigger 
bottleneck.

I think we can do this as a start, and then as a followup implement relates() 
too? We should be able to speed those up in a similar way, most calls would 
then be rectangle <-> rectangle operations in integer space.

And of course it would still be nice to speed up the relation operations in 
Polygon.java (e.g. compute if convex in ctor or whatever), but that stuff is 
complex.

{noformat}
n=5 25.9 QPS -> 27.0 QPS
n=50 18.2 QPS -> 18.1 QPS
n=500 8.0 QPS -> 9.5 QPS
n=1000 5.0 QPS -> 6.2 QPS
{noformat}

> improve spatial point/rect vs. polygon performance
> --------------------------------------------------
>
>                 Key: LUCENE-7159
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7159
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>         Attachments: LUCENE-7159.patch, LUCENE-7159.patch
>
>
> Now that we can query on complex polygons without going OOM (LUCENE-7153), we 
> should do something to address the current 🐢 performance.
> Currently, we use a basic crossings test ({{O\(n)}}) for boundary cases. We 
> defer these expensive per-doc checks on boundary cases to a two phase 
> iterator (LUCENE-7019, LUCENE-7109), so that it can be avoided if e.g. 
> excluded by filters, conjunctions, deleted doc, and so on. This is currently 
> important for performance, but basically its shoving the problem under the 
> rug and hoping it goes away. At least for point in poly, there are a number 
> of faster techniques described here: 
> http://erich.realtimerendering.com/ptinpoly/
> Additionally I am not sure how costly our "tree traversal" (rectangle 
> intersection algorithms). Maybe its nothing to be worried about, but likely 
> it too gets bad if the thing gets complex enough. These don't need to be 
> perfect but need to behave like java's Shape#contains (can conservatively 
> return false), and Shape#intersects (can conservatively return true). Of 
> course, if they are too inaccurate, then things can get slower.
> In cases of precomputed structures we should also consider memory usage: e.g. 
> we shouldn't make a horrible tradeoff there.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-7159) improve spatial point/rect vs. polygon performance

Reply via email to