[jira] [Updated] (LUCENE-7241) Improve performance of geo3d for polygons with very large numbers of points

Karl Wright (JIRA) Thu, 21 Apr 2016 20:40:26 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Karl Wright updated LUCENE-7241:
--------------------------------
    Description: 
This ticket corresponds to LUCENE-7239, except it's for geo3d polygons.

The trick here is to organize edges by some criteria, e.g. z value range, and 
use that to avoid needing to go through all edges and/or tile large irregular 
polygons.  Then we use the ability to quickly determine intersections to figure 
out whether a point is within the polygon, or not.

The current way geo3d polygons are constructed involves finding a single point, 
or "pole", which all polygon points circle.  This point is known to be either 
"in" or "out" based on the direction of the points.  So we have one place of 
"truth" on the globe that is known at polygon setup time.

If edges are organized by z value, where the z values for an edge are computed 
by the standard way of computing bounds for a plane, then we can readily 
organize edges into a tree structure such that it is easy to find all edges we 
need to check for a given z value.  Then, we merely need to compute how many 
intersections to consider as we navigate from the "truth" point to the point 
being tested.  In practice, this means both having a tree that is organized by 
z, and a tree organized by (x,y), since we need to navigate in both directions. 
 But then we can cheaply count the number of intersections, and once we do 
that, we know whether our point is "in" or "out".

The other performance improvement we need is whether a given plane intersects 
the polygon within provided bounds.  This can be done using the same two trees 
(z and (x,y)), by virtue of picking which tree to use based on the plane's 
minimum bounds in z or (x,y).  And, in practice, we might well use three trees: 
one in x, one in y, and one in z, which would mean we didn't have to compute 
longitudes ever.

An implementation like this trades off the cost of finding point membership in 
near O\(log\(n)) time vs. the extra expense per step of finding that 
membership.  Setup of the query is O\(n) in this scheme, rather than O\(n^2) in 
the current implementation, but once again each individual step is more 
expensive.  Therefore I would expect we'd want to use the current 
implementation for simpler polygons and this sort of implementation for tougher 
polygons.  Choosing which to use is a topic for another ticket.



  was:
This ticket corresponds to LUCENE-7239, except it's for geo3d polygons.

The trick here is to organize edges by some criteria, e.g. z value range, and 
use that to avoid needing to go through all edges and/or tile large irregular 
polygons.  Then we use the ability to quickly determine intersections to figure 
out whether a point is within the polygon, or not.

The current way geo3d polygons are constructed involves finding a single point, 
or "pole", which all polygon points circle.  This point is known to be either 
"in" or "out" based on the direction of the points.  So we have one place of 
"truth" on the globe that is known at polygon setup time.

If edges are organized by z value, where the z values for an edge are computed 
by the standard way of computing bounds for a plane, then we can readily 
organize edges into a tree structure such that it is easy to find all edges we 
need to check for a given z value.  Then, we merely need to compute how many 
intersections to consider as we navigate from the "truth" point to the point 
being tested.  In practice, this means both having a tree that is organized by 
z, and a tree organized by (x,y), since we need to navigate in both directions. 
 But then we can cheaply count the number of intersections, and once we do 
that, we know whether our point is "in" or "out".

The other performance improvement we need is whether a given plane intersects 
the polygon within provided bounds.  This can be done using the same two trees 
(z and (x,y)), by virtue of picking which tree to use based on the plane's 
minimum bounds in z or (x,y).  And, in practice, we might well use three trees: 
one in x, one in y, and one in z, which would mean we didn't have to compute 
longitudes ever.

An implementation like this trades off the cost of finding point membership in 
near O(log(n)) time vs. the extra expense per step of finding that membership.  
Setup of the query is O\(n) in this scheme, rather than O\(n^2) in the current 
implementation, but once again each individual step is more expensive.  
Therefore I would expect we'd want to use the current implementation for 
simpler polygons and this sort of implementation for tougher polygons.  
Choosing which to use is a topic for another ticket.




> Improve performance of geo3d for polygons with very large numbers of points
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-7241
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7241
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial3d
>    Affects Versions: master
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>
> This ticket corresponds to LUCENE-7239, except it's for geo3d polygons.
> The trick here is to organize edges by some criteria, e.g. z value range, and 
> use that to avoid needing to go through all edges and/or tile large irregular 
> polygons.  Then we use the ability to quickly determine intersections to 
> figure out whether a point is within the polygon, or not.
> The current way geo3d polygons are constructed involves finding a single 
> point, or "pole", which all polygon points circle.  This point is known to be 
> either "in" or "out" based on the direction of the points.  So we have one 
> place of "truth" on the globe that is known at polygon setup time.
> If edges are organized by z value, where the z values for an edge are 
> computed by the standard way of computing bounds for a plane, then we can 
> readily organize edges into a tree structure such that it is easy to find all 
> edges we need to check for a given z value.  Then, we merely need to compute 
> how many intersections to consider as we navigate from the "truth" point to 
> the point being tested.  In practice, this means both having a tree that is 
> organized by z, and a tree organized by (x,y), since we need to navigate in 
> both directions.  But then we can cheaply count the number of intersections, 
> and once we do that, we know whether our point is "in" or "out".
> The other performance improvement we need is whether a given plane intersects 
> the polygon within provided bounds.  This can be done using the same two 
> trees (z and (x,y)), by virtue of picking which tree to use based on the 
> plane's minimum bounds in z or (x,y).  And, in practice, we might well use 
> three trees: one in x, one in y, and one in z, which would mean we didn't 
> have to compute longitudes ever.
> An implementation like this trades off the cost of finding point membership 
> in near O\(log\(n)) time vs. the extra expense per step of finding that 
> membership.  Setup of the query is O\(n) in this scheme, rather than O\(n^2) 
> in the current implementation, but once again each individual step is more 
> expensive.  Therefore I would expect we'd want to use the current 
> implementation for simpler polygons and this sort of implementation for 
> tougher polygons.  Choosing which to use is a topic for another ticket.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (LUCENE-7241) Improve performance of geo3d for polygons with very large numbers of points

Reply via email to