[
https://issues.apache.org/jira/browse/LUCENE-7241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated LUCENE-7241:
--------------------------------
Attachment: LUCENE-7241.patch
Once more, attaching code for safekeeping. This implements the tie-in to
GeoPolygonFactory. The new method, makeLargeGeoPolygon, has characteristics
which make it quite distinct from the current makeGeoPolygon methods, which is
why there's a separate factory method and not just a heuristic at this level to
choose the best. These are listed in the javadoc for it, but summarized
quickly here:
(1) No error checking; polygons are assumed to be properly constructed;
(2) Very short edges are supported which are essentially coplanar with
adjoining edges;
(3) Algorithm is optimized for very large polygons ( > 100 sides);
What this has in common with the standard polygon implementation is:
(1) Computing the bounds of the shape remains expensive (O(N));
(2) Computing the outside distance to the shape remains expensive (O(N)).
> Improve performance of geo3d for polygons with very large numbers of points
> ---------------------------------------------------------------------------
>
> Key: LUCENE-7241
> URL: https://issues.apache.org/jira/browse/LUCENE-7241
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial3d
> Affects Versions: master
> Reporter: Karl Wright
> Assignee: Karl Wright
> Attachments: LUCENE-7241.patch, LUCENE-7241.patch, LUCENE-7241.patch
>
>
> This ticket corresponds to LUCENE-7239, except it's for geo3d polygons.
> The trick here is to organize edges by some criteria, e.g. z value range, and
> use that to avoid needing to go through all edges and/or tile large irregular
> polygons. Then we use the ability to quickly determine intersections to
> figure out whether a point is within the polygon, or not.
> The current way geo3d polygons are constructed involves finding a single
> point, or "pole", which all polygon points circle. This point is known to be
> either "in" or "out" based on the direction of the points. So we have one
> place of "truth" on the globe that is known at polygon setup time.
> If edges are organized by z value, where the z values for an edge are
> computed by the standard way of computing bounds for a plane, then we can
> readily organize edges into a tree structure such that it is easy to find all
> edges we need to check for a given z value. Then, we merely need to compute
> how many intersections to consider as we navigate from the "truth" point to
> the point being tested. In practice, this means both having a tree that is
> organized by z, and a tree organized by (x,y), since we need to navigate in
> both directions. But then we can cheaply count the number of intersections,
> and once we do that, we know whether our point is "in" or "out".
> The other performance improvement we need is whether a given plane intersects
> the polygon within provided bounds. This can be done using the same two
> trees (z and (x,y)), by virtue of picking which tree to use based on the
> plane's minimum bounds in z or (x,y). And, in practice, we might well use
> three trees: one in x, one in y, and one in z, which would mean we didn't
> have to compute longitudes ever.
> An implementation like this trades off the cost of finding point membership
> in near O\(log\(n)) time vs. the extra expense per step of finding that
> membership. Setup of the query is O\(n) in this scheme, rather than O\(n^2)
> in the current implementation, but once again each individual step is more
> expensive. Therefore I would expect we'd want to use the current
> implementation for simpler polygons and this sort of implementation for
> tougher polygons. Choosing which to use is a topic for another ticket.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]