[ 
https://issues.apache.org/jira/browse/LUCENE-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269524#comment-15269524
 ] 

Robert Muir commented on LUCENE-7270:
-------------------------------------

I think your tree might be doing too much work.

{code}
 private static boolean traverse(final Node node, final EdgeIterator 
edgeIterator, final double minValue, final double maxValue) {
      if (node == null) {
        return true;
      }
      if (minValue <= node.max) {
        
        // Does this node overlap?
        if (minValue <= node.high && maxValue >= node.low) {
          if (edgeIterator.matches(node.edge) == false) {
            return false;
          }
        }
        
        if (traverse(node.left, edgeIterator, minValue, maxValue) == false) {
          return false;
        }
        if (traverse(node.right, edgeIterator, minValue, maxValue) == false) { 
// missing a check here?
          return false;
        }
      }
      return true;
    }
{code}

on the other hand look at the 1D interval tree traversal of Polygon2D:
{code}
    boolean contains(double latitude, double longitude) {
      // crossings algorithm is an odd-even algorithm, so we descend the tree 
xor'ing results along our path
      boolean res = false;
      if (latitude <= max) {
        if (lat1 > latitude != lat2 > latitude) {
          if (longitude < (lon1 - lon2) * (latitude - lat2) / (lat1 - lat2) + 
lon2) {
            res = true;
          }
        }
        if (left != null) {
          res ^= left.contains(latitude, longitude);
        }
        if (right != null && latitude >= low) {
          res ^= right.contains(latitude, longitude);
        }
      }
      return res;
    }
{code}

See how it doesn't traverse the right subtree at all when the point is below 
the lower bound? If you want to see the same logic written another way, see 
https://en.wikipedia.org/wiki/Interval_tree#Java_Example:_Searching_a_point_or_an_interval_in_the_tree

Maybe its easier to see the problem in two dimensions: Polygon2D code uses the 
same logic but in two dimensions for polygon components (and holes) themselves, 
organized by bounding box. So same algorithm just a little bit hairier logic 
because "min" depends on which dimension we split on (x or y). Max of both 
dimensions is "pushed up" so its safe to check that for both dimensions every 
time.

{code}
  public boolean contains(double latitude, double longitude) {
    if (latitude <= maxY && longitude <= maxX) {
      if (componentContains(latitude, longitude)) {
        return true;
      }
      if (left != null) {
        if (left.contains(latitude, longitude)) {
          return true;
        }
      }
      if (right != null && ((splitX == false && latitude >= minLat) || (splitX 
&& longitude >= minLon))) {
        if (right.contains(latitude, longitude)) {
          return true;
        }
      }
    }
    return false;
  }
{code}

Anyway, thats the differences I see. Also null checks are in a different place: 
this way we don't do all those comparisons for "simple" cases and we keep a low 
overhead.

> Use better balanced trees for Geo3d complex polygons
> ----------------------------------------------------
>
>                 Key: LUCENE-7270
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7270
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/spatial3d
>            Reporter: Karl Wright
>            Assignee: Karl Wright
>             Fix For: master, 6.x
>
>
> The current tree data structure in GeoComplexPolygon has a lot of weaknesses. 
>  A better algorithm maybe can be taken from Polygon2D, which basically does 
> the following:
> Each node has:
> - low value (which is for that edge alone)
> - max value (which is for that edge and all children)
> Balanced tree building:
> - sort by low value (which is for the individual edge), and use max value as 
> tie breaker (which is max for edge and all children)
> - build tree after sorting, picking midpoint and recursively building lesser 
> and greater children that way
> Balanced tree traversal (looking for range minValue -> maxValue):
> - Descend the entire tree until the node fails this test:
>       if (minValue <= max) { ... }
>   So if the minimum value being sought is greater than the max for this edge 
> and all children, we stop and don't look at children.
>   (Q: does this represent a good split and a targeted range?  Maybe...  We 
> can certainly try it.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to