[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707941#comment-14707941 ] Karl Wright edited comment on LUCENE-6699 at 8/22/15 8:24 AM: -- Hmm, as a final check, I took the original point from the original failure, which is not adjusted and is therefore on the WGS84 surface. Unfortunately, that too fails in the same way: {code} c = new GeoCircle(PlanetModel.WGS84,-0.006450320645814321,0.004660694205115142,0.00489710732634323); //xyzb = new XYZBounds(); //c.getBounds(xyzb); //System.err.println("xmin="+xyzb.getMinimumX()+", xmax="+xyzb.getMaximumX()+",ymin="+xyzb.getMinimumY()+", ymax="+xyzb.getMaximumY()+",zmin="+xyzb.getMinimumZ()+", zmax="+xyzb.getMaximumZ()); //xmin=1.0010356621420726, xmax=1.0011141249179447,ymin=-2.5326643901354566E-4, ymax=0.009584741915757169,zmin=-0.011359874956269283, zmax=-0.0015549504447452225 area = GeoAreaFactory.makeGeoArea(PlanetModel.WGS84,1.0010822580620098,1.0010945779732867,0.007079167343247293,0.007541006774427837,-0.0021855011220022575,-0.001896122718181518); assertTrue(GeoArea.CONTAINS == area.getRelationship(c)); p2 = new GeoPoint(PlanetModel.WGS84,-0.002164069780096702, 0.007505617500830066); assertTrue(PlanetModel.WGS84.pointOnSurface(p2)); assertTrue(!c.isWithin(p2)); assertTrue(!area.isWithin(p2)); // fails {code} So there is something more subtle going on than I originally thought. Looking into it now. was (Author: kwri...@metacarta.com): Hmm, as a final check, I took the original point from the original failure, which is not adjusted and is therefore on the WGS84 surface. Unfortunately, that too fails in the same way: {code} c = new GeoCircle(PlanetModel.WGS84,-0.006450320645814321,0.004660694205115142,0.00489710732634323); //xyzb = new XYZBounds(); //c.getBounds(xyzb); //System.err.println("xmin="+xyzb.getMinimumX()+", xmax="+xyzb.getMaximumX()+",ymin="+xyzb.getMinimumY()+", ymax="+xyzb.getMaximumY()+",zmin="+xyzb.getMinimumZ()+", zmax="+xyzb.getMaximumZ()); //xmin=1.0010356621420726, xmax=1.0011141249179447,ymin=-2.5326643901354566E-4, ymax=0.009584741915757169,zmin=-0.011359874956269283, zmax=-0.0015549504447452225 area = GeoAreaFactory.makeGeoArea(PlanetModel.WGS84,1.0010822580620098,1.0010945779732867,0.007079167343247293,0.007541006774427837,-0.0021855011220022575,-0.001896122718181518); assertTrue(GeoArea.CONTAINS == area.getRelationship(c)); p2 = new GeoPoint(PlanetModel.WGS84,-0.002164069780096702, 0.007505617500830066); assertTrue(PlanetModel.WGS84.pointOnSurface(p2)); assertTrue(!c.isWithin(p2)); assertTrue(!area.isWithin(p2)); {code} So there is something more subtle going on than I originally thought. Looking into it now. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail:
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704015#comment-14704015 ] Karl Wright edited comment on LUCENE-6699 at 8/20/15 12:02 AM: --- I know how to do it, PROVIDED that it is true that for any plane and any ellipsoid, the intersection of the plane and the ellipsoid is a simple ellipse. I don't yet know whether this is true, however. HA. Yes, it is true: http://www.scirp.org/journal/PaperInformation.aspx?PaperID=24506 was (Author: kwri...@metacarta.com): I know how to do it, PROVIDED that it is true that for any plane and any ellipsoid, the intersection of the plane and the ellipsoid is a simple ellipse. I don't yet know whether this is true, however. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703903#comment-14703903 ] Karl Wright edited comment on LUCENE-6699 at 8/19/15 11:34 PM: --- Ok, I know what is going on, and it is indeed related to the WGS84 model. But I have to think this through carefully. The strategy used to compute the X and Y bounds in XYZBound is subtlely flawed. Working on this now. was (Author: kwri...@metacarta.com): Ok, I know what is going on, and it is indeed related to the WGS84 model. But I have to think this through carefully. The strategy used to compute all three bounds in XYZBound, and the latitude bound in LatLonBounds, is subtlely flawed, I think. Working on this now. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703543#comment-14703543 ] Karl Wright edited comment on LUCENE-6699 at 8/19/15 6:44 PM: -- Hmm, I couldn't reproduce this with a simple test. Here's the failure detail: {code} [junit4] 2> java.lang.AssertionError: Solid=XYZSolid: {planetmodel=PlanetModel.SPHERE, isWholeWorld=false, minXplane=[A=1.0, B=0.0, C=0.0, D=-0.778774751769, side=1.0], maxXplane=[A=1.0, B=0.0, C=0.0, D=-0.780900134368, side=-1.0], minYplane=[A=0.0, B=1.0, C=0.0, D=0.002943435994670142, side=1.0], maxYplane=[A=0.0, B=1.0, C=0.0, D=0.0029114063562165494, side=-1.0], minZplane=[A=0.0, B=0.0, C=1.0, D=0.005971010432932473, side=1.0], maxZplane=[A=0.0, B=0.0, C=1.0, D=0.005938981247250581, side=-1.0]}; Shape=GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[X=0.779838725235, Y=-0.0029274211758186968, Z=-0.0059549958440800015], radius=1.601488279374338E-5(9.175851934781766E-4)} {code} Here's the test code I created that passes: {code} c = new GeoCircle(PlanetModel.SPHERE, -0.00595503104063, -0.00292747726474, 1.601488279374338E-5); xyzb = new XYZBounds(); c.getBounds(xyzb); GeoArea area = GeoAreaFactory.makeGeoArea(PlanetModel.SPHERE, xyzb.getMinimumX(), xyzb.getMaximumX(), xyzb.getMinimumY(), xyzb.getMaximumY(), xyzb.getMinimumZ(), xyzb.getMaximumZ()); int relationship = area.getRelationship(c); assertTrue(relationship == GeoArea.WITHIN || relationship == GeoArea.OVERLAPS); {code} Here's the math I did to get there: {code} >>> Z=-0.0059549958440800015 >>> Y=-0.0029274211758186968 >>> X=0.779838725235 >>> print math.asin(Z) -0.00595503104063 >>> print math.atan2(Y,X) -0.00292747726474 >>> {code} was (Author: kwri...@metacarta.com): Hmm, I couldn't reproduce this with a simple test. Here's the failure detail: {code} [junit4] 2> java.lang.AssertionError: Solid=XYZSolid: {planetmodel=PlanetModel.SPHERE, isWholeWorld=false, minXplane=[A=1.0, B=0.0, C=0.0, D=-0.778774751769, side=1.0], maxXplane=[A=1.0, B=0.0, C=0.0, D=-0.780900134368, side=-1.0], minYplane=[A=0.0, B=1.0, C=0.0, D=0.002943435994670142, side=1.0], maxYplane=[A=0.0, B=1.0, C=0.0, D=0.0029114063562165494, side=-1.0], minZplane=[A=0.0, B=0.0, C=1.0, D=0.005971010432932473, side=1.0], maxZplane=[A=0.0, B=0.0, C=1.0, D=0.005938981247250581, side=-1.0]}; Shape=GeoCircle: {planetmodel=PlanetModel.SPHERE, center=[X=0.779838725235, Y=-0.0029274211758186968, Z=-0.0059549958440800015], radius=1.601488279374338E-5(9.175851934781766E-4)} {code} Here's the test code I created that passes: {code} c = new GeoCircle(PlanetModel.SPHERE, -0.00595503104063, -0.00292747726474, 1.601488279374338E-5); xyzb = new XYZBounds(); c.getBounds(xyzb); GeoArea area = GeoAreaFactory.makeGeoArea(PlanetModel.SPHERE, xyzb.getMinimumX(), xyzb.getMaximumX(), xyzb.getMinimumY(), xyzb.getMaximumY(), xyzb.getMinimumZ(), xyzb.getMaximumZ()); int relationship = area.getRelationship(c); assertTrue(relationship == GeoArea.WITHIN || relationship == GeoArea.OVERLAPS); {code} > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14703091#comment-14703091 ] Karl Wright edited comment on LUCENE-6699 at 8/19/15 2:27 PM: -- Ok, I'm swamped at the moment, so anything you can do to describe the sequence of interactions with Geo3D that demonstrate a problem or inconsistency would be very useful. I will have time Thursday evening and Friday morning to look at those in detail I think. ;-) was (Author: kwri...@metacarta.com): Ok, I'm swamped at the moment, so anything you can do to describe the sequence of interactions with Geo3D that demonstrate a problem would be very useful. I will have time Thursday evening and Friday morning to look at those in detail I think. ;-) > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700292#comment-14700292 ] Karl Wright edited comment on LUCENE-6699 at 8/17/15 9:57 PM: -- So what is the code? Are you constructing an XYZSolid from the XYZBounds for the shape? The shape should definitely be contained by an XYZ solid constructed from the XYZBounds for the shape. Round off should not be a concern. It's possible, though, that you might be misinterpreting the result from getRelationship(). Either that, or some specific shape has code problems I am unaware of and need to debug. So code would help. ;-) was (Author: kwri...@metacarta.com): So what is the code? Are you constructing an XYZSolid from the XYZBounds for the shape? > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695068#comment-14695068 ] Karl Wright edited comment on LUCENE-6699 at 8/13/15 11:29 AM: --- Hmm, either there was an svn hiccup, or you got the wrong patch. ;-) Actually, it appears that I uploaded the same patch twice, so it's my fault. But in any case, attaching a new one based on the current branch status. was (Author: kwri...@metacarta.com): Hmm, either there was an svn hiccup, or you got the wrong patch. ;-) Attaching a new one based on the current branch status. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14692456#comment-14692456 ] Karl Wright edited comment on LUCENE-6699 at 8/11/15 11:20 PM: --- Ok, did not understand that. We don't yet have the ability to get a Bounds result for a shape that is x,y,z instead of lat/lon. (That was the optimization we discussed but decided to do as a second step). But I presume you *do* want the ability to know, for a given planet model, the actual bounds of the planet. ;-) That's gotta go somewhere. was (Author: kwri...@metacarta.com): Ok, did not understand that. We don't yet have the ability to get a Bounds result for a shape that is x,y,z instead of lat/lon. But I presume you *do* want the ability to know, for a given planet model, the actual bounds of the planet. ;-) That's gotta go somewhere. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java, LUCENE-6699.patch, LUCENE-6699.patch, > LUCENE-6699.patch, LUCENE-6699.patch > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647540#comment-14647540 ] Nicholas Knize edited comment on LUCENE-6699 at 7/30/15 12:23 PM: -- bq. Right, but you'd be comparing 2 sines, 2 cosines, and 1 sqrt against only the cost of unpacking Encoding/Decoding ECEF into 96 Bits: {noformat} Avg computation: 664.6969666857143 ns Trials: 3500 Total time: 23264.393834 ms Avg computation: 664.829008375 ns Trials: 4000 Total time: 26593.160335 ms Avg computation: 667.362547134 ns Trials: 4500 Total time: 30031.314621 ms Avg computation: 668.46880436 ns Trials: 5000 Total time: 33423.440218 ms Avg computation: 667.8703028909091 ns Trials: 5500 Total time: 36732.866659 ms Avg computation: 669.375388866 ns Trials: 6000 Total time: 40162.523332 ms Avg computation: 668.4362739230769 ns Trials: 6500 Total time: 43448.357805 ms Avg computation: 667.9539851 ns Trials: 7000 Total time: 46756.778957 ms Avg computation: 667.363829733 ns Trials: 7500 Total time: 50052.28723 ms Avg computation: 675.024778375 ns Trials: 8000 Total time: 54001.98227 ms Avg computation: 674.1673578352941 ns Trials: 8500 Total time: 57304.225416 ms Avg computation: 673.472343978 ns Trials: 9000 Total time: 60612.510958 ms Avg computation: 673.0372402842105 ns Trials: 9500 Total time: 63938.537827 ms Avg computation: 672.55224382 ns Trials: 1 Total time: 67255.224382 ms {noformat} Compared to packing/unpacking lat/lon into 64 bits using using GeoPointField morton bit twiddling: {noformat} Avg computation: 60.397136 ns Trials: 3500 Total time: 2113.89976 ms Avg computation: 61.6391708 ns Trials: 4000 Total time: 2465.566832 ms Avg computation: 62.7440744 ns Trials: 4500 Total time: 2823.48334 ms Avg computation: 63.5108 ns Trials: 5000 Total time: 3175.54 ms Avg computation: 64.18207294545455 ns Trials: 5500 Total time: 3530.014012 ms Avg computation: 64.7368465667 ns Trials: 6000 Total time: 3884.210794 ms Avg computation: 65.18073341538461 ns Trials: 6500 Total time: 4236.747672 ms Avg computation: 65.5902512 ns Trials: 7000 Total time: 4591.317584 ms Avg computation: 65.0290225333 ns Trials: 7500 Total time: 4877.17669 ms Avg computation: 63.6249806 ns Trials: 8000 Total time: 5089.998448 ms Avg computation: 62.4193206 ns Trials: 8500 Total time: 5305.642251 ms Avg computation: 61.34443397776 ns Trials: 9000 Total time: 5520.999058 ms Avg computation: 61.402236642105265 ns Trials: 9500 Total time: 5833.212481 ms Avg computation: 61.10019762 ns Trials: 1 Total time: 6110.019762 ms {noformat} So using the 3D BitSet approach is 10 times longer, with the obvious culprit being the for loop for each bit. This can be optimized, though, using a 3-way bit twiddle and 2 longs if a 64 bit 3D packing yields unacceptable loss of precision. bq. so I'd think a benchmarking should not overemphasize seeks as being costly Maybe not. But it does depend on the on-disk representation of the tree (and I typically don't use SSDs as an excuse for not paying attention to good file layout). I was mentioning this in the context of index size as a function of a "wasteful" vs. efficient encoding. was (Author: nknize): bq. Right, but you'd be comparing 2 sines, 2 cosines, and 1 sqrt against only the cost of unpacking Encoding/Decoding ECEF into 96 Bits: {noformat} Avg computation: 664.6969666857143 ns Trials: 3500 Total time: 23264.393834 ms Avg computation: 664.829008375 ns Trials: 4000 Total time: 26593.160335 ms Avg computation: 667.362547134 ns Trials: 4500 Total time: 30031.314621 ms Avg computation: 668.46880436 ns Trials: 5000 Total time: 33423.440218 ms Avg computation: 667.8703028909091 ns Trials: 5500 Total time: 36732.866659 ms Avg computation: 669.375388866 ns Trials: 6000 Total time: 40162.523332 ms Avg computation: 668.4362739230769 ns Trials: 6500 Total time: 43448.357805 ms Avg computation: 667.9539851 ns Trials: 7000 Total time: 46756.778957 ms Avg computation: 667.363829733 ns Trials: 7500 Total time: 50052.28723 ms Avg computation: 675.024778375 ns Trials: 8000 Total time: 54001.98227 ms Avg computation: 674.1673578352941 ns Trials: 8500 Total time: 57304.225416 ms Avg computation: 673.472343978 ns Trials: 9000 Total time: 60612.510958 ms Avg computation: 673.0372402842105 ns Trials: 9500 Total time: 63938.537827 ms Avg computation: 672.55224382 ns Trials: 1 Total time: 67255.224382 ms {noformat} Compared to packing/unpacking lat/lon into 64 bits using using GeoPointField morton bit twiddling: {noformat} Avg computation: 6
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647128#comment-14647128 ] Nicholas Knize edited comment on LUCENE-6699 at 7/30/15 12:00 PM: -- bq. The obvious alternative would be to store just lat/lon, exactly like is done for Nicholas's code Just to clarify, so there's no confusion, the attached code converts from lat/lon to ECEF (either full range in meters, or scaled to the unit spheroid) and stores the result in a 96 bit BitSet. GeoPointField stores encoded lat/lon (if that's what you were referring to). The attached was intended to be used for a Geo3d integration w/ GeoPointField. bq. I expect the conversion for every record would turn out to be more expensive. Just to note again, the conversion from lat/lon to ECEF XYZ is a non-iterative conversion (2 sines, 2 cosines, 1 sqrt). Conversion cost vs. wateful representation is the interesting question, so I threw together a quick encoding/decoding benchmark (on my i7 16GB System76) and ran it on 100M points (converting from lla to ECEF and back). The law of large numbers took over at around 35M. For interest the results are provided: {noformat} Avg computation: 620.431975767 ns Trials: 3000 Total time: 18612.959273 ms Avg computation: 621.3008362285715 ns Trials: 3500 Total time: 21745.529268 ms Avg computation: 621.582647925 ns Trials: 4000 Total time: 24863.305917 ms Avg computation: 621.372458955 ns Trials: 4500 Total time: 27961.760653 ms Avg computation: 621.16271364 ns Trials: 5000 Total time: 31058.135682 ms Avg computation: 621.3857686909091 ns Trials: 5500 Total time: 34176.217278 ms Avg computation: 621.8110524 ns Trials: 6000 Total time: 37308.663144 ms Avg computation: 621.6768083230769 ns Trials: 6500 Total time: 40408.992541 ms Avg computation: 621.5324306714285 ns Trials: 7000 Total time: 43507.270147 ms Avg computation: 621.444053693 ns Trials: 7500 Total time: 46608.304027 ms Avg computation: 621.7594845875 ns Trials: 8000 Total time: 49740.758767 ms Avg computation: 621.9327540705882 ns Trials: 8500 Total time: 52864.284096 ms Avg computation: 621.942943456 ns Trials: 9000 Total time: 55974.864911 ms Avg computation: 621.8868688947368 ns Trials: 9500 Total time: 59079.252545 ms Avg computation: 621.98037608 ns Trials: 1 Total time: 62198.037608 ms {noformat} Again, those are both conversions to ECEF and back to LLA. So roughly 1 minute for 100M points. Halve those numbers and you have the cost for converting either direction. Next we could benchmark tree access and compare? I suspect traversal cost should be a function of the on-disk representation and chosen split method (that's where RTrees tend to become costly). Since BKD splits a sorted space, and it can exploit file system cache? I suspect there aren't many random seeks? Seems benchmarking might be relatively straightforward? was (Author: nknize): bq. The obvious alternative would be to store just lat/lon, exactly like is done for Nicholas's code Just to clarify, so there's no confusion, the attached code converts from lat/lon to ECEF (either full range in meters, or scaled to the unit spheroid) and stores the result in a 96 bit BitSet. GeoPointField stores encoded lat/lon (if that's what you were referring to). The attached was intended to be used for a Geo3d integration w/ GeoPointField. bq. I expect the conversion for every record would turn out to be more expensive. Just to note again, the conversion from lat/lon to ECEF XYZ is a non-iterative conversion (2 sines, 2 cosines, 1 sqrt). Conversion cost vs. wateful representation is the interesting question, so I threw together a quick encoding/decoding benchmark (on my i7 16GB System76) and ran it on 1B points (converting from lla to ECEF and back). The law of large numbers took over at around 350M. For interest the results are provided: {noformat} Avg computation: 620.431975767 ns Trials: 3000 Total time: 18612.959273 ms Avg computation: 621.3008362285715 ns Trials: 3500 Total time: 21745.529268 ms Avg computation: 621.582647925 ns Trials: 4000 Total time: 24863.305917 ms Avg computation: 621.372458955 ns Trials: 4500 Total time: 27961.760653 ms Avg computation: 621.16271364 ns Trials: 5000 Total time: 31058.135682 ms Avg computation: 621.3857686909091 ns Trials: 5500 Total time: 34176.217278 ms Avg computation: 621.8110524 ns Trials: 6000 Total time: 37308.663144 ms Avg computation: 621.6768083230769 ns Trials: 6500 Total time: 40408.992541 ms Avg computation: 621.5324306714285 ns Trials: 7000 Total time: 43507.270147 ms Avg computation: 621.444053693 ns Trials: 7500 Total time: 46608.304027 ms Avg computation: 621.75
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647128#comment-14647128 ] Nicholas Knize edited comment on LUCENE-6699 at 7/30/15 3:54 AM: - bq. The obvious alternative would be to store just lat/lon, exactly like is done for Nicholas's code Just to clarify, so there's no confusion, the attached code converts from lat/lon to ECEF (either full range in meters, or scaled to the unit spheroid) and stores the result in a 96 bit BitSet. GeoPointField stores encoded lat/lon (if that's what you were referring to). The attached was intended to be used for a Geo3d integration w/ GeoPointField. bq. I expect the conversion for every record would turn out to be more expensive. Just to note again, the conversion from lat/lon to ECEF XYZ is a non-iterative conversion (2 sines, 2 cosines, 1 sqrt). Conversion cost vs. wateful representation is the interesting question, so I threw together a quick encoding/decoding benchmark (on my i7 16GB System76) and ran it on 1B points (converting from lla to ECEF and back). The law of large numbers took over at around 350M. For interest the results are provided: {noformat} Avg computation: 620.431975767 ns Trials: 3000 Total time: 18612.959273 ms Avg computation: 621.3008362285715 ns Trials: 3500 Total time: 21745.529268 ms Avg computation: 621.582647925 ns Trials: 4000 Total time: 24863.305917 ms Avg computation: 621.372458955 ns Trials: 4500 Total time: 27961.760653 ms Avg computation: 621.16271364 ns Trials: 5000 Total time: 31058.135682 ms Avg computation: 621.3857686909091 ns Trials: 5500 Total time: 34176.217278 ms Avg computation: 621.8110524 ns Trials: 6000 Total time: 37308.663144 ms Avg computation: 621.6768083230769 ns Trials: 6500 Total time: 40408.992541 ms Avg computation: 621.5324306714285 ns Trials: 7000 Total time: 43507.270147 ms Avg computation: 621.444053693 ns Trials: 7500 Total time: 46608.304027 ms Avg computation: 621.7594845875 ns Trials: 8000 Total time: 49740.758767 ms Avg computation: 621.9327540705882 ns Trials: 8500 Total time: 52864.284096 ms Avg computation: 621.942943456 ns Trials: 9000 Total time: 55974.864911 ms Avg computation: 621.8868688947368 ns Trials: 9500 Total time: 59079.252545 ms Avg computation: 621.98037608 ns Trials: 1 Total time: 62198.037608 ms {noformat} Again, those are both conversions to ECEF and back to LLA. So roughly 1 minute for 1B points. Halve those numbers and you have the cost for converting either direction. Next we could benchmark tree access and compare? I suspect traversal cost should be a function of the on-disk representation and chosen split method (that's where RTrees tend to become costly). Since BKD splits a sorted space, and it can exploit file system cache? I suspect there aren't many random seeks? Seems benchmarking might be relatively straightforward? was (Author: nknize): bq. The obvious alternative would be to store just lat/lon, exactly like is done for Nicholas's code Just to clarify, so there's no confusion, the attached code converts from lat/lon to ECEF (either full range in meters, or scaled to the unit spheroid) and stores the result in a 96 bit BitSet. GeoPointField stores encoded lat/lon (if that's what you were referring to). The attached was intended to be used for a Geo3d integration w/ GeoPointField. bq. I expect the conversion for every record would turn out to be more expensive. Just to note again, the conversion from lat/lon to ECEF XYZ is a non-iterative conversion (2 sines, 2 cosines, 1 sqrt). I threw together a quick benchmark (on my i7 16GB System76) and ran it on 1B points (converting from lla to ECEF and back). The law of large numbers took over at around 350M. For interest the results are provided: {noformat} Avg computation: 620.431975767 ns Trials: 3000 Total time: 18612.959273 ms Avg computation: 621.3008362285715 ns Trials: 3500 Total time: 21745.529268 ms Avg computation: 621.582647925 ns Trials: 4000 Total time: 24863.305917 ms Avg computation: 621.372458955 ns Trials: 4500 Total time: 27961.760653 ms Avg computation: 621.16271364 ns Trials: 5000 Total time: 31058.135682 ms Avg computation: 621.3857686909091 ns Trials: 5500 Total time: 34176.217278 ms Avg computation: 621.8110524 ns Trials: 6000 Total time: 37308.663144 ms Avg computation: 621.6768083230769 ns Trials: 6500 Total time: 40408.992541 ms Avg computation: 621.5324306714285 ns Trials: 7000 Total time: 43507.270147 ms Avg computation: 621.444053693 ns Trials: 7500 Total time: 46608.304027 ms Avg computation: 621.7594845875 ns Trials: 8000 Total time: 49740.758767 ms Avg computation: 621.9327540705882 ns
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646331#comment-14646331 ] Nicholas Knize edited comment on LUCENE-6699 at 7/29/15 4:26 PM: - I had thrown this code together a while ago before I put the GeoPointField / Geo3d integration work on hold. This rough draft converts 3D LLA coordinates into ECEF cartesian coordinates using the WGS84 based non-iterative approach in GeoProjectionUtils (derived from the conversion approach illustrated in the Manual of Photogrammetry using 2 sin/cos and 1 sqrt). The ECEF cartesian coordinate is scaled to a unit spheroid (since this is presumably what Geo3D requires) and each of the 32 bits are interleaved (XYZ) akin to MortonEncoding. Decoding procedures are also provided. While this encoding is not nicely represented as a long (not yet convinced 21 bits per will preserve "acceptable" precision, though we could allocate bits differently since larger altitudes degrade w/ conversion) it is BinaryDocValue friendly and enables a space partitioning/prefix coded approach similar to the way GeoPointField currently works. The Most-Significant 3 bits represent an Oct-Cell at the first level, next 3 for level 2, etc. was (Author: nknize): I had thrown this code together a while ago before I put the GeoPointField / Geo3d integration work on hold. This rough draft converts 3D LLA coordinates into ECEF cartesian coordinates using the WGS84 based non-iterative approach in GeoProjectionUtils (derived from the conversion approach illustrated in the Manual of Photogrammetry using 2 sin/cos and 1 sqrt). The ECEF cartesian coordinate is scaled to a unit spheroid (since this is presumably what Geo3D requires) and each of the 32 bits are interleaved (XYZ) akin to MortonEncoding. Decoding procedures are also provided. While this encoding is not nicely represented as a long (not yet convinced 21 bits per will preserve "acceptable" precision) it is BinaryDocValue friendly and enables a space partitioning/prefix coded approach similar to the way GeoPointField currently works. The Most-Significant 3 bits represent an Oct-Cell at the first level, next 3 for level 2, etc. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > Attachments: Geo3DPacking.java > > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646121#comment-14646121 ] Karl Wright edited comment on LUCENE-6699 at 7/29/15 2:57 PM: -- bq. Essentially the BKD tree needs a way to recursively chop up the earth surface into smaller and smaller curved-rectangle-like (I think?) shapes on the earth's surface, both at indexing time and at search time. With x/y/z splitting, actually you're not quite doing that. Instead, you are chopping up the space that the entire world lives in (not just its surface). One these 3d rectangles may or may not actually intersect the surface (which is where all the geo shapes actually lie). If it does intersect, it might intersect on only one side of the world, or it might intersect on two sides of the world. A long, thin 3d rectangle could well encompass a little bit of territory in (say) the UK as well as Alaska, for instance. But as you describe the algorithm, I'm not sure that this is important at all to know. bq. At search time, for a given cell, it needs to "relate" to the query shape to know if the query shape full contains the cell, does not overlap with the cell, or partially overlaps. The GeoArea interface gives you all of that, which is why I wanted to implement objects that *aren't* limited to the surface but *do* implement GeoArea. bq. But I don't understand the "degenerate in X/Y" Area objects... it seems like GeoArea would be used for the lat/lon "approximation" (outer bounding box?) to a GeoShape? Seems like it's better if we can do all functions using proper earth-surface shapes? GeoArea objects are not constrained to be surface objects. Right now the only implementers of GeoArea are bounding boxes, bounded in latitude and longitude, but that's merely due to lack of need for anything else. The relationship types GeoArea objects can determine are against general GeoShape objects (which *are* surface objects), so the semantics are perfect for what you are trying to do. You can determine whether a lat/lon rectangle overlaps, contains, is within, or doesn't intersect at all with, any arbitrary spatial3d surface shape. But let's be clear: for BKD in the x,y,z space, GeoArea objects bounding in lat/lon are useless. So we need to invent GeoArea objects that represent 3d rectangles. If we try to talk about a general XYZ-bounded area, then there will be up to two bounds for X (min and max), two bounds for Y (min and max), and two bounds for Z (min and max). The degenerate cases come into play when min-X == max-X, or min-Y == max-Y, etc, or when there *is* no min-X bound but just a max-X one, for instance. This can be conditional logic but the whole thing is faster and more efficient if there's an object for each funky case. Hope this helps. [~mikemccand] If this all is clear now, and you want to proceed, let me know and I can start working on it tonight. was (Author: kwri...@metacarta.com): bq. Essentially the BKD tree needs a way to recursively chop up the earth surface into smaller and smaller curved-rectangle-like (I think?) shapes on the earth's surface, both at indexing time and at search time. With x/y/z splitting, actually you're not quite doing that. Instead, you are chopping up the space that the entire world lives in (not just its surface). One these 3d rectangles may or may not actually intersect the surface (which is where all the geo shapes actually lie). If it does intersect, it might intersect on only one side of the world, or it might intersect on two sides of the world. A long, thin 3d rectangle could well encompass a little bit of territory in (say) the UK as well as Alaska, for instance. But as you describe the algorithm, I'm not sure that this is important at all to know. bq. At search time, for a given cell, it needs to "relate" to the query shape to know if the query shape full contains the cell, does not overlap with the cell, or partially overlaps. The GeoArea interface gives you all of that, which is why I wanted to implement objects that *aren't* limited to the surface but *do* implement GeoArea. bq. But I don't understand the "degenerate in X/Y" Area objects... it seems like GeoArea would be used for the lat/lon "approximation" (outer bounding box?) to a GeoShape? Seems like it's better if we can do all functions using proper earth-surface shapes? GeoArea objects are not constrained to be surface objects. Right now the only implementers of GeoArea are bounding boxes, bounded in latitude and longitude, but that's merely due to lack of need for anything else. The relationship types GeoArea objects can determine are against general GeoShape objects (which *are* surface objects), so the semantics are perfect for what you are trying to do. You can determine whether a lat/lon rectangle overlaps, contains, i
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645839#comment-14645839 ] Karl Wright edited comment on LUCENE-6699 at 7/29/15 10:41 AM: --- [~mikemccand] bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind of surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z 3d rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. Thinking in more detail about BKD efficiency, I realize that GeoXYZArea could describe a huge number of 3D rectangles that have NO intersection with the world surface. That's worrisome because it implies that BKD over the (x,y,z) space may not be a very good way of organizing records? Or does it? was (Author: kwri...@metacarta.com): [~mikemccand] bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind of surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645839#comment-14645839 ] Karl Wright edited comment on LUCENE-6699 at 7/29/15 10:26 AM: --- [~mikemccand] bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind of surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. was (Author: kwri...@metacarta.com): [~mikemccand] bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645839#comment-14645839 ] Karl Wright edited comment on LUCENE-6699 at 7/29/15 10:24 AM: --- [~mikemccand] bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. was (Author: kwri...@metacarta.com): bq. So for this to work, I need to be able to efficiently ask the query shape whether it separately overlaps with a range in each of the 3 dimensions. E.g. "do you intersect 0.3 <= x < 0.7", and same for y and z. I assume this is fast/simple to do with the spatial3d APIs? Hmm. There's a "bounds()" method which obtains the lat/lon bounds of a shape. But for 3d we'd not be able to use that. However: bq. Hmm, one source of efficiency for the BKD tree is it can recognize at search time when a given indexed cell is fully contained by the query shape, and then it doesn't need to test every point against the shape (it just blindly collects all docIDs in that cell). But in the 3D case, I think that optimization would never apply? Well, IF we presume that the records all lie on the surface of the world, then you can ask the question, "do these x, y, z bounds, when intersected with the world surface, all lie within the shape, or overlap the shape?" I'll have to think about how efficient that can be made to work, but you ARE after all describing a second surface shape by your x-y-z ranges, so in theory this should be doable. In particular, the GeoBBox construct allows you to find the relationship between a specific kind surface shape (currently only lat/lon rectangles and their degenerate 3d equivalents) and any other GeoShape. We could try to introduce x-y-z rectangles as shapes; they'd be completely describable by planes, so really all the same logic would apply. I would need to create a new GeoAreaFactory method, and a family of x-y-z bounding shapes, e.g. GeoXYZArea, and it should all work. To be sure, have a look at the functionality that GeoArea gives you. If that looks like what you need, and you can convince yourself that it would be reasonably efficient as far as BKD is concerned, then I'll go ahead and create a patch that does what you need. > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff al
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644248#comment-14644248 ] Karl Wright edited comment on LUCENE-6699 at 7/29/15 5:49 AM: -- bq. Can we "just" index earth surface lat/lon, and then at query time is spatial3d able to give me an enclosing "surface lat/lon" bbox for a 3d shape? [~mikemccand] Ok, so now you have me confused a bit as to what your requirements are for BKD. If you want to split the globe up into lat/lon rectangles, and use BKD that way to descend, then obviously you'd need points to be stored in lat/lon. But that would make less sense for geospatial3d, because what you're really trying to do is assess membership in a shape, or distance also in regards to a shape, both of which require (x,y,z) not lat/lon. Yes, you can convert to (x,y,z) from lat/lon, but the conversion is relatively expensive. Instead, I could imagine just staying natively in (x,y,z), and doing your splits in that space, e.g. split in x, then in y, then in z. So you'd have a GeoPoint3D which would pack (x,y,z) in a format you could rapidly extract, and a splitting algorithm that would use the known ranges for these values. Does that make sense to you? Would that work with BKD? was (Author: kwri...@metacarta.com): bq. Can we "just" index earth surface lat/lon, and then at query time is spatial3d able to give me an enclosing "surface lat/lon" bbox for a 3d shape? Ok, so now you have me confused a bit as to what your requirements are for BKD. If you want to split the globe up into lat/lon rectangles, and use BKD that way to descend, then obviously you'd need points to be stored in lat/lon. But that would make less sense for geospatial3d, because what you're really trying to do is assess membership in a shape, or distance also in regards to a shape, both of which require (x,y,z) not lat/lon. Yes, you can convert to (x,y,z) from lat/lon, but the conversion is relatively expensive. Instead, I could imagine just staying natively in (x,y,z), and doing your splits in that space, e.g. split in x, then in y, then in z. So you'd have a GeoPoint3D which would pack (x,y,z) in a format you could rapidly extract, and a splitting algorithm that would use the known ranges for these values. Does that make sense to you? Would that work with BKD? > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-6699) Integrate lat/lon BKD and spatial3d
[ https://issues.apache.org/jira/browse/LUCENE-6699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644248#comment-14644248 ] Karl Wright edited comment on LUCENE-6699 at 7/28/15 11:23 AM: --- bq. Can we "just" index earth surface lat/lon, and then at query time is spatial3d able to give me an enclosing "surface lat/lon" bbox for a 3d shape? Ok, so now you have me confused a bit as to what your requirements are for BKD. If you want to split the globe up into lat/lon rectangles, and use BKD that way to descend, then obviously you'd need points to be stored in lat/lon. But that would make less sense for geospatial3d, because what you're really trying to do is assess membership in a shape, or distance also in regards to a shape, both of which require (x,y,z) not lat/lon. Yes, you can convert to (x,y,z) from lat/lon, but the conversion is relatively expensive. Instead, I could imagine just staying natively in (x,y,z), and doing your splits in that space, e.g. split in x, then in y, then in z. So you'd have a GeoPoint3D which would pack (x,y,z) in a format you could rapidly extract, and a splitting algorithm that would use the known ranges for these values. Does that make sense to you? Would that work with BKD? was (Author: kwri...@metacarta.com): bq. Can we "just" index earth surface lat/lon, and then at query time is spatial3d able to give me an enclosing "surface lat/lon" bbox for a 3d shape? Ok, so now you have me confused a bit as to what your requirements are for BKD. If you want to split the globe up into lat/lon rectangles, and use BKD that way to descend, then obviously you'd need points to be stored in lat/lon. But that would make less sense for geospatial3d, because what you're really trying to do is assess membership in a shape, or distance also in regards to a shape, both of which require (x,y,z) not lat/lon. Yes, you can convert to (x,y,z) from lat/lon, but the conversion is relatively expensive. Instead, I could imagine just staying natively in (x,y,z), and doing your splits in that space, e.g. split in x, then in y, then in z. So you'd have a GeoPoint3D which would pack (x,y,z) in a format you could rapidly extract, and a splitting algorithm that would use the known ranges for these values. Does that make sense to you? Would that work with BKD? > Integrate lat/lon BKD and spatial3d > --- > > Key: LUCENE-6699 > URL: https://issues.apache.org/jira/browse/LUCENE-6699 > Project: Lucene - Core > Issue Type: New Feature >Reporter: Michael McCandless >Assignee: Michael McCandless > > I'm opening this for discussion, because I'm not yet sure how to do > this integration, because of my ignorance about spatial in general and > spatial3d in particular :) > Our BKD tree impl is very fast at doing lat/lon shape intersection > (bbox, polygon, soon distance: LUCENE-6698) against previously indexed > points. > I think to integrate with spatial3d, we would first need to record > lat/lon/z into doc values. Somewhere I saw discussion about how we > could stuff all 3 into a single long value with acceptable precision > loss? Or, we could use BinaryDocValues? We need all 3 dims available > to do the fast per-hit query time filtering. > But, second: what do we index into the BKD tree? Can we "just" index > earth surface lat/lon, and then at query time is spatial3d able to > give me an enclosing "surface lat/lon" bbox for a 3d shape? Or > ... must we index all 3 dimensions into the BKD tree (seems like this > could be somewhat wasteful)? -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org