[
https://issues.apache.org/jira/browse/LUCENE-6480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14541933#comment-14541933
]
Nicholas Knize edited comment on LUCENE-6480 at 5/13/15 1:43 PM:
-----------------------------------------------------------------
Its sounds much like the simple morton interleaving I'm using for the 2D case?
But since you're using an extra bit for the 3rd dimension you lose precision in
the horizontal direction. We could start w/ that as a phase one? Instead of
worrying about the sign bit the values in the 2D case are scaled 0:360, 0:180
and divided into 32 bits per lat/lon (see GeoUtils.java). Extending to 3D
divide 0:360, 0:180, 0:?? by 21 and extend BitUtil.interleave to the 3 value
case. Its super fast since its done by bit twiddling using magic numbers
(although the magic numbers will need to be reworked). The question is the max
value of the altitude? The larger the value the less precise, but you could
conceivably go as far as 3,300 (km) to cover the earth's atmosphere? Maybe
that's configurable.
As a phase 2 there has been some work in this area for 3 and 4d hilbert order
(still using 64 bit), which will better preserve locality. (I mentioned it in a
comment in the previous issue). Locality is important since it will drive the
complexity of the range search and how much the postings list will actually
help (e.g. stepping one unit in the 3rd dimension can result in a boundary
range that requires post-filtering a significant number of high precision
terms).
The more I think about it, this might be efficiently done using a statically
computed lookup table (we'd have to tinker)? i.e., one hilbert order for the
3d unit cube is 000, 001, 101, 100, 110, 111, 011, 010, and the order of the
suboctants at each succeeding level is a permutation of this base unit cube.
For example, the next rotated level (for suboctant 000) gives the binary order:
000 000, 000 010, 000 110, 000 100, 000 101, 000 111, 000 011, 000 001.
There's a paper that describes how to compute the suboctant permutation rather
efficiently, and it could be statically computed and represented using 1. base
unit ordering, 2. substitution list. So for level 2, each suboctant ordering
is: base order (000, 001, 101, 100, 110, 111, 011, 010), substitution list (2
8) (3 5), (2 8 4) (3 7 5), (2 8 4) (3 7 5), (1 3) (2 4) (5 7) (6 8), (1 3) (2
4) (5 7) (6 8), (1 5 7) (2 4 6), (1 7) (4 6). Something to think about as an
enhancement. I'll try to find the paper as I've got this worked out in my
notebook from some previous work (lol).
was (Author: nknize):
Its sounds much like the simple morton interleaving I'm using for the 2D case?
But since you're using an extra bit for the 3rd dimension you lose precision in
the horizontal direction. We could start w/ that as a phase one? Instead of
worrying about the sign bit the values in the 2D case are scaled 0:360, 0:180
and divided into 32 bits per lat/lon (see GeoUtils.java). Extending to 3D
divide 0:360, 0:180, 0:?? by 21 and extend BitUtil.interleave to the 3 value
case. Its super fast since its done by bit twiddling using magic numbers
(although the magic numbers will need to be reworked). The question is the max
value of the altitude? The larger the value the less precise, but you could
conceivably go as far as 3,300 (km) to cover the earth's atmosphere? Maybe
that's configurable.
As a phase 2 there has been some work in this area for 3 and 4d hilbert order
(still using 64 bit), which will better preserve locality. (I mentioned it in a
comment in the previous issue). Locality is important since it will drive the
complexity of the range search and how much the postings list will actually
help (e.g. stepping one unit in the 3rd dimension can result in a boundary
range that requires post-filtering a significant number of high precision
terms).
The more I think about it, this might be efficiently done using a statically
computed lookup table (we'd have to tinker)? i.e., one hilbert order for the
3d unit cube is 000, 001, 101, 100, 110, 111, 011, 010, and the order of the
suboctants at each succeeding level is a permutation of this base unit cube.
For example, the next rotated level (for suboctant 000) gives the binary order:
000 000, 000 010, 000 110, 000 100, 000 101, 000 111, 000 011, 000 001.
There's a paper that describes how to compute the suboctant permutation rather
efficiently, and it could be statically computed and represented using 1. base
unit ordering, 2. substitution list. So for level 2, each suboctant ordering
is: base order (000, 001, 101, 100, 110, 111, 011, 010), substitution list (2
8) (3 5), (2 8 4) (3 7 5), (2 8 4) (3 7 5), (1 3) (2 4) (5 7) (6 8), (1 3) (2
4) (5 7) (6 8), (1 5 7) (2 4 6), (1 7) (4 6). Something to think about as an
enhancement. I'll try to find the paper.
> Extend Simple GeoPointField Type to 3d
> ---------------------------------------
>
> Key: LUCENE-6480
> URL: https://issues.apache.org/jira/browse/LUCENE-6480
> Project: Lucene - Core
> Issue Type: New Feature
> Components: core/index
> Reporter: Nicholas Knize
>
> [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] proposes a
> simple GeoPointField type to lucene core. This field uses 64bit encoding of 2
> dimensional points to construct sorted term representations of GeoPoints
> (aka: GeoHashing).
> This feature investigates adding support for encoding 3 dimensional
> GeoPoints, either by extending GeoPointField to a Geo3DPointField or adding
> an additional 3d constructor.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]