Hi all,

I am currently working on the project Multidimensional Space Search with
Lucene 6 for DAS. Here is a list of possible functionalities (and some
example scenarios) that we can provide using Lucene 6.1 's multi
dimensional space search.
(This is a brief description on collected details, to see more info and the
test codes, please visit my git repo
<https://github.com/janakact/test_lucene>.)

Multidimensional space points in Lucene 6 can be categorized into two types
of space points depending on their query types and distribution of points
in the space.

   1. General Multidimensional space points.
   2. Locations on the planet surface.


*1. General Multidimensional Space Points*
This is the generic type of multi dimensional space points. Those are
actually vector spaces. Space has a dimension K. In a K-dimensional space a
point is represented by a K number of numeric values.

For an example 3D point will be represented by 3 float/double values.
(because distance is measured as a floating point number.)

*Possible Queries for API*

   - Search for points with exact given values.
   - Search for points which has one of the value from a given set of
   values.
   - Search for points within a given range.
   - Get the number of points which has exact point.
   - Get the number of points within a given range. (Ranges are
   multidimensional ranges. In 3D, they are boxes.)
   - Divide points into range-buckets and get the count in each buckets.
   (Range bucket is a range which has a label in it)

*Scenarios*

Since this is a more general space definition this could have many general
applications. These dimensions can be used to represent any numeric field.
They can be used to represent locations in a 3D space if we use 3
dimensional points. We can use it to represent both space and time if we
use 4 dimensions. (X, Y, Z, time all are numeric fields. Double can be
used).

*Independent Parameters*

It can be used to represent completely independent parameters. Think there
is a set of employees. A multi dimensional space can be used to represent
different parameters of them.

*Example:* Age, salary, height, average number of leaves per month. These
are 4 numeric fields which are completely independent. It can be
represented as a 4 Dimensional Space. Each person will be represented as a
point in this space. Then the user can use Lucene to query  on these
people.

   - What is the number of people who's age is 25-30 years, Height is 160cm
   to 180cm, Salary is 50,000 to 75,000 and take 1-5 average number of leaves
   per month?
   - Or user can divide people into different buckets and count them
   depending on the ranges for each parameter.

(Of course this can be done by indexing those parameters separately and
query using 'AND' keyword, but indexing them together as a multidimensional
space will make searching more efficient)

*2. Locations on the planet surface. (Latitude, Longitude)*
Here points represents locations on top of the planet surface. This is a
more specific type of search provided by Lucene to index and search
geographical locations.

These points are created using only the latitude and longitude values of
locations.
**Please consider altitude is not yet supported by Lucene.*

Since this is specifically designed for Locations search it has more useful
queries than General Multidimensional Points.

*Possible Queries for the API*

   - Search for the K-nearest points from a given location. (return the
   given number of points)
   - Search for the Points within a given radius from a given point.
   - Sort them by the distance from a given location.
   - Points inside a polygon.(Polygons are geometric shapes on the surface
   of the planet. Example: map of a country)
   - Get the number of points inside a polygon.*
   - Get the number of points in each bucket where buckets are specified as
   polygons.
   - Get the number of points in each bucket where buckets are specified by
   the distance from a given location.

* Composite polygons are possible.
*Scenarios*

*Airport Scenario *
If we index the set of airports in the world as GeoPoints. Following
queries are possible examples. (Here is the test code I implemented as an
example.)
<https://github.com/janakact/test_lucene/blob/master/src/test/java/TestMultiDimensionalQueries.java>

   - Find closest set of airports to a given town.
   - Find the set of airports within a given radius from a particular town.
   - Find the set of airports inside a country. (Country can be given as a
   polygon)
   - Find the set of airports within a given range of Latitudes and
   Longitudes. It is a Latitude, Longitude box query. (For a examples:
   Airports closer to the equatorial)
   - Find the set of airports closer to a given path. (Path can be
   something like a road. Find the airports which are less than 50km away from
   a given highway)
   - Count the airports in each country by giving country maps as polygons.

*Indexing airplane paths*

   - It is possible to query for paths which goes through an interesting
   area.

Above example covers most of the functionalities that Lucene Space search
provides.
Here are some other examples,

   - Number of television users a satellite can cover.(by indexing
   receivers' locations)
   - To find the number of stationary telescopes that can be used to
   observe a solar eclipse. (by indexing telescope locations. Area the solar
   eclipse is visible, can be represented as a polygon
   http://eclipse.gsfc.nasa.gov/SEplot/SEplot2001/SE2016Sep01A.GIF
   <http://eclipse.gsfc.nasa.gov/SEplot/SEplot2001/SE2016Sep01A.GIF>)

So, that's it.
Thank you.

Regards,
Janaka Chathuranga

-- 
Janaka Chathuranga
*Software Engineering Intern*
Mobile : *+94 (**071) 3315 725*
[email protected]

<https://wso2.com/signature>
_______________________________________________
Architecture mailing list
[email protected]
https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture

Reply via email to