Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "SolrAdaptersForLuceneSpatial4" page has been changed by DavidSmiley: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4?action=diff&rev1=6&rev2=7 <!> Solr4.0 - Note: This page is a working draft of documentation. It needs to be migrated/merged/moved/renamed into the existing Solr spatial wiki content somehow. + Note: This page is a working draft of documentation. It needs to be migrated/merged/moved/renamed into the existing Solr [[SpatialSearch|spatial wiki content]] somehow. = Lucene / Solr 4 Spatial = - This document describes how to use the new spatial functionality in Lucene / Solr 4. The bulk of the implementation lives in the new Lucene 4 spatial module. It replaces the former "Lucene spatial contrib" in v3. The Solr piece is small as it only needs to provide field types which are essentially adapters to the code in the Lucene spatial module. Furthermore, understand that the shape implementations and other core spatial code that isn't related to Lucene is held in another new open-source project called [[https://github.com/spatial4j/spatial4j|Spatial4j]]. Presently, polygon support requires an additional dependency -- [[http://sourceforge.net/projects/jts-topo-suite/|JTS]]. + This document describes how to use the new spatial functionality in Lucene / Solr 4. The bulk of the implementation lives in the new Lucene 4 spatial module. It replaces the former "Lucene spatial contrib" in v3. The Solr piece is small as it only needs to provide field types which are essentially adapters to the code in the Lucene spatial module. The shape implementations and other core spatial code that isn't related to Lucene is held in a new open-source project called [[https://github.com/spatial4j/spatial4j|Spatial4j]]. Presently, polygon support requires an additional dependency -- [[http://sourceforge.net/projects/jts-topo-suite/|JTS]]. + + There is a basic [[https://github.com/ryantxu/spatial-solr-sandbox|demo application]] that exercises a variety of these features. It's not "live" so you'll have to download and build it first. It's a bit rough around the edges as it's mostly used by the Lucene spatial developers. == New features, over Solr 3 spatial == @@ -15, +17 @@ These features describe what developer-users of Lucene/Solr 4 will appreciate. Under the hood, it's a framework designed to be extended for different so-called "spatial strategies". I'll assume here the RecursivePrefixTreeStrategy as it should address most use-cases and it has the best tests. + * Polygon, LineString and other new shapes. All shapes are supported as indexed shapes and query shapes. Shapes other than point, rectangle and circle are supported via JTS -- an otherwise optional dependency. See JTS caveats below for more information. * Multi-valued indexed fields. This is critical for storing the results of automatic place extraction from text using natural language processing techniques with a gazetteer (a variant of "geocoding"), since a variable number of locations will be found. - * Index shapes with area, not just points. An indexed shape is essentially pixelated (i.e. gridded) to a configured resolution per shape. By default that resolution is defined by a percentage of the overall shape size, and it applies to query shapes too. Note: If extremely high precision of shape edges needs to be retained for accurate indexing, then this solution probably won't scale too well at indexing time (big indexes, slow indexing). On the other hand, query shapes generally scale well to the maximum configured precision regardless of shape size. Note: indexing shapes with area sorely [[https://issues.apache.org/jira/browse/LUCENE-4419|needs testing]]. + * Index non-point shapes as well as points. Non-point shapes are essentially pixelated (i.e. gridded) to a configured resolution per shape -- an approximation. By default that resolution is defined by a percentage of the overall shape size, and it applies to query shapes too. Note: If extremely high precision of shape edges needs to be retained for accurate indexing, then this solution probably won't scale too well at indexing time (big indexes, slow indexing). On the other hand, ''query'' shapes generally scale well to the maximum configured precision regardless of shape size. Note: indexing shapes with area sorely [[https://issues.apache.org/jira/browse/LUCENE-4419|needs testing]]. - * Polygon, LineString and other new shapes. All shapes are supported as indexed shapes and query shapes. Note: Shapes other than point, rectangle and circle are supported via JTS -- an otherwise optional dependency. JTS views the world as a flat plane; the latitude and longitude are mapped to this plane directly. It uses Euclidean math operations, not Geodesic ones. By and large this isn't a problem, although it can be if the vertices are particularly far apart longitudinally. Spatial4j adapts shapes that cross the dateline to be compatible with JTS, and so you shouldn't notice a problem (notwithstanding unknown bugs). It does not support shapes covering the poles yet. Consequently if you want to index or query by the Antarctica polygon for example, you are out of luck for now. * Rectangles with user-specifiable corners. Oddly, Solr 3 spatial only supports the bounding box of a circle. * Multi-value distance sort / score boost. Note: this is a preliminary unoptimized implementation that uses a fair amount of RAM, even when multiValued=false. An alternative should be provided in the future. * Configurable precision which can vary per shape at query time (and sort of at index time). This enhances the performance. @@ -49, +51 @@ * geo="false": Set geospatial to false. It defaults to true. By setting it to false, you really should indicate worldBounds and probably maxDistErr as well. * worldBounds="minX minY maxX maxY": Set the valid numerical ranges for x & y. By default for non-geospatial this is the limits of a Java double however those values have been shown to not work (yet). - There are other parameters not yet documented as they are more obscure, such as using other distance calculation formulas, and specifying the grid encoding (geohash vs quad). + There are other parameters too: + * distCalculator="haversine": Set the distance calculation algorithm. Others are: lawOfCosines (warning: faulty), vincentySphere, cartesian, and cartesian^2. + * prefixTree="geohash": Choose the spatial grid implementation. "geohash" uses the Geohash algorithm which has 32 children at each level, and there is "quad" which has 4 children from each level, and supports non-geospatial (geo=false). + * maxLevels="10": Set the maximum level (aka grid depth). It's easier to think in terms of a real distance and use maxDistErr instead. And finally, specify a field that uses this field type: {{{ <field name="geo" type="location_rpt" indexed="true" stored="true" multiValued="true" /> }}} @@ -76, +81 @@ {{{ <field name="geo">POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30))</field> }}} In WKT, coordinates are in "x y" (lon lat) order, and the coordinates are each separated by commas. - == Shape / Polygon / WKT notes == + == JTS / WKT / Polygon notes == + Shapes other than point, circle, or rectangle require JTS, an otherwise optional dependency. If you want to use WKT but only need the basic shapes, you still need JTS -- a restriction likely to be addressed in the near future. + + * JTS views the world as a flat plane; the latitude and longitude are mapped to this plane directly. It uses Euclidean math operations, not Geodesic ones. By and large this isn't a problem, although it can be if the vertices are particularly far apart longitudinally. Spatial4j adapts shapes that cross the dateline to be compatible with JTS, and so you shouldn't notice a problem (notwithstanding unknown bugs). It does not support shapes covering the poles yet. Consequently if you want to index or query by the Antarctica polygon for example, you are out of luck for now. - * Only Polygon, and Multipolygon WKT types have been tested. GeometryCollection will not work but the others should in theory. Holes in polygons haven't been tested but they there is code to support them. + * Only Polygon, and MultiPolygon WKT types have been tested. GeometryCollection will not work but the others should in theory. Holes in polygons haven't been tested but there is code to support them. * The implementation doesn't support WKT that encompasses a pole. The only shape that can encompass a pole is a Circle. Technically a longitude-wrapping (-180 to +180) lat-lon box that touches a pole will too though. - * Polygons and other WKT must have each vertex less than 180 degrees in longitude difference than the vertex before it, or else it will be confused as going the wrong way around the globe. Dateline crossing is supported. + * Polygons and other WKT must have each vertex less than 180 degrees in longitude difference than the vertex before it, or else it will be confused as going the wrong way around the globe. Dateline crossing '''is''' supported. - * All wkt input coordinates are normalized into the standard geospatial lat-lon boundaries. So, -184 longitude becomes +176, for example. Both +180 and -180 are kept distinct. + * All WKT coordinates are normalized into the standard geospatial lat-lon boundaries. So, -184 longitude becomes +176, for example. Both +180 and -180 are kept distinct (true for all of Spatial4j, not just JTS. == Search ==