Dear Wiki user, You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.
The "SolrAdaptersForLuceneSpatial4" page has been changed by DavidSmiley: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4?action=diff&rev1=17&rev2=18 Comment: Whole section on sorting. == Sorting and Relevancy == - Here is an example of a circle of degree radius of .1, sorting and filtering by this. + A common spatial requirement is to sort the search results by distance from a point such as the center of a map window. Again, this works quite differently than Solr 3 spatial. Here, the spatial queries seen earlier are capable of returning a distance based score, which can then be sorted, used in relevancy boosting, and even returned in search results. - {{{ http://localhost:8983/solr/select?q=*:*&sort=query($sortsq)+asc&wt=xml&fl=distdeg:query($sortsq),store_geohash&fq={!%20v=$geoq}&sortsq={!%20score=distance%20v=$geoq}&geoq=store_geohash:"Intersects(Circle(37.489651,-77.665085 d=.1))" }}} + Here, we show parameters that do a spatial search filter & sort & returns the distance (as the score) simultaneously: - The setup in schema.xml: + {{{ &fl=*,score&sort=score asc&q={! score=distance}geo:"Intersects(Circle(54.729696,-98.525391 d=10))" }}} - {{{ <fieldType name="geohash" class="solr.SpatialRecursivePrefixTreeFieldType" units="degrees" /> }}} + Adding a user keyword search in this case would be added as an 'fq' param, most likely with the leading {!edismax}. Notice the score=distance local-param here. Without this (or if set to "none"), the query would yield a constant 1.0 for all documents. With "distance", it is the distance in degrees from the center of the query shape to the indexed point(s). You'll probably want to sort these values ascending. Another option is "recipDistance" which will use the reciprocal function such that distance 0 yields a score of 1, and a distance at the edge of the query shape yields ~0.1, trailing down closer to 0 beyond that. The "recipDistance" option is intended for use in boosting relevancy, such as using it in dismax's boost parameter. - {{{ <field name="store_geohash" type="geohash" indexed="true" stored="true"/> }}} + If you want to sort and to have the distance in the results like in the last example, but don't want the spatial filter, you can do this too. Use this approach in which we sort by a function query referring to a query's score: - You can convert from 1 degree lat/long to an approximation of 111 km or 68.9722 miles. d=.14498 is equal to approx 10 miles. + {{{ &fl=*,distdeg:query($sortsq)&sort=query($sortsq) asc&sortsq={! score=distance}geo:"Intersects(Circle(54.729696,-98.525391 d=10))" }}} - {{{ http://localhost:8983/solr/select?q=*:*&sort=query($sortsq)+asc&wt=xml&fl=distmi:mul(query($sortsq),68.9722),distkm:mul(query($sortsq),111),distdeg:query($sortsq),store_geohash&fq={!%20v=$geoq}&sortsq={!%20score=distance%20v=$geoq}&geoq=store_geohash:"Intersects(Circle(37.489651,-77.665085 d=.144985951))" }}} + The parameter "sortsq" was named arbitrarily (it's not special); it's referred to in the "fl" parameter and in the "sort" parameter with the same distance-yielding query. If a document has no point in the spatial field, the distance used is 0. Use of this query in two places will result in some redundant calculations but only for the results actually returned, not the potentially millions of matched documents. - === Returning The Distance in Search Results === + If you only need to return the distance but ''don't need to sort'', then the most performant approach is to calculate it on the client based on the lat & lon from the search results. Google for the haversine algorithm and your language of choice and you'll find a code snippet. If you ask Solr to do it then it'll put all the points in memory needlessly, but it'll certainly work. This shortcoming may be addressed in the future. - TODO + Notes: + * If you index non-point data (e.g. polygons), then the PrefixTree based strategy will supply the center points of those shapes for sorting purposes + * If you supply multiple points or other shapes, then the distance to the closest one is used. If you need different behavior then file an issue in JIRA and explain your use-case. + * The PrefixTree based field type has a sub-par implementation for caching the indexed points in memory, currently. Even if multiValue="false", it's going to use the same big array of List of Point objects in memory. It's wasteful and the implementation is not friendly to real-time search requirements. Until a better implementation arrives, if you have single-valued point fields then use LatLonType for sorting instead. LatLonType also allows the choice of a float based coordinate field which halves memory compared to doubles, yet getting less than 3 meters of precision. + Sorting in Solr, wether it be a number/date or one of these spatial fields, requires some memory for each document and spatial sorting can involve some non-trivial math performed numerous times. Consequently, don't apply sorting without an actual need / requirement, versus a "hey, why not?" choice. The first time you sort on a field (spatial or not) it will load some data into memory then. This "first time" is the first time since the last commit, to be precise. You probably want to do put the sort query into firstSearcher & newSearcher so that an end user's search won't get hit with that penalty. + + === Units, Conversion === + + Degrees to kilometers: degrees * 111.2 + Degrees to miles: degrees * 69.09 + + Just divide instead of multiply to go the other way. +