Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change 
notification.

The "SolrAdaptersForLuceneSpatial4" page has been changed by DavidSmiley:
http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4?action=diff&rev1=17&rev2=18

Comment:
Whole section on sorting.

  
  == Sorting and Relevancy ==
  
- Here is an example of a circle of degree radius of .1, sorting and filtering 
by this.
+ A common spatial requirement is to sort the search results by distance from a 
point such as the center of a map window.  Again, this works quite differently 
than Solr 3 spatial.  Here, the spatial queries seen earlier are capable of 
returning a distance based score, which can then be sorted, used in relevancy 
boosting, and even returned in search results.
  
- {{{   
http://localhost:8983/solr/select?q=*:*&sort=query($sortsq)+asc&wt=xml&fl=distdeg:query($sortsq),store_geohash&fq={!%20v=$geoq}&sortsq={!%20score=distance%20v=$geoq}&geoq=store_geohash:"Intersects(Circle(37.489651,-77.665085
 d=.1))" }}}
+ Here, we show parameters that do a spatial search filter & sort & returns the 
distance (as the score) simultaneously:
  
- The setup in schema.xml:
+ {{{ &fl=*,score&sort=score asc&q={! 
score=distance}geo:"Intersects(Circle(54.729696,-98.525391 d=10))" }}}
  
- {{{     <fieldType name="geohash"   
class="solr.SpatialRecursivePrefixTreeFieldType" units="degrees" /> }}}
+ Adding a user keyword search in this case would be added as an 'fq' param, 
most likely with the leading {!edismax}. Notice the score=distance local-param 
here.  Without this (or if set to "none"), the query would yield a constant 1.0 
for all documents.  With "distance", it is the distance in degrees from the 
center of the query shape to the indexed point(s).  You'll probably want to 
sort these values ascending.  Another option is "recipDistance" which will use 
the reciprocal function such that distance 0 yields a score of 1, and a 
distance at the edge of the query shape yields ~0.1, trailing down closer to 0 
beyond that.  The "recipDistance" option is intended for use in boosting 
relevancy, such as using it in dismax's boost parameter.
  
- {{{     <field name="store_geohash" type="geohash" indexed="true" 
stored="true"/> }}}
+ If you want to sort and to have the distance in the results like in the last 
example, but don't want the spatial filter, you can do this too.  Use this 
approach in which we sort by a function query referring to a query's score:
  
- You can convert from 1 degree lat/long to an approximation of 111 km or 
68.9722 miles. d=.14498 is equal to approx 10 miles.
+ {{{ &fl=*,distdeg:query($sortsq)&sort=query($sortsq) asc&sortsq={! 
score=distance}geo:"Intersects(Circle(54.729696,-98.525391 d=10))" }}}
  
- {{{   
http://localhost:8983/solr/select?q=*:*&sort=query($sortsq)+asc&wt=xml&fl=distmi:mul(query($sortsq),68.9722),distkm:mul(query($sortsq),111),distdeg:query($sortsq),store_geohash&fq={!%20v=$geoq}&sortsq={!%20score=distance%20v=$geoq}&geoq=store_geohash:"Intersects(Circle(37.489651,-77.665085
 d=.144985951))" }}}
+ The parameter "sortsq" was named arbitrarily (it's not special); it's 
referred to in the "fl" parameter and in the "sort" parameter with the same 
distance-yielding query. If a document has no point in the spatial field, the 
distance used is 0.  Use of this query in two places will result in some 
redundant calculations but only for the results actually returned, not the 
potentially millions of matched documents.
  
- === Returning The Distance in Search Results ===
+ If you only need to return the distance but ''don't need to sort'', then the 
most performant approach is to calculate it on the client based on the lat & 
lon from the search results.  Google for the haversine algorithm and your 
language of choice and you'll find a code snippet.  If you ask Solr to do it 
then it'll put all the points in memory needlessly, but it'll certainly work. 
This shortcoming may be addressed in the future.
  
- TODO
+ Notes:
+  * If you index non-point data (e.g. polygons), then the PrefixTree based 
strategy will supply the center points of those shapes for sorting purposes
+  * If you supply multiple points or other shapes, then the distance to the 
closest one is used. If you need different behavior then file an issue in JIRA 
and explain your use-case.
+  * The PrefixTree based field type has a sub-par implementation for caching 
the indexed points in memory, currently.  Even if multiValue="false", it's 
going to use the same big array of List of Point objects in memory.  It's 
wasteful and the implementation is not friendly to real-time search 
requirements.  Until a better implementation arrives, if you have single-valued 
point fields then use LatLonType for sorting instead.  LatLonType also allows 
the choice of a float based coordinate field which halves memory compared to 
doubles, yet getting less than 3 meters of precision.
  
+ Sorting in Solr, wether it be a number/date or one of these spatial fields, 
requires some memory for each document and spatial sorting can involve some 
non-trivial math performed numerous times.  Consequently, don't apply sorting 
without an actual need / requirement, versus a "hey, why not?" choice.  The 
first time you sort on a field (spatial or not) it will load some data into 
memory then.  This "first time" is the first time since the last commit, to be 
precise.  You probably want to do put the sort query into firstSearcher & 
newSearcher so that an end user's search won't get hit with that penalty.
+ 
+ === Units, Conversion ===
+ 
+ Degrees to kilometers:  degrees * 111.2
+ Degrees to miles: degrees * 69.09 
+ 
+ Just divide instead of multiply to go the other way.
+ 

Reply via email to