[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12924098#action_12924098
 ] 

Lance Norskog edited comment on SOLR-2155 at 10/22/10 8:32 PM:
---------------------------------------------------------------

I've reread the patch a few times and I understand it better now, and yes there 
should be no equator/prime meridian problems. I retract any overt or implied 
criticism.

bq. First of all, I'm re-using the existing geohash field support in Solr which 
indexes the lat-lons as actual geohashes (i.e. the character representation), 
not in a bitwise fashion. But that doesn't really matter - it would be a 
worthwhile optimization to index them in that fashion as it would be more 
compact.

Using the canonical geohash gives facet values that can be copy&pasted with 
other software. Thinking again, this is a great feature. Would it be worth 
optimizing geohash with a Trie version?  Trie fields (can be made to) show up 
correctly in facets.

And thank you for the word 
[gazateer|http://encyclopedia2.thefreedictionary.com/Gazateer].

About unit tests: I've stumbled so many times with floating point that I only 
trust real-world data. A good unit test would be indexing a [gazateer| 
http://encyclopedia2.thefreedictionary.com/Gazateer] of world data and randomly 
comparing points. OpenStreetMaps or Wikipedia locations for example.


      was (Author: lancenorskog):
    I've reread the patch a few times and I understand it better now, and yes 
there should be no equator/prime meridian problems. I retract any overt or 
implied criticism.

bq. First of all, I'm re-using the existing geohash field support in Solr which 
indexes the lat-lons as actual geohashes (i.e. the character representation), 
not in a bitwise fashion. But that doesn't really matter - it would be a 
worthwhile optimization to index them in that fashion as it would be more 
compact.

Using the canonical geohash gives facet values that can be copy&pasted with 
other software. Thinking again, this is a great feature. Would it be worth 
optimizing geohash with a Trie version?  Trie fields (can be made to) show up 
correctly in facets.

And thank you for the word 
[gazateer|http://encyclopedia2.thefreedictionary.com/Gazateer|.

About unit tests: I've stumbled so many times with floating point that I only 
trust real-world data. A good unit test would be indexing a 
[gazateer|http://encyclopedia2.thefreedictionary.com/Gazateer| of world data 
and randomly comparing points. OpenStreetMaps or Wikipedia locations for 
example.

  
> Geospatial search using geohash prefixes
> ----------------------------------------
>
>                 Key: SOLR-2155
>                 URL: https://issues.apache.org/jira/browse/SOLR-2155
>             Project: Solr
>          Issue Type: Improvement
>            Reporter: David Smiley
>         Attachments: GeoHashPrefixFilter.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox.... to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to