[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-05-22 Thread kevenz (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13663989#comment-13663989
 ] 

kevenz commented on SOLR-2155:
--

hi David, I'm using solr 4.3, I have indexed docs with a polygon field,  and 
I'd like to search the polygon docs according to the given point.

I've put the jts-1.13.jar into the WEB-INF/lib directory, and I've added the 
doc to solr successfully. My question is how to search? I'm new to lucene and 
solr, any help would be appreciated.

scheme.xml: 
fieldType name=location_rpt   
class=solr.SpatialRecursivePrefixTreeFieldType
   
spatialContextFactory=com.spatial4j.core.context.jts.JtsSpatialContextFactory
   distErrPct=0.025
   maxDistErr=0.09
   units=degrees
/
field name=geo  type=location_rpt  indexed=true stored=true  
multiValued=true /

java code:
String sql = indexType:219 + AND + 
geo:Contains(POINT(114.078327401257,22.5424866754136));
SolrQuery query = new SolrQuery();
query.setQuery(sql);
QueryResponse rsp = server.query(query);
SolrDocumentList docsList = rsp.getResults();

Then I got an error at java.lang.IllegalArgumentException: missing parens: 
Contains. Is there any suggestion?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-05-22 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13664227#comment-13664227
 ] 

David Smiley commented on SOLR-2155:


Kevenz, please ask your question on the Solr-user list. It doesn't pertain to 
SOLR-2155. I'll look for your question and answer there.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-03-01 Thread Sandeep Tucknat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590636#comment-13590636
 ] 

Sandeep Tucknat commented on SOLR-2155:
---

I have a similar requirement to Sujan. I am doing a filtering spatial query, 
trying to find businesses (with multiple locations stored in a multi-valued 
field) available within a radius of a given point. I also need to know how many 
locations are actually within the radius as well as which one is the closest. 
Wondering if that's possible with the Solr 3 or 4 spatial implementation.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-03-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590665#comment-13590665
 ] 

David Smiley commented on SOLR-2155:


Sujan, Sandeep,
The filter doesn't ultimately know which, just that the document (business) 
matched.  At the time you display the search results, which is only the top-X 
(20?  100?) you could then figure out which addresses matched and which is 
closest.  Since this is only done on the limited number of documents you're 
displaying, it should scale fine. If your docs many many locations then ideally 
Solr would have a mechanism to filter out the locations outside the filter from 
the multi-value so that you needn't do this yourself client-side.  That 
optimization is on my TODO list.

For Solr 3 use SOLR-2155 (see the banner at the top of this JIRA issue) and for 
Solr 4, see the location_rpt field in the default schema to get started.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-03-01 Thread Sandeep Tucknat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590679#comment-13590679
 ] 

Sandeep Tucknat commented on SOLR-2155:
---

First of all, thanks for the prompt response! It feels good to see you are 
supporting the approach we took in the interim :) I was just thinking that in 
order to do the ranking, the filter has to go through all the values of the 
field and it shouldn't be hard to persist this information and return to the 
client. We'll be waiting for that optimization to come through! Many thanks! 
let me know if I can help in any way (3 weeks in spatial or solr).

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-03-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590706#comment-13590706
 ] 

David Smiley commented on SOLR-2155:


bq. I was just thinking that in order to do the ranking, the filter has to go 
through all the values of the field and it shouldn't be hard to persist this 
information and return to the client

It doesn't; this is a common misconception!  It does *not* calculate the 
distance between every indexed matched point and the center point of a circle 
query shape.  If it did, it wouldn't be so fast :-)  If you want an 
implementation like that then look at LatLonType which is a brute force 
algorithm and hence is not as scalable, and doesn't support multi-value either. 
 To try and help explain how this can possibly be, understand that there are 
large grid cells that fit within the query shape and for these large grid 
cells, the index knows which documents are all in that grid cell and so it 
simply matches those documents without knowing or calculating more precisely 
where those underlying points actually are.  So it's not like the filter code 
has all this information you want and simply isn't exposing it to you.  I need 
to drive this point home at my next conference presentation at Lucene/Solr 
Revolution 2013 in May (San Diego CA).

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2013-03-01 Thread Sandeep Tucknat (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13590810#comment-13590810
 ] 

Sandeep Tucknat commented on SOLR-2155:
---

Once again, thanks for the prompt response AND the information! It makes sense, 
you have a forward index of grid cells to documents, no geo comparisons 
required at run time. I won't be able to attend the conference but will 
definitely look forward to your presentation!

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-12-07 Thread Sujan (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13526849#comment-13526849
 ] 

Sujan commented on SOLR-2155:
-

Is there a way to know the address within a document that matches the location 
search and not just the document, 
for example, i might have Store A with location addresses 100 Main Street, 
NY, 1 and 100 NotMainStreet, NY, 00010 which are like say 40 miles 
apart. I search for 2 and 100 Main Street, NY, 1 matches, I want 
result to indicate location 100 Main Street, NY of Store A matches, not 
sure if that functionality exists.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-10-17 Thread Robert Tseng (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477653#comment-13477653
 ] 

Robert Tseng commented on SOLR-2155:


Hi All,

New to Solr here!  I have a question for you all on gh_geofilt.  My document 
has rows of path, think of KML lineString, of which I want to do bounding box 
check which of them fall within a box.  Each row basically has an id field and 
a multivalued field describing the line with multiple points.

What I want returned is all lines that fall within but I read Solr is not very 
good, yet, in returning large number of hits.  Hence the row params to limit 
result to top N rows.  My two questions are:

1. If want to retrieve all rows, do I query twice from solrj.  Once to get 
number of hits so I can set the number of rows that grabs all row in a second 
call? Or two I should chunk up the query call using the start params as an 
offset? 

2. If it's only returning top N, is it based on score?  What is considered high 
score? A row with most number of hits in the box? Cloest to the center?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-10-17 Thread Kristopher Davidsohn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477656#comment-13477656
 ] 

Kristopher Davidsohn commented on SOLR-2155:


Hi,
I am no longer with the company. Your email has been forwarded to my supervisor 
for attention. If you need immediate assistance, please call CityGrid’s front 
desk at 310-360-4500. Thank you!
Thank you.


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-10-17 Thread Erick Erickson
Robert:

It's much better to ask usage questions like this on the user's list
rather than in a
comment on a JIRA, the user's list gets seen by a much wider audience. See
the solr-user section here: http://lucene.apache.org/solr/discussion.html

Quick answers:
1 you're better off chunking things up. But beware the deep paging
problem, as
 you go farther and farther into the list, response will slow down.
2 top N is indeed based on score, see the scoring algorithm in the
Similarity (Lucene)
 class. And it's just the top rows documents. Their scores could
be  0.9 or
  0.001, just the top N. If two docs have the exact same
score, the tie is
 broken by the internal Lucene doc ID so the results are consistent.

Why do you want to return all the docs anyway? Some kind of dump?
Perhaps there's
a better way do do what you want.

But again, please move the discussion over to the user's list.

Best
Erick

On Wed, Oct 17, 2012 at 2:20 AM, Robert Tseng (JIRA) j...@apache.org wrote:

 [ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477653#comment-13477653
  ]

 Robert Tseng commented on SOLR-2155:
 

 Hi All,

 New to Solr here!  I have a question for you all on gh_geofilt.  My document 
 has rows of path, think of KML lineString, of which I want to do bounding box 
 check which of them fall within a box.  Each row basically has an id field 
 and a multivalued field describing the line with multiple points.

 What I want returned is all lines that fall within but I read Solr is not 
 very good, yet, in returning large number of hits.  Hence the row params to 
 limit result to top N rows.  My two questions are:

 1. If want to retrieve all rows, do I query twice from solrj.  Once to get 
 number of hits so I can set the number of rows that grabs all row in a second 
 call? Or two I should chunk up the query call using the start params as an 
 offset?

 2. If it's only returning top N, is it based on score?  What is considered 
 high score? A row with most number of hits in the box? Cloest to the center?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, 
 SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the 
 introductory readme and download the plugin .jar file.  Lucene 4's new 
 spatial module is largely based on this code.  The Solr 4 glue for it should 
 come very soon but as of this writing it's hosted temporarily at 
 https://github.com/spatial4j.  For more information on using SOLR-2155 with 
 Solr 3, see http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA 
 issue is closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text. 
  None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in 
 a user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on 
 the earth.  Each successive character added further subdivides the box into 
 a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid.  The 
 first step in this scheme is figuring out which geohash grid squares cover 
 the user's search query.  I've added various extra methods to GeoHashUtils 
 (and added tests) to assist in this purpose.  The next step is an actual 
 Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to 
 support different queried shapes so that the filter need not care about 
 these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, 

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-09-17 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13456974#comment-13456974
 ] 

David Smiley commented on SOLR-2155:


{panel:title=announcement}
SOLR-3304 was just committed to Solr 4.  If you are using SOLR-2155 in Solr 3, 
then you don't need to worry about patching Solr in Solr 4 or anything to get 
the same functionality (and more).  The field type is 
SpatialRecursivePrefixTreeFieldType which defaults to geospatial use with 
geohashes.  It also has a distErrPct option which specifies the precision of 
the shape which speeds things up some.
{panel}

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-07-06 Thread Kristopher Davidsohn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13408186#comment-13408186
 ] 

Kristopher Davidsohn commented on SOLR-2155:


Hi David,
Thanks for the response. Unfortunately I do already have that line in my 
solrconfig, and my index is optimized as well... Also for information's sake 
I'm running solr 3.4

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-07-05 Thread Kristopher Davidsohn (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407625#comment-13407625
 ] 

Kristopher Davidsohn commented on SOLR-2155:


Hi, I have the latest version of this patch, and in this version (and prior 
ones I've used as well actually) I could not get the distance sorting to work. 
I have a geohash field set up like so in the schema: 
{code:xml}fieldType name=geohash class=solr2155.solr.schema.GeoHashField 
length=12 / 
field name=locLatLon_hash type=geohash indexed=true stored=true/{code}

and a sample query I am testing: 
{code}q=(locName%3Ario^5.0)sort=geodist()+ascrows=15start=0fq={!bbox}sfield=locLatLon_hashpt=40.2114%2C-111.6980d=80.45qt=standard{code}
 

The results come out in no real discernable order (not distance, and not score) 
and it seems like I may be missing something. Does anyone have any advice or 
ideas on what might be the issue? Or is distance sorting not supported in this 
particular case?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-07-05 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13407710#comment-13407710
 ] 

David Smiley commented on SOLR-2155:


Kristopher,
Perhaps you didn't make the changes to solrconfig.xml that you need to make, 
namely:
{code:xml}
!-- Optional: replace built-in geodist() with our own modified one for 
multi-valued geo sort --
valueSourceParser name=geodist 
class=solr2155.solr.search.function.distance.HaversineConstFunction$HaversineValueSourceParser
 /
{code}
This is documented on the solr wiki link at the top of this issue.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-21 Thread Alexander Kanarsky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13280608#comment-13280608
 ] 

Alexander Kanarsky commented on SOLR-2155:
--

Oleg, if you're talking about the ssplex thing, it is simple but stable. You 
can see how it works on our site (look for custom area search icon, right top 
corner on Map panel, for example http://www.trulia.com/for_sale/Fremont,CA/My 
Custom 
Area__cr_dFpvpgVae%40i]ne%40sjAg%40}H}XvBoFwhApXuAvgAah%40vQnhAmI`G`Jn`%40ehAbyA_sp)
Please let me know if you have any questions about it.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-14 Thread Oleg Shevelyov (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13274546#comment-13274546
 ] 

Oleg Shevelyov commented on SOLR-2155:
--

Hi David! Thanks for your quick response. At the moment I decided to go for 
JTeam's Spatial Solr Plugin (SSP 2.0) which has a patch with polygon search. 
The project looks well-documented and it seems faster to apply their stuff than 
modify 2155 sources. If tests show it doesn't work well, I'll get back to 2155 
and let you know. Thank you anyway.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-11 Thread Oleg Shevelyov (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273417#comment-13273417
 ] 

Oleg Shevelyov commented on SOLR-2155:
--

Hi David, does new 1.0.5 version include polygon search? If not, please, could 
you clarify where to apply GeoHashPrefixFilter patch? It doesn't match solr 3.1 
sources, and obviously higher versions as well. I saw you mentioned that you 
successfully implemented polygon search but I still don't get how to make it 
work. Thanks

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-05-11 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13273540#comment-13273540
 ] 

David Smiley commented on SOLR-2155:


Hi Oleg.  No, it was stripped out a long while ago.  But come to think of it, 
now that this issue isn't going to get committed and is also hosted somewhere 
outside Apache (it's on GitHub), I can re-introduce the polygon support that 
was formerly there.  It's not a priority for me right now but if you find the 
last .patch file on this issue that includes the JTS support (which in some 
comment above I mentioned stripping it out so you could grab the version prior 
to that), then you could resurrect it.  There was just one source file, plus a 
small hook into my query parser above.  JTS did all the work, really.  If you 
want to try and bring it back, then do so and send me a pull-request on github. 
 All said and done, it's a very small amount of work; the integration was done 
it just needs to be brought back.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-18 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257212#comment-13257212
 ] 

David Smiley commented on SOLR-2155:


There is a new version, v1.0.5, available on my [GitHub 
repo|https://github.com/dsmiley/SOLR-2155/downloads].  Changelog:
 * Fixed bug affecting sorting by distance when the index was not in an 
optimized state.
 * Norms are omitted automatically now; they aren't used.

The bug Bill reported is fairly serious and affects anyone doing sorting by 
this field when the index isn't optimized.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} The status of this issue is a plugin for Solr 3.x 
 located here: https://github.com/dsmiley/SOLR-2155.  Look at the introductory 
 readme and download the plugin .jar file.  Lucene 4's new spatial module is 
 largely based on this code.  The Solr 4 glue for it should come very soon but 
 as of this writing it's hosted temporarily at https://github.com/spatial4j.  
 For more information on using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254357#comment-13254357
 ] 

Bill Bell commented on SOLR-2155:
-

Could it be if they only have 1 entry in the store_geohash field?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254356#comment-13254356
 ] 

Bill Bell commented on SOLR-2155:
-

I tried SOLR 2155 1.0.3 and the same results.

I am able to simplify the issue with a simple query:

{code}
http://localhost:8983/solr/citystateprovider/select?fl=score,display_name,store_lat_lon,store_geohash,city_stateq.alt=*:*d=100pt=39.740112,-104.984856defType=dismaxrows=100echoParams=allsfield=store_geohashsort=geodisth%28%29%20asc;
{code}

AND the one that works:

{code}

http://localhost:8983/solr/citystateprovider/select?fl=score,display_name,store_lat_lon,store_geohash,city_stateq.alt=*:*pt=39.740112,-104.984856defType=dismaxrows=100echoParams=allsfield=store_lat_lonsort=geodist%28%29%20asc;

{code}



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254359#comment-13254359
 ] 

Bill Bell commented on SOLR-2155:
-

Ohh. This is interesting, if I switch to sort using the new geodisth() for the 
sfield=store_lat_lon fields I get the same results as what I use 
sfield=store_geohash (the wrong ones).



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254392#comment-13254392
 ] 

Bill Bell commented on SOLR-2155:
-

The distance is calculated properly, but the sort is not working. Probably due 
to the wrong function being called.

You need to add a test case to check the sort order coming back from Solr.







 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254398#comment-13254398
 ] 

Bill Bell commented on SOLR-2155:
-

sort is definitely not working in all cases using geohash field.

I am going to compare with other sorts... Maybe we need to extend another class?


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-04-14 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13254269#comment-13254269
 ] 

David Smiley commented on SOLR-2155:


Weird. Is this reproducible using an index of just these 4 documents (all geo 
single-valued, apparently) with your query here?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 {panel:title=NOTICE} There is a .zip attached to the issue with a .jar you 
 can drop in to Solr 3.x.  Lucene 4's new spatial module is largely based on 
 this code.  The Solr 4 glue for it should come very soon but as of this 
 writing it's hosted temporarily at Spatial4j.com.  For more information on 
 using SOLR-2155 with Solr 3, see 
 http://wiki.apache.org/solr/SpatialSearch#SOLR-2155  This JIRA issue is 
 closed because it won't be committed in its current form.
 {panel}
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-29 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242025#comment-13242025
 ] 

Harley Parks commented on SOLR-2155:


Thanks David. so, i will be sure to add the custom cache.

running into some issues when using a range query on a multiValue GeoHash. 

sometimes the values are outside of the range provided.

is that expected? and if so, why?
Thanks


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-29 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242031#comment-13242031
 ] 

Harley Parks commented on SOLR-2155:


my assumption is that coordinates are lat, lng
and, if north is up on the page, the range is from the Lower-Left TO 
Upper-Right.

examples of range query:

 q=GeoTagGeoHash:[-15,149 TO -4,165]
 q=GeoTagGeoHash:[19,-160 TO 23,-154]
q=GeoTagGeoHash:[10,20 TO 30,40]

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-29 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13242042#comment-13242042
 ] 

David Smiley commented on SOLR-2155:


I don't recall ever declaring that the range query syntax works. I reviewed the 
code and I didn't override getRangeQuery() which is required for it to work 
correctly -- instead the parent class's implementation is in effect which 
unsurprisingly doesn't work.  Instead, use my query parser to supply the query 
box.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239218#comment-13239218
 ] 

David Smiley commented on SOLR-2155:


Bill,
I don't think 1.0.4 introduced any problem as it was fairly trivial.  The INFO 
log message tells me you are using multi-value sorting which has to put all the 
values into memory after each commit.  Did a commit happen prior to this log 
message?  FYI you should put a warming query into newSearcher for any sorting 
you do in Solr so that a user never sees the time hit for loading related 
caches.  Mikhail Khludnev suggested the fieldValueCache be explicitly 
configured but I don't see it being relevant.  AFAIK each sorted or faceted 
multi-valued field gets one entry in that cache.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239230#comment-13239230
 ] 

Bill Bell commented on SOLR-2155:
-

It seems that a max size of 10 would be too small. If we have an average of 3 
geohash fields per doc, and we have 2M rows how do we set these caches?

We will get back to you on when it occurs. We don't commit during the day. 
Maybe during a gc?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239710#comment-13239710
 ] 

Harley Parks commented on SOLR-2155:


interesting... conversation regarding both tuning and data requirements. 
I would guess that the size needs to be in power of 2's

since the example given is related to a custom type, it should be noted that a 
fieldValueCache is by default created, for each document id.

the example given by default in solr 3.4 if needed:
   fieldValueCache class=solr.FastLRUCache
size=512
autowarmCount=128
showItems=32 /

Additionally, the custom cache uses the same name, which might not matter, or 
does it? the custom cache may override the fieldValueCache created by default. 

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239711#comment-13239711
 ] 

Harley Parks commented on SOLR-2155:


interesting... conversation regarding both tuning and data requirements. 
I would guess that the size needs to be in power of 2's

since the example given is related to a custom type, it should be noted that a 
fieldValueCache is by default created, for each document id.

the example given by default in solr 3.4 if needed:
   fieldValueCache class=solr.FastLRUCache
size=512
autowarmCount=128
showItems=32 /

Additionally, the custom cache uses the same name, which might not matter, or 
does it? the custom cache may override the fieldValueCache created by default. 

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread William Bell
Harley,

I don't really think this is the same thing. Although I think it is
poor form to name it the same, these are 2 different caches. I am
still not sure what increasing the custom cache does. I looked at the
code and it is not clear. I did confirm that this seems t be happening
NOT on commit, but throughout the day. Maybe on Garbage collection?

On Tue, Mar 27, 2012 at 11:42 AM, Harley Parks (Commented) (JIRA)
j...@apache.org wrote:

    [ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239711#comment-13239711
  ]

 Harley Parks commented on SOLR-2155:
 

 interesting... conversation regarding both tuning and data requirements.
 I would guess that the size needs to be in power of 2's

 since the example given is related to a custom type, it should be noted that 
 a fieldValueCache is by default created, for each document id.

 the example given by default in solr 3.4 if needed:
       fieldValueCache class=solr.FastLRUCache
                        size=512
                        autowarmCount=128
                        showItems=32 /

 Additionally, the custom cache uses the same name, which might not matter, or 
 does it? the custom cache may override the fieldValueCache created by default.

 Geospatial search using geohash prefixes
 

                 Key: SOLR-2155
                 URL: https://issues.apache.org/jira/browse/SOLR-2155
             Project: Solr
          Issue Type: Improvement
            Reporter: David Smiley
         Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, 
 SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text. 
  None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in 
 a user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on 
 the earth.  Each successive character added further subdivides the box into 
 a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid.  The 
 first step in this scheme is figuring out which geohash grid squares cover 
 the user's search query.  I've added various extra methods to GeoHashUtils 
 (and added tests) to assist in this purpose.  The next step is an actual 
 Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to 
 support different queried shapes so that the filter need not care about 
 these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Bill Bell
billnb...@gmail.com
cell 720-256-8076

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-27 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240190#comment-13240190
 ] 

David Smiley commented on SOLR-2155:


Ok so I got to the bottom of this. When I originally coded the cache entry 
thing, my intention was to use the same cache as UnInvertedField (multi-value 
faceting) -- THE fieldValueCache since it seemed related.  But I did that 
wrong and instead it looked up a custom user cache by that same name.  The 
difference is that the official one is configured with fieldValueCache and 
the other is cache name=fieldValueCache (documented in the read me).  In 
hind site, I should have chosen some special name like solr2155.  Now, if you 
don't configure the cache then *each* time you try to sort, it has to rebuild 
the cache.  So basically it's required to add this cache configuration assuming 
your sorting.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-26 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238921#comment-13238921
 ] 

Bill Bell commented on SOLR-2155:
-

David,

We are seeing weird slow performance on your new 1.0.4 release.

INFO: [providersearch] webapp=/solr path=/select 
params={d=160.9344facet=falsewt=jsonrows=6start=0pt=42.6450,-73.facet.field=5star_45f.5star_45.facet.mincount=1qt=providersearchspecdistfq=specialties_ids:(45+)qq=city_state_lower:albany,+nyf.5star_45.facet.limit=-1}
 hits=960 status=0 QTime=8222 

Hitting that with a slightly different lat long comes back almost instantly. 
I'm not sure why sometimes they take seconds instead of milliseconds.  There is 
also this log entry a few lines before the long query:

Mar 22, 2012 11:26:29 AM solr2155.solr.search.function.GeoHashValueSource init
INFO: field 'store_geohash' in RAM: loaded min/avg/max per doc #: 
(1,1.1089503,11) #2270017

Are we missing something? Shall we go back to 1.0.3 ?

Shall I increase the following?  What does this actually do?

cache name=fieldValueCache
class=solr.FastLRUCache size=10 initialSize=1
autowarmCount=1/ 



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-21 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235330#comment-13235330
 ] 

Harley Parks commented on SOLR-2155:


Just doing some testing on the new jar file.
are there rules on how to structure bounding box. lower left is south west and 
upper right is north east?

using the gh_geohash, was tricky as it's coordinates are flipped: long,lat to 
get west, south, east, north box.
but works!
   {!gh_geofilt%20sfield=GeoTagGeoHash%20box=129,-16,-180,1}

q=*:*fq={!geofilt sfield=GeoTagGeoHash pt=60,100 d=1}

ranged query works too:
q=GeoTagGeoHash:[-15,149 TO -4,165]
q=GeoTagGeoHash:[19,-160 TO 23,-154]

good stuff.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-21 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13235332#comment-13235332
 ] 

Harley Parks commented on SOLR-2155:


Just doing some testing on the new jar file.
are there rules on how to structure bounding box. lower left is south west and 
upper right is north east?

using the gh_geohash, was tricky as it's coordinates are flipped: long,lat to 
get west, south, east, north box.
but works!
   {!gh_geofilt%20sfield=GeoTagGeoHash%20box=129,-16,-180,1}

q=*:*fq={!geofilt sfield=GeoTagGeoHash pt=60,100 d=1}

ranged query works too:
q=GeoTagGeoHash:[-15,149 TO -4,165]
q=GeoTagGeoHash:[19,-160 TO 23,-154]

good stuff.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-19 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233208#comment-13233208
 ] 

Harley Parks commented on SOLR-2155:


Nice! Thanks David, that really helps us out. 



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-19 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13233230#comment-13233230
 ] 

Bill Bell commented on SOLR-2155:
-

I already had updated http://wiki.apache.org/solr/SpatialSearchDev with the 
SOLR-2155 info. 



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-1.0.4-project.zip, 
 Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-17 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232133#comment-13232133
 ] 

Bill Bell commented on SOLR-2155:
-

I did figure out the {!gh_geofilt}


The parameters are point and radius, not pt and d.

Radius also is in meters not km.

The performance of this gh geofilt is also most the same as geofilt. So not 
sure why you would need it.



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-17 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232139#comment-13232139
 ] 

Harley Parks commented on SOLR-2155:


Bill: 

the main advantage is gh_geofilt (or the name in queryParser settup in 
solrconfig.xml) can be used as part of the query versus the filter... at least 
that is what I thought... here are my test strings in solr/admin/form

it might be helpful to be able to pass parameters in like 
geodist(sfield,lat,lng).
gh_geofilt(sfield,lat,lng,d) 

this worked for me in the solr/admin/form:

!gh_geofilt sfield=GeoTagGeoHash pt=-9,160 d=5


this also worked for me in the solr/admin/form:

!gh_geofilt sfield=GeoTagGeoHash point=-9,160 radius=5

but this also worked:
!geofilt sfield=GeoTagGeoHash point=-9,160 radius=5
!geofilt sfield=GeoTagGeoHash pt=-9,160 d=5



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-17 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13232156#comment-13232156
 ] 

Bill Bell commented on SOLR-2155:
-

D is in km and radius is in meters. So Ou would need radius=5000

Geofilt does to use point and radius as far as I can tell.

Also hg_geofilt does not use pt and d. 

I just don't want people confused.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-17 Thread Bill Bell
I updated the Spatial wiki

Sent from my Mobile device
720-256-8076

On Mar 13, 2012, at 8:31 AM, David Smiley (Commented) (JIRA) 
j...@apache.org wrote:

 
[ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228408#comment-13228408
  ] 
 
 David Smiley commented on SOLR-2155:
 
 
 Harley,
  Apparently I haven't been clear because this question does come up often, 
 and I sympathize with you all because the comments on this issue are 
 ridiculously long.  What I should have done and still can do is add info to 
 the Solr wiki.  Ever since ~September 2011, you no longer need to patch Solr 
 and you can use any 3x release.  The specific comment with further info 
 announcing this is:
 https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13117350page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13117350
 If you look at the attachments to the issue, you'll notice the latest version 
 is 1.0.3.
 
 When Solr 3.6 comes out (soon!), my co-author and I will write an online 
 addendum to discuss the changes in Solr 3.5  Solr 3.6 that affect the 
 content of the book, or are interesting things that we would have written 
 about if we were still writing it.  I'll add a clarification to the existing 
 info box on 144 mentioning SOLR-2155 that this feature is available in plugin 
 form to 3x without patching Solr.
 
 
 
 Geospatial search using geohash prefixes
 
 
Key: SOLR-2155
URL: https://issues.apache.org/jira/browse/SOLR-2155
Project: Solr
 Issue Type: Improvement
   Reporter: David Smiley
Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, 
 SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch
 
 
 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text. 
  None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in 
 a user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on 
 the earth.  Each successive character added further subdivides the box into 
 a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid.  The 
 first step in this scheme is figuring out which geohash grid squares cover 
 the user's search query.  I've added various extra methods to GeoHashUtils 
 (and added tests) to assist in this purpose.  The next step is an actual 
 Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to 
 support different queried shapes so that the filter need not care about 
 these details.
 This work was presented at LuceneRevolution in Boston on October 8th.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira
 
 
 
 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org
 

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229928#comment-13229928
 ] 

Harley Parks commented on SOLR-2155:


Sorry, about the editing.

But Thanks For the Feedback.

so perhaps there is something wrong with my configuration, since the search 
results do not return lat, long but the geohash string.
 
The maven build went great. 

I did finally figure out how to use the geofilt function using the pt and d.

I am reindexing each time, but yes, I delete the data folder, and reindex.

I am placing the jar file into the tomcats solr/lib folder, after a restart, 
and after changing solrconfig and schema, geohash string is displayed, not the 
lat,long.
Schema
Field Type:
fieldType name=geohash class=solr2155.solr.schema.GeoHashField 
length=12/
this is the field:
   field name=GeoTagGeoHash type=geohash indexed=true stored=true 
multiValued=true /

this is the info from solr/admin/ field types - GEOHASH
Field Type: geohash
Fields: GEOTAGGEOHASH
Tokenized: true
Class Name: solr2155.solr.schema.GeoHashField
Index Analyzer: org.apache.solr.analysis.TokenizerChain
Tokenizer Class: solr2155.solr.schema.GeoHashField$1
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer

Still, the query returns values like:

arr name=GeoTagGeoHash
strrw3sh9g8c6mx/str
strrw3f3xc9dnh3/str
strrw3ckbue74y7/str
/arr

so, if this is not the right, is there anything I can do to troubleshoot?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229945#comment-13229945
 ] 

Bill Bell commented on SOLR-2155:
-

David,

What is an example URL call for multiValued field? Does geofilt work?

/select?q=*:*fq={!geofilt}sort=geodist() ascsfield=store_hashd=10

Or do we need to use gh_geofilt? like this?

/select?q=*:*fq={!gh_geofilt}sort=geodist() ascsfield=store_hashd=10


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230289#comment-13230289
 ] 

Harley Parks commented on SOLR-2155:


Bill: 
the doc's on queryParser state the name of the function can then be used as the 
main query, gh_geofilt, perhaps something like: /select?q={!gh_geofilt}... but, 
good question.
geofilt is working for me on multivalued fields.

my issue is the query result returns the geohash string, not the geohash lat, 
long.
In building the v 1.0.3 jar file for solr2155, I used jdk 6. as I didn't see 
any errors, so hopefully, that's fine.
so, I'm going to see if solr 3.5 will perhaps resolve my issue.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230346#comment-13230346
 ] 

Harley Parks commented on SOLR-2155:


Oh.. I may have messed up my build, since i did not include the solr 3.4 jar 
files in the class path...
is there an enviorment variable that maven will use? such as CLASSPATH or a lib 
folder in the project being built?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230398#comment-13230398
 ] 

Harley Parks commented on SOLR-2155:


All of the Class Paths in the solr1.0.3 project point to apache solr 3.4 
libraries on the apache website... so no action needed, to answer my own 
question. I'm stumped.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230781#comment-13230781
 ] 

Harley Parks commented on SOLR-2155:


I used a fresh install of solr 3.4, Just in case our implementation had 
something to do with it, still not able to return the lat,long and only get the 
geohash.

what's interesting is I can change the field's type between solr-2155's geohash 
 solr's geohash, the query returned format changes from geohash to lat,long 
without reindexing.

Where in the code does the geohash value get returned in a query?

Help.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-15 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13230843#comment-13230843
 ] 

Harley Parks commented on SOLR-2155:


Sorry, for the endless droning.
But, for now, the plan is to store the latlong and the geohash in two fields.
search on the geohash, and map on the latlong.


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-14 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229423#comment-13229423
 ] 

Harley Parks commented on SOLR-2155:


did i need to rebuild the index after making changes to the schema?
What does a valid url query for geohash look like?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-14 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229777#comment-13229777
 ] 

Harley Parks commented on SOLR-2155:


Wait...  there is GeoHashUtils.decode(geohashString)... I wonder if I can 
create a field in solr that returns lat, lng pair?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-14 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13229869#comment-13229869
 ] 

David Smiley commented on SOLR-2155:


Harley,
  You shouldn't need to know a thing about geohashes to use SOLR-2155.  You use 
it identically to LatLonType insofar as you add the data in lat,lon format and 
get it out the same way in search results, and you can use the built-in geofilt 
query parser (my gh_geofilt is has other options and is optional).  Perhaps you 
are inadvertently using Solr's geohash field?  And/or maybe you forgot to 
reindex?

p.s. please use the JIRA comment edit feature sparingly; it sends interested 
parties notifications each time.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-13 Thread Erick Erickson
Be brave, just go for building it G...
It's actually surprisingly easy.

Start here:

http://wiki.apache.org/solr/HowToContribute


you should be able to get this all running (assuming
you have svn, ant and a jdk) in about 15 minutes,
exclusive of the checkout time. I was amazed first
time I tried it.

One note: Solr 3.x all go against the 1.5 JDK. It should
work against the 1.6, but note that the release was compiled
with 1.5.



Applying the patches is patching source code and recompiling


Best
Erick

On Tue, Mar 13, 2012 at 12:17 AM, Harley Parks (Commented) (JIRA)
j...@apache.org wrote:

    [ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228218#comment-13228218
  ]

 Harley Parks commented on SOLR-2155:
 

 So, basic question.. perhaps needs to be posted else where.
 I'm working with Solr 3.4, using the GeoHash to store multiple locations for 
 a document.
 if geofilt or geodist doesn't work with the GeoHash, is the only way to add 
 this patch into solr 3.4?
 I'm using a tomcat and solr, and jumping to 4.0 might be a while, even if 
 it's released soon.

 I'm not real clear how to apply the patch, as I would need basically create 
 the solr.war file from the source... and compile all of the other sources... 
 painful, but once setup, perhaps rewarding.

 Ideally, I would have a jar file from the patch, that I drop into the 
 solr/lib and make the needed changes to the config files.

 So, I'm real interested in getting something stable, so i'm watching the 
 above mentioned links.

 David Smiley, would you be able to amend your book - Apache Solr 3 ESS, which 
 mentions solr-2155, to include how to implement this patch? or do I need to 
 get brave, and build the source?

 Geospatial search using geohash prefixes
 

                 Key: SOLR-2155
                 URL: https://issues.apache.org/jira/browse/SOLR-2155
             Project: Solr
          Issue Type: Improvement
            Reporter: David Smiley
         Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, 
 SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text. 
  None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in 
 a user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on 
 the earth.  Each successive character added further subdivides the box into 
 a 4x8 (or 8x4 depending on the even/odd length of the geohash) grid.  The 
 first step in this scheme is figuring out which geohash grid squares cover 
 the user's search query.  I've added various extra methods to GeoHashUtils 
 (and added tests) to assist in this purpose.  The next step is an actual 
 Lucene Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to 
 support different queried shapes so that the filter need not care about 
 these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA 
 administrators: 
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-13 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228408#comment-13228408
 ] 

David Smiley commented on SOLR-2155:


Harley,
  Apparently I haven't been clear because this question does come up often, and 
I sympathize with you all because the comments on this issue are ridiculously 
long.  What I should have done and still can do is add info to the Solr wiki.  
Ever since ~September 2011, you no longer need to patch Solr and you can use 
any 3x release.  The specific comment with further info announcing this is:
https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13117350page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13117350
If you look at the attachments to the issue, you'll notice the latest version 
is 1.0.3.

When Solr 3.6 comes out (soon!), my co-author and I will write an online 
addendum to discuss the changes in Solr 3.5  Solr 3.6 that affect the content 
of the book, or are interesting things that we would have written about if we 
were still writing it.  I'll add a clarification to the existing info box on 
144 mentioning SOLR-2155 that this feature is available in plugin form to 3x 
without patching Solr.



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-13 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228939#comment-13228939
 ] 

Harley Parks commented on SOLR-2155:


I am missing the  package solr2155.lucene.spatial.geometry.shape;



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-13 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228938#comment-13228938
 ] 

Harley Parks commented on SOLR-2155:


I am missing the  package solr2155.lucene.spatial.geometry.shape;



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-03-12 Thread Harley Parks (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13228218#comment-13228218
 ] 

Harley Parks commented on SOLR-2155:


So, basic question.. perhaps needs to be posted else where.
I'm working with Solr 3.4, using the GeoHash to store multiple locations for a 
document.
if geofilt or geodist doesn't work with the GeoHash, is the only way to add 
this patch into solr 3.4?
I'm using a tomcat and solr, and jumping to 4.0 might be a while, even if it's 
released soon.

I'm not real clear how to apply the patch, as I would need basically create the 
solr.war file from the source... and compile all of the other sources... 
painful, but once setup, perhaps rewarding. 

Ideally, I would have a jar file from the patch, that I drop into the solr/lib 
and make the needed changes to the config files.

So, I'm real interested in getting something stable, so i'm watching the above 
mentioned links.

David Smiley, would you be able to amend your book - Apache Solr 3 ESS, which 
mentions solr-2155, to include how to implement this patch? or do I need to get 
brave, and build the source?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-02-16 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13209613#comment-13209613
 ] 

David Smiley commented on SOLR-2155:


If someone watching this issue has an interest in this capability winding its 
way into Solr out of the box, then I suggest you vote (and maybe watch) 
LUCENE-3795.  That issue is the first step, the subsequent step is a follow-on 
issue that will bring LSP's spatial-solr module which uses spatial-lucene 
(LUCENE-3795). I don't intend or support committing SOLR-2155 as is.  Spatial 
done-right should involve a good framework; SOLR-2155 isn't a framework and 
Lucene's existing defunct spatial-contrib module isn't good.  That's where LSP 
comes in, and LUCENE-3795 is the first step to get it incorporated into 
Lucene/Solr.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-01-14 Thread Bill Bell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13186295#comment-13186295
 ] 

Bill Bell commented on SOLR-2155:
-

David,

We really need to get this into the trunk. What is left to do? Would really 
love to use this, but until it is moved into the trunk most clients won't. 

Maybe we can scale it back a bit, add the feature and iterate?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-01-12 Thread Srikanth Kallurkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185172#comment-13185172
 ] 

Srikanth Kallurkar commented on SOLR-2155:
--

Hi David, Apologies for taking a long time to reply. I incorporated your 
suggestions and did see some speedup. I started with geohash length at 12 and 
got reasonable speed up for 8. T
But I am bit confused by (1) accuracy associated with length and (2) the query 
time operation in this patch. 
(1) So, is the accuracy associated with how big/small an enclosing box is?
(2) The entire geohash field is loaded into memory at query time. Is this done 
because for the lat-lon comparisons, the patch cannot use lucene string 
matching mechanisms. Also, because the whole field is in memory, how are 
updates to the index handled. Meaning, if new lat-lons are added, do they get 
added to the in-memory geohashes. 

Thanks,
Srikanth

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-01-12 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185224#comment-13185224
 ] 

David Smiley commented on SOLR-2155:


Srikanth,
# According to the table on wikipedia, a geohash length of 8 translates to +- 
19 meters error.  This means that if you do a filter query, it may match points 
that are =19 meters outside of the query shape erroneously, or it may exclude 
points = 19 meters within the query shape erroneously.  The accuracy is 
associated to the indexing of the data, not the query shape.
# At query time, if you only do geospatial filters, then there is *no* RAM 
requirement for this field.  *If* you sort by distance or influence relevancy 
by distance, then the indexed points are all brought into memory inverted so 
that the center of the query shape can be compared against all indexed points 
for documents matching the query.  LatLonType does the same thing for the same 
reason, but this code is multiValue aware, returning the closest indexed point 
a document has to the query shape.  When a commit happens on Solr, this 
in-memory cache is unused and garbage collected, and the data is brought into 
memory again.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2012-01-12 Thread Srikanth Kallurkar (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13185248#comment-13185248
 ] 

Srikanth Kallurkar commented on SOLR-2155:
--

Aha! Thanks for the explaination. 

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-12-08 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165396#comment-13165396
 ] 

David Smiley commented on SOLR-2155:


Srikanth: If I were you, I would increase ramBufferSizeMB in solrconfig.xml to 
a good amount -- perhaps to as much as 256MB.  Remember to do this in 
mainIndex, NOT indexDefaults.  Secondly and most importantly, I would 
configure the geohash length attribute on the field type to have sufficient 
search detail for your needs, but no more. Remember, making the geohash shorter 
and thus courser granularity doesn't mean you lose the accuracy in your 
stored value -- which is retained verbatim as you provided.  For a guide on 
what geohash length to use, consult the wikipedia page which has a table.  
Please let me know how this works out for you.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-12-08 Thread Mikhail Khludnev
Srikanth,

Have you tried to do a basic profiling or sampling? Just take a few thread
dumps by jstack. If the code is so greedy for CPU, you'll have it in a
stack.

Regards

On Thu, Dec 8, 2011 at 8:57 PM, Srikanth Kallurkar (Commented) (JIRA) 
j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13165326#comment-13165326]

 Srikanth Kallurkar commented on SOLR-2155:
 --

 In my use case, I have a large number of lat-lons for each document - on
 the order of about 2K lat-lon pairs. Since the time we started using
 geohash prefix filter, the time to index has significantly degraded - by
 about 2-3 times. Are there any suggestions for speeding up the indexing
 process. I was trying to read the comments here, but am not sure if any
 index time caching mechanism is used (or could be used) to lookup geohashes.

 Thanks,
 Srikanth



  Geospatial search using geohash prefixes
  
 
  Key: SOLR-2155
  URL: https://issues.apache.org/jira/browse/SOLR-2155
  Project: Solr
   Issue Type: Improvement
 Reporter: David Smiley
  Attachments: GeoHashPrefixFilter.patch,
 GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch,
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch,
 SOLR.2155.p3.patch, SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip,
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch
 
 
  There currently isn't a solution in Solr for doing geospatial filtering
 on documents that have a variable number of points.  This scenario occurs
 when there is location extraction (i.e. via a gazateer) occurring on free
 text.  None, one, or many geospatial locations might be extracted from any
 given document and users want to limit their search results to those
 occurring in a user-specified area.
  I've implemented this by furthering the GeoHash based work in
 Lucene/Solr with a geohash prefix based filter.  A geohash refers to a
 lat-lon box on the earth.  Each successive character added further
 subdivides the box into a 4x8 (or 8x4 depending on the even/odd length of
 the geohash) grid.  The first step in this scheme is figuring out which
 geohash grid squares cover the user's search query.  I've added various
 extra methods to GeoHashUtils (and added tests) to assist in this purpose.
  The next step is an actual Lucene Filter, GeoHashPrefixFilter, that uses
 these geohash prefixes in TermsEnum.seek() to skip to relevant grid squares
 in the index.  Once a matching geohash grid is found, the points therein
 are compared against the user's query to see if it matches.  I created an
 abstraction GeoShape extended by subclasses named PointDistance... and
 CartesianBox to support different queried shapes so that the filter
 need not care about these details.
  This work was presented at LuceneRevolution in Boston on October 8th.

 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA
 administrators:
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira



 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: dev-h...@lucene.apache.org




-- 
Sincerely yours
Mikhail Khludnev
Developer
Grid Dynamics
tel. 1-415-738-8644
Skype: mkhludnev
http://www.griddynamics.com
 mkhlud...@griddynamics.com


[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-10-17 Thread Olivier Jacquet (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13128980#comment-13128980
 ] 

Olivier Jacquet commented on SOLR-2155:
---

I just wanted to mention another use case for multivalued point fields since 
everyone is always talking about this in a location context.

The PointType can also be used to categorize other other stuff. In my case 
we're storing qualifications of persons as a tuple of experience, function and 
skill (eg. senior, developer, java) which are internally represented by 
numerical ids. Now with Solr I would like to be able to do the query: return 
everything that is a java developer which would be the same as asking for all 
points on a certain line.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-10-17 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13129002#comment-13129002
 ] 

David Smiley commented on SOLR-2155:


Oliver: Your scenario is interesting but I wouldn't recommend spatial for that. 
A key part of spatial is the use of a numerical range. In your case there are 
discrete values.  Instead, I recommend you experiment with phrase queries, and 
if you are expert in Lucene then span queries.  As a toy hack example, imagine 
indexing each of these values in the form senior developer java (3 words, one 
for each part). We assume each value tokenizes as one token.  Then search for 
the developer java in which the was substituted as a kind of wildcard for 
the first position to find java developers in all levels of experience. The 
is a stopword and in effect creates a wildcard placeholder.  If you search the 
solr-user list then you will see information on this topic.  I've solved this 
problem in a different more difficult way because my values were not single 
tokens, but based on the example you present, the solution I present here isn't 
bad.  If you want to discuss this further I recommend the solr-user list.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-10-07 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122801#comment-13122801
 ] 

David Smiley commented on SOLR-2155:


The patch file is against Solr trunk (v4). It is a little out of date but it 
should be easy to port it. Perhaps only some import statements would change due 
to some moves; I'm not sure. The Solr 4 solution I've been working on with 
Chris  Ryan is a larger spatial framework called LSP (a temporary name): 
http://code.google.com/p/lucene-spatial-playground/  Feel free to email me 
about LSP directly for assistance with it. Development on LSP has slowed but 
will pick up a lot within a few weeks.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-10-06 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13121996#comment-13121996
 ] 

David Smiley commented on SOLR-2155:


Frederick, a rough inspection of your problem suggests that the GeoHashField is 
declared multiValue=true but the field in your POJO is not correspondingly a 
ListString like it should be.  If you only need a single value then I suggest 
you use LatLonType instead, since it's what comes with Solr.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-10-06 Thread arin ghazarian (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13122469#comment-13122469
 ] 

arin ghazarian commented on SOLR-2155:
--

Hi David,
i am interested in using this geohash prefix filtering/bbox feature in solr 4.x 
with solrcloud, do you have any plans to convert this plugin to a solr 4 
compatible one?
Thanks,
 arin



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch, Solr2155-1.0.2-project.zip, 
 Solr2155-1.0.3-project.zip, Solr2155-for-1.0.2-3.x-port.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114585#comment-13114585
 ] 

geert-jan brits commented on SOLR-2155:
---

I have the impression that this code is meant for drawing shapes and to see if 
geospatial enriched documents are within this shape. Is that correct?

Perhaps my use-case is also supported, because it's in the 'multi-geopoint 
domain' as well.

I envision documents having multiple lat/long points. I would like to query 
(sort / filter on) documents by their 'closest point' to a given user-defined 
lat/long point. Documents would either contain a bag of lat/long pairs or a 
polygon made up out of these lat/long pairs and the query would become: return 
the closest distance from a user-defined point to the polygon. 

Before delving in the above code or in the LSP-code myself, perhaps someone can 
say if this type of querying is supported? 




 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114671#comment-13114671
 ] 

David Smiley commented on SOLR-2155:


geert-jan, your impression of the capabilities of the code is correct including 
your use-case for the most part.  Your use-case describes that the document 
might have an indexed polygon -- that is supported in LSP but not this patch. 
Sorting by (multi-value) indexed shapes is supported only for points, not yet 
other shapes like polygons. But you could basically get this by indexing an 
additional multi-point field containing the polygon vertices and center-point, 
and then sorting on that. If you have a particular sorting use-case with 
nuances that my solution does not cover then please let me know.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114686#comment-13114686
 ] 

geert-jan brits commented on SOLR-2155:
---

David, to clarify:
My use-case could be either represented as: 
 1. a bag of points, in which case I want to be able to return the closest 
point to a user-defined point and sort on the distance
 2. a polygon made of the points (where the points are the vertices of the 
polygon) and return the closest distance from a user-defined point to the 
polygon. 

Either of the solutions suffices for me, from your answer I can't entirely see 
if that was clear. 

You mention: Sorting by (multi-value) indexed shapes is supported only for 
points. 
Does this mean that representation 1.) above is supported? It wasn't entirely 
clear for me from your response. 

Let me give you the use-case, (and why the sort on center-point / centroid is 
not going to work): 

Consider a travel application in which walks/itineraries can be defined. Most 
of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my 
representation (for now) a walk visits certain Points of interest (poi) (which 
each have a lat/long point defined)  in a certain order. 

A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). 
I want a user to be able to request walks that are nearby. (sorted based on 
distance). For each walk the distance becomes the closest Poi (thus point) 
defined in the walk related to the user-defined point. 

Does this make sense? 


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114703#comment-13114703
 ] 

David Smiley commented on SOLR-2155:


bq. You mention: Sorting by (multi-value) indexed shapes is supported only for 
points. Does this mean that representation 1.) above is supported? It wasn't 
entirely clear for me from your response.

Yes, it does.  This patch  LSP support sorting documents by a multi-value 
point field.

Based on the description of your use-case and my interpretation of what you say 
(which I am not 100% sure of), I don't think it's pertinent that the walks 
are polygons. What is pertinent is that they are a collection of points (Pois). 
 So if your walk corresponds to a Solr document, you could have a poi field 
that is an indexed multi-value geospatial point field. Then you could sort 
walks (documents) according to the pois that are closest to a user-specified 
point in the query.  I would add a large bounding-box filter which would 
improve performance.  It's not clear to me there is any need for polygon 
support.

FYI I use the well-known JTS library for polygon support.


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114726#comment-13114726
 ] 

geert-jan brits commented on SOLR-2155:
---

Great thanks, I believe you interpretation of my use-case is correct .
I will go the Multi-point route first, without the polygons. 

Just to clarify: I realize I added to the confusion by bringing polygons to the 
table where they aren't necessary for the problem I described. 
I did this because I thought that perhaps distance of point to polygon' was 
implemented in LSP, while 'point to collection of points' was not.  

In that case 'transforming the problem space' by representing a 'collection of 
points' as a polygon and querying for distance of point to polygon instead 
would have given me what I wanted. This is all superfluous now, because doing 
'point to collection of points' IS possible. 

I will check out the code, thanks again!



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13114839#comment-13114839
 ] 

geert-jan brits commented on SOLR-2155:
---

David, 

I try not to swamp this discussion, but I have a totally different issue for 
which I might misuse this patch / LSP. 

It's about pois having multiple openinghours (depending on day of week, special 
festivitydays, and sometimes even multiple timeslots per day) 
I want to query, for example, all pois that are open NOW, and that will remain 
open until NOW+3H. 

For background see: 
http://lucene.472066.n3.nabble.com/multiple-dateranges-timeslots-per-doc-modeling-openinghours-td3368790.html
 on why all normal approaches don't work (afaik): basically it's about needing 
multiple opening/closing times and having them be pairwise related.

I have the feeling that opening/closing datetimes might be modelled as multiple 
lat/long points. But I would need a query of the form: 

Given a user defined point x, return all docs that have a point p defined for 
which: 
 - x.latitude  p.latitude
 - x.longitude  p.longitude

Is this possible? (As far as I see GeoFilt, BBox, GeoDist don't provide me with 
what I need)

Basically this is how I envision encoding it:
 - each open,closedelta)-tuple is represented as a (lat/long)point 
 - open is matched on latitude
 - closedelta (closedelta is represented as delta from open) is matched on 
longitude
 - granularity is 5 minutes
- open can be a max of 100 days in future - ~30.000 distinct values. 
- closedelta can be at most 24 hours - ~300 distinct values

The above lat/long query applied to the domain would become: 
Given a user defined open/closedelta-datetime x, return all docs that have a 
open/close-datetime p defined for which: 
 - x.open  p.open (poi is already open at requested opening time) 
 - x.closedelta   p.closedelta (poi is not yet closed on the requested closing 
time) 

In other words, the poi is open from the requested open-datetime until at least 
the requested close-datetime.

Ok, good exercise in writing this down, the question remains is this query 
possible (perhaps with some coding-efforts)?

Thanks, 
Geert-Jan  
 



 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread David Smiley (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13115244#comment-13115244
 ] 

David Smiley commented on SOLR-2155:


Your use-case is a feature I have intended to have LSP address in a direct 
manner when I have time. In the mean time, there are a couple approaches that 
should work.

The first approach that comes to mind is to use the LSP QuadPrefixTree with 
LSP's ability to index rectangles.  You would treat the x dimension as time, 
and ignore the y dimension (use 0). What helps make this possible is LSP's 
unique ability to index shapes other than points, and in an efficient manner.  
The only spatial filter query operation that LSP supports right now is an 
intersection. If your query is simply a point (a specific time) then this is 
fine, or if it is a time duration and you want all stores that were open for at 
least part of this time, then it's fine. If your query is a time duration and 
you want it to reside completely _within_ an indexed time duration, then 
no-can-do for now.  Based on the nature of your use-case, it may suffice to use 
multiple spatial filter queries, each one a point (time) at each hour interval 
of the desired query duration.

The second approach is similar to your suggestion but for y = closing time, not 
the delta.  y should always be  x. I just did some sample Venn diagrams to 
verify this approach. If you want to find documents with an indexed duration 
that completely overlaps your query time, then you do a bounding box filter 
query from x=0-starttime and y=endtime-max (where max is the maximum indexable 
time). When you initialize the LSP QuadPrefixTree you need to tell it the range 
of values.  Some time ago when writing tests, I discovered it simply can't 
handle Double.MAX_VALUE, but I imagine it will handle your 30,000.  If you want 
to use this patch (SOLR-2155) and not LSP then you will instead have to map 
your times to latitude-longitude ranges and use a Geohash grid length with 
granularity sufficient to differentiate your smallest unit of time (5min).

I think the 2nd approach is simplest and ideal based on what you've said about 
your needs.

If you want help with LSP then email me directly: david.w.smi...@gmail.com

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-07-25 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070339#comment-13070339
 ] 

David Smiley commented on SOLR-2155:


Bill, have you checked out LSP?: 
http://code.google.com/p/lucene-spatial-playground/  A couple folks have been 
kicking the tires last week.  We've got some benchmarking and (more) testing to 
do, and there's still the polygon date-line  pole wrapping feature-gap that 
hasn't been implemented yet. There's always more I want to do with it but its 
in decent shape now. It can even index shapes with area (i.e. not just points 
but other shapes) -- the only Lucene/Solr native implementation that I know of, 
Ryan too.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-07-24 Thread William Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13070263#comment-13070263
 ] 

William Bell commented on SOLR-2155:


OK... Would it ask too much to commit this? It is very stable and works very 
well.

I know that there might be a new version coming out, but isn't that always the 
case?

I would love more people the weigh in... Vote?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-05-20 Thread Bill Bell (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036689#comment-13036689
 ] 

Bill Bell commented on SOLR-2155:
-

Why are we abandoning this? I thought it was a good enhancement. I need this 
feature to be committed so that I can do multiple points per row.

We can mark it experimental?


 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-05-20 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036760#comment-13036760
 ] 

Grant Ingersoll commented on SOLR-2155:
---

I don't think we are abandoning it, I think David and Ryan, etc. have decided 
to bake things in the playground for a bit and then may move it back.  Lance, 
the code is in Google Code, search the archives.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-05-19 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13036505#comment-13036505
 ] 

Lance Norskog commented on SOLR-2155:
-

Where is lucene-spatial-playground?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-02 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015012#comment-13015012
 ] 

Grant Ingersoll commented on SOLR-2155:
---

If the intent is to bring in the lucene-spatial-playground into the ASF, why 
not just start a branch?  It will make provenance so much easier.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-01 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014841#comment-13014841
 ] 

David Smiley commented on SOLR-2155:


To anyone listening: I'll continue to support my latest patch here with any bug 
fixes or basic things.  As of today I'll principally be working directly with 
Ryan McKinley on his lucene-spatial-playground code-base. He ported my patch 
to this framework as the predominant means of searching for points (single or 
multi-value) and I'm going to finish what he started.  This new framework is 
superior to the geospatial mess in Lucene/Solr right now (no offense to any 
involved).  It won't be long before it's ready for broad use as a replacement 
for anything existing.  I look forward to exploring new indexing techniques 
with this framework, and for it to eventually become part of Lucene/Solr.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-04-01 Thread Lance Norskog (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13014873#comment-13014873
 ] 

Lance Norskog commented on SOLR-2155:
-

Excellent! Geo is a complex topic, too big for a one-man project.

Lance 

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, 
 SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
 SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-27 Thread Smiley, David W.
Just a small piece of feedback on your note Bill.  I don't need more help in 
knowing the best way to support multi-value sort.  I've figured out an approach 
I feel confident in (and it's 1/3 coded)... were it not for my new baby (and 
the 2nd edition of my book for sure!) you'd be using it right now.  It 
frustrates me very much that I haven't had sufficient time yet.  I'm shooting 
for Friday April 1st at the absolute latest now.  It's not going to have any 
fancy tie-in with the filtering yet; I'll leave that for a future patch.  

~ David

From: William Bell [billnb...@gmail.com]
Sent: Sunday, March 27, 2011 12:19 AM
To: dev@lucene.apache.org
Cc: Chris Male; yo...@lucidimagination.com
Subject: Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash 
prefixes

Maybe I am too close to this issue and not looking at global
implications like you are

SOLR-2155 seems fairly close to good to go. There are a couple open
issues that David Smiley has been asking for input on.
I would recommend we answer those questions, and commit it. Then we
can look at modules, etc.

If a rewrite was on the works, a committer should have said something
a LONG time ago. Like back in October or something. Are we talking
redesign or refactoring? I think core spatial things should remain in
Lucene.

Even though Spatial support in 3.1 is basic it is stable, and VERY
fast. We ran regression tests and the performance was 10-100x faster
than the plugin solution of Solr Spatial from Patrick. Whatever fancy
support for polygons, etc we support it needs to be even faster than
what we have with 3.1.

What I like about this patch more than anything is the support for
multiple-lat longs per document. I have several clients who need this
feature. For example, one doctor with multiple offices. It would be
nice if a committer would work with David Smiley to get this done.

1. We would like to know if the pole issue can be solved and how.
2. We would like to know the best way to support the multi lat long
(without the copy happening) and get the values from multigeodist(). I
have pushed up a good example, and I would like someone to please
comment and maybe even show me some code to do that. There has been
some discussions on this issue - my solution uses VS and it is fast.
There might be faster and more simple ways to handle the N number of
points.

On another note, it is frustrating when David and I put in some time
on this, and it sits out there with us begging for a committer to
assist, and then when Grant starts discussions, it is summarility
discarded with a new design without any input from the original
contributors? Is this how we want to do things here?

We should have Grant or Yonik work with David to get this patch done,

Then we can discuss Spatial V2 and the design of it.

Bill




On Sat, Mar 26, 2011 at 7:19 PM, Chris Male gento...@gmail.com wrote:


 On Sun, Mar 27, 2011 at 7:30 AM, Yonik Seeley yo...@lucidimagination.com
 wrote:

 On Sat, Mar 26, 2011 at 2:17 PM, Ryan McKinley ryan...@gmail.com wrote:
 
  No, the question is: what justification is there for adding spatial
  support to solr-only, leaving lucene with a broken contrib module,
  versus adding it where it belongs and exposing it to solr?
 
  There need not be any linkage to lucene to improve a Solr feature.
  If you disagree, we should vote to clarify - this is too important
  (and too much of a negative for Solr).
 
 
  I don't think there is *requirement* to move the core spatial stuff to
  lucene, but I think there is huge benefit to both communities if
  things have as few dependencies as possible.  To be frank, the spatial
  support in solr is pretty hairy -- it works for some use cases, but is
  not extendable and quite basic.  Calling it 'distance' seems more
  appropriate then 'spatial'


 Having something basic that works (and has a clean enough high level
 HTTP interface) was clearly a win for Solr users.

 Of course a more fully featured spatial module would be a win for
 everyone, but that's ignoring the more generic issue at hand here:
 a patch that improves Solr's spatial
 should not be blocked on the grounds that it does not improve Lucene's
 spatial enough.

 I don't think we need to see it that way, we want to improve both Solr and
 Lucene's spatial support, not block either.  As you say, having a module is
 a win for everyone, Solr and Lucene alike, so it seems obvious that we
 should go down that path and the code in SOLR-2155 would make a great first
 addition.


 Likewise, the ridiculous notion that Queries don't belong in Solr
 needs to be put to rest.

 Issues in and around this seem to be coming up a lot these days (I'm
 thinking FunctionQuerys too).  Sounds like something that really does need
 to be openly discussed.


 -Yonik

 -
 To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
 For additional commands, e

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011613#comment-13011613
 ] 

Robert Muir commented on SOLR-2155:
---

I don't really think things like this (queries etc) should go into just Solr, 
while we leave the lucene-contrib spatial package broken.

Lets put things in the right places?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Grant Ingersoll (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011615#comment-13011615
 ] 

Grant Ingersoll commented on SOLR-2155:
---

Yeah, I agree.  I haven't looked at the patch yet.  It was my understanding 
that Chris Male was going to move lucene/contrib/spatial to modules and gut the 
broken stuff in it.  I think there is a separate issue open for that one.  
Presumably, once spatial and function queries are moved to modules, then we 
will have a properly working spatial package.

I obviously can move it, but I don't have time to do the gutting (we really 
should have deprecated the tier stuff for this release).

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Robert Muir (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011616#comment-13011616
 ] 

Robert Muir commented on SOLR-2155:
---

well what would the deprecation have suggested as an alternative?

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Chris Male (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011619#comment-13011619
 ] 

Chris Male commented on SOLR-2155:
--

In LUCENE-2599 I deprecated the spatial contrib.  The problem is as Robert 
raises, deprecating the code without providing an alternative isn't that user 
friendly.  I think as part of this issue we should start up the spatial module 
and work towards moving what we can there.  Moving function queries is going to 
take some time since they are very coupled to Solr.  But that shouldn't 
preclude us from putting into the module what we can.  Once we have a module 
that provides a reasonable set of functionality, then we can 
deprecate/gut/remove the spatial contrib.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Grant Ingersoll
Not really related to this issue, so moving to dev@...

On Mar 26, 2011, at 7:52 AM, Robert Muir (JIRA) wrote:

 
[ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011616#comment-13011616
  ] 
 
 Robert Muir commented on SOLR-2155:
 ---
 
 well what would the deprecation have suggested as an alternative?

It's a good question.  The tier stuff, IMO and confirmed by others is broken 
for most of the world.  I sunk a good week into fixing it and was so entangled 
in the spaghetti that I gave up.  What we laid out on another issue (I forget 
the number, but I think C Male owns it and says he has a rewrite) is to move to 
modules, keep what we can (geohash and some of the utils) and gut the rest.  
That combined w/ moving function queries to modules would make all of spatial a 
good solution for the large majority of users.  The only thing that would 
remain to be back to our current state (at least in terms of features) would be 
to implement a tier approach.  I've proposed the Military Grid System (there is 
an open JIRA issue for it) as something that looks to be as a good candidate.  
It's well documented on the web and uses a metric for all distances and has the 
benefit that all of NATO uses it, albeit for different purposes.  It also 
addresses the poles and the meridians as first class citizens.  It just needs 
an implementer.  Having said that, I'm not 100% certain.  I also don't know 
that the tier stuff is absolutely necessary.  The combination of what we have 
in function queries plus trie fields makes for a very fast spatial lookup at 
this point.

I'm totally open to other suggestions, however.

Longer term, I've got a lot of ideas for spatial, but that's a different thread.

-Grant
-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Yonik Seeley
On Sat, Mar 26, 2011 at 7:32 AM, Robert Muir (JIRA) j...@apache.org wrote:
 I don't really think things like this (queries etc) should go into just Solr

I disagree strongly with the sentiment that queries don't belong in Solr.
Everything developed in/for lucene need not be exported to Solr immediately.
Everything developed in/for solr need not be exported to Lucene immediately.

If the work has been done, and the patch works for Solr, that should
be enough.  Period.

-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Grant Ingersoll

On Mar 26, 2011, at 8:24 AM, Robert Muir wrote:

 On Sat, Mar 26, 2011 at 8:06 AM, Grant Ingersoll gsing...@apache.org wrote:
 Not really related to this issue, so moving to dev@...
 
 On Mar 26, 2011, at 7:52 AM, Robert Muir (JIRA) wrote:
 
 
[ 
 https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011616#comment-13011616
  ]
 
 Robert Muir commented on SOLR-2155:
 ---
 
 well what would the deprecation have suggested as an alternative?
 
 It's a good question.  The tier stuff, IMO and confirmed by others is broken 
 for most of the world.  I sunk a good week into fixing it and was so 
 entangled in the spaghetti that I gave up.  What we laid out on another 
 issue (I forget the number, but I think C Male owns it and says he has a 
 rewrite) is to move to modules, keep what we can (geohash and some of the 
 utils) and gut the rest.  That combined w/ moving function queries to 
 modules would make all of spatial a good solution for the large majority of 
 users.  The only thing that would remain to be back to our current state (at 
 least in terms of features) would be to implement a tier approach.  I've 
 proposed the Military Grid System (there is an open JIRA issue for it) as 
 something that looks to be as a good candidate.  It's well documented on the 
 web and uses a metric for all distances and has the benefit that all of NATO 
 uses it, albeit for different purposes.  It also addresses the poles and the 
 meridians as first class citizens.  It just needs an implementer.  Having 
 said that, I'm not 100% certain.  I also don't know that the tier stuff is 
 absolutely necessary.  The combination of what we have in function queries 
 plus trie fields makes for a very fast spatial lookup at this point.
 
 I'm totally open to other suggestions, however.
 
 Longer term, I've got a lot of ideas for spatial, but that's a different 
 thread.
 
 
 I guess the reason I asked my question is more high-level: on one hand
 there are suggestions that lucene's spatial package should have been
 deprecated in 3.1, but on the other hand the very first feature on
 solr 3.1's new feature list is 'improved geospatial support'.
 

It really should say: Added Geospatial Support, as it was non-existent in Solr 
before.

Most of the work for adding in spatial in Solr consisted of improving things in 
Solr to make it easy to leverage the one spatial feature we really added: 
distance based functions and parsing support.  Everything else was generally 
useful things: sorting by function, poly fields, etc.  I started on tier 
support, but dropped it when I realized it was broken beyond repair.  The Solr 
stuff uses, IMO, the stuff in Lucene that works and ignores the rest.  I seem 
to recall Chris had said that once I got done w/ the Solr stuff he would do the 
modules work, but it hasn't happened yet.

I'd say in 3.2, since it sounds like Chris did at least deprecate 
contrib/spatial, that we work to get all of this resolved:  spatial - modules, 
function queries - modules.  Naturally we should do it on trunk, too.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Grant Ingersoll

On Mar 26, 2011, at 9:48 AM, Yonik Seeley wrote:

 On Sat, Mar 26, 2011 at 7:32 AM, Robert Muir (JIRA) j...@apache.org wrote:
 I don't really think things like this (queries etc) should go into just Solr
 
 I disagree strongly with the sentiment that queries don't belong in Solr.
 Everything developed in/for lucene need not be exported to Solr immediately.
 Everything developed in/for solr need not be exported to Lucene immediately.
 
 If the work has been done, and the patch works for Solr, that should
 be enough.  Period.
 

I agree it's enough for the contributor to do that, but as committers we need 
to look at the bigger picture in this particular case, which is the move of 
spatial to modules.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Yonik Seeley
On Sat, Mar 26, 2011 at 9:57 AM, Grant Ingersoll gsing...@apache.org wrote:

 On Mar 26, 2011, at 9:48 AM, Yonik Seeley wrote:

 On Sat, Mar 26, 2011 at 7:32 AM, Robert Muir (JIRA) j...@apache.org wrote:
 I don't really think things like this (queries etc) should go into just Solr

 I disagree strongly with the sentiment that queries don't belong in Solr.
 Everything developed in/for lucene need not be exported to Solr immediately.
 Everything developed in/for solr need not be exported to Lucene immediately.

 If the work has been done, and the patch works for Solr, that should
 be enough.  Period.


 I agree it's enough for the contributor to do that, but as committers we need 
 to look at the bigger picture in this particular case, which is the move of 
 spatial to modules.

That's a separate asynchronous issue.
Progress should not be blocked in Solr in the meantime.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Robert Muir
On Sat, Mar 26, 2011 at 9:48 AM, Yonik Seeley
yo...@lucidimagination.com wrote:
 On Sat, Mar 26, 2011 at 7:32 AM, Robert Muir (JIRA) j...@apache.org wrote:
 I don't really think things like this (queries etc) should go into just Solr

 I disagree strongly with the sentiment that queries don't belong in Solr.
 Everything developed in/for lucene need not be exported to Solr immediately.
 Everything developed in/for solr need not be exported to Lucene immediately.

 If the work has been done, and the patch works for Solr, that should
 be enough.  Period.


Its not enough for me: you can expect me to start raising questions
and objections when things are committed to the wrong place in the
codebase, its totally appropriate. We merged development, all
committers can commit to the correct places, there are no excuses.

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Nicolas Helleringer

  I started on tier support, but dropped it when I realized it was broken
 beyond repair.


I did no know one could break code beyond repair

Nicolas


Re: [jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread Yonik Seeley
On Sat, Mar 26, 2011 at 10:05 AM, Robert Muir rcm...@gmail.com wrote:
 On Sat, Mar 26, 2011 at 9:48 AM, Yonik Seeley
 yo...@lucidimagination.com wrote:
 On Sat, Mar 26, 2011 at 7:32 AM, Robert Muir (JIRA) j...@apache.org wrote:
 I don't really think things like this (queries etc) should go into just Solr

 I disagree strongly with the sentiment that queries don't belong in Solr.
 Everything developed in/for lucene need not be exported to Solr immediately.
 Everything developed in/for solr need not be exported to Lucene immediately.

 If the work has been done, and the patch works for Solr, that should
 be enough.  Period.


 Its not enough for me: you can expect me to start raising questions
 and objections when things are committed to the wrong place in the
 codebase, its totally appropriate. We merged development, all
 committers can commit to the correct places, there are no excuses.

If you're saying Queries don't belong in Solr, I'm a huge -1 on that.
There's no correct place for queries in general - it's all in the context.
If there's a better place for the query that can be achieved with a
mv, then fine.
But there's often much more work involved, dependencies on other solr features,
or fleshing out a real Java API rather than treating something as
simple implementation.

-Yonik

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-03-26 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011653#comment-13011653
 ] 

David Smiley commented on SOLR-2155:


I plan to finish a couple improvements to this patch within 2 weeks time: 
distance function queries to work with multi-value, and polygon queries that 
span the date line.  I've been delayed by some life events (new baby).  
Furthermore, I'll try and ensure that the work here is applicable to pure 
Lucene users (i.e. sans Solr). 

One thing I'm unsure of is how to integrate (or not integrate) existing Lucene 
 Solr spatial code with this patch.  In this patch I chose to re-use some 
basic shape classes in Lucene's spatial contrib simply because they were 
already there, but I could just as easily of not.  My preference going forward 
would be to outright replace Lucene's spatial contrib with this patch.  I also 
think LatLonType and PointType could become deprecated since this patch is not 
only more capable (multiValue support) but faster too.  Well with filtering, 
sorting is TBD. I'm also inclined to name the field type LatLonGeohashType to 
re-enforce the fact that it works with lat  lon; geohash is an implementation 
detail. In the future it might even not be geohash, strictly speaking, once we 
optimize the encoding.

 Geospatial search using geohash prefixes
 

 Key: SOLR-2155
 URL: https://issues.apache.org/jira/browse/SOLR-2155
 Project: Solr
  Issue Type: Improvement
Reporter: David Smiley
Assignee: Grant Ingersoll
 Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
 GeoHashPrefixFilter.patch, SOLR.2155.p3.patch, SOLR.2155.p3tests.patch


 There currently isn't a solution in Solr for doing geospatial filtering on 
 documents that have a variable number of points.  This scenario occurs when 
 there is location extraction (i.e. via a gazateer) occurring on free text.  
 None, one, or many geospatial locations might be extracted from any given 
 document and users want to limit their search results to those occurring in a 
 user-specified area.
 I've implemented this by furthering the GeoHash based work in Lucene/Solr 
 with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
 earth.  Each successive character added further subdivides the box into a 4x8 
 (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
 step in this scheme is figuring out which geohash grid squares cover the 
 user's search query.  I've added various extra methods to GeoHashUtils (and 
 added tests) to assist in this purpose.  The next step is an actual Lucene 
 Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
 TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
 matching geohash grid is found, the points therein are compared against the 
 user's query to see if it matches.  I created an abstraction GeoShape 
 extended by subclasses named PointDistance... and CartesianBox to support 
 different queried shapes so that the filter need not care about these details.
 This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



  1   2   >