[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114839#comment-13114839
 ] 

geert-jan brits commented on SOLR-2155:
---

David, 

I try not to swamp this discussion, but I have a totally different issue for 
which I might misuse this patch / LSP. 

It's about pois having multiple openinghours (depending on day of week, special 
festivitydays, and sometimes even multiple timeslots per day) 
I want to query, for example, all pois that are open NOW, and that will remain 
open until NOW+3H. 

For background see: 
http://lucene.472066.n3.nabble.com/multiple-dateranges-timeslots-per-doc-modeling-openinghours-td3368790.html
 on why all normal approaches don't work (afaik): basically it's about needing 
multiple opening/closing times and having them be pairwise related.

I have the feeling that opening/closing datetimes might be modelled as multiple 
lat/long points. But I would need a query of the form: 

Given a user defined point x, return all docs that have a point p defined for 
which: 
 - x.latitude > p.latitude
 - x.longitude < p.longitude

Is this possible? (As far as I see GeoFilt, BBox, GeoDist don't provide me with 
what I need)

Basically this is how I envision encoding it:
 - each -tuple is represented as a (lat/long)point 
 - open is matched on latitude
 - closedelta (closedelta is represented as delta from open) is matched on 
longitude
 - granularity is 5 minutes
- open can be a max of 100 days in future -> ~30.000 distinct values. 
- closedelta can be at most 24 hours -> ~300 distinct values

The above lat/long query applied to the domain would become: 
Given a user defined open/closedelta-datetime x, return all docs that have a 
open/close-datetime p defined for which: 
 - x.open > p.open (poi is already open at requested opening time) 
 - x.closedelta  < p.closedelta (poi is not yet closed on the requested closing 
time) 

In other words, the poi is open from the requested open-datetime until at least 
the requested close-datetime.

Ok, good exercise in writing this down, the question remains is this query 
possible (perhaps with some coding-efforts)?

Thanks, 
Geert-Jan  
 



> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114726#comment-13114726
 ] 

geert-jan brits edited comment on SOLR-2155 at 9/26/11 3:45 PM:


Great thanks, I believe you interpretation of my use-case is correct .
I will go the Multi-point route first, without the polygons. 

Just to clarify: I realize I added to the confusion by bringing polygons to the 
table where they aren't necessary for the problem I described. 
I did this because I thought that perhaps "distance of point to polygon' was 
implemented in LSP, while 'distance of point to collection of points' was not.  

In that case 'transforming the problem space' by representing a 'collection of 
points' as a polygon and querying for "distance of point to polygon" instead 
would have given me what I wanted. This is all superfluous now, because doing 
'distance of point to collection of points' IS possible. 

I will check out the code, thanks again!



  was (Author: gbrits):
Great thanks, I believe you interpretation of my use-case is correct .
I will go the Multi-point route first, without the polygons. 

Just to clarify: I realize I added to the confusion by bringing polygons to the 
table where they aren't necessary for the problem I described. 
I did this because I thought that perhaps "distance of point to polygon' was 
implemented in LSP, while 'point to collection of points' was not.  

In that case 'transforming the problem space' by representing a 'collection of 
points' as a polygon and querying for "distance of point to polygon" instead 
would have given me what I wanted. This is all superfluous now, because doing 
'point to collection of points' IS possible. 

I will check out the code, thanks again!


  
> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114726#comment-13114726
 ] 

geert-jan brits commented on SOLR-2155:
---

Great thanks, I believe you interpretation of my use-case is correct .
I will go the Multi-point route first, without the polygons. 

Just to clarify: I realize I added to the confusion by bringing polygons to the 
table where they aren't necessary for the problem I described. 
I did this because I thought that perhaps "distance of point to polygon' was 
implemented in LSP, while 'point to collection of points' was not.  

In that case 'transforming the problem space' by representing a 'collection of 
points' as a polygon and querying for "distance of point to polygon" instead 
would have given me what I wanted. This is all superfluous now, because doing 
'point to collection of points' IS possible. 

I will check out the code, thanks again!



> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Issue Comment Edited] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114686#comment-13114686
 ] 

geert-jan brits edited comment on SOLR-2155 at 9/26/11 2:16 PM:


David, to clarify:
My use-case could be either represented as: 
 1. a bag of points, in which case I want to be able to return the closest 
point to a user-defined point and sort on the distance
 2. a polygon made of the points (where the points are the vertices of the 
polygon) and return the closest distance from a user-defined point to the 
polygon. 

Either of the solutions suffices for me, from your answer I can't entirely see 
if that was clear. 

You mention: "Sorting by (multi-value) indexed shapes is supported only for 
points". 
Does this mean that representation 1.) above is supported? It wasn't entirely 
clear for me from your response. 

Let me give you the use-case, (and why the sort on center-point / centroid is 
not going to work): 

Consider a travel application in which walks/itineraries can be defined. Most 
of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my 
representation (for now) a walk visits certain Points of interest (poi) (which 
each have a lat/long point defined)  in a certain order. 

A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). 
I want a user to be able to request walks that are nearby. (sorted based on 
distance). For each walk the distance becomes the closest Poi (thus point) 
defined in the walk related to the user-defined point. 

Does this make sense?

P.s: having only though of representing this problem as polygons to support the 
'find closest point'-query, I skimmed over the fact that for my notion of a 
walk (ordered collection of points) , connecting the points in the order 
specified may generate a complex (self-intersecting) polygon. Are these 
polygons supported in the LSP? 


  was (Author: gbrits):
David, to clarify:
My use-case could be either represented as: 
 1. a bag of points, in which case I want to be able to return the closest 
point to a user-defined point and sort on the distance
 2. a polygon made of the points (where the points are the vertices of the 
polygon) and return the closest distance from a user-defined point to the 
polygon. 

Either of the solutions suffices for me, from your answer I can't entirely see 
if that was clear. 

You mention: "Sorting by (multi-value) indexed shapes is supported only for 
points". 
Does this mean that representation 1.) above is supported? It wasn't entirely 
clear for me from your response. 

Let me give you the use-case, (and why the sort on center-point / centroid is 
not going to work): 

Consider a travel application in which walks/itineraries can be defined. Most 
of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my 
representation (for now) a walk visits certain Points of interest (poi) (which 
each have a lat/long point defined)  in a certain order. 

A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). 
I want a user to be able to request walks that are nearby. (sorted based on 
distance). For each walk the distance becomes the closest Poi (thus point) 
defined in the walk related to the user-defined point. 

Does this make sense? 

  
> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes i

[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114686#comment-13114686
 ] 

geert-jan brits commented on SOLR-2155:
---

David, to clarify:
My use-case could be either represented as: 
 1. a bag of points, in which case I want to be able to return the closest 
point to a user-defined point and sort on the distance
 2. a polygon made of the points (where the points are the vertices of the 
polygon) and return the closest distance from a user-defined point to the 
polygon. 

Either of the solutions suffices for me, from your answer I can't entirely see 
if that was clear. 

You mention: "Sorting by (multi-value) indexed shapes is supported only for 
points". 
Does this mean that representation 1.) above is supported? It wasn't entirely 
clear for me from your response. 

Let me give you the use-case, (and why the sort on center-point / centroid is 
not going to work): 

Consider a travel application in which walks/itineraries can be defined. Most 
of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my 
representation (for now) a walk visits certain Points of interest (poi) (which 
each have a lat/long point defined)  in a certain order. 

A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). 
I want a user to be able to request walks that are nearby. (sorted based on 
distance). For each walk the distance becomes the closest Poi (thus point) 
defined in the walk related to the user-defined point. 

Does this make sense? 


> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes

2011-09-26 Thread geert-jan brits (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114585#comment-13114585
 ] 

geert-jan brits commented on SOLR-2155:
---

I have the impression that this code is meant for drawing shapes and to see if 
geospatial enriched documents are within this shape. Is that correct?

Perhaps my use-case is also supported, because it's in the 'multi-geopoint 
domain' as well.

I envision documents having multiple lat/long points. I would like to query 
(sort / filter on) documents by their 'closest point' to a given user-defined 
lat/long point. Documents would either contain a bag of lat/long pairs or a 
polygon made up out of these lat/long pairs and the query would become: return 
the closest distance from a user-defined point to the polygon. 

Before delving in the above code or in the LSP-code myself, perhaps someone can 
say if this type of querying is supported? 




> Geospatial search using geohash prefixes
> 
>
> Key: SOLR-2155
> URL: https://issues.apache.org/jira/browse/SOLR-2155
> Project: Solr
>  Issue Type: Improvement
>Reporter: David Smiley
>Assignee: Grant Ingersoll
> Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, 
> GeoHashPrefixFilter.patch, 
> SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, 
> SOLR.2155.p3tests.patch
>
>
> There currently isn't a solution in Solr for doing geospatial filtering on 
> documents that have a variable number of points.  This scenario occurs when 
> there is location extraction (i.e. via a "gazateer") occurring on free text.  
> None, one, or many geospatial locations might be extracted from any given 
> document and users want to limit their search results to those occurring in a 
> user-specified area.
> I've implemented this by furthering the GeoHash based work in Lucene/Solr 
> with a geohash prefix based filter.  A geohash refers to a lat-lon box on the 
> earth.  Each successive character added further subdivides the box into a 4x8 
> (or 8x4 depending on the even/odd length of the geohash) grid.  The first 
> step in this scheme is figuring out which geohash grid squares cover the 
> user's search query.  I've added various extra methods to GeoHashUtils (and 
> added tests) to assist in this purpose.  The next step is an actual Lucene 
> Filter, GeoHashPrefixFilter, that uses these geohash prefixes in 
> TermsEnum.seek() to skip to relevant grid squares in the index.  Once a 
> matching geohash grid is found, the points therein are compared against the 
> user's query to see if it matches.  I created an abstraction GeoShape 
> extended by subclasses named PointDistance... and CartesianBox to support 
> different queried shapes so that the filter need not care about these details.
> This work was presented at LuceneRevolution in Boston on October 8th.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira



-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org