[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114839#comment-13114839 ] geert-jan brits commented on SOLR-2155: --- David, I try not to swamp this discussion, but I have a totally different issue for which I might misuse this patch / LSP. It's about pois having multiple openinghours (depending on day of week, special festivitydays, and sometimes even multiple timeslots per day) I want to query, for example, all pois that are open NOW, and that will remain open until NOW+3H. For background see: http://lucene.472066.n3.nabble.com/multiple-dateranges-timeslots-per-doc-modeling-openinghours-td3368790.html on why all normal approaches don't work (afaik): basically it's about needing multiple opening/closing times and having them be pairwise related. I have the feeling that opening/closing datetimes might be modelled as multiple lat/long points. But I would need a query of the form: Given a user defined point x, return all docs that have a point p defined for which: - x.latitude > p.latitude - x.longitude < p.longitude Is this possible? (As far as I see GeoFilt, BBox, GeoDist don't provide me with what I need) Basically this is how I envision encoding it: - each -tuple is represented as a (lat/long)point - open is matched on latitude - closedelta (closedelta is represented as delta from open) is matched on longitude - granularity is 5 minutes - open can be a max of 100 days in future -> ~30.000 distinct values. - closedelta can be at most 24 hours -> ~300 distinct values The above lat/long query applied to the domain would become: Given a user defined open/closedelta-datetime x, return all docs that have a open/close-datetime p defined for which: - x.open > p.open (poi is already open at requested opening time) - x.closedelta < p.closedelta (poi is not yet closed on the requested closing time) In other words, the poi is open from the requested open-datetime until at least the requested close-datetime. Ok, good exercise in writing this down, the question remains is this query possible (perhaps with some coding-efforts)? Thanks, Geert-Jan > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114726#comment-13114726 ] geert-jan brits edited comment on SOLR-2155 at 9/26/11 3:45 PM: Great thanks, I believe you interpretation of my use-case is correct . I will go the Multi-point route first, without the polygons. Just to clarify: I realize I added to the confusion by bringing polygons to the table where they aren't necessary for the problem I described. I did this because I thought that perhaps "distance of point to polygon' was implemented in LSP, while 'distance of point to collection of points' was not. In that case 'transforming the problem space' by representing a 'collection of points' as a polygon and querying for "distance of point to polygon" instead would have given me what I wanted. This is all superfluous now, because doing 'distance of point to collection of points' IS possible. I will check out the code, thanks again! was (Author: gbrits): Great thanks, I believe you interpretation of my use-case is correct . I will go the Multi-point route first, without the polygons. Just to clarify: I realize I added to the confusion by bringing polygons to the table where they aren't necessary for the problem I described. I did this because I thought that perhaps "distance of point to polygon' was implemented in LSP, while 'point to collection of points' was not. In that case 'transforming the problem space' by representing a 'collection of points' as a polygon and querying for "distance of point to polygon" instead would have given me what I wanted. This is all superfluous now, because doing 'point to collection of points' IS possible. I will check out the code, thanks again! > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114726#comment-13114726 ] geert-jan brits commented on SOLR-2155: --- Great thanks, I believe you interpretation of my use-case is correct . I will go the Multi-point route first, without the polygons. Just to clarify: I realize I added to the confusion by bringing polygons to the table where they aren't necessary for the problem I described. I did this because I thought that perhaps "distance of point to polygon' was implemented in LSP, while 'point to collection of points' was not. In that case 'transforming the problem space' by representing a 'collection of points' as a polygon and querying for "distance of point to polygon" instead would have given me what I wanted. This is all superfluous now, because doing 'point to collection of points' IS possible. I will check out the code, thanks again! > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114686#comment-13114686 ] geert-jan brits edited comment on SOLR-2155 at 9/26/11 2:16 PM: David, to clarify: My use-case could be either represented as: 1. a bag of points, in which case I want to be able to return the closest point to a user-defined point and sort on the distance 2. a polygon made of the points (where the points are the vertices of the polygon) and return the closest distance from a user-defined point to the polygon. Either of the solutions suffices for me, from your answer I can't entirely see if that was clear. You mention: "Sorting by (multi-value) indexed shapes is supported only for points". Does this mean that representation 1.) above is supported? It wasn't entirely clear for me from your response. Let me give you the use-case, (and why the sort on center-point / centroid is not going to work): Consider a travel application in which walks/itineraries can be defined. Most of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my representation (for now) a walk visits certain Points of interest (poi) (which each have a lat/long point defined) in a certain order. A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). I want a user to be able to request walks that are nearby. (sorted based on distance). For each walk the distance becomes the closest Poi (thus point) defined in the walk related to the user-defined point. Does this make sense? P.s: having only though of representing this problem as polygons to support the 'find closest point'-query, I skimmed over the fact that for my notion of a walk (ordered collection of points) , connecting the points in the order specified may generate a complex (self-intersecting) polygon. Are these polygons supported in the LSP? was (Author: gbrits): David, to clarify: My use-case could be either represented as: 1. a bag of points, in which case I want to be able to return the closest point to a user-defined point and sort on the distance 2. a polygon made of the points (where the points are the vertices of the polygon) and return the closest distance from a user-defined point to the polygon. Either of the solutions suffices for me, from your answer I can't entirely see if that was clear. You mention: "Sorting by (multi-value) indexed shapes is supported only for points". Does this mean that representation 1.) above is supported? It wasn't entirely clear for me from your response. Let me give you the use-case, (and why the sort on center-point / centroid is not going to work): Consider a travel application in which walks/itineraries can be defined. Most of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my representation (for now) a walk visits certain Points of interest (poi) (which each have a lat/long point defined) in a certain order. A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). I want a user to be able to request walks that are nearby. (sorted based on distance). For each walk the distance becomes the closest Poi (thus point) defined in the walk related to the user-defined point. Does this make sense? > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes i
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114686#comment-13114686 ] geert-jan brits commented on SOLR-2155: --- David, to clarify: My use-case could be either represented as: 1. a bag of points, in which case I want to be able to return the closest point to a user-defined point and sort on the distance 2. a polygon made of the points (where the points are the vertices of the polygon) and return the closest distance from a user-defined point to the polygon. Either of the solutions suffices for me, from your answer I can't entirely see if that was clear. You mention: "Sorting by (multi-value) indexed shapes is supported only for points". Does this mean that representation 1.) above is supported? It wasn't entirely clear for me from your response. Let me give you the use-case, (and why the sort on center-point / centroid is not going to work): Consider a travel application in which walks/itineraries can be defined. Most of the walks are defined as roundtrips (i.e: beginpoint = endpoint). In my representation (for now) a walk visits certain Points of interest (poi) (which each have a lat/long point defined) in a certain order. A lot of walks can be started at any given Poi. (bc. of the roundtrip nature). I want a user to be able to request walks that are nearby. (sorted based on distance). For each walk the distance becomes the closest Poi (thus point) defined in the walk related to the user-defined point. Does this make sense? > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2155) Geospatial search using geohash prefixes
[ https://issues.apache.org/jira/browse/SOLR-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13114585#comment-13114585 ] geert-jan brits commented on SOLR-2155: --- I have the impression that this code is meant for drawing shapes and to see if geospatial enriched documents are within this shape. Is that correct? Perhaps my use-case is also supported, because it's in the 'multi-geopoint domain' as well. I envision documents having multiple lat/long points. I would like to query (sort / filter on) documents by their 'closest point' to a given user-defined lat/long point. Documents would either contain a bag of lat/long pairs or a polygon made up out of these lat/long pairs and the query would become: return the closest distance from a user-defined point to the polygon. Before delving in the above code or in the LSP-code myself, perhaps someone can say if this type of querying is supported? > Geospatial search using geohash prefixes > > > Key: SOLR-2155 > URL: https://issues.apache.org/jira/browse/SOLR-2155 > Project: Solr > Issue Type: Improvement >Reporter: David Smiley >Assignee: Grant Ingersoll > Attachments: GeoHashPrefixFilter.patch, GeoHashPrefixFilter.patch, > GeoHashPrefixFilter.patch, > SOLR-2155_GeoHashPrefixFilter_with_sorting_no_poly.patch, SOLR.2155.p3.patch, > SOLR.2155.p3tests.patch > > > There currently isn't a solution in Solr for doing geospatial filtering on > documents that have a variable number of points. This scenario occurs when > there is location extraction (i.e. via a "gazateer") occurring on free text. > None, one, or many geospatial locations might be extracted from any given > document and users want to limit their search results to those occurring in a > user-specified area. > I've implemented this by furthering the GeoHash based work in Lucene/Solr > with a geohash prefix based filter. A geohash refers to a lat-lon box on the > earth. Each successive character added further subdivides the box into a 4x8 > (or 8x4 depending on the even/odd length of the geohash) grid. The first > step in this scheme is figuring out which geohash grid squares cover the > user's search query. I've added various extra methods to GeoHashUtils (and > added tests) to assist in this purpose. The next step is an actual Lucene > Filter, GeoHashPrefixFilter, that uses these geohash prefixes in > TermsEnum.seek() to skip to relevant grid squares in the index. Once a > matching geohash grid is found, the points therein are compared against the > user's query to see if it matches. I created an abstraction GeoShape > extended by subclasses named PointDistance... and CartesianBox to support > different queried shapes so that the filter need not care about these details. > This work was presented at LuceneRevolution in Boston on October 8th. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org