How do you propose we start here? I can start writing a spec based on the implementation, based on how Postgres and others implement them.
On Thu, Jun 29, 2017 at 12:08 AM, Julian Hyde <jh...@apache.org> wrote: > No, anyone brave enough to be an early adopter would already be on this list. > > James (Phoenix), Jacques (Dremio), Fabian (Flink), Jinfeng/Aman (Drill) can > be proxies for their their communities. And others who are building > interesting stuff I don’t know about. > > Julian > >> On Jun 28, 2017, at 10:29 AM, Atri Sharma <atri.j...@gmail.com> wrote: >> >> Should we start a thread with potential users, eg Phoenix community? >> >> On Wed, Jun 28, 2017 at 10:55 PM, Julian Hyde <jh...@apache.org> wrote: >>> I’d also like to hear from potential users of this feature. They could try >>> this functionality as it becomes available, and help us prioritize features. >>> >>> Julian >>> >>> >>>> On Jun 28, 2017, at 9:10 AM, Atri Sharma <atri.j...@gmail.com> wrote: >>>> >>>> And I have created the JIRA: >>>> >>>> https://issues.apache.org/jira/browse/CALCITE-1861 >>>> >>>> >>>> >>>> On Wed, Jun 28, 2017 at 7:02 AM, Julian Hyde <jh...@apache.org> wrote: >>>>> Is anyone looking for a neat project in Calcite that would have a big >>>>> impact? I'm thinking that we could add support for spatial indexes to >>>>> Calcite in such a way that downstream projects such as Phoenix and >>>>> Flink could easily benefit from it. >>>>> >>>>> GIS (Geographic Information Systems, aka Spatial database) is really >>>>> useful functionality to have in your database. To find restaurants >>>>> less than 1 km from downtown San Francisco, you could run >>>>> >>>>> select * >>>>> from restaurants as r >>>>> where st_distance(point(-122.4194, 37.7749), r.coordinates) <= 1; >>>>> >>>>> There are mature SQL implementations of GIS in PostGIS, Oracle Spatial >>>>> and Microsoft SQL Server; and OpenGIS has standardized SQL >>>>> extensions[1]. >>>>> >>>>> Now, the SQL-GIS standard is rather large, and involves implementing >>>>> lots of data types and scalar functions. We could get to that >>>>> eventually. But I contend that many, many applications would be >>>>> satisfied by points and distances (like the query above) and a spatial >>>>> index to make them run quickly. And I believe that we can add spatial >>>>> index support to Calcite using a logical rewrite rule. >>>>> >>>>> Rewriting spatial queries to indexes on space-filling curves is a >>>>> well-established technique [2]. >>>>> >>>>> Suppose that the restaurants table, above, had columns latitude and >>>>> longitude and a computed numeric column h = hilbert(latitude, >>>>> longitude). Hilbert curves are space-filling curves such that if two >>>>> points are close in space then their h values will be close. So, if >>>>> there is an index on h, we can find all restaurants close to a given >>>>> point using a range scan of the index. >>>>> >>>>> So, the above query could be rewritten to something like >>>>> >>>>> select * >>>>> from restaurants as r >>>>> where (r.h between 123456 and 123599 >>>>> or r.h between 256789 and 259887) >>>>> and st_distance_internal(point(-122.4194, 37.7749), r.coordinates) <= 1; >>>>> >>>>> The range predicates on r.h quickly eliminate 99.9% of the rows in the >>>>> database, and the call to st_distance_internal eliminates the >>>>> remaining false positives. >>>>> >>>>> That rewrite can be done using a logical rewrite rule, and the >>>>> resulting query will be faster on just about any database, but >>>>> especially one with key-sorted tables (like Phoenix/HBase) or >>>>> range-partitioned tables. The database does not need to have a >>>>> dedicated "spatial index" data structure. >>>>> >>>>> Julian >>>>> >>>>> [1] http://www.opengeospatial.org/standards/sfs >>>>> >>>>> [2] http://math.bme.hu/~gnagy/mmsz/eloadasok/BisztrayDenes2014.pdf >>>> >>>> >>>> >>>> -- >>>> Regards, >>>> >>>> Atri >>>> l'apprenant >>> >> >> >> >> -- >> Regards, >> >> Atri >> l'apprenant > -- Regards, Atri l'apprenant