How do you propose we start here? I can start writing a spec based on
the implementation, based on how Postgres and others implement them.

On Thu, Jun 29, 2017 at 12:08 AM, Julian Hyde <jh...@apache.org> wrote:
> No, anyone brave enough to be an early adopter would already be on this list.
>
> James (Phoenix), Jacques (Dremio), Fabian (Flink), Jinfeng/Aman (Drill) can 
> be proxies for their their communities. And others who are building 
> interesting stuff I don’t know about.
>
> Julian
>
>> On Jun 28, 2017, at 10:29 AM, Atri Sharma <atri.j...@gmail.com> wrote:
>>
>> Should we start a thread with potential users, eg Phoenix community?
>>
>> On Wed, Jun 28, 2017 at 10:55 PM, Julian Hyde <jh...@apache.org> wrote:
>>> I’d also like to hear from potential users of this feature. They could try 
>>> this functionality as it becomes available, and help us prioritize features.
>>>
>>> Julian
>>>
>>>
>>>> On Jun 28, 2017, at 9:10 AM, Atri Sharma <atri.j...@gmail.com> wrote:
>>>>
>>>> And I have created the JIRA:
>>>>
>>>> https://issues.apache.org/jira/browse/CALCITE-1861
>>>>
>>>>
>>>>
>>>> On Wed, Jun 28, 2017 at 7:02 AM, Julian Hyde <jh...@apache.org> wrote:
>>>>> Is anyone looking for a neat project in Calcite that would have a big
>>>>> impact? I'm thinking that we could add support for spatial indexes to
>>>>> Calcite in such a way that downstream projects such as Phoenix and
>>>>> Flink could easily benefit from it.
>>>>>
>>>>> GIS (Geographic Information Systems, aka Spatial database) is really
>>>>> useful functionality to have in your database. To find restaurants
>>>>> less than 1 km from downtown San Francisco, you could run
>>>>>
>>>>> select *
>>>>> from restaurants as r
>>>>> where st_distance(point(-122.4194, 37.7749), r.coordinates) <= 1;
>>>>>
>>>>> There are mature SQL implementations of GIS in PostGIS, Oracle Spatial
>>>>> and Microsoft SQL Server; and OpenGIS has standardized SQL
>>>>> extensions[1].
>>>>>
>>>>> Now, the SQL-GIS standard is rather large, and involves implementing
>>>>> lots of data types and scalar functions. We could get to that
>>>>> eventually. But I contend that many, many applications would be
>>>>> satisfied by points and distances (like the query above) and a spatial
>>>>> index to make them run quickly. And I believe that we can add spatial
>>>>> index support to Calcite using a logical rewrite rule.
>>>>>
>>>>> Rewriting spatial queries to indexes on space-filling curves is a
>>>>> well-established technique [2].
>>>>>
>>>>> Suppose that the restaurants table, above, had columns latitude and
>>>>> longitude and a computed numeric column h = hilbert(latitude,
>>>>> longitude). Hilbert curves are space-filling curves such that if two
>>>>> points are close in space then their h values will be close. So, if
>>>>> there is an index on h, we can find all restaurants close to a given
>>>>> point using a range scan of the index.
>>>>>
>>>>> So, the above query could be rewritten to something like
>>>>>
>>>>> select *
>>>>> from restaurants as r
>>>>> where (r.h between 123456 and 123599
>>>>>     or r.h between 256789 and 259887)
>>>>>   and st_distance_internal(point(-122.4194, 37.7749), r.coordinates) <= 1;
>>>>>
>>>>> The range predicates on r.h quickly eliminate 99.9% of the rows in the
>>>>> database, and the call to st_distance_internal eliminates the
>>>>> remaining false positives.
>>>>>
>>>>> That rewrite can be done using a logical rewrite rule, and the
>>>>> resulting query will be faster on just about any database, but
>>>>> especially one with key-sorted tables (like Phoenix/HBase) or
>>>>> range-partitioned tables. The database does not need to have a
>>>>> dedicated "spatial index" data structure.
>>>>>
>>>>> Julian
>>>>>
>>>>> [1] http://www.opengeospatial.org/standards/sfs
>>>>>
>>>>> [2] http://math.bme.hu/~gnagy/mmsz/eloadasok/BisztrayDenes2014.pdf
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>>
>>>> Atri
>>>> l'apprenant
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Atri
>> l'apprenant
>



-- 
Regards,

Atri
l'apprenant

Reply via email to