We already have two JIRA cases for Arrow integration:
https://issues.apache.org/jira/browse/CALCITE-2040 and
https://issues.apache.org/jira/browse/CALCITE-2173.

I think this is an extremely important area of work for the Calcite
project, because it helps us realize the vision of a deconstructed
database[1]. There is a lot of work to do, much of it very interesting
(e.g. writing a thread scheduler, IPC mechanisms, and algorithms for
sort, join and aggregation that work effectively on Arrow data
structures).

If you want to help Masayuki, please step up!

Julian

[1] 
https://www.slideshare.net/julienledem/from-flat-files-to-deconstructed-database

On Thu, Jun 28, 2018 at 2:24 PM, Michael Mior <[email protected]> wrote:
> That's great! If you could create a JIRA case to track your progress, that
> would be helpful for others who might want to follow along or contribute.
> Thanks!
>
> --
> Michael Mior
> [email protected]
>
>
>
> Le mar. 26 juin 2018 à 10:36, Masayuki Takahashi <[email protected]> a
> écrit :
>
>> Hi Julian,
>>
>> > Masayuki Takahashi has started to develop an Arrow adapter for
>> Calcite[2], but a lot of work remains to implement all SQL built-in
>> functions and basic relational operators. Building on top of Gandiva we
>> could save a lot of this effort.
>>
>> I will start to build Gandiva development environment and try to
>> consider a way to incorporate.
>>
>> thanks.
>>
>>
>>
>> 2018年6月23日(土) 3:54 Julian Hyde <[email protected]>:
>> >
>> > Suppose a company wishes to build a graph database using their own
>> innovative graph index data structure. They nevertheless need to implement
>> core relational algebra, core data types, and core built-in functions (+,
>> CASE, SUM, SUBSTRING). And they want to implement these on a
>> memory-efficient data structure (tens of thousands of rows, stored
>> column-oriented, per memory block). This is a massive effort.
>> >
>> > With Calcite+Gandiva+Arrow they just need to create a sequence of
>> relational operators (using RelBuilder, say) and efficient machine code is
>> generated. They can then start adding their own data types, built-in
>> functions, and relational operators, using the same architecture.
>> >
>> > Julian
>> >
>> >
>> > > On Jun 22, 2018, at 11:33 AM, Xiening Dai <[email protected]> wrote:
>> > >
>> > > I was in a talk regarding Gandiva yesterday. Impressive work!
>> > >
>> > > But I am not sure why Calcite would like to integrate with it. To me
>> Gandiva is on execution side, in which scenarios a query planner would need
>> a arrow engine? I read the original Jira about implementing file
>> enumerator, but the intent is still not clear to me. Would appreciate if
>> you can elaborate. Thanks.
>> > >
>> > >
>> > >> On Jun 22, 2018, at 11:20 AM, Julian Hyde <[email protected]> wrote:
>> > >>
>> > >> There is a discussion on dev@arrow about Gandiva, a kernel for
>> Arrow[1].
>> > >>
>> > >> I think it would be an interesting library on which to build our
>> Arrow engine. (Without a kernel, Arrow is just a data format, but with
>> Gandiva it becomes an engine upon which we can implement all relational
>> operations, albeit on a multi-threaded single node. Potentially this
>> approach can process each row in a few machine cycles, i.e. billions of
>> records per second. Therefore single-node would be sufficient for many
>> queries.)
>> > >>
>> > >> Masayuki Takahashi has started to develop an Arrow adapter for
>> Calcite[2], but a lot of work remains to implement all SQL built-in
>> functions and basic relational operators. Building on top of Gandiva we
>> could save a lot of this effort.
>> > >>
>> > >> Julian
>> > >>
>> > >> [1]
>> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
>> <
>> https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E
>> >
>> > >>
>> > >> [2] https://issues.apache.org/jira/browse/CALCITE-2173 <
>> https://issues.apache.org/jira/browse/CALCITE-2173>
>> > >
>> >
>>
>>
>> --
>> 高橋 真之
>>

Reply via email to