There is a discussion on dev@arrow about Gandiva, a kernel for Arrow[1]. I think it would be an interesting library on which to build our Arrow engine. (Without a kernel, Arrow is just a data format, but with Gandiva it becomes an engine upon which we can implement all relational operations, albeit on a multi-threaded single node. Potentially this approach can process each row in a few machine cycles, i.e. billions of records per second. Therefore single-node would be sufficient for many queries.)
Masayuki Takahashi has started to develop an Arrow adapter for Calcite[2], but a lot of work remains to implement all SQL built-in functions and basic relational operators. Building on top of Gandiva we could save a lot of this effort. Julian [1] https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E <https://lists.apache.org/thread.html/f099b3d1e2aaf9803c5c756f872a594baf17e9f25974e3496c9706d9@%3Cdev.arrow.apache.org%3E> [2] https://issues.apache.org/jira/browse/CALCITE-2173 <https://issues.apache.org/jira/browse/CALCITE-2173>
