weijie.tong created DRILL-7087:
----------------------------------
Summary: Integrate Arrow's Gandiva into Drill
Key: DRILL-7087
URL: https://issues.apache.org/jira/browse/DRILL-7087
Project: Apache Drill
Issue Type: Improvement
Components: Execution - Codegen, Execution - Relational Operators
Reporter: weijie.tong
It's a prior work to integrate arrow into drill by invoking the its gandiva
feature. Comparing arrow and drill 's in memory column representation , there's
different null representation internal now. Drill use 1 byte while arrow using
1 bit to indicate one null row. Also all columns of arrow is nullable now.
Apart from those basic differences , they have same memory representation to
the different data types.
The integrating strategy is to invoke arrow's JniWrapper's native method
directly by passing the ValueVector's memory address.
I have done a implementation at our own Drill version by integrating gandiva
into Drill's project operator. The performance shows that there's nearly 1
times performance gain at expression computation.
So if there's no objection , I will submit a related PR to contribute this
feature. Also this issue waits for arrow's related issue[ARROW-4819].
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)