weijie.tong created DRILL-7087:
----------------------------------

             Summary: Integrate Arrow's Gandiva into Drill
                 Key: DRILL-7087
                 URL: https://issues.apache.org/jira/browse/DRILL-7087
             Project: Apache Drill
          Issue Type: Improvement
          Components: Execution - Codegen, Execution - Relational Operators
            Reporter: weijie.tong


It's a prior work to integrate arrow into drill by invoking the its gandiva 
feature. Comparing arrow and drill 's in memory column representation , there's 
different null representation internal now. Drill use 1 byte while arrow using 
1 bit to indicate one null row. Also all columns of arrow is nullable now. 
Apart from those basic differences , they have same memory representation to 
the different data types. 

The integrating strategy is to invoke arrow's JniWrapper's native method 
directly by passing the ValueVector's memory address. 

I have done a implementation at our own Drill version by integrating gandiva 
into Drill's project operator. The performance shows that there's nearly 1 
times performance gain at expression computation.

So if there's no objection , I will submit a related PR to contribute this 
feature. Also this issue waits for arrow's related issue[ARROW-4819].





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to