Jinfeng Ni created DRILL-5585:
---------------------------------

             Summary: UnionAll operator generates run-time code for every 
incoming batch
                 Key: DRILL-5585
                 URL: https://issues.apache.org/jira/browse/DRILL-5585
             Project: Apache Drill
          Issue Type: Bug
            Reporter: Jinfeng Ni
            Assignee: Jinfeng Ni


In Drill's execution framework, each operator may generate run-time code for 
various purpose. The code generation & compilation should only happen when 
there is a new schema from incoming batch ({{OK_NEW_SCHEM}}. For any follow-up 
schema ({{OK}}), the operator should not generate the run-time code, since it's 
available. 

However, in the current implementation of UnionAll, regardless the incoming 
batch returns with a {{OK_NEW_SCHEMA}} or {{OK}}, it will always call doWork(), 
which essentially would 1) generate code and possibly compile code, 2) doSetup, 
3) doEvaluation.  The code generation logic is not necessary, and doing that 
for each batch would significantly impact the operator's performance, and slow 
down query execution. 

{code}
        case OK_NEW_SCHEMA:
          outputFields = unionAllInput.getOutputFields();
        case OK:
          IterOutcome workOutcome = doWork();
{code}
For the multiple run-time generation, code compilation could be skipped, unless 
there is a miss in code cache. However,  the current code logic is still 
problematic,  since it has to {{ClassGenerator}} to generate the run-time 
source code. 




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to