Jinfeng Ni created DRILL-5585:
---------------------------------
Summary: UnionAll operator generates run-time code for every
incoming batch
Key: DRILL-5585
URL: https://issues.apache.org/jira/browse/DRILL-5585
Project: Apache Drill
Issue Type: Bug
Reporter: Jinfeng Ni
Assignee: Jinfeng Ni
In Drill's execution framework, each operator may generate run-time code for
various purpose. The code generation & compilation should only happen when
there is a new schema from incoming batch ({{OK_NEW_SCHEM}}. For any follow-up
schema ({{OK}}), the operator should not generate the run-time code, since it's
available.
However, in the current implementation of UnionAll, regardless the incoming
batch returns with a {{OK_NEW_SCHEMA}} or {{OK}}, it will always call doWork(),
which essentially would 1) generate code and possibly compile code, 2) doSetup,
3) doEvaluation. The code generation logic is not necessary, and doing that
for each batch would significantly impact the operator's performance, and slow
down query execution.
{code}
case OK_NEW_SCHEMA:
outputFields = unionAllInput.getOutputFields();
case OK:
IterOutcome workOutcome = doWork();
{code}
For the multiple run-time generation, code compilation could be skipped, unless
there is a miss in code cache. However, the current code logic is still
problematic, since it has to {{ClassGenerator}} to generate the run-time
source code.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)