Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#6).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
......................................................................

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 836 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/6
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>

Reply via email to