[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-31 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 10:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@43
PS8, Line 43: (IMPALA-7737)
> are there examples of these fns in our benchmarks to quantify the regressio
Done


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@45
PS8, Line 45: Still, the fix provides most of what the JIRA ticket requested
> I'd skip these next three paragraphs.
Done


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@57
PS8, Line 57:
> pls make a section for this called "Testing" so its easier to jump to. also
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h
File be/src/exprs/conditional-functions.h:

http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h@76
PS8, Line 76: since
: /// various bugs mean that this implementation is still sometimes 
used. But
: /// the goal is to remove these classes at some point.
> simpler: "until their use is eliminated by the frontend".
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@37
PS8, Line 37: vanish
> is this accurate given the comments in the commit message about order by?
Yes, they vanish after the rewrite. If the rewrite does not occur (due to bugs 
or options), then the functions do not vanish.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@48
PS8, Line 48: planner runs the rule to simplify CASE
:  * after this rule. Where that other rule can perform 
simplifications,
:  * those simplifications are omitted here
> simplify and use the specific rule name for concreteness.
Rewrote. Since many rules apply, did not seem to add value to try to enumerate 
them.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@106
PS8, Line 106:return rewriteCoalesceFn(e
> clarify whether you think this happens after the rewrite or before. If its
Done


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@130
PS8, Line 130:Lists.ne
> isn't this all that's done here (the most general case) and we'll depend on
Rewrote comment


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@172
PS8, Line 172: ESCE(a, literal) when literal
> The simplest rewrite here would be to not look at the child exprs for the v
I found that I could strip this function down as you suggest, but the results 
were sub-optimal due to weaknesses in other parts of the code.

coalesce(id) --> CASE ELSE id END (expected: id)
coalesce(id, null) --> CASE WHEN id IS NOT NULL THEN ID ELSE NULL END 
(expected: id)
coalesce(null, id) --> same as above
coalesce(null is null, id) --> 1 (expected: TRUE)
coalesce(10 + null, id) --> CASE WHEN NULL IS NOT NULL THEN NULL ELSE id END 
(expected: id)
coalesce(42, min(distinct tinyint_col)) --> CASE WHEN FALSE THEN NULL WHEN TRUE 
THEN 42 ELSE min(distinct tinyint_col, 42)) END (Expected: CASE WHEN 
min(distinct tinyint_col) IS NOT NULL THEN min(distinct tinyint_col) ELSE 42 
END)

And so on.

One solution is to restore the original coalesce() function rewrite which 
produces somewhat better results. Even that function has holes which this 
version fixed. So, we'd lose those optimizations.

As it turns out, all this is moot because the BE CASE implementation has a bug 
so we can't even use CASE to replace coalesce()...

For now, I did rework the function to the simple form requested, then disabled 
it. We can tackle any other concerns if/when it is a priority.


http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1137
PS8, Line 1137: |  other predicates: functional.alltypestiny.tinyint_col + 
functional.alltypestiny.smallint_col + functional.alltypestiny.int_col > 10, 
CASE WHEN functional.alltypestiny.tinyint_col + 
functional.alltypestiny.bigint_col IS NULL THEN 1 ELSE 
functional.alltypestiny.tinyint_col + functional.alltypestiny.bigint_col END = 1
> so this is the example of the performance regression (same 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1225/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 10
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 31 Oct 2018 02:53:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

Turns out that there is a BE bug that means CASE is not equivalent to 
coalesce(). Disabled all coalesce() rewrites and tests, restoring original 
behavior from master. Coalesce() can be revisited when IMPALA-7793 is fixed.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 31 Oct 2018 02:24:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11760/10/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/10/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@213
PS10, Line 213:* Simplify COALESCE by skipping leading nulls and applying 
the following transformations:
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/10/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@234
PS10, Line 234: List newChildren = 
Lists.newArrayList(expr.getChildren().subList(i, numChildren));
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 10
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 31 Oct 2018 02:22:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#10).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
18 files changed, 1,014 insertions(+), 282 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1224/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 9
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 31 Oct 2018 01:10:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 9:

(10 comments)

still reviewing the tests.

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@43
PS8, Line 43: (IMPALA-7737)
are there examples of these fns in our benchmarks to quantify the regression? 
if so, would be useful to see the effect.


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@45
PS8, Line 45: Still, the fix provides most of what the JIRA ticket requested
I'd skip these next three paragraphs.


http://gerrit.cloudera.org:8080/#/c/11760/8//COMMIT_MSG@57
PS8, Line 57:
pls make a section for this called "Testing" so its easier to jump to. also, 
pls condense these so that they're easier to skim. for example:
- split up tests for conditional functions to make them easier to test
- added unit tests for end-to-end rewrite rule interactions
- updated existing planner tests due to rewrites


http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h
File be/src/exprs/conditional-functions.h:

http://gerrit.cloudera.org:8080/#/c/11760/8/be/src/exprs/conditional-functions.h@76
PS8, Line 76: since
: /// various bugs mean that this implementation is still sometimes 
used. But
: /// the goal is to remove these classes at some point.
simpler: "until their use is eliminated by the frontend".


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@37
PS8, Line 37: vanish
is this accurate given the comments in the commit message about order by?


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@48
PS8, Line 48: planner runs the rule to simplify CASE
:  * after this rule. Where that other rule can perform 
simplifications,
:  * those simplifications are omitted here
simplify and use the specific rule name for concreteness.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@106
PS8, Line 106:  return rewriteIfNullFn(expr)
clarify whether you think this happens after the rewrite or before. If its 
after, then I expect the example on L109,110 to be in terms of CASE. I'm also 
fine with omitting the example since its assumed that these rules compose.


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@130
PS8, Line 130:expr.get
isn't this all that's done here (the most general case) and we'll depend on 
other rewrites for further simplifications?


http://gerrit.cloudera.org:8080/#/c/11760/8/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@172
PS8, Line 172:
The simplest rewrite here would be to not look at the child exprs for the 
various scenarios listed above and instead simply translate naively to a case 
statement. From there, we'd get constant folding and case simplification which 
will find the first when clause that evals to true. preceding when clauses that 
remain unknown will be retained, but this transform will need to retain them as 
well. Aggregate handling will result in a brute-force roll-back of the rewrite 
in case simplification, which will result in falling back to the case rewrite 
here. Might want to handle that situation by retaining the coalesce for now. So 
besides that issue, what else do we miss by doing the simple thing and rely on 
case simplification?


http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test:

http://gerrit.cloudera.org:8080/#/c/11760/8/testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test@1137
PS8, Line 1137: |  other predicates: functional.alltypestiny.tinyint_col + 
functional.alltypestiny.smallint_col + functional.alltypestiny.int_col > 10, 
CASE WHEN functional.alltypestiny.tinyint_col + 
functional.alltypestiny.bigint_col IS NULL THEN 1 ELSE 
functional.alltypestiny.tinyint_col + functional.alltypestiny.bigint_col END = 1
so this is the example of the performance regression (same work on multiple 
when clauses)?



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-30 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#9).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
18 files changed, 946 insertions(+), 313 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1209/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 8
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:45:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#8).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
17 files changed, 870 insertions(+), 313 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/8
--
To view, visit http://gerrit.cloudera.org:8080/11760

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11822 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1207/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
Gerrit-Change-Number: 11822
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 30 Oct 2018 00:19:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has abandoned this change. ( http://gerrit.cloudera.org:8080/11822 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Abandoned

Duplicate
--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
Gerrit-Change-Number: 11822
Gerrit-PatchSet: 1
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

(14 comments)

Thanks, Phil, for the code review. Addressed all but one comment.

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@29
PS7, Line 29: As a result, the BE retains the original interpreted forms that
> Let's mention this in the backend code in comments, so that folks know it's
Done


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@30
PS7, Line 30: are still used in two cases: 1) top-leel conditions in the
> nit: level
Done


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@45
PS7, Line 45: Still, the fix provides most of what the JIRA ticket requested
> Does the specific "compute stats" query that's underlying the JIRA get fast
Haven't gotten that far yet; still trying to make sure that things actually 
work. Will provide an update once I can do the required performance testing.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@54
PS7, Line 54:
> nit: it's weird that there are two spaces there.
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@66
PS7, Line 66: // then want to allow CASE to do any simplification 
(reverting to
> nit: The parenthetical here isn't closed.
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@70
PS7, Line 70:   Expr revised =  rewriteConditionalFn((FunctionCallExpr) 
expr);
> nit: should only have one space after =
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@72
PS7, Line 72:   // Workaround for IMALA-TBD
> Is TBD resolved at this point?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@132
PS7, Line 132:* IFNULL(a, x) --> 
> If we're not using HTML, is the  here noise?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@179
PS7, Line 179:   if (childExpr.isAggregate()) { sawAgg = true; }
> The javadoc for "expr.contains()" suggests that hasAgg has already checked
True.

This check adjusts the non-null literal case. Ignoring aggregates, we can stop 
adding arguments (ignoring all remaining arguments) if we hit a non-null 
literal. (This is the ! hasAgg case.)

Or, if we have aggregates, we can stop if we've already added at least one 
aggregate. Otherwise, we have to keep going to add at least one aggregate -- 
even though the aggregate will never actually be evaluated. (This is the sawAgg 
case.)

But, that I had to explain this means that using dual flags is the wrong 
approach. Rewrote to use a state machine. Please let me know if that is clearer.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@50
PS7, Line 50:  * We can't eliminate aggregates as this may change the meaning 
of the
> This is the same as line 70-72. Is that intentional?
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
File fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@42
PS7, Line 42: System.out.println(stmt);
> remove
Done


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@97
PS7, Line 97:   @Test
:   public void testSqlRecovery() {
: ParseNode node = analyzeExpr("true and false");
: System.out.println(node.toSql());
: System.out.println(((SelectStmt) node).toSql(true));
: // SELECT TRUE AND FALSE FROM functional.alltypessmall
:   }
> There's no assertion here...
Ad-hoc test. Removed.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@117
PS7, Line 117: // TODO: IMPALA-7766
> nit: including the subject of the JIRA here may make it obvious why...
Done



[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11822


Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-level conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163

Fixes

Change-Id: Idfe275032dc13aee64bfd60d448138b5b77ab96a
---
M be/src/exprs/conditional-functions.h
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
17 files changed, 870 insertions(+), 313 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/22/11822/1
--
To view, visit http://gerrit.cloudera.org:8080/11822
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1201/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:48:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 7:

(14 comments)

Thanks for the updates!

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@29
PS7, Line 29: As a result, the BE retains the original interpreted forms that
Let's mention this in the backend code in comments, so that folks know it's not 
far away from being dead code.


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@30
PS7, Line 30: are still used in two cases: 1) top-leel conditions in the
nit: level


http://gerrit.cloudera.org:8080/#/c/11760/7//COMMIT_MSG@45
PS7, Line 45: Still, the fix provides most of what the JIRA ticket requested
Does the specific "compute stats" query that's underlying the JIRA get faster 
in your testing?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
File fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@54
PS7, Line 54:
nit: it's weird that there are two spaces there.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@66
PS7, Line 66: // then want to allow CASE to do any simplification 
(reverting to
nit: The parenthetical here isn't closed.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@70
PS7, Line 70:   Expr revised =  rewriteConditionalFn((FunctionCallExpr) 
expr);
nit: should only have one space after =


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@72
PS7, Line 72:   // Workaround for IMALA-TBD
Is TBD resolved at this point?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@132
PS7, Line 132:* IFNULL(a, x) --> 
If we're not using HTML, is the  here noise?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java@179
PS7, Line 179:   if (childExpr.isAggregate()) { sawAgg = true; }
The javadoc for "expr.contains()" suggests that hasAgg has already checked all 
the children, so it shouldn't be possible that hasAgg = false but sawAgg = true.


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@50
PS7, Line 50:  * We can't eliminate aggregates as this may change the meaning 
of the
This is the same as line 70-72. Is that intentional?


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
File fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@42
PS7, Line 42: System.out.println(stmt);
remove


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@97
PS7, Line 97:   @Test
:   public void testSqlRecovery() {
: ParseNode node = analyzeExpr("true and false");
: System.out.println(node.toSql());
: System.out.println(((SelectStmt) node).toSql(true));
: // SELECT TRUE AND FALSE FROM functional.alltypessmall
:   }
There's no assertion here...


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@117
PS7, Line 117: // TODO: IMPALA-7766
nit: including the subject of the JIRA here may make it obvious why...


http://gerrit.cloudera.org:8080/#/c/11760/7/fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java@128
PS7, Line 128:   public void testIf() throws ImpalaException {
Should we be testing the boring case:

if(a, b, c) => case when a then b else c

for columns a, b, c. (As opposed to aggregations or constants.)

Update: I think this is covered in the next test file.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 7
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1200/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:36:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1198/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:19:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#7).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 837 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/7
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/6/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/6/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@93
PS6, Line 93: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 6
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:56:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#6).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 836 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/6
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

Thanks everyone for your patient reviews of this change. Per PhilZ, removed the 
per-function optimizations, relying on existing rewrite rules (bugs and all). 
Added tests to verify that the "downstream" rules work (when they do.)

Per Vuk, reduced the scope to only rewrite the requested functions, and to 
retain the BE implementation; living with the bugs that prevent the rewrites 
from being done some of the time.

Added more tests, including tests of the full rewrite process to ensure that 
the conditional functions are, in fact, first rewritten to use CASE, then CASE 
is further simplified. Tests include commented-out cases to record where we 
decided to live with existing bugs.

The result is the minimal change that accomplishes the request in the JIRA 
without breaking anything.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:51:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 5:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@66
PS5, Line 66:   public Expr RewritesOk(String tableName, String exprStr, 
ExprRewriteRule rule, String expectedExprStr)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@71
PS5, Line 71:   public Expr RewritesOk(String exprStr, List 
rules, String expectedExprStr)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@88
PS5, Line 88:   public Expr RewritesOkWhereExpr(String exprStr, ExprRewriteRule 
rule, String expectedExprStr)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@90
PS5, Line 90: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@93
PS5, Line 93:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, ExprRewriteRule rule, String expectedExprStr)
line too long (113 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@95
PS5, Line 95: return RewritesOkWhereExpr(tableName, exprStr, 
Lists.newArrayList(rule), expectedExprStr);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/5/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@98
PS5, Line 98:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, List rules,
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Mon, 29 Oct 2018 20:46:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-29 Thread Paul Rogers (Code Review)
Hello Bharath Vissapragada, Philip Zeyliger, Tim Armstrong, Csaba Ringhofer, 
Impala Public Jenkins, Vuk Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#5).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the
code-generated CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The rewrite engine has many bugs that are beyond the scope of
this change to fix. This change codes around those bugs. The
result is that the conditional rewrite happens some of the time,
and sometimes produces less-than perfect optimizations.

Conditionals in the top-level ORDER BY clause are not rewritten
(IMPALA-7753), but those one or more levels down are. Some expressions
involving NULL are not simplified (IMPALA-7769). Several hacks were
used to work around the fact that the rewrite engine ignores unanalyzed
expressions, yet the rewrite engine does not, in general, re-analyze
the expressions it produces, causing simplifications to be skipped
(IMPALA-7754). And so on.

As a result, the BE retains the original interpreted forms that
are still used in two cases: 1) top-leel conditions in the
ORDER BY clause, and 2) if the user disables rewrites.

Further, code generation does not occur for CASE statements in the
SELECT clause when it is in the root fragment (the most common case
in simple tests.) This is another known bug (IMPALA-4356).

One possible performance regression is that the new form of the code
evaluates some expressions twice, where the original interpreted
code evaluated the argument once. E.g. coalesce(id, 10) is rewritten
to CASE WHEN id IS NULL THEN id ELSE 10 END. Here, id is evaluated
twice. If the "id" were replaced by a complex sub-expression, the
gain from compilation could be offset by doing work twice.
(IMPALA-7737)

Still, the fix provides most of what the JIRA ticket requested
within the limitations of the existing code.

Conditional function rewrites are moved into a new class,
RewriteConditionalsRule in order to keep things simple.

Most functions use the simplest possible rewrite, relying on the
existing rewrite rules for further simplification.  The one exception
is coalesce(): the existing code relies on the semantics of the
function and so was retained and slightly improved. The code was
extended to produce a CASE statement directly, retaining existing
simplifications.

Tests for conditional functions were in one large function along with
other rewrite tests. Moved them into a new file, then broke up the
tests by function to allow much easier debugging of each function
one-by-one.  This required moving the common test mechanims into a
new common base class.

Existing tests focus on one or two rules at a time. The conditional
function rewrite, however, relies on the entire set of rules being
applied repeatedly. So, added a new FullRewriteTest case to verify this
behavior. This class contains several commented-out tests that cannot
pass due to existing rewrite bugs noted above.

Changing the rewrite cause the PlannerTest to produce different plans
than previously. Changed the expected results file to match the new
rewrite rules.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
A fe/src/main/java/org/apache/impala/rewrite/RewriteConditionalFnsRule.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/FullRewriteTest.java
A fe/src/test/java/org/apache/impala/analysis/RewriteConditionalFnsRuleTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test
M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test
M testdata/workloads/functional-planner/queries/PlannerTest/joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test
M testdata/workloads/functional-planner/queries/PlannerTest/outer-joins.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-propagation.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test
16 files changed, 831 insertions(+), 309 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/5
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit 

[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/4/be/src/exprs/conditional-functions.h
File be/src/exprs/conditional-functions.h:

http://gerrit.cloudera.org:8080/#/c/11760/4/be/src/exprs/conditional-functions.h@a75
PS4, Line 75:
> High level comment (still digging into the code).
Just figured Vuk commented something similar. Should've read the other comments 
before posting this. I think the second half of the comment still holds.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 26 Oct 2018 18:45:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/4/be/src/exprs/conditional-functions.h
File be/src/exprs/conditional-functions.h:

http://gerrit.cloudera.org:8080/#/c/11760/4/be/src/exprs/conditional-functions.h@a75
PS4, Line 75:
High level comment (still digging into the code).

All these rewrites are hidden behind a session option (ENABLE_EXPR_REWRITES). 
Removing these backend implementations essentially means that when someone sets 
 ENABLE_EXPR_REWRITES=false, the conditional functions won't run (obvious since 
they aren't rewritten to the corresponding CASE form).

If we fix that though, one thing I'm concerned here is that if something goes 
wrong with the rewrite rules (yes this happened in quite a few cases in the 
past), there doesn't seem to be any other workaround and all the queries with 
the problematic conditionals would fail. Wondering if we should still retain 
these be implementations until we are sure that this optimization works pretty 
well?



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 26 Oct 2018 18:23:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py@531
PS4, Line 531:   # Rewritten away in the FE
> See IMPALA-7741. nvl2() currently does not appear in the list of built-in f
Actually, this is a bit more subtle. Today, we rewrite nvl2() and nullif() at 
parse time. The goal is to rewrite them in the rewrite rules, along with other 
conditionals. Doing so allows us to take advantage of analyzed arguments to 
implement optimizations.

To do that, however, we need the function to be defined in this function table. 
We need it as a FE-only function (which also allows it to appear in the table 
of built-in functions) so it passes analysis, even though there is no BE 
implementation.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 25 Oct 2018 18:52:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-25 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(12 comments)

Thanks everyone for the comments.

PhilZ's note to use existing optimizations is a good one. A number of changes 
are required to make that happen. Researching that turned up quite a few bugs 
listed in IMPALA-7655. I plan to push changes for those bugs one by one to ease 
reviews. Once those fixes are in master, I'll push a new change set here 
rebased on the revised master.

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py@531
PS4, Line 531:   # Rewritten away in the FE
> can you explain why this is needed in this change?
See IMPALA-7741. nvl2() currently does not appear in the list of built-in 
functions. However, since this improvement is orthogonal to the topic of this 
bug, I'll drop this change from here and do a separate change set for 
IMPALA-7741.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@a92
PS4, Line 92:
> rewrites can be disabled via a query option so cannot be assumed to run. it
Right. Turns out that I had to alter the rule application flow to always 
rewrite the conditional functions.

The other choice is to leave the current interpreted implementations and have 
two paths based on turning on optimizations or not. In the spirit of "simpler 
is better", seems to make sense to have only one implementation, but the 
two-implementation solution is also possible.

Though, note, in the original code, nvl2() was always rewritten, there never 
was a BE implementation, so that aspect is not really changing.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@89
PS2, Line 89:* code gen for , avoiding the need for ad-hoc 
implementations
:* of each function.
> The following code looks to me like it'll do re-writes until we're "stuck".
Turns out this is a difference between the simplified test framework (which 
applies selected rules once) and the production code (which replies rules 
repeatedly.) Easy enough to create a separate test that will apply rules in the 
production way so that we take multiple passes to verify all rewrites.

Making this change uncovered quite a few obscure bugs which are recorded in the 
JIRA and will be submitted as separate changes for review. Once those are in, 
I'll rebase this change onto that code, and implement the chained rewrite test 
and logic.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@122
PS2, Line 122:   private Expr rewriteIfFn(FunctionCallExpr expr) {
> Is the issue that "null is null" is not getting constant optimization? In m
See earlier note.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@112
PS4, Line 112: 
> to be consistent with the style in the frontend Java code, pls remove these
Really? We create Javadoc comments, but don't use the resulting formatting? 
Eclipse, for example, will display the formatting in hovers. Without proper 
HTML formatting, the result is a bit of a mess.

Still, if our policy is to use Javadoc comments, but without formatting, I can 
revert this stuff.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@144
PS4, Line 144: instanceof NullLiteral)
> I would prefer .isNullLiteral() as other functions use that.
This is existing code that was moved. However, I agree with your point. I've 
gone though and cleaned up places that use this older style to use the newer 
predicates, including in cases where we test for true and false.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@183
PS4, Line 183: revised.size() - 1;
> Can you add a comment about not adding the last child here? It took me some
Done



[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(3 comments)

main question from my end is to consider what happens when rewrites are 
disabled.

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

http://gerrit.cloudera.org:8080/#/c/11760/4/common/function-registry/impala_functions.py@531
PS4, Line 531:   # Rewritten away in the FE
can you explain why this is needed in this change?
do we assert anywhere that these fns are never seen by the backend?


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@a92
PS4, Line 92:
rewrites can be disabled via a query option so cannot be assumed to run. it 
cannot be assumed to transform an invalid/unknown stmt to a valid one. its 
purpose is only to transform valid statements to other equivalent valid 
statements.
what happens when you do:
set enable_expr_rewrites = false ?


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@112
PS4, Line 112: 
to be consistent with the style in the frontend Java code, pls remove these 
javadoc markups. I think the idea is that the code is read through a variety of 
editors and since the javadocs are not used, simplify writing/reading the 
comments by omitting the markup.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 24 Oct 2018 21:13:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@66
PS4, Line 66:   public Expr RewritesOk(String tableName, String exprStr, 
ExprRewriteRule rule, String expectedExprStr)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@71
PS4, Line 71:   public Expr RewritesOk(String exprStr, List 
rules, String expectedExprStr)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@88
PS4, Line 88:   public Expr RewritesOkWhereExpr(String exprStr, ExprRewriteRule 
rule, String expectedExprStr)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@90
PS4, Line 90: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@93
PS4, Line 93:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, ExprRewriteRule rule, String expectedExprStr)
line too long (113 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@95
PS4, Line 95: return RewritesOkWhereExpr(tableName, exprStr, 
Lists.newArrayList(rule), expectedExprStr);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@98
PS4, Line 98:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, List rules,
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 19:48:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@89
PS2, Line 89:* IF(TRUE, then, else)  
then
:* IF(FALSE|NULL, then, else)  
else
> Further testing revealed complexities. First, the rewrite engine does not a
The following code looks to me like it'll do re-writes until we're "stuck". It 
may be that "does not apply the same rule multiple times" is a test artifact, 
in that in the tests, we explicitly specify which rules to apply.


  public Expr rewrite(Expr expr, Analyzer analyzer) throws AnalysisException {
// Keep applying the rule list until no rule has made any changes.
int oldNumChanges;
Expr rewrittenExpr = expr;
do {
  oldNumChanges = numChanges_;
  for (ExprRewriteRule rule: rules_) {
rewrittenExpr = applyRuleRepeatedly(rewrittenExpr, rule, analyzer);
  }
} while (oldNumChanges != numChanges_);
return rewrittenExpr;
  }


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@122
PS2, Line 122:* Rewrites IFNULL(a, x):
> Tried removing the optimizations. Turns out that the CASE simplification di
Is the issue that "null is null" is not getting constant optimization? In my 
quick test, it looks like it does. Note the "predicates" in the explain plan 
below. Again, the question is whether or not the test case is taking advantage 
of constant folding.

[localhost:21000] default> explain select * from tpch.orders where o_orderkey 
is null limit 1;
Query: explain select * from tpch.orders where o_orderkey is null limit 1
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=3 |
| Per-Host Resource Estimates: Memory=176MB  |
||
| PLAN-ROOT SINK |
| |  |
| 01:EXCHANGE [UNPARTITIONED]|
| |  limit: 1|
| |  |
| 00:SCAN HDFS [tpch.orders] |
|partitions=1/1 files=1 size=162.56MB|
|predicates: o_orderkey IS NULL  |
|limit: 1|
++
Fetched 12 row(s) in 0.03s
[localhost:21000] default> explain select * from tpch.orders where null is null 
limit 1;
Query: explain select * from tpch.orders where null is null limit 1
++
| Explain String |
++
| Max Per-Host Resource Reservation: Memory=8.00MB Threads=2 |
| Per-Host Resource Estimates: Memory=88MB   |
| Codegen disabled by planner|
||
| PLAN-ROOT SINK |
| |  |
| 00:SCAN HDFS [tpch.orders] |
|partitions=1/1 files=1 size=162.56MB|
|limit: 1|
++
Fetched 9 row(s) in 0.01s



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 18:09:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@49
PS3, Line 49:   public Expr RewritesOk(String tableName, String exprStr, 
ExprRewriteRule rule, String expectedExprStr)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@54
PS3, Line 54:   public Expr RewritesOk(String exprStr, List 
rules, String expectedExprStr)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@71
PS3, Line 71:   public Expr RewritesOkWhereExpr(String exprStr, ExprRewriteRule 
rule, String expectedExprStr)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@73
PS3, Line 73: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@76
PS3, Line 76:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, ExprRewriteRule rule, String expectedExprStr)
line too long (113 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@78
PS3, Line 78: return RewritesOkWhereExpr(tableName, exprStr, 
Lists.newArrayList(rule), expectedExprStr);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/3/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@81
PS3, Line 81:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, List rules,
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 18:02:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@89
PS2, Line 89:* code gen for , avoiding the need for ad-hoc 
implementations
:* of each function.
> Removing optimizations worked here.
Further testing revealed complexities. First, the rewrite engine does not apply 
the same rule multiple times (rewrite function, then rewrite resulting case.)

If the code is changed to apply CASE rewrite to the result, additional problems 
result. One of which is that the CASE rewrites will revert to the original 
expression if all aggregates are removed. But, we can't do that here: we no 
longer have function implementations. So, we would need the aggregate fallback 
to fallback to the CASE rewritten from the function.

The resulting code gets pretty complex and looks hard to maintain. So, I'm 
thinking to revert to the code in the first patch, with the added subtlety of 
handling aggregates in the function rewrites.

Stay tuned for a revised version.



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 16:28:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11760/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11760/4//COMMIT_MSG@20
PS4, Line 20: k
nit: please wrap at 72 chars


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@144
PS4, Line 144: instanceof NullLiteral)
I would prefer .isNullLiteral() as other functions use that.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@183
PS4, Line 183: revised.size() - 1;
Can you add a comment about not adding the last child here? It took me some 
time until I understood that it only added in CaseExpr's default branch.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@328
PS4, Line 328: isNull
typo: ifNull


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@335
PS4, Line 335:* nullif(x, null)  
NULL
Is this really the way it should behave?
Currently select nullif(1,NULL) returns 1.


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@345
PS4, Line 345: tail
I think that we should return head in this case (if head is not null, then head 
and tail are distinct, so head should be returned).


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
File 
fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java@252
PS4, Line 252: // If non-leading parameter is a non-NULL constant, drop 
other terms
nit: missing .


http://gerrit.cloudera.org:8080/#/c/11760/4/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java@291
PS4, Line 291: NULL
This should be 'id', see my comment at SimplifyConditionalsRule.java:335

expr-test should catch this - be/src/exprs/expr-test.cc contains:
TestValue("nullif(TRUE, NULL)", TYPE_BOOLEAN, true);

Can you verify that it indeed catches this issue?



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 13:27:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@49
PS2, Line 49:   public Expr RewritesOk(String tableName, String exprStr, 
ExprRewriteRule rule, String expectedExprStr)
line too long (104 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@54
PS2, Line 54:   public Expr RewritesOk(String exprStr, List 
rules, String expectedExprStr)
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@71
PS2, Line 71:   public Expr RewritesOkWhereExpr(String exprStr, ExprRewriteRule 
rule, String expectedExprStr)
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@73
PS2, Line 73: return RewritesOkWhereExpr("functional.alltypessmall", 
exprStr, rule, expectedExprStr);
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@76
PS2, Line 76:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, ExprRewriteRule rule, String expectedExprStr)
line too long (113 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@78
PS2, Line 78: return RewritesOkWhereExpr(tableName, exprStr, 
Lists.newArrayList(rule), expectedExprStr);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@81
PS2, Line 81:   public Expr RewritesOkWhereExpr(String tableName, String 
exprStr, List rules,
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 12:49:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1142/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 02:23:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Paul Rogers (Code Review)
Hello Philip Zeyliger, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#4).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the code-generated
CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The functions ifnull and nvl are aliases for isnull, so revised those to
directly rewrite them to CASE rather than first converting them to if().

Moved rewriting into our existing set of conditional rewrite rules
into SimplifyConditionalRules instead of in the parser. To make this work
(and to solve the issue in IMPALA-7741, added ifnull and nvl to the list
of builtin Impala functions.

Found a few additional simplifications for coalesce, ifnull and nvl. See
the Javadoc and unit tests for details. (See IMPALA-7750 for why the
rules are needed; solve IMPALA-7750 and the rules can be removed as the
rewrite for CASE will handle them.)

Tests for conditional functions were in one huge function along with
other rewrite tests. Moved them into a new file, then broke up the tests
by function to allow much easier debugging of each function one-by-one.
This required moving the common test mechanims into a new common base
class.

Once tests verified that the rewrites worked, we no longer have need of
the specialized C++ implementations, so removed those.

Also cleaned up a few typos, unnecessary imports and JavaDocs found
during this work.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions-ir.cc
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/scalar-expr.cc
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
12 files changed, 617 insertions(+), 493 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/4
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 4
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1141/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 01:40:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Paul Rogers (Code Review)
Hello Philip Zeyliger, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11760

to look at the new patch set (#3).

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the code-generated
CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The functions ifnull and nvl are aliases for isnull, so revised those to
directly rewrite them to CASE rather than first converting them to if().

Moved rewriting into our existing set of conditional rewrite rules
into SimplifyConditionalRules instead of in the parser. To make this work
(and to solve the issue in IMPALA-7741, added ifnull and nvl to the list
of builtin Impala functions.

Found a few additional simplifications for coalesce, ifnull and nvl. See
the Javadoc and unit tests for details. (See IMPALA-7750 for why the
rules are needed; solve IMPALA-7750 and the rules can be removed as the
rewrite for CASE will handle them.)

Tests for conditional functions were in one huge function along with
other rewrite tests. Moved them into a new file, then broke up the tests
by function to allow much easier debugging of each function one-by-one.
This required moving the common test mechanims into a new common base
class.

Once tests verified that the rewrites worked, we no longer have need of
the specialized C++ implementations, so removed those.

Also cleaned up a few typos, unnecessary imports and JavaDocs found
during this work.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions-ir.cc
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/scalar-expr.cc
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
12 files changed, 600 insertions(+), 493 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/3
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 3
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11760/2/be/src/exprs/scalar-expr.h
File be/src/exprs/scalar-expr.h:

http://gerrit.cloudera.org:8080/#/c/11760/2/be/src/exprs/scalar-expr.h@84
PS2, Line 84: /// necessary on the arguments to generate the result.These 
compute functions have
> Period after space; this change seems accidental.
Done


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@89
PS2, Line 89:* IF(TRUE, then, else)  
then
:* IF(FALSE|NULL, then, else)  
else
> Do we need to do this? The general case of simplifyCaseExpr() should take c
Removing optimizations worked here.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@122
PS2, Line 122:* Rewrites IFNULL(a, x):
> Same comment: can we just always make this a case, and then other simplific
Tried removing the optimizations. Turns out that the CASE simplification did 
not discover these optimizations. See IMPALA-7750.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@175
PS2, Line 175:   // TODO: Clone child expr here?
> Do you know what the answer is here?
Done


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@1
PS2, Line 1: package org.apache.impala.analysis;
> Please copy over the regular license stuff here.
Done


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
File 
fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java@35
PS2, Line 35: "CASE WHEN id = 0 THEN TRUE ELSE FALSE END");
> Should this be simplified further to "id = 0"? Or do we think it doesn't ma
Subtle issue. The expression, as written, is equivalent to istrue(id = 0). By 
itself, id = 0 can be true, false or null. So, we could not eliminate the CASE 
operator, though we could rewrite it to istrue(id = 0).

Of course, new could also rewrite isTrue(x) to be CASE x THEN true ELSE FALSE 
END ...



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 01:17:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Philip Zeyliger (Code Review)
Philip Zeyliger has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 2:

(6 comments)

Thanks! The outline here looks good to me. I think you've net decreased the 
number of lines of code, which is great.

Do you have any initial sense whether IF() is now faster relative to what it 
used to be?

For the simplifying expressions, see if we can just let case handle it as 
opposed to some of the complexity you have here. It's not a huge deal but I 
think we can write less code there.

If you're interested, tests/comparison/func.py and friends have a framework to 
generate query strings and run them against postgresql and Impala. It's kind of 
a zoo to run and doesn't run, for example, as part of GVO. There's a natural 
question as to whether we get coverage of IF, CASE, and friends in there. I 
don't know if it's worth it.

I assume you've run the tests with this?

There's a relatively pleasant way to look at Java coverage (mvn test 
-DcodeCoverage; git grep jacoco) and you might want to look at coverage for the 
java code. I imagine everything looks pretty good, but it's an easy thing to do.

http://gerrit.cloudera.org:8080/#/c/11760/2/be/src/exprs/scalar-expr.h
File be/src/exprs/scalar-expr.h:

http://gerrit.cloudera.org:8080/#/c/11760/2/be/src/exprs/scalar-expr.h@84
PS2, Line 84: /// necessary on the arguments to generate the result.These 
compute functions have
Period after space; this change seems accidental.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
File fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@89
PS2, Line 89:* IF(TRUE, then, else)  
then
:* IF(FALSE|NULL, then, else)  
else
Do we need to do this? The general case of simplifyCaseExpr() should take care 
of it, eh?


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@122
PS2, Line 122:* Rewrites IFNULL(a, x):
Same comment: can we just always make this a case, and then other 
simplifications kick in?


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java@175
PS2, Line 175:   // TODO: Clone child expr here?
Do you know what the answer is here?


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
File fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java@1
PS2, Line 1: package org.apache.impala.analysis;
Please copy over the regular license stuff here.


http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
File 
fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java:

http://gerrit.cloudera.org:8080/#/c/11760/2/fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java@35
PS2, Line 35: "CASE WHEN id = 0 THEN TRUE ELSE FALSE END");
Should this be simplified further to "id = 0"? Or do we think it doesn't matter 
because LLVM will do the optimization just fine?



--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Philip Zeyliger 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 00:00:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/1138/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 23 Oct 2018 23:41:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

2018-10-23 Thread Paul Rogers (Code Review)
Paul Rogers has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11760 )

Change subject: IMPALA-7655: Rewrite if, isnull, coalesce to use CASE
..

IMPALA-7655: Rewrite if, isnull, coalesce to use CASE

See IMPALA-7655 for backgound. Tim found that the current interpreted
forms of if, isnull and coalesce are slow compared to the code-generated
CASE statement.

This patch rewrites the above functions into the equivalent CASE
structure.

The functions ifnull and nvl are aliases for isnull, so revised those to
directly rewrite them to CASE rather than first converting them to if().

Moved rewriting into our existing set of conditional rewrite rules
instead of in the parser. To make this work (and to solve the issue in
IMPALA-7741, added ifnull and nvl to the list of builtin Impala
functions.

Found a few additional simplifications for coalesce, ifnull and nvl. See
the Javadoc and unit tests for details.

Tests for conditional functions were in one huge function along with
other rewrite tests. Moved them into a new file, then broke up the tests
by function to allow much easier debugging of each function one-by-one.
This required moving the common test mechanims into a new common base
class.

Once tests verified that the rewrites worked, we no longer have need of
the specialized C++ implementations, so removed those.

Also cleaned up a few typos, unnecessary imports and JavaDocs found
during this work.

Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
---
M be/src/exprs/conditional-functions-ir.cc
M be/src/exprs/conditional-functions.cc
M be/src/exprs/conditional-functions.h
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M common/function-registry/impala_functions.py
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/rewrite/SimplifyConditionalsRule.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeExprsTest.java
A fe/src/test/java/org/apache/impala/analysis/BaseRewriteRulesTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java
A fe/src/test/java/org/apache/impala/analysis/SimplifyConditionalRulesTest.java
13 files changed, 586 insertions(+), 485 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/11760/2
--
To view, visit http://gerrit.cloudera.org:8080/11760
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I526654d8546e17b2545c42cc59dab66d9fe1b163
Gerrit-Change-Number: 11760
Gerrit-PatchSet: 2
Gerrit-Owner: Paul Rogers