[ 
https://issues.apache.org/jira/browse/IMPALA-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18054405#comment-18054405
 ] 

ASF subversion and git services commented on IMPALA-14525:
----------------------------------------------------------

Commit 2360a06e4ac0ab27d6f3dcec1c366a3ba4904089 in impala's branch 
refs/heads/master from Steve Carlin
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2360a06e4 ]

IMPALA-14525: Calcite planner: Add support for RexSimplify

RexSimplify is a class in Calcite that simplifies expressions
into something more optimal. It was disabled up until this point
because it converts IN clauses into a Calcite internal SEARCH
object which isn't directly supported by Impala.

This commit brings back the RexSimplify class. The SEARCH
operator is now converted into an IN operator when RexNode
objects are changed into Expr objects.

Some notes about the changes that had to be made:

- some small refactoring needed to be done in the Impala Expr
  objects.

- RexSimplify is very stringent about operators that are nullable,
  as there is an assert when certain operators are checked. There is
  logic in the CoerceOperandShuttle that ensures the nullability is
  now set correctly.

- Some duplicated logic at line 148 in CoerceOperandShuttle was removed,
  (existing logic in getReturnType)

- The AnalyzedInPredicate subclass was created to avoid analysis done in
  InPredicate.

- Removed ImpalaRexBuilder logic which avoided creation of the SEARCH op.

- Created ImpalaRexSimplify which extends RexSimplify. RexSimplify causes
  regressions with NaN on comparisons with Double. For instance,
  "where not(my_col > 30)" changes to "where my_col <= 30": The first
  expression returns true when my_col is NaN and the second expression
  returns false. So ImpalaRexSimplify looks for the existence of any
  binary comparison operator with Double in it and avoids the simplification.

- Added ImpalaRexUtil which copies the RexUtil.expandSearch() method that
  converts the SEARCH operator into non-search operators. The version here
  handles the conversion to the custom Impala IN operator.

- Created an ImpalaCoreRules class. Even though RexSimplify is supported,
  it is important it is run through ImpalaRexSimplify. The RexSimplify
  is disabled for the SqlToRelNode converter and for all rules given by
  Calcite. ImpalaCoreRules also has the benefit of having one place where
  one can find all the rules used by Impala.

- Created simplify rules for the filter condition, and the projects in
  the project object.

- Changed the FilterSelectivityEstimator to get the selectivity for the
  SEARCH operator.

- Added a couple of rules in the optimizer for a bug that was being exposed
  when enabling the SEARCH operator. The PROJECT_JOIN_TRANSPOSE was removed
  because it did not serve any purpose, as we transpose JOIN_PROJECT in the
  join phase. Some other rules were added to help with pushdown predicates
  like JOIN_DERIVE_IS_NOT_NULL_FILTER and JOIN_PUSH_EXPRESSIONS. And the
  Simplifier rules have also been added.

- Some of the new rules caused many changes in the estimations of cardinality
  and memory.  The one noticeable change was using IsNullPredicate for the
  IS_NULL and IS_NOT_NULL operators. Previously, these functions were using
  FunctionCallExpr, and the cardinality estimation was way off.

- Fixed a small bug in RexLiteralConverter where a string literal was treated
  as a VARCHAR.  A string literal should always be treated as a STRING.

Change-Id: I44792688f361bf15affa565e5de5709f64dcf18c
Reviewed-on: http://gerrit.cloudera.org:8080/23679
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
Reviewed-by: Aman Sinha <[email protected]>


> Calcite Planner: Add support for RexSimplify
> --------------------------------------------
>
>                 Key: IMPALA-14525
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14525
>             Project: IMPALA
>          Issue Type: Sub-task
>            Reporter: Steve Carlin
>            Assignee: Steve Carlin
>            Priority: Major
>
> RexSimplify does quite a few optimizations on the expressions that we should 
> be using.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to