Shant Hovsepian has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16182


Change subject: IMPALA-9974: Join elimination based on referential integrity.
......................................................................

IMPALA-9974: Join elimination based on referential integrity.

Use defined referential dependency constraints during planning.

  1. For join ndv calculations involving a single column primary key we
     now use the table's row count instead of the column's ndv stats
     which are approximate and incorrect at times.
  2. Convert LEFT SEMI JOIN to INNER JOIN if the inner table's primary
     key is used as the join key.
  3. Convert LEFT OUTER JOIN to INNER JOIN if the join key uses a
     fk/pk relationship, the FK is not nullable.
  4. Prune INNER JOINS on fk/pk conditions where the join doesn't affect
     the output cardinality and there are no predicates or non-key
     columns in use on the inner side.
  5. For a confirmed fk/pk join we include a IS NOT NULL predicate on
     the foreign key columns.

Plan optimizations can be disabled with the query option 
DISABLE_INFORMATIONAL_CONSTRAINTS

Testing:
  - New constraint rewrite planner test
  - TPC-DS functional tests
  - TPC-DS Planner test changes

Some TPC-DS 30TB Performance Improvements:
  - Q72 74.3s -> 24.8s
  - Q82 13.5s -> 2.2s
  - Q84 19.33s -> 6.07s
  - Q85 29s -> 6.33s

Change-Id: Idd07d45e419175af7d62f6035ba51be841ebc074
---
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/SelectList.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
A fe/src/main/java/org/apache/impala/analysis/TableConstraintRewriter.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/SqlConstraints.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/test/java/org/apache/impala/catalog/CatalogTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/bin/compute-table-stats.sh
M testdata/data/child_table.txt
M testdata/data/parent_table.txt
M testdata/data/parent_table_2.txt
M testdata/datasets/functional/functional_schema_template.sql
A 
testdata/workloads/functional-planner/queries/PlannerTest/constraint-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q03.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q33.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q42.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q43.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q47.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q50.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q51.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q52.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q53.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q55.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q58.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q59.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q63.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q65.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q68.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q69.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q70.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q73.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q76.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q77.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q78.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q79.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q82.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q87.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q88.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q89.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q93.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q96.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q97.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q98.test
105 files changed, 6,809 insertions(+), 7,476 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/16182/4
--
To view, visit http://gerrit.cloudera.org:8080/16182
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idd07d45e419175af7d62f6035ba51be841ebc074
Gerrit-Change-Number: 16182
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Shant Hovsepian <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>

Reply via email to