Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11528

to look at the new patch set (#5).

Change subject: IMPALA-7310: All-null columns give wrong estimates in planner
......................................................................

IMPALA-7310: All-null columns give wrong estimates in planner

Modified the planner to handle low-value NDVs by adjusting them
upward by one to account for null values. Thus, even an all-null
column, which has an NDV of 0 in stats, will have an NDV of 1 in
the planner. (The planner already expects NDV to include nulls.)

Modified the front end to allow capturing the full plan for use in
a unit test. Added unit tests that verify estimated cardinality
for a plan as a way to verify that the fix solved the scenario
in IMPALA-7310.

Testing required a new table, similar to the existing nulltable,
but which has multiple rows and has stats calculated.

The change was limited to a very narrow range of cases:

* Table column (not an internal column such as COUNT(*))
* Column is nullable
* Column has stats
* Column does not provide a null count, or null count > 0
* Reported NDV <= 1

In this narrow case, we add one to NDV to account for nulls.
(Any larger adjustment throws off the TPC-H tests which have
multiple columns, marked as non-null, with low NDV, but which
actually include no nulls.)

The change minimized impact on PlannerTest, but still some
memory numbers needed adjusting for a test in which one
column hit the criteria listed above and had its NDV adjusted.

Change-Id: Ife657a43c9cafc451bd12ddf857dcb7169e97459
---
M .gitignore
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
A fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
A testdata/NullTable/large_data.csv
M testdata/bin/compute-table-stats.sh
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
12 files changed, 449 insertions(+), 23 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/11528/5
--
To view, visit http://gerrit.cloudera.org:8080/11528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ife657a43c9cafc451bd12ddf857dcb7169e97459
Gerrit-Change-Number: 11528
Gerrit-PatchSet: 5
Gerrit-Owner: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>

Reply via email to