Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11528 )

Change subject: Use NDV=1 for a Column with all nulls
......................................................................


Patch Set 10:

(3 comments)

Thanks for the continued reviews. Cleaned up a few more items. Answered some 
questions.

http://gerrit.cloudera.org:8080/#/c/11528/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11528/8//COMMIT_MSG@31
PS8, Line 31:
> what types of changes are these-- did they manage to change the structure o
Sorry. The note is trying to explain that 1) there is a planner change (that is 
the gist of the problem resolution), and 2) the change occurs only in the very 
narrow conditions spelled out in this note.

The line in question here explains why the adjustment is applied only to NDV of 
0-1 (and not, say, 0,-10). If we have NDV=10, and column is nullable, then we 
certainly might want to adjust NDV to 11. But, if we do, many tests fail due to 
changed cardinality, changed memory and so on. (The tests "fail" in the sense 
that the text plans have changed. The structure is the same, just the numbers 
differ.)

The original bug complained that, with NDV=0, the planner did misfire, did 
produce an incorrect plan. It did so because, in aggregation, we multiply 
cardinalities by 0, resulting in an overall cardinality of 0.

The new tests verify that, after the adjustments, cardinality estimates are no 
longer zeroed out in aggregate nodes, but instead bubble up. This will, 
presumably, fix the original problem. (The original bug did not give sufficient 
information to completely reproduce the particular query that failed.)


http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/NullTable/large_data.csv
File testdata/NullTable/large_data.csv:

http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/NullTable/large_data.csv@1
PS8, Line 1:
> Looks like theres several conventions under testdata... I see testdata/data
Created a new NullRows folder, moved this file there and renamed it to 
"data.csv".


http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/11528/8/testdata/datasets/functional/functional_schema_template.sql@1355
PS8, Line 1355: ====
> I have no idea, but I counted BASE_TABLE_NAME and see it shows up 105 times
Removed the Kudu portion.



--
To view, visit http://gerrit.cloudera.org:8080/11528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ife657a43c9cafc451bd12ddf857dcb7169e97459
Gerrit-Change-Number: 11528
Gerrit-PatchSet: 10
Gerrit-Owner: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com>
Gerrit-Reviewer: Philip Zeyliger <phi...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Tue, 02 Oct 2018 19:53:05 +0000
Gerrit-HasComments: Yes

Reply via email to