Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16349 )
Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs. ...................................................................... Patch Set 10: (14 comments) Looks good! http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java File fe/src/main/java/org/apache/impala/analysis/SlotRef.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98 PS10, Line 98: adjustNdv( Maybe renamed as getNumDistinctValuesAdjusted(). http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103 PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null values. When the numDistinctValues > 0, such adjustment is not performed. I wonder if the adjustment is unconditional, it will hurt anything. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191 PS10, Line 191: tNumDistinctValues() { return numDistinctValues_; } nit: seems like a move of the method in the module. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211 PS10, Line 211: // Bug: NDV should be 1 to include nulls This comment can be removed. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@176 PS10, Line 176: a id http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@178 PS10, Line 178: f some_nulls http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@180 PS10, Line 180: c null_str http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182 PS10, Line 182: // NDV(b) = 1, add 1 for nulls : // Bug: See IMPALA-7310, IMPALA-8094 : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 2); Seems like these lines can be removed. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java: http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132 PS10, Line 132: f has NDV=3 This comment is not accurate. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136 PS10, Line 136: c is all nulls Seems like the reference to c is not right here. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138 PS10, Line 138: a same here. http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: a same here http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140 PS10, Line 140: f) same http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182 PS10, Line 182: = 1 Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26. -- To view, visit http://gerrit.cloudera.org:8080/16349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032 Gerrit-Change-Number: 16349 Gerrit-PatchSet: 10 Gerrit-Owner: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Aman Sinha <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Shant Hovsepian <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Mon, 31 Aug 2020 14:28:02 +0000 Gerrit-HasComments: Yes
