Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/11565 )
Change subject: IMPALA-7659: Populate NULL count while computing column stats ...................................................................... Patch Set 7: Code-Review+1 (2 comments) My vote is to get this in, then do three things: 1. Use IMPALA-7842 to add cardinality tests based on this feature. 2. Do some refresh metadata performance runs to check performance impact. 3. Tackle the NDV-does-or-does-not-include-nulls issue. http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11565/7//COMMIT_MSG@16 PS7, Line 16: Tests: Updated the affected tests to include the null counts. > Can we add a couple of tests that verify that cardinality estimates for out FWIW, IMPALA-7842 provides a starter set of cardinality tests based on exposing the pre-Thrift plan tree. We can build on those if that patch goes in before this one. http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/11565/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@251 PS6, Line 251: > Was discussing this with Paul offline. We thought that adjusting the NDV be Agreed. Let's get this in, then tackle the NDV=0 as a separate issue. I wonder, do we have any data about the original issue: any performance slowness when adding this additional calculation? If a table has many columns, and we add a null count for each, how much impact is there on refresh metadata performance? -- To view, visit http://gerrit.cloudera.org:8080/11565 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic68f8b4c3756eb1980ce299a602a7d56db1e507a Gerrit-Change-Number: 11565 Gerrit-PatchSet: 7 Gerrit-Owner: Anonymous Coward <piotr.findei...@gmail.com> Gerrit-Reviewer: Anonymous Coward <piotr.findei...@gmail.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Paul Rogers <par0...@yahoo.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Wed, 05 Dec 2018 21:18:47 +0000 Gerrit-HasComments: Yes