Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14666 )

Change subject: IMPALA-8205:  Support number of true and false statistics for 
boolean column
......................................................................


Patch Set 6:

(8 comments)

does this include computing stats for boolean columns in Kudu as well? (if not 
thats okay, just wondering)

http://gerrit.cloudera.org:8080/#/c/14666/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14666/6//COMMIT_MSG@17
PS6, Line 17: incrementabl
nit: typo, should be "incremental"


http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc
File be/src/exec/incr-stats-util.cc:

http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@132
PS6, Line 132:   int64_t num_trues;
             :
             :   int64_t num_falses;
nit: both fields should have comments


http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@249
PS6, Line 249: update
nit: first letter should be uppercase, change to "Updated"


http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@249
PS6, Line 249: VLOG
revise to: "Updated statistics for table=[table-name] and column=[column-name]; 
statistics={ndv=[ndv-value], num_rows=[num-rows], avg_width=[avg_width], ...}"


http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@318
PS6, Line 318: VLOG
same comments as log statement above.


http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@293
PS6, Line 293: "COUNT(CASE WHEN " + colRefSql + " = TRUE THEN 1 ELSE NULL 
END)");
why not just use "sum(" + colRefSql ")");? I'm not sure if one way is better 
than another, but it would be nice to try and see if one is faster than another.


http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@294
PS6, Line 294: columnStatsSelectList.add(
             :             "COUNT(CASE WHEN " + colRefSql + " = FALSE THEN 1 
ELSE NULL END)");
can't this be derived by the total number of rows, and the number of true 
values?


http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@297
PS6, Line 297:  columnStatsSelectList.add("NULL");
             :         columnStatsSelectList.add("NULL");
what is this for?



--
To view, visit http://gerrit.cloudera.org:8080/14666
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431
Gerrit-Change-Number: 14666
Gerrit-PatchSet: 6
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Comment-Date: Tue, 21 Jan 2020 15:53:45 +0000
Gerrit-HasComments: Yes

Reply via email to