Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14666 )
Change subject: IMPALA-8205: Support number of true and false statistics for boolean column ...................................................................... Patch Set 6: (8 comments) does this include computing stats for boolean columns in Kudu as well? (if not thats okay, just wondering) http://gerrit.cloudera.org:8080/#/c/14666/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14666/6//COMMIT_MSG@17 PS6, Line 17: incrementabl nit: typo, should be "incremental" http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc File be/src/exec/incr-stats-util.cc: http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@132 PS6, Line 132: int64_t num_trues; : : int64_t num_falses; nit: both fields should have comments http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@249 PS6, Line 249: update nit: first letter should be uppercase, change to "Updated" http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@249 PS6, Line 249: VLOG revise to: "Updated statistics for table=[table-name] and column=[column-name]; statistics={ndv=[ndv-value], num_rows=[num-rows], avg_width=[avg_width], ...}" http://gerrit.cloudera.org:8080/#/c/14666/6/be/src/exec/incr-stats-util.cc@318 PS6, Line 318: VLOG same comments as log statement above. http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@293 PS6, Line 293: "COUNT(CASE WHEN " + colRefSql + " = TRUE THEN 1 ELSE NULL END)"); why not just use "sum(" + colRefSql ")");? I'm not sure if one way is better than another, but it would be nice to try and see if one is faster than another. http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@294 PS6, Line 294: columnStatsSelectList.add( : "COUNT(CASE WHEN " + colRefSql + " = FALSE THEN 1 ELSE NULL END)"); can't this be derived by the total number of rows, and the number of true values? http://gerrit.cloudera.org:8080/#/c/14666/6/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@297 PS6, Line 297: columnStatsSelectList.add("NULL"); : columnStatsSelectList.add("NULL"); what is this for? -- To view, visit http://gerrit.cloudera.org:8080/14666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I991bee8e7fdc644d908289f5fe2ee8032cc2c431 Gerrit-Change-Number: 14666 Gerrit-PatchSet: 6 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Comment-Date: Tue, 21 Jan 2020 15:53:45 +0000 Gerrit-HasComments: Yes
