Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9133 )
Change subject: IMPALA-3562: support column restriction for compute stats ...................................................................... Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@344 PS2, Line 344: if (!isIncremental_ && columnWhitelist_ != null) { remove the incremental part, that's handled via c'tor precondition http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@349 PS2, Line 349: throw new AnalysisException("Column: " + colName + " not found in table: " + Sentence reads strangely with two colons, how about removing the first one? http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@354 PS2, Line 354: "column:" + col.getName() + " of HDFS table."); remove colon for readability? http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@357 PS2, Line 357: throw new AnalysisException("COMPUTE STATS not supported for type: " + COMPUTE STATS not supported for column <colName> of complex type: <typeSql> http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@726 PS2, Line 726: if (sampleParams_ != null) tblsmpl = " " + sampleParams_.toSql(); nit: move this below L732 to have SQL order http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java: http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1214 PS2, Line 1214: checkComputeStatsStmt("compute stats functional.alltypes (int_col, double_col)", add test with duplicate columns in the whitelist (we expect dedup) http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1217 PS2, Line 1217: checkComputeStatsStmt("compute stats functional.alltypes ()", I have a feeling an empty columns list might analyze but not execute. It might be simpler to forbid this case. http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1270 PS2, Line 1270: checkComputeStatsStmt( add test with column whitelist http://gerrit.cloudera.org:8080/#/c/9133/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test: http://gerrit.cloudera.org:8080/#/c/9133/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test@211 PS2, Line 211: # Restricts stats to a subset of columns. Would be good to add an E2E test with TABLESAMPLE in tests/custom_cluster/test_stats_extrapolation.py -- To view, visit http://gerrit.cloudera.org:8080/9133 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If8b25dd248e578dc7ddd35468125cca12d1b9f27 Gerrit-Change-Number: 9133 Gerrit-PatchSet: 2 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Comment-Date: Tue, 30 Jan 2018 00:30:13 +0000 Gerrit-HasComments: Yes
