Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9133 )

Change subject: IMPALA-3562: support column restriction for compute stats
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@344
PS2, Line 344:     if (!isIncremental_ && columnWhitelist_ != null) {
remove the incremental part, that's handled via c'tor precondition


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@349
PS2, Line 349:           throw new AnalysisException("Column: " + colName + " 
not found in table: " +
Sentence reads strangely with two colons, how about removing the first one?


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@354
PS2, Line 354:               "column:" + col.getName() + " of HDFS table.");
remove colon for readability?


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@357
PS2, Line 357:           throw new AnalysisException("COMPUTE STATS not 
supported for type: " +
COMPUTE STATS not supported for column <colName> of complex type: <typeSql>


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@726
PS2, Line 726:       if (sampleParams_ != null) tblsmpl = " " + 
sampleParams_.toSql();
nit: move this below L732 to have SQL order


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java:

http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1214
PS2, Line 1214:     checkComputeStatsStmt("compute stats functional.alltypes 
(int_col, double_col)",
add test with duplicate columns in the whitelist (we expect dedup)


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1217
PS2, Line 1217:     checkComputeStatsStmt("compute stats functional.alltypes 
()",
I have a feeling an empty columns list might analyze but not execute. It might 
be simpler to forbid this case.


http://gerrit.cloudera.org:8080/#/c/9133/2/fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java@1270
PS2, Line 1270:       checkComputeStatsStmt(
add test with column whitelist


http://gerrit.cloudera.org:8080/#/c/9133/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test
File testdata/workloads/functional-query/queries/QueryTest/compute-stats.test:

http://gerrit.cloudera.org:8080/#/c/9133/2/testdata/workloads/functional-query/queries/QueryTest/compute-stats.test@211
PS2, Line 211: # Restricts stats to a subset of columns.
Would be good to add an E2E test with TABLESAMPLE in 
tests/custom_cluster/test_stats_extrapolation.py



--
To view, visit http://gerrit.cloudera.org:8080/9133
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If8b25dd248e578dc7ddd35468125cca12d1b9f27
Gerrit-Change-Number: 9133
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Comment-Date: Tue, 30 Jan 2018 00:30:13 +0000
Gerrit-HasComments: Yes

Reply via email to