Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
......................................................................


Patch Set 7:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift@380
PS4, Line 380: // identified by their id. The caller's table_version_number is 
used by the catalog
> several things here :-)
right, but I think the question was: if we could specify the ones to exclude, 
and fetch the remaining ones, it would actually be possible for the catalogd to 
merge them all (ie do the HLL merge step) before transmitting them. That would 
reduce the size on the wire substantially (to O(num columns)) at the cost of 
doing more work on the catalogd side.


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452
PS7, Line 452: excludedPartitions
nit: the format for this should be /*excludedPartitions=*/  so that ErrorProne 
can understand it: 
https://github.com/google/error-prone/blob/master/docs/bugpattern/argumentselectiondefects/NamedParameters.md

errorprone isn't on by default at the moment but probably still good to work 
towards that


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@454
PS7, Line 454: null;
             :           partStats =
nit: combine


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@610
PS7, Line 610: expNumStats
nit: i think it's worth writing 'expected' instead of 'exp' since my mind went 
to 'exponential' instead of 'expected' for this abbreviation


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@644
PS7, Line 644:     Map<Long, TPartitionStats> ps = Collections.emptyMap();
it seems like the scoping in this function could be tightened up a bit by 
getting rid of the 'ps' variable entirely.
- just return Collections.emptyMap directly on L645
- on line 654 add an "else" which returns emptyMap

I think that makes the control flow a bit clearer since you don't have to track 
the reassignment of this local in and out of the block


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@410
PS7, Line 410: format
missing the '%s' here. Side note: not sure why Impala likes to use 
String.format so much since it's so error-prone vs just adding the strings 
together


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1742
PS7, Line 1742: configured
this javadoc is a bit incomplete since it also depends how many partitions are 
being sent


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1744
PS7, Line 1744: check
nit: "shouldSend..." so that it's a question rather than a verb. To me, 
"checkFoo" is a function that checks some condition and probably throws an 
exception if it fails


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@211
PS7, Line 211: add validation to ensure that both pulling incremental
             :     // statistics and a local catalog are specified.
did you mean the opposite of this?



--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 7
Gerrit-Owner: Vuk Ercegovac <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Wed, 29 Aug 2018 00:27:17 +0000
Gerrit-HasComments: Yes

Reply via email to