Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 )
Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. ...................................................................... Patch Set 7: (9 comments) http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift@380 PS4, Line 380: // identified by their id. The caller's table_version_number is used by the catalog > several things here :-) right, but I think the question was: if we could specify the ones to exclude, and fetch the remaining ones, it would actually be possible for the catalogd to merge them all (ie do the HLL merge step) before transmitting them. That would reduce the size on the wire substantially (to O(num columns)) at the cost of doing more work on the catalogd side. http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452 PS7, Line 452: excludedPartitions nit: the format for this should be /*excludedPartitions=*/ so that ErrorProne can understand it: https://github.com/google/error-prone/blob/master/docs/bugpattern/argumentselectiondefects/NamedParameters.md errorprone isn't on by default at the moment but probably still good to work towards that http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@454 PS7, Line 454: null; : partStats = nit: combine http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@610 PS7, Line 610: expNumStats nit: i think it's worth writing 'expected' instead of 'exp' since my mind went to 'exponential' instead of 'expected' for this abbreviation http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@644 PS7, Line 644: Map<Long, TPartitionStats> ps = Collections.emptyMap(); it seems like the scoping in this function could be tightened up a bit by getting rid of the 'ps' variable entirely. - just return Collections.emptyMap directly on L645 - on line 654 add an "else" which returns emptyMap I think that makes the control flow a bit clearer since you don't have to track the reassignment of this local in and out of the block http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@410 PS7, Line 410: format missing the '%s' here. Side note: not sure why Impala likes to use String.format so much since it's so error-prone vs just adding the strings together http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1742 PS7, Line 1742: configured this javadoc is a bit incomplete since it also depends how many partitions are being sent http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1744 PS7, Line 1744: check nit: "shouldSend..." so that it's a question rather than a verb. To me, "checkFoo" is a function that checks some condition and probably throws an exception if it fails http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@211 PS7, Line 211: add validation to ensure that both pulling incremental : // statistics and a local catalog are specified. did you mean the opposite of this? -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 7 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Wed, 29 Aug 2018 00:27:17 +0000 Gerrit-HasComments: Yes
