Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 )
Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. ...................................................................... Patch Set 11: (14 comments) http://gerrit.cloudera.org:8080/#/c/11193/11/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/11193/11/common/thrift/CatalogService.thrift@402 PS11, Line 402: // Partitions are identified by their partitioning column values > out of date Done http://gerrit.cloudera.org:8080/#/c/11193/11/common/thrift/CatalogService.thrift@403 PS11, Line 403: defalte > typo Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@469 PS11, Line 469: if (partStats != null) { > isn't this guaranteed as part of the condition on line 459? Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@498 PS11, Line 498: if (partitionStats != null) > this always returns non-null now, right? hmm... this shifted around a fair bit. I've checked that the fetch prunes those TPartitionStats that don't have incremental stats. That lets me remove the complicated hasStats stmt on L459 and this null-check. tableIsMissingColStats is specific for the case where partitions are not specified. When they are specified, going by the comment on L479, we only stick to those partitions that were specified. http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@614 PS11, Line 614: id > nit: inline p.getId() here Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@634 PS11, Line 634: partitionIds > nit: needs update Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@668 PS11, Line 668: isSetIntermediate_col_stats > how come here we only return those that have this isset, but above, we retu good catch, I updated the non-fetch case around L624. http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@417 PS11, Line 417: {} > annoyingly, Preconditions expects '%s' formatting, whereas slf4j expects '{ Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@427 PS11, Line 427: checkArgument > i think this is more of a state than an argument Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java File fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java: http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@56 PS11, Line 56: deserialization/decompression > what cases does this happen? it's sort of weird that decompression might re this is a refactor that's exposing existing impl. the deflator can return null and deserialization currently throws. http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/PartitionStatsUtil.java@67 PS11, Line 67: LOG.warn("Error decompressing partition stats."); > can you take the partition name or some other kind of string identifier as Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@210 PS11, Line 210: // TODO(vercegovac): add validation to ensure that both pulling incremental : // statistics and a local catalog are not specified. > instead, could we just have it ignore the configuration in that case, consi I'd prefer to leave it as-is for now to avoid introducing odd cases where it sort of works when misconfigured. Ideally, this misconfiguration is caught early. http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/service/JniCatalog.java File fe/src/main/java/org/apache/impala/service/JniCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/service/JniCatalog.java@32 PS11, Line 32: import org.apache.commons.collections.iterators.EmptyListIterator; > unused? Done http://gerrit.cloudera.org:8080/#/c/11193/11/fe/src/main/java/org/apache/impala/service/JniCatalog.java@278 PS11, Line 278: response.setStatus(new TStatus(TErrorCode.OK, Collections.<String>emptyList())); > not necessary now that it's optional? Done -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Wed, 05 Sep 2018 02:24:57 +0000 Gerrit-HasComments: Yes
