Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11193 )
Change subject: IMPALA-7425: Change incremental stats to pull from catalogd. ...................................................................... Patch Set 4: (11 comments) main change here is to use names for partitions instead of the ids. http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/catalog/catalog-server.cc@149 PS4, Line 149: ; > Can you log the request too? Contains table info etc the java fn logs this info http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/common/global-flags.cc File be/src/common/global-flags.cc: http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/common/global-flags.cc@210 PS4, Line 210: If used, the flag must " : "be set on both catalogd and all impalad coordinators."); > Since this is for v1 metadata, I think it would be even cleaner if it was s backing off on this change to keep the scope of the change down. added a todo. http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift File common/thrift/CatalogService.thrift: http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift@380 PS4, Line 380: // identified by their id. The caller's table_version_number is used by the catalog yes, I considered doing the work on the catalogd side (shift the merge from coordinator to catalog) but opted to stick with a dataflow that's closer to what's currently there (e.g., primary diff is to fetch remote instead of local... all the other plumbing remains the same and we don't run the c++ code for merging in catalog). updated revision goes with partitioning values as more stable partition names. the scheme is for catalogd to send everything then let the impalad remove whats needed, which is expected to be small (so we're usually sending most partition stats). the issue with versions is avoided, but on the impalad, we do detect if partitions have been added/removed and fail if we see we're out-of-sync in this regard. http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452 PS7, Line 452: ionStats != null) > nit: the format for this should be /*excludedPartitions=*/ so that ErrorPr Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@454 PS7, Line 454: te_col_stats.isEmpty() && !tableIsMissingColStats; : if (!hasStat > nit: combine Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@610 PS7, Line 610: if (exclude > nit: i think it's worth writing 'expected' instead of 'exp' since my mind w Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@644 PS7, Line 644: } catch (Exception e) { > it seems like the scoping in this function could be tightened up a bit by g Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@410 PS7, Line 410: able_v > missing the '%s' here. Side note: not sure why Impala likes to use String.f Done. not sure. perhaps it was a habit formed when concat was considered slower than alternatives. when I use this I'm primarily sticking to what's in the current file. if we want to change that, I'd prefer to do change all uses per file, but would prefer to avoid that for now to avoid conflicts. http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1742 PS7, Line 1742: ponding to > this javadoc is a bit incomplete since it also depends how many partitions Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1744 PS7, Line 1744: erwis > nit: "shouldSend..." so that it's a question rather than a verb. To me, "ch Done http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java: http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@211 PS7, Line 211: OperationException("N/A"); : } > did you mean the opposite of this? whoops, missed a "not" -- To view, visit http://gerrit.cloudera.org:8080/11193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e Gerrit-Change-Number: 11193 Gerrit-PatchSet: 4 Gerrit-Owner: Vuk Ercegovac <[email protected]> Gerrit-Reviewer: Bharath Vissapragada <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Thu, 30 Aug 2018 03:01:56 +0000 Gerrit-HasComments: Yes
