Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11193 )

Change subject: IMPALA-7425: Change incremental stats to pull from catalogd.
......................................................................


Patch Set 4:

(11 comments)

main change here is to use names for partitions instead of the ids.

http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/catalog/catalog-server.cc@149
PS4, Line 149: ;
> Can you log the request too? Contains table info etc
the java fn logs this info


http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/11193/4/be/src/common/global-flags.cc@210
PS4, Line 210: If used, the flag must "
             :     "be set on both catalogd and all impalad coordinators.");
> Since this is for v1 metadata, I think it would be even cleaner if it was s
backing off on this change to keep the scope of the change down. added a todo.


http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift
File common/thrift/CatalogService.thrift:

http://gerrit.cloudera.org:8080/#/c/11193/4/common/thrift/CatalogService.thrift@380
PS4, Line 380: // identified by their id. The caller's table_version_number is 
used by the catalog
yes, I considered doing the work on the catalogd side (shift the merge from 
coordinator to catalog) but opted to stick with a dataflow that's closer to 
what's currently there (e.g., primary diff is to fetch remote instead of 
local... all the other plumbing remains the same and we don't run the c++ code 
for merging in catalog).

updated revision goes with partitioning values as more stable partition names. 
the scheme is for catalogd to send everything then let the impalad remove whats 
needed, which is expected to be small (so we're usually sending most partition 
stats). the issue with versions is avoided, but on the impalad, we do detect if 
partitions have been added/removed and fail if we see we're out-of-sync in this 
regard.


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@452
PS7, Line 452: ionStats != null)
> nit: the format for this should be /*excludedPartitions=*/  so that ErrorPr
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@454
PS7, Line 454: te_col_stats.isEmpty() && !tableIsMissingColStats;
             :           if (!hasStat
> nit: combine
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@610
PS7, Line 610: if (exclude
> nit: i think it's worth writing 'expected' instead of 'exp' since my mind w
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java@644
PS7, Line 644:     } catch (Exception e) {
> it seems like the scoping in this function could be tightened up a bit by g
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@410
PS7, Line 410: able_v
> missing the '%s' here. Side note: not sure why Impala likes to use String.f
Done. not sure. perhaps it was a habit formed when concat was considered slower 
than alternatives. when I use this I'm primarily sticking to what's in the 
current file. if we want to change that, I'd prefer to do change all uses per 
file, but would prefer to avoid that for now to avoid conflicts.


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1742
PS7, Line 1742: ponding to
> this javadoc is a bit incomplete since it also depends how many partitions
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1744
PS7, Line 1744: erwis
> nit: "shouldSend..." so that it's a question rather than a verb. To me, "ch
Done


http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java
File fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11193/7/fe/src/main/java/org/apache/impala/catalog/local/LocalCatalog.java@211
PS7, Line 211: OperationException("N/A");
             :   }
> did you mean the opposite of this?
whoops, missed a "not"



--
To view, visit http://gerrit.cloudera.org:8080/11193
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9d564808ca5157afe4e091909ca6cdac76e60d6e
Gerrit-Change-Number: 11193
Gerrit-PatchSet: 4
Gerrit-Owner: Vuk Ercegovac <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Thu, 30 Aug 2018 03:01:56 +0000
Gerrit-HasComments: Yes

Reply via email to