Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8529 )
Change subject: [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web UI. ...................................................................... Patch Set 1: (19 comments) http://gerrit.cloudera.org:8080/#/c/8529/1/be/src/catalog/catalog-server.cc File be/src/catalog/catalog-server.cc: http://gerrit.cloudera.org:8080/#/c/8529/1/be/src/catalog/catalog-server.cc@391 PS1, Line 391: DCHECK_EQ(catalog_usage_result.large_tables.size(), : catalog_usage_result.memory_estimates.size()); : DCHECK_EQ(catalog_usage_result.frequent_tables.size(), : catalog_usage_result.num_metadata_operations.size()); > can be removed if using a struct instead of multiple arrays Done http://gerrit.cloudera.org:8080/#/c/8529/1/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/8529/1/common/thrift/JniCatalog.thrift@602 PS1, Line 602: 1: required list<string> large_tables > Is the idea that len(large_tables)==len(memory_estimates), and likewise len Done http://gerrit.cloudera.org:8080/#/c/8529/1/fe/pom.xml File fe/pom.xml: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/pom.xml@365 PS1, Line 365: <artifactId>metrics-core</artifactId> > Thanks. This is the right thing to use in my experience. Good to hear :) http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1091 PS1, Line 1091: REFRESH > why not sync this name to RELOAD? Good point. However, I wanted the name to correspond to the name of the high level operation performed which is REFRESH. I agree that the naming may not be very accurate in this function. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java File fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@27 PS1, Line 27: Sigleton > nit(spelling): Singleton Done http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@36 PS1, Line 36: // TODO: Consider making it a configurable parameter. > A somewhat cheap way to do this is to use: I like this idea. Done http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@48 PS1, Line 48: true > why always evict for this one? I just wanted a cheap way to make sure that frequently accessed tables from the past didn't prevent newly accessed tables from ever being inserted in the cache. A more elaborate schemes could be time-based rank reduction or something like that. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@923 PS1, Line 923: HdfsTable.StorageStats stats, : Reference<Boolean> hasIncrementalStats > this is a surprising api since it modifies the last two args and does a bit Done http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@197 PS1, Line 197: al stats > this seems to only be written, not read. can it be removed or will it be us It's read in L2248. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@220 PS1, Line 220: table or partition : // level. > unclear what this is trying to say. is this: "aggregated table wide at the It means that an instance of this class can be used for stats aggregated at the table or partition level. Reworked the comment a bit. Let me know if it's clear now. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@256 PS1, Line 256: from > nit: by Done http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1735 PS1, Line 1735: hasIncrementalStats.getRef() > there's a lot going on here-- surprising to see that its tested given that Code changed a bit. Let me know if it's better now. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1737 PS1, Line 1737: hasIncrementalStats_ > so incremental stats is a table-wide property and stored per partition? so Yes, so the 'hasIncrementalStats_' field simply indicates that some (not necessarily all) partitions have incremental stats. The answer to the last question is yes. Say you create a table, add 10 partitions and run incremental stats. Then you add 10 more partitions. Until you run incremental stats again, those last partitions will not have incremental stats stored. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java File fe/src/main/java/org/apache/impala/util/TopNCache.java: http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java@35 PS1, Line 35: Always evict policy > why is this needed? is it to reflect some sort of recency? Yeah, see the comment in CatalogUsageMonitor. http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java@94 PS1, Line 94: . > do you want to enforce a sort order here or at the caller? might be simpler I don't think there is a need to enforce a sort order. The UI allows you to sort on any column (table, name, metric value, etc). http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl File www/catalog.tmpl: http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@26 PS1, Line 26: Top-25 > if parameterizing N, this would need to change as well. perhaps omit the N? Done http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@34 PS1, Line 34: > several ws issues here. Done http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@65 PS1, Line 65: Top-25 > same here Done http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@78 PS1, Line 78: {{#frequent_tables}} > are these sorted (desc) by num operations? the screenshot for this one is n No they are not. Using the UI you can sort based on that metric or the table name. -- To view, visit http://gerrit.cloudera.org:8080/8529 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e Gerrit-Change-Number: 8529 Gerrit-PatchSet: 1 Gerrit-Owner: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Philip Zeyliger <[email protected]> Gerrit-Reviewer: Vuk Ercegovac <[email protected]> Gerrit-Comment-Date: Thu, 30 Nov 2017 20:43:48 +0000 Gerrit-HasComments: Yes
