Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8529 )

Change subject: [PREVIEW] IMPALA-4886: Expose table metrics in the catalog web 
UI.
......................................................................


Patch Set 1:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/8529/1/be/src/catalog/catalog-server.cc
File be/src/catalog/catalog-server.cc:

http://gerrit.cloudera.org:8080/#/c/8529/1/be/src/catalog/catalog-server.cc@391
PS1, Line 391: DCHECK_EQ(catalog_usage_result.large_tables.size(),
             :       catalog_usage_result.memory_estimates.size());
             :   DCHECK_EQ(catalog_usage_result.frequent_tables.size(),
             :       catalog_usage_result.num_metadata_operations.size());
> can be removed if using a struct instead of multiple arrays
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/common/thrift/JniCatalog.thrift
File common/thrift/JniCatalog.thrift:

http://gerrit.cloudera.org:8080/#/c/8529/1/common/thrift/JniCatalog.thrift@602
PS1, Line 602:   1: required list<string> large_tables
> Is the idea that len(large_tables)==len(memory_estimates), and likewise len
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/pom.xml
File fe/pom.xml:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/pom.xml@365
PS1, Line 365:       <artifactId>metrics-core</artifactId>
> Thanks. This is the right thing to use in my experience.
Good to hear :)


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@1091
PS1, Line 1091: REFRESH
> why not sync this name to RELOAD?
Good point. However, I wanted the name to correspond to the name of the high 
level operation performed which is REFRESH. I agree that the naming may not be 
very accurate in this function.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java
File fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@27
PS1, Line 27: Sigleton
> nit(spelling): Singleton
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@36
PS1, Line 36:   // TODO: Consider making it a configurable parameter.
> A somewhat cheap way to do this is to use:
I like this idea. Done


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java@48
PS1, Line 48: true
> why always evict for this one?
I just wanted a cheap way to make sure that frequently accessed tables from the 
past didn't prevent newly accessed tables from ever being inserted in the 
cache. A more elaborate schemes could be time-based rank reduction or something 
like that.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@923
PS1, Line 923: HdfsTable.StorageStats stats,
             :       Reference<Boolean> hasIncrementalStats
> this is a surprising api since it modifies the last two args and does a bit
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@197
PS1, Line 197: al stats
> this seems to only be written, not read. can it be removed or will it be us
It's read in L2248.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@220
PS1, Line 220: table or partition
             :   // level.
> unclear what this is trying to say. is this: "aggregated table wide at the
It means that an instance of this class can be used for stats aggregated at the 
table or partition level. Reworked the comment a bit. Let me know if it's clear 
now.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@256
PS1, Line 256: from
> nit: by
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1735
PS1, Line 1735: hasIncrementalStats.getRef()
> there's a lot going on here-- surprising to see that its tested given that
Code changed a bit. Let me know if it's better now.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1737
PS1, Line 1737: hasIncrementalStats_
> so incremental stats is a table-wide property and stored per partition? so
Yes, so the 'hasIncrementalStats_' field simply indicates that some (not 
necessarily all) partitions have incremental stats. The answer to the last 
question is yes. Say you create a table, add 10 partitions and run incremental 
stats. Then you add 10 more partitions. Until you run incremental stats again, 
those last partitions will not have incremental stats stored.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java
File fe/src/main/java/org/apache/impala/util/TopNCache.java:

http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java@35
PS1, Line 35: Always evict policy
> why is this needed? is it to reflect some sort of recency?
Yeah, see the comment in CatalogUsageMonitor.


http://gerrit.cloudera.org:8080/#/c/8529/1/fe/src/main/java/org/apache/impala/util/TopNCache.java@94
PS1, Line 94: .
> do you want to enforce a sort order here or at the caller? might be simpler
I don't think there is a need to enforce a sort order. The UI allows you to 
sort on any column (table, name, metric value, etc).


http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl
File www/catalog.tmpl:

http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@26
PS1, Line 26: Top-25
> if parameterizing N, this would need to change as well. perhaps omit the N?
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@34
PS1, Line 34:
> several ws issues here.
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@65
PS1, Line 65: Top-25
> same here
Done


http://gerrit.cloudera.org:8080/#/c/8529/1/www/catalog.tmpl@78
PS1, Line 78: {{#frequent_tables}}
> are these sorted (desc) by num operations? the screenshot for this one is n
No they are not. Using the UI you can sort based on that metric or the table 
name.



--
To view, visit http://gerrit.cloudera.org:8080/8529
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I37d407979e6d3b1a444b6b6265900b148facde9e
Gerrit-Change-Number: 8529
Gerrit-PatchSet: 1
Gerrit-Owner: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Philip Zeyliger <[email protected]>
Gerrit-Reviewer: Vuk Ercegovac <[email protected]>
Gerrit-Comment-Date: Thu, 30 Nov 2017 20:43:48 +0000
Gerrit-HasComments: Yes

Reply via email to