Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14611 )

Change subject: IMPALA-9110: Add table loading time break-down metrics for 
HdfsTable
......................................................................


Patch Set 1:

(3 comments)

I think we missed the loading time of msDb and msTbl objects. We use 
msClient.getHiveClient().getTable(dbName, tblName) in many places to fetch the 
msTbl object from HMS. We also care about time spent in those places and it's 
the real schema-loading-time. May need some refactor to measure this time since 
it's here and there in our code base.

http://gerrit.cloudera.org:8080/#/c/14611/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14611/1//COMMIT_MSG@13
PS1, Line 13: We added "table-schema-load-duration", "all-column-stats-load
            : -duration", "all-column-stats-load-duration"
Can we also measure the queue time for loading a table? E.g. in 
TableLoadingMgr.loadAsync, we also want to know how long it takes to start 
loading the table after we created the tableLoadTask. This is the code path 
when catalogd launchs or receives a prioritizeLoad request from coordinators.


http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1229
PS1, Line 1229:   private void 
loadSchema(org.apache.hadoop.hive.metastore.api.Table msTbl)
I think this function does not invoke any HMS RPCs so the time will be very 
short. What we care more is the time to get the msTbl object, e.g. 
https://github.com/apache/impala/blob/50e0fbf889465eb288a5ef69477ccbc627681105/fe/src/main/java/org/apache/impala/catalog/TableLoader.java#L68

Table schema is the required fields of msTbl object (inside StorageDescriptor). 
So we already loaded it after we get the msTbl object. This function just 
extracts the schema. The schema loading time should be the time fetching the 
msTbl object.


http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/14611/1/fe/src/main/java/org/apache/impala/catalog/Table.java@30
PS1, Line 30: import com.codahale.metrics.Timer;
nit: looks like we import this in below section just before "import 
com.google.common.xxx" in other files.



--
To view, visit http://gerrit.cloudera.org:8080/14611
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5381f9316df588b2004876c6cd9fb7e674085b10
Gerrit-Change-Number: 14611
Gerrit-PatchSet: 1
Gerrit-Owner: Jiawei Wang <jiawei.w...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Comment-Date: Mon, 04 Nov 2019 03:53:37 +0000
Gerrit-HasComments: Yes

Reply via email to