[
https://issues.apache.org/jira/browse/IMPALA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931781#comment-16931781
]
ASF subversion and git services commented on IMPALA-7322:
---------------------------------------------------------
Commit 7136e8b965bd0df974dccd1419ea65d42c494c06 in impala's branch
refs/heads/master from Yongzhi Chen
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7136e8b ]
IMPALA-7322: Add storage wait time to profile
Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.
Testing:
* Ran queries that can trigger all of, none of or
some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.
Sample output:
Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
- Metadata load started: 661.084us (661.084us)
- Metadata load finished. loaded-tables=1/1
load-requests=1 catalog-updates=3
storage-load-time=233ms: 3s819ms (3s819ms)
- Analysis finished: 3s820ms (763.979us)
- Value transfer graph computed: 3s820ms (63.193us)
Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):
Frontend:
- CatalogFetch.ColumnStats.Misses: 1
- CatalogFetch.ColumnStats.Requests: 1
- CatalogFetch.ColumnStats.Time: 0
- CatalogFetch.Config.Misses: 1
- CatalogFetch.Config.Requests: 1
- CatalogFetch.Config.Time: 3ms
- CatalogFetch.DatabaseList.Hits: 1
- CatalogFetch.DatabaseList.Requests: 1
- CatalogFetch.DatabaseList.Time: 0
- CatalogFetch.PartitionLists.Misses: 1
- CatalogFetch.PartitionLists.Requests: 1
- CatalogFetch.PartitionLists.Time: 4ms
- CatalogFetch.Partitions.Hits: 2
- CatalogFetch.Partitions.Misses: 1
- CatalogFetch.Partitions.Requests: 3
- CatalogFetch.Partitions.Time: 1ms
- CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
- CatalogFetch.RPCs.Requests: 4
- CatalogFetch.RPCs.Time: 93ms
- CatalogFetch.StorageLoad.Time: 68ms
- CatalogFetch.TableNames.Hits: 2
- CatalogFetch.TableNames.Requests: 2
- CatalogFetch.TableNames.Time: 0
- CatalogFetch.Tables.Misses: 1
- CatalogFetch.Tables.Requests: 1
- CatalogFetch.Tables.Time: 91ms
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
Count: 1
Mean rate: 0.0085
1 min. rate: 0.032
5 min. rate: 0.1386
15 min. rate: 0.177
Min (msec): 111
Max (msec): 111
Mean (msec): 111.1802
Median (msec): 111.1802
75th-% (msec): 111.1802
95th-% (msec): 111.1802
99th-% (msec): 111.1802
Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Reviewed-on: http://gerrit.cloudera.org:8080/13786
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Sahil Takiar <[email protected]>
> Add storage wait time to profile for operations with metadata load
> ------------------------------------------------------------------
>
> Key: IMPALA-7322
> URL: https://issues.apache.org/jira/browse/IMPALA-7322
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 3.0, Impala 2.12.0
> Reporter: Balazs Jeszenszky
> Assignee: Yongzhi Chen
> Priority: Major
>
> The profile of a REFRESH or of the query triggering metadata load should
> point out how much time was spent waiting for source systems.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]