[ 
https://issues.apache.org/jira/browse/IMPALA-7322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16931781#comment-16931781
 ] 

ASF subversion and git services commented on IMPALA-7322:
---------------------------------------------------------

Commit 7136e8b965bd0df974dccd1419ea65d42c494c06 in impala's branch 
refs/heads/master from Yongzhi Chen
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=7136e8b ]

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing:
* Ran queries that can trigger all of, none of or
  some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.

Sample output:

Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):

Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
      load-requests=1 catalog-updates=3
      storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)

Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):

    Frontend:
       - CatalogFetch.ColumnStats.Misses: 1
       - CatalogFetch.ColumnStats.Requests: 1
       - CatalogFetch.ColumnStats.Time: 0
       - CatalogFetch.Config.Misses: 1
       - CatalogFetch.Config.Requests: 1
       - CatalogFetch.Config.Time: 3ms
       - CatalogFetch.DatabaseList.Hits: 1
       - CatalogFetch.DatabaseList.Requests: 1
       - CatalogFetch.DatabaseList.Time: 0
       - CatalogFetch.PartitionLists.Misses: 1
       - CatalogFetch.PartitionLists.Requests: 1
       - CatalogFetch.PartitionLists.Time: 4ms
       - CatalogFetch.Partitions.Hits: 2
       - CatalogFetch.Partitions.Misses: 1
       - CatalogFetch.Partitions.Requests: 3
       - CatalogFetch.Partitions.Time: 1ms
       - CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
       - CatalogFetch.RPCs.Requests: 4
       - CatalogFetch.RPCs.Time: 93ms
       - CatalogFetch.StorageLoad.Time: 68ms
       - CatalogFetch.TableNames.Hits: 2
       - CatalogFetch.TableNames.Requests: 2
       - CatalogFetch.TableNames.Time: 0
       - CatalogFetch.Tables.Misses: 1
       - CatalogFetch.Tables.Requests: 1
       - CatalogFetch.Tables.Time: 91ms

Catalog metrics(this sample is from a hdfs table):

    storage-metadata-load-duration:
       Count: 1
       Mean rate: 0.0085
       1 min. rate: 0.032
       5 min. rate: 0.1386
       15 min. rate: 0.177
       Min (msec): 111
       Max (msec): 111
       Mean (msec): 111.1802
       Median (msec): 111.1802
       75th-% (msec): 111.1802
       95th-% (msec): 111.1802
       99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Reviewed-on: http://gerrit.cloudera.org:8080/13786
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Sahil Takiar <[email protected]>


> Add storage wait time to profile for operations with metadata load
> ------------------------------------------------------------------
>
>                 Key: IMPALA-7322
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7322
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 3.0, Impala 2.12.0
>            Reporter: Balazs Jeszenszky
>            Assignee: Yongzhi Chen
>            Priority: Major
>
> The profile of a REFRESH or of the query triggering metadata load should 
> point out how much time was spent waiting for source systems.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to