[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Sahil Takiar (Code Review)
Sahil Takiar has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing:
* Ran queries that can trigger all of, none of or
  some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.

Sample output:

Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):

Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)

Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):

Frontend:
   - CatalogFetch.ColumnStats.Misses: 1
   - CatalogFetch.ColumnStats.Requests: 1
   - CatalogFetch.ColumnStats.Time: 0
   - CatalogFetch.Config.Misses: 1
   - CatalogFetch.Config.Requests: 1
   - CatalogFetch.Config.Time: 3ms
   - CatalogFetch.DatabaseList.Hits: 1
   - CatalogFetch.DatabaseList.Requests: 1
   - CatalogFetch.DatabaseList.Time: 0
   - CatalogFetch.PartitionLists.Misses: 1
   - CatalogFetch.PartitionLists.Requests: 1
   - CatalogFetch.PartitionLists.Time: 4ms
   - CatalogFetch.Partitions.Hits: 2
   - CatalogFetch.Partitions.Misses: 1
   - CatalogFetch.Partitions.Requests: 3
   - CatalogFetch.Partitions.Time: 1ms
   - CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
   - CatalogFetch.RPCs.Requests: 4
   - CatalogFetch.RPCs.Time: 93ms
   - CatalogFetch.StorageLoad.Time: 68ms
   - CatalogFetch.TableNames.Hits: 2
   - CatalogFetch.TableNames.Requests: 2
   - CatalogFetch.TableNames.Time: 0
   - CatalogFetch.Tables.Misses: 1
   - CatalogFetch.Tables.Requests: 1
   - CatalogFetch.Tables.Time: 91ms

Catalog metrics(this sample is from a hdfs table):

storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Reviewed-on: http://gerrit.cloudera.org:8080/13786
Tested-by: Impala Public Jenkins 
Reviewed-by: Sahil Takiar 
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/query_test/test_observability.py
9 files changed, 170 insertions(+), 30 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Sahil Takiar: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 8
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7: Code-Review+2

Carrying +2


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 17 Sep 2019 20:08:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 17 Sep 2019 19:27:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4956/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 17 Sep 2019 15:21:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4575/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 17 Sep 2019 14:23:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13786

to look at the new patch set (#6).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing:
* Ran queries that can trigger all of, none of or
  some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.

Sample output:

Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):

Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)

Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):

Frontend:
   - CatalogFetch.ColumnStats.Misses: 1
   - CatalogFetch.ColumnStats.Requests: 1
   - CatalogFetch.ColumnStats.Time: 0
   - CatalogFetch.Config.Misses: 1
   - CatalogFetch.Config.Requests: 1
   - CatalogFetch.Config.Time: 3ms
   - CatalogFetch.DatabaseList.Hits: 1
   - CatalogFetch.DatabaseList.Requests: 1
   - CatalogFetch.DatabaseList.Time: 0
   - CatalogFetch.PartitionLists.Misses: 1
   - CatalogFetch.PartitionLists.Requests: 1
   - CatalogFetch.PartitionLists.Time: 4ms
   - CatalogFetch.Partitions.Hits: 2
   - CatalogFetch.Partitions.Misses: 1
   - CatalogFetch.Partitions.Requests: 3
   - CatalogFetch.Partitions.Time: 1ms
   - CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
   - CatalogFetch.RPCs.Requests: 4
   - CatalogFetch.RPCs.Time: 93ms
   - CatalogFetch.StorageLoad.Time: 68ms
   - CatalogFetch.TableNames.Hits: 2
   - CatalogFetch.TableNames.Requests: 2
   - CatalogFetch.TableNames.Time: 0
   - CatalogFetch.Tables.Misses: 1
   - CatalogFetch.Tables.Requests: 1
   - CatalogFetch.Tables.Time: 91ms

Catalog metrics(this sample is from a hdfs table):

storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/query_test/test_observability.py
9 files changed, 170 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/6
--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-17 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 6:

patch 6 fix the test: because of the name change of the property, an old test 
was failed.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 17 Sep 2019 13:44:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4952/


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 21:51:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4952/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-16 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-11 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

Re-format the commit message.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 11 Sep 2019 14:03:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-11 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13786

to look at the new patch set (#4).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing:
* Ran queries that can trigger all of, none of or
  some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.

Sample output:

Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):

Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)

Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):

Frontend:
   - CatalogFetch.ColumnStats.Misses: 1
   - CatalogFetch.ColumnStats.Requests: 1
   - CatalogFetch.ColumnStats.Time: 0
   - CatalogFetch.Config.Misses: 1
   - CatalogFetch.Config.Requests: 1
   - CatalogFetch.Config.Time: 3ms
   - CatalogFetch.DatabaseList.Hits: 1
   - CatalogFetch.DatabaseList.Requests: 1
   - CatalogFetch.DatabaseList.Time: 0
   - CatalogFetch.PartitionLists.Misses: 1
   - CatalogFetch.PartitionLists.Requests: 1
   - CatalogFetch.PartitionLists.Time: 4ms
   - CatalogFetch.Partitions.Hits: 2
   - CatalogFetch.Partitions.Misses: 1
   - CatalogFetch.Partitions.Requests: 3
   - CatalogFetch.Partitions.Time: 1ms
   - CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
   - CatalogFetch.RPCs.Requests: 4
   - CatalogFetch.RPCs.Time: 93ms
   - CatalogFetch.StorageLoad.Time: 68ms
   - CatalogFetch.TableNames.Hits: 2
   - CatalogFetch.TableNames.Requests: 2
   - CatalogFetch.TableNames.Time: 0
   - CatalogFetch.Tables.Misses: 1
   - CatalogFetch.Tables.Requests: 1
   - CatalogFetch.Tables.Time: 91ms

Catalog metrics(this sample is from a hdfs table):

storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/query_test/test_observability.py
9 files changed, 169 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/4
--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-10 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3: Code-Review+1

(1 comment)

Just a comment on the commit message.

I'm not super familiar with this code, so would be good if @Bharath could take 
a look as well

http://gerrit.cloudera.org:8080/#/c/13786/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13786/3//COMMIT_MSG@26
PS3, Line 26: Sample output:
: Profile for Catalog V1: (storage-load-time is the added property 
and
: it is part of Metadata load in Query Compilation):
: After ran a hbase query (Metadata load finished is divided into
: several lines because of limitation of commit message):
: Query Compilation: 4s401ms
:   - Metadata load started: 661.084us (661.084us)
:   - Metadata load finished. loaded-tables=1/1
:   load-requests=1 catalog-updates=3
:   storage-load-time=233ms: 3s819ms (3s819ms)
:   - Analysis finished: 3s820ms (763.979us)
:   - Value transfer graph computed: 3s820ms (63.193us)
: Profile for Catalog V2: (StorageLoad.Time is the added property 
and it
: is in CatalogFetch):
: Frontend:
:- CatalogFetch.ColumnStats.Misses: 1
:- CatalogFetch.ColumnStats.Requests: 1
:- CatalogFetch.ColumnStats.Time: 0
:- CatalogFetch.Config.Misses: 1
:- CatalogFetch.Config.Requests: 1
:- CatalogFetch.Config.Time: 3ms
:- CatalogFetch.DatabaseList.Hits: 1
:- CatalogFetch.DatabaseList.Requests: 1
:- CatalogFetch.DatabaseList.Time: 0
:- CatalogFetch.PartitionLists.Misses: 1
:- CatalogFetch.PartitionLists.Requests: 1
:- CatalogFetch.PartitionLists.Time: 4ms
:- CatalogFetch.Partitions.Hits: 2
:- CatalogFetch.Partitions.Misses: 1
:- CatalogFetch.Partitions.Requests: 3
:- CatalogFetch.Partitions.Time: 1ms
:- CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
:- CatalogFetch.RPCs.Requests: 4
:- CatalogFetch.RPCs.Time: 93ms
:- CatalogFetch.StorageLoad.Time: 68ms
:- CatalogFetch.TableNames.Hits: 2
:- CatalogFetch.TableNames.Requests: 2
:- CatalogFetch.TableNames.Time: 0
:- CatalogFetch.Tables.Misses: 1
:- CatalogFetch.Tables.Requests: 1
:- CatalogFetch.Tables.Time: 91ms
: Catalog metrics(this sample is from a hdfs table):
: storage-metadata-load-duration:
:Count: 1
:Mean rate: 0.0085
:1 min. rate: 0.032
:5 min. rate: 0.1386
:15 min. rate: 0.177
:Min (msec): 111
:Max (msec): 111
:Mean (msec): 111.1802
:Median (msec): 111.1802
:75th-% (msec): 111.1802
:95th-% (msec): 111.1802
:99th-% (msec): 111.1802
I still find this hard to understand, would recommend re-formatting as follows:

 Sample Output:

 Profile for Catalog V1 ('storage-load-time' is the added
 property and it is part of the 'Metadata load finished' section of the
 profile under the 'Query Compilation' section. The example below was
 taken after running a HBase query. 'Metadata load finished' is divided
 into several lines because of limitation of commit message):

 Query Compilation: 4s401ms
   - Metadata load started: 661.084us (661.084us)
   - Metadata load finished. loaded-tables=1/1
   load-requests=1 catalog-updates=3
   storage-load-time=233ms: 3s819ms (3s819ms)
   - Analysis finished: 3s820ms (763.979us)
   - Value transfer graph computed: 3s820ms (63.193us)

 Profile for Catalog V2 ('StorageLoad.Time' is the added property and it
 is part of the 'CatalogFetch' counters):

 Frontend:
- CatalogFetch.ColumnStats.Misses: 1
- CatalogFetch.ColumnStats.Requests: 1
- CatalogFetch.ColumnStats.Time: 0
- CatalogFetch.Config.Misses: 1
- CatalogFetch.Config.Requests: 1
- CatalogFetch.Config.Time: 3ms
- CatalogFetch.DatabaseList.Hits: 1
- CatalogFetch.DatabaseList.Requests: 1
- CatalogFetch.DatabaseList.Time: 0
- CatalogFetch.PartitionLists.Misses: 1
- CatalogFetch.PartitionLists.Requests: 1
- CatalogFetch.PartitionLists.Time: 4ms
- CatalogFetch.Partitions.Hits: 2
- 

[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-09 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4500/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 09 Sep 2019 16:34:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-09 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13786

to look at the new patch set (#3).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing:
* Ran queries that can trigger all of, none of or
  some of the related tables loading.
* Check query profile for each query.
* Check catalog metrics for each table.
* Add unit tests to test_observability.py
* Ran all core tests.

Sample output:
Profile for Catalog V1: (storage-load-time is the added property and
it is part of Metadata load in Query Compilation):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)
Profile for Catalog V2: (StorageLoad.Time is the added property and it
is in CatalogFetch):
Frontend:
   - CatalogFetch.ColumnStats.Misses: 1
   - CatalogFetch.ColumnStats.Requests: 1
   - CatalogFetch.ColumnStats.Time: 0
   - CatalogFetch.Config.Misses: 1
   - CatalogFetch.Config.Requests: 1
   - CatalogFetch.Config.Time: 3ms
   - CatalogFetch.DatabaseList.Hits: 1
   - CatalogFetch.DatabaseList.Requests: 1
   - CatalogFetch.DatabaseList.Time: 0
   - CatalogFetch.PartitionLists.Misses: 1
   - CatalogFetch.PartitionLists.Requests: 1
   - CatalogFetch.PartitionLists.Time: 4ms
   - CatalogFetch.Partitions.Hits: 2
   - CatalogFetch.Partitions.Misses: 1
   - CatalogFetch.Partitions.Requests: 3
   - CatalogFetch.Partitions.Time: 1ms
   - CatalogFetch.RPCs.Bytes: 1.01 KB (1036)
   - CatalogFetch.RPCs.Requests: 4
   - CatalogFetch.RPCs.Time: 93ms
   - CatalogFetch.StorageLoad.Time: 68ms
   - CatalogFetch.TableNames.Hits: 2
   - CatalogFetch.TableNames.Requests: 2
   - CatalogFetch.TableNames.Time: 0
   - CatalogFetch.Tables.Misses: 1
   - CatalogFetch.Tables.Requests: 1
   - CatalogFetch.Tables.Time: 91ms
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/query_test/test_observability.py
9 files changed, 169 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/3
--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-09 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 2:

(8 comments)

Submit patch 3 to address the code review

http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@18
PS2, Line 18: Testing: Ran queries that can trigger all of, none of or
: some of the related tables loading.
: Check query profile for each query.
: Check catalog metrics for each table.
: Add unit tests to test_observability.py
: Ran all core tests.
> nit: if these are a list of testing activities done, i generally like to fo
Done


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@27
PS2, Line 27: After ran a hbase query (Metadata load finished is divided into 
several lines
> nit: line too long
Done


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@28
PS2, Line 28: because of limitation of commit message): Query Compilation: 
4s401ms
> nit: the profile snippet should be on a newline
Done


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@35
PS2, Line 35: Profile for Catalog V2:
> would be good to explain how storage-load-time is represented in the v1 pro
Done


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@46
PS2, Line 46: storage-load-time
> nit: why is this formatted differently from everything else - e.g. would
I will change to CatalogFetch.StorageLoad.Time which consistent with V1 and has 
three levels.


http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift@496
PS2, Line 496:   15: optional i64 storage_metadata_load_time
> nit: change to storage_metadata_load_time_ns so we know it is in nanosecond
Done


http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@23
PS2, Line 23: SkipIfCatalogV2
> is this needed?
Done


http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@684
PS2, Line 684:   def test_query_profile_storage_load_time(self):
> nit: might be cleaner to split this into two tests, one for kudu and one fo
I think they can put together:
They are similar and do not belong to filesystem loading.



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 09 Sep 2019 16:15:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-06 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@18
PS2, Line 18: Testing: Ran queries that can trigger all of, none of or
: some of the related tables loading.
: Check query profile for each query.
: Check catalog metrics for each table.
: Add unit tests to test_observability.py
: Ran all core tests.
nit: if these are a list of testing activities done, i generally like to format 
it as follows

 Testing:
 * Ran queries that can trigger all of, none of or
 some of the related tables loading.
 * Check query profile for each query.
 * Check catalog metrics for each table.
 * dd unit tests to test_observability.py.
 * Ran all core tests.


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@27
PS2, Line 27: After ran a hbase query (Metadata load finished is divided into 
several lines
nit: line too long


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@28
PS2, Line 28: because of limitation of commit message): Query Compilation: 
4s401ms
nit: the profile snippet should be on a newline


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@35
PS2, Line 35: Profile for Catalog V2:
would be good to explain how storage-load-time is represented in the v1 profile 
vs the v2 profile


http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@46
PS2, Line 46: storage-load-time
nit: why is this formatted differently from everything else - e.g. would

 - CatalogFetch.Metadata.Storage.Time

Be more consistent? I'm assuming "CatalogFetch.ColumnStats.Time" is how long it 
takes to load column stats?


http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift@496
PS2, Line 496:   15: optional i64 storage_metadata_load_time
nit: change to storage_metadata_load_time_ns so we know it is in nanoseconds


http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@23
PS2, Line 23: SkipIfCatalogV2
is this needed?


http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@684
PS2, Line 684:   def test_query_profile_storage_load_time(self):
nit: might be cleaner to split this into two tests, one for kudu and one for 
hbase



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 06 Sep 2019 16:51:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4485/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 06 Sep 2019 13:25:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-06 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 2:

submit patch set 2 to include profile support for catalog V2
Ran exhaustive test: Only one unrelated failure:
query_test/test_result_spooling.py (test_full_queue_large_fetch)


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 06 Sep 2019 12:47:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-09-06 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/13786

to look at the new patch set (#2).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift
and log total storage load time in query profile.
Storage-load-time is the amount of time spent loading metadata
from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase),
which does not include the amount of time spending loading data
from HMS.

Testing: Ran queries that can trigger all of, none of or
some of the related tables loading.
Check query profile for each query.
Check catalog metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile for Catalog V1: (storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into several lines
because of limitation of commit message): Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
  - Analysis finished: 3s820ms (763.979us)
  - Value transfer graph computed: 3s820ms (63.193us)
Profile for Catalog V2:
Frontend:
   - CatalogFetch.ColumnStats.Misses: 1
   - CatalogFetch.ColumnStats.Requests: 1
   - CatalogFetch.ColumnStats.Time: 16ms
   - CatalogFetch.Config.Misses: 1
   - CatalogFetch.Config.Requests: 1
   - CatalogFetch.Config.Time: 24ms
   - CatalogFetch.DatabaseList.Hits: 1
   - CatalogFetch.DatabaseList.Requests: 1
   - CatalogFetch.DatabaseList.Time: 0
   - CatalogFetch.Metadata.storage-load-time: 218ms
   - CatalogFetch.PartitionLists.Misses: 1
   - CatalogFetch.PartitionLists.Requests: 1
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
---
M common/thrift/CatalogObjects.thrift
M common/thrift/CatalogService.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M tests/query_test/test_observability.py
9 files changed, 160 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/2
--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-08-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

Can you rebase? I can submit for a GVO.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 09 Aug 2019 17:14:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-08-01 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13786/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13786/1//COMMIT_MSG@7
PS1, Line 7:
the formatting of this commit message looks off, is there a reason it is all 
indented?



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 01 Aug 2019 20:05:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-08-01 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1: Code-Review+1

+1 pending rebase


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 01 Aug 2019 20:04:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-08-01 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

I created a jira for the local catalog issue:
https://issues.apache.org/jira/browse/IMPALA-8822


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 01 Aug 2019 16:40:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-09 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589
PS1, Line 589:   @SkipIfCatalogV2.hms_event_polling_enabled()
> It works with Catalog V2, but when catalog V2 + hms event polling, running
- running a query triggers no metadata loading at all.

Do you know why? Agree with Sahil that we would permanently lose coverage if we 
skip it without understanding why.



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 10 Jul 2019 00:24:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-09 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589
PS1, Line 589:   @SkipIfCatalogV2.hms_event_polling_enabled()
> we should really try to get this to work in Catalog V2, as that will eventu
It works with Catalog V2, but when catalog V2 + hms event polling, running a 
query triggers no metadata loading at all. Storage loading is part of the 
metastore loading, we can only verify it when metastore loading existing in 
profile. This test is just creating a test env to make sure this loading can 
really happen. The skip is to avoid flaky tests.



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 09 Jul 2019 16:59:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-05 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589
PS1, Line 589:   @SkipIfCatalogV2.hms_event_polling_enabled()
we should really try to get this to work in Catalog V2, as that will eventually 
be the default in our tests at some point. IMPALA-8627 / IMPALA-8715 / 
IMPALA-8660 is working to achieve that. Otherwise, this test will never run by 
default, which sort of defeats the purpose of adding it in the first place.



--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 05 Jul 2019 16:53:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3804/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 02 Jul 2019 17:43:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-02 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13786 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

The original review is in https://gerrit.cloudera.org/c/13738/


--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 02 Jul 2019 17:04:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-07-02 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/13786


Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802

Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/common/skip.py
M tests/query_test/test_observability.py
8 files changed, 144 insertions(+), 30 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/1
--
To view, visit http://gerrit.cloudera.org:8080/13786
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2
Gerrit-Change-Number: 13786
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 15:

Actually the test is also flaky with metadata v1 as far as I can see, since it 
can run in parallel with other tests that touch functional.alltypes. I'm going 
to revert this and my change.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 15
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 26 Jun 2019 17:06:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 15:

It fails reliably with the local catalog, so raced with my change to enable 
local catalog in precommit.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 15
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 26 Jun 2019 17:04:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-26 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 15:

It looks like the tests added here are a little flaky.

https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/630/testReport/junit/query_test.test_observability/TestObservability/test_query_profile_storage_load_time_filesystem/
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/630/testReport/junit/query_test.test_observability/TestObservability/test_query_profile_storage_load_time/

Probably a matter of time before other builds run into it, just a heads up.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 15
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 26 Jun 2019 15:40:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Reviewed-on: http://gerrit.cloudera.org:8080/12940
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 131 insertions(+), 29 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 15
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 14: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 14
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 25 Jun 2019 22:02:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 14:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4537/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 14
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-25 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 13
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 14
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3725/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 13
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 24 Jun 2019 21:52:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-24 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 13:

(2 comments)

The new patch with the fixes

http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@509
PS12, Line 509:
> nit: mention the unit too.. ns ms...?
Done


http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@959
PS12, Line 959:if (loadParitionFileMetadata) {
  :   storageMetadataLoadTime_ += 
updateUnpartitionedTableFileMd();
  : }
  :   } else {
  : storageMetadataLoadTime_ += updatePartitionsFromHms(
  : client, partitionsToUpdate, 
loadParitionFileMetadata);
  :   }
  :   LOG.info("Incrementally loaded table metadata for: " 
+ getFullName());
  : } else {
  :   LOG.info("Fetching partition metadata from the 
Metastore: " + getFullName());
  :   // Load all partitions from Hive Metastore, including 
file metadata.
  :   List 
msPartitions =
  :   MetaStoreUtil.fetchAllPartitions(
  :   client, db_.getName(), name_, 
NUM_PARTITION_FETCH_RETRIES);
  :   LOG.info("Fetched partition metadata from the 
Metastore: " + getFullName());
  :   storageMetadataLoadTime_ = 
loadAllPartitions(msPartitions, msTbl);
  : }
  : if (loadTableSchema) setAvroSchema(client, msTbl);
  : setTableStats(msTbl);
  : fileMetadataStats_.unset();
  : refreshLastUsedTime();
  :   } catch (TableLoadingException e) {
  : throw e;
  :   } catch (Exception e) {
  : thr
> I think you could use a single try finally for
Done



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 13
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 24 Jun 2019 21:13:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-24 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#13).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 131 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/13
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 13
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-24 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 12: Code-Review+2

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@509
PS12, Line 509:
nit: mention the unit too.. ns ms...?


http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@959
PS12, Line 959:  try {
  : storageMetadataLoadTime_ += 
updateMdFromHmsTable(msTbl);
  : if (msTbl.getPartitionKeysSize() == 0) {
  :   if (loadParitionFileMetadata) {
  : storageMetadataLoadTime_ += 
updateUnpartitionedTableFileMd();
  :   }
  : } else {
  :   storageMetadataLoadTime_ += 
updatePartitionsFromHms(
  :   client, partitionsToUpdate, 
loadParitionFileMetadata);
  : }
  :   } finally {
  : storageLdTimer.update(storageMetadataLoadTime_, 
TimeUnit.NANOSECONDS);
  :   }
  :   LOG.info("Incrementally loaded table metadata for: " 
+ getFullName());
  : } else {
  :   LOG.info("Fetching partition metadata from the 
Metastore: " + getFullName());
  :   // Load all partitions from Hive Metastore, including 
file metadata.
  :   List 
msPartitions =
  :   MetaStoreUtil.fetchAllPartitions(
  :   client, db_.getName(), name_, 
NUM_PARTITION_FETCH_RETRIES);
  :   LOG.info("Fetched partition metadata from the 
Metastore: " + getFullName());
  :   try {
  : storageMetadataLoadTime_ = 
loadAllPartitions(msPartitions, msTbl);
  :   } finally {
  :
I think you could use a single try finally for

 storageLdTimer.update(storageMetadataLoadTime_, TimeUnit.NANOSECONDS);

Avoids unnecessary nesting.


http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572
PS11, Line 572: end_time = tree.nodes[1].info_strings["End Time"]
  : assert len(end_time) == 0, end_time
  : # Save the last operation to be able to retrieve the 
profile after closing the query
  : last_op = handle.get_handle()._last_operation
  : self.hs2_client.close_query(handle)
  : tree = last_op.get_profile(TRuntimeProfileFormat.THRIFT)
  : end_time = tree.nodes[1].info_strings["End Time"]
  : assert end_time is not None
  :
  :   def 
test_query_profile_contains_number_of_fragment_instance(self):
  : """Test that the expected section for number of fragment 
instance in
  : a query profile."""
  : event_regexes = [r'Per Host Number of Fragment Instances']
  : query = "select count (*) from functional.alltypes"
  : runtime_profile = self.execute_query(query).runtime_profile
  : self.__verify_profile_event_sequence(event_regexes, 
runtime_profile)
  :
  :   def test_query_profile_storage_load_time_filesystem(self):
> The test_query_profile_storage_load_time2 only tests kudu and hbase. For hb
Ah sorry, I missed the S3 part, my bad.



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 12
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Mon, 24 Jun 2019 18:29:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-21 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3711/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 12
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 21 Jun 2019 17:11:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-21 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 12:

(3 comments)

Uploaded patch 12 to address the issues. Run all core tests for this patch

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@945
PS11, Line 945: //TODO writeIDs may also be loaded in other code 
paths.
  : loadValidWriteIdList(client);
  : }
  :
  : final Timer storageLdTimer =
  : 
getMetrics().getTimer(Table.STORAGE_METADATA_LOAD_DURATION_METRIC);
  : storageMetadataLoadTime_ = 0;
  :
  : // Load partition and file metadata
  : if (reuseMetadata) {
  :   // Incrementally update this table's partitions and 
file metadata
  :   Preconditions.checkState(
  :   partitionsToUpdate == null || 
loadParitionFileMetadata);
  :
  :   try {
  : storageMetadataLoadTime_ += 
updateMdFromHmsTable(msTbl);
  : if (msTbl.getPartitionKeysSize() == 0) {
  :   if (loadParitionFileMetadata) {
  : storageMetadataLoadTime_ += 
updateUnpartitionedTableFileMd();
  :   }
  : } else {
  :   storageMetadataLoadTime_ += 
updatePartitionsFromHms(
  :   client, partitionsToUpdate, 
loadParitionFileMetadata);
  : }
  :   } finally {
  : storageLdTimer.update(storageMetadataLoadTime_, 
TimeUnit.NANOSECONDS);
  :   }
  :   LOG.info("Incrementally loaded table metadata for: " 
+ getFullName());
  : } else {
> Can we clean this up a bit by making all the update/load methods return the
Done


http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java@118
PS11, Line 118: s
> Mention that it is updated on every reload of the table?
Done


http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572
PS11, Line 572: end_time = tree.nodes[1].info_strings["End Time"]
  : assert len(end_time) == 0, end_time
  : # Save the last operation to be able to retrieve the 
profile after closing the query
  : last_op = handle.get_handle()._last_operation
  : self.hs2_client.close_query(handle)
  : tree = last_op.get_profile(TRuntimeProfileFormat.THRIFT)
  : end_time = tree.nodes[1].info_strings["End Time"]
  : assert end_time is not None
  :
  :   def 
test_query_profile_contains_number_of_fragment_instance(self):
  : """Test that the expected section for number of fragment 
instance in
  : a query profile."""
  : event_regexes = [r'Per Host Number of Fragment Instances']
  : query = "select count (*) from functional.alltypes"
  : runtime_profile = self.execute_query(query).runtime_profile
  : self.__verify_profile_event_sequence(event_regexes, 
runtime_profile)
  :
  :   def test_query_profile_storage_load_time_filesystem(self):
> you could just do something like 
The test_query_profile_storage_load_time2 only tests kudu and hbase. For hbase 
and kudu, we only use hdfs, other filessystems are skipped. In order to skip 
the filesystems such as S3, you need to tell a reason, so .hbase is added (But 
both kudu and hbase tests will be skipped). For the funational tests, all the 
filesystem will be tested, it does not skip any tests.  So we need two tests,  
I will change the two test names.



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 12
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath 

[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-21 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#12).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 142 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/12
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 12
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-20 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 11:

(3 comments)

lgtm, a few suggestions for cleanup, can +2 after that.

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@945
PS11, Line 945:
  : final Timer storageLdTimer =
  : 
getMetrics().getTimer(Table.STORAGE_METADATA_LOAD_DURATION_METRIC);
  : final Clock clock = Clock.defaultClock();
  : long startTime;
  : storageMetadataLoadTime_ = 0;
  :
  : // Load partition and file metadata
  : if (reuseMetadata) {
  :   // Incrementally update this table's partitions and 
file metadata
  :   Preconditions.checkState(
  :   partitionsToUpdate == null || 
loadParitionFileMetadata);
  :
  :   try {
  : startTime = clock.getTick();
  : updateMdFromHmsTable(msTbl);
  : // Time spent on getting file system and check 
access levels.
  : storageMetadataLoadTime_ += clock.getTick() - 
startTime;
  : if (msTbl.getPartitionKeysSize() == 0) {
  :   startTime = clock.getTick();
  :   if (loadParitionFileMetadata) 
updateUnpartitionedTableFileMd();
  :   storageMetadataLoadTime_ += clock.getTick() - 
startTime;
  : } else {
  :   long partitionfileLoadTime = 
updatePartitionsFromHms(
  :   client, partitionsToUpdate, 
loadParitionFileMetadata);
  :   storageMetadataLoadTime_ += partitionfileLoadTime;
  : }
  :   } finally {
  : storageLdTim
Can we clean this up a bit by making all the update/load methods return the 
time spent in the loading, sum up all of them and finally update the metric?

I think that will be cleaner.


http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java@118
PS11, Line 118: .
Mention that it is updated on every reload of the table?


http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572
PS11, Line 572:   def test_query_profile_storage_load_time1(self):
  : """Test that when a query needs load metadata for table(s), 
the
  : storage load time should be in the profile. Tests file 
systems."""
  : self.__check_query_profile_storage_load_time("functional")
  :
  :   @SkipIfS3.hbase
  :   @SkipIfLocal.hbase
  :   @SkipIfIsilon.hbase
  :   @SkipIfABFS.hbase
  :   @SkipIfADLS.hbase
  :   def test_query_profile_storage_load_time2(self):
  : """Test that when a query needs load metadata for table(s), 
the
  : storage load time should be in the profile. Tests kudu and 
hbase."""
  : # KUDU table
  : 
self.__check_query_profile_storage_load_time("functional_kudu")
  :
  : # HBASE table
  : 
self.__check_query_profile_storage_load_time("functional_hbase")
you could just do something like

for suffix in ["", "_kudu", "_hbase"]:
  check_query_...("functional" + siffix)

Please avoid test names that are hard to follow.

Btw, why are we skipping hbase?



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 11
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 20 Jun 2019 18:13:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-20 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 11: Code-Review+1

Any rebase conflicts that need to be reviewed?

Otherwise, carrying +1


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 11
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 20 Jun 2019 17:42:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-20 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3693/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 11
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 20 Jun 2019 15:31:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-20 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 11:

Submit patch 11 for the rebase, thanks for the review.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 11
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 20 Jun 2019 14:51:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-20 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#11).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 131 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/11
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 11
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-18 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 10:

Yongzhi, feel free to rebase when you get a chance.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 10
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 18 Jun 2019 21:33:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-06-18 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 10: Code-Review+1

Looks like this needs to be rebased, but overall LGTM.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 10
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 18 Jun 2019 20:55:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-30 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3448/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 10
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 30 May 2019 21:14:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-30 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 10:

(2 comments)

Submit a new patch

http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@239
PS9, Line 239: .filter(Table.class::isInstance)
 :   .map(Table.class::cast)
> do u need a space between ".class" and "::"
I did not do that, it was changed by the reformatting tool(recommended by a 
ramp-up doc):
git diff -U0 --no-color HEAD^ | 
"${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clang/clang-format-diff.py"
 -binary "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-format" -p1 
-i
I will remove the space.


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198
PS7, Line 198: metrics_.addTimer(REFRESH_DURATION_METRIC);
> Then we should add a snippet of the updated metrics to the commit message (
Will add.



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 10
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 30 May 2019 20:48:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-30 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#10).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
Profile:(storage-load-time is the added property):
After ran a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Catalog metrics(this sample is from a hdfs table):
storage-metadata-load-duration:
   Count: 1
   Mean rate: 0.0085
   1 min. rate: 0.032
   5 min. rate: 0.1386
   15 min. rate: 0.177
   Min (msec): 111
   Max (msec): 111
   Mean (msec): 111.1802
   Median (msec): 111.1802
   75th-% (msec): 111.1802
   95th-% (msec): 111.1802
   99th-% (msec): 111.1802
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 131 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/10
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 10
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-29 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS7, Line 235:   for (FeTable loadedTbl: loadedTbls_.values()) {
> I cannot see many advantages of using stream here, but change according to
IMO it's easier to understand.


http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@239
PS9, Line 239:
 :   }
do u need a space between ".class" and "::"


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202
PS7, Line 202:   loadAllColumnStats(msClient);
> protected void loadAllColumnStats(IMetaStoreClient client)
hmm originally though ImpalaRuntimeException was a RuntimeException (as the 
name would imply), but I guess its not


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198
PS7, Line 198: metrics_.addTimer(STORAGE_MD_LOAD_DURATION_METRIC);
> Yes, it is intentional. It is in the commit message: Add metrics to record
Then we should add a snippet of the updated metrics to the commit message 
(similar to what was done for the runtime profile).



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 29 May 2019 16:40:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3416/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 9
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 28 May 2019 23:09:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-28 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 9:

(9 comments)

Patch 9 should address most issues.

http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG@9
PS8, Line 9: Add metrics to record storage wait time for operations with
> I think we need a clearer definition of "storage-load-time" spelled out in
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift@487
PS7, Line 487:   15: optional i64 storage_metadata_load_time
> nit: this patch seems to be interchanging "filesystem" and "storage" throug
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS7, Line 235:   // the tables already loaded before the query was called).
> You can replace this for loop with a Java stream. Should look something lik
I cannot see many advantages of using stream here, but change according to the 
recommendations.


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@246
PS7, Line 246:   + "storage-load-time=%dms",
> nit: use TimeUnit to convert from nanoseconds to milliseconds:
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1045
PS7, Line 1045:* DROP + CREATE. It removes from this table partitions that 
no longer exist
> nit: return --> Returns
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202
PS7, Line 202:   try {
> this changes the exception handling of `loadAllColumnStats`, if this method
protected void loadAllColumnStats(IMetaStoreClient client)
This method does not throw Exceptions.


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@117
PS7, Line 117:   // Time spent in the source systems loading the fs metadata.
> nit: change to storageMetadataLoadTime
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@140
PS7, Line 140:   public static final String REFRESH_DURATION_METRIC = 
"refresh-duration";
> nit: change to STORAGE_METADATA_LOAD_DURATION_METRIC; I think abbreviating
Done


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198
PS7, Line 198: metrics_.addTimer(REFRESH_DURATION_METRIC);
> I'm not sure if this is intentional, but this approach also adds this metri
Yes, it is intentional. It is in the commit message: Add metrics to record 
storage wait time for operations with
metadata load in catalog



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 9
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 28 May 2019 22:31:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-28 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#9).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.
Storage-load-time is the amount of time spent loading
metadata from the underlying storage layer (e.g. S3, HDFS,
Kudu, HBase), which does not  include the amount of time
spending loading data from HMS.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output(storage-load-time is the added property):
After run a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 131 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/9
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 9
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-28 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG@9
PS8, Line 9: Add metrics to record storage loading time in catalog for hdfs, 
kudu
I think we need a clearer definition of "storage-load-time" spelled out in the 
commit message. We should be clear that it refers to the amount of time spent 
waiting loading metadata from the underlying storage layer (e.g. S3, HDFS, 
Kudu, HBase) and *not* does not include the amount of time spending loading 
data from HMS.


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237
PS3, Line 1237:   final Timer.Context ctxStg =
> Do not understand your comment.
I think Bharath means that this code is measuring the amount of time it takes 
to load metadata from HMS, but what we want to measure is the amount of time it 
takes to load metadata from HDFS (e.g. from the NameNode).

https://gerrit.cloudera.org/#/c/12950/ looks like it re-factors the code so all 
the filesystem loading is abstracted into a single method (I think 
loadFileMetadataForPartitions).



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 28 May 2019 18:54:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3387/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 8
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 24 May 2019 20:51:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#8).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output(storage-load-time is the added property):
After run a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 126 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/8
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 8
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7:

(9 comments)

Haven't gone through the code in HdfsTable.java yet, but here are several more 
comments.

http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift@487
PS7, Line 487:   15: optional i64 filesystem_metadata_load_time
nit: this patch seems to be interchanging "filesystem" and "storage" throughout 
the variable names. I would suggest just sticking to the more generic "storage" 
rather than "filesystem" throughout this patch, mainly because we typically use 
"filesystem" to refer to HDFS, whereas HBase / Kudu are more storage systems, 
since they really provide typical filesystem semantics


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS7, Line 235:   for (FeTable loadedTbl: loadedTbls_.values()) {
You can replace this for loop with a Java stream. Should look something like:

 long storageLoadTimeNano = 
loadedTbls_.values().stream().filter(Table.class::isInstance).map(Table.class::cast).mapToLong(Table::getStorageLoadTime()).sum();

Syntax might not be exactly correct, but that should be the general idea.


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@246
PS7, Line 246:   storageLoadTimeNano/100));
nit: use TimeUnit to convert from nanoseconds to milliseconds:

 TimeUnit.MILLISECONDS.convert(storageLoadTimeNano, TimeUnit.NANOSECONDS);


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1045
PS7, Line 1045:* the table partitions if 'partitionsToUpdate' is null. 
return the time spent
nit: return --> Returns


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java
File fe/src/main/java/org/apache/impala/catalog/KuduTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202
PS7, Line 202:   loadAllColumnStats(msClient);
this changes the exception handling of `loadAllColumnStats`, if this method 
throws an `ImpalaRuntimeException` it is no longer caught and re-thrown.

I would suggest moving this back to its original location, and just using the 
timer to surround the call to `loadSchemaFromKudu`


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@117
PS7, Line 117:   protected long fileSystemMdLoadTime_=0;
nit: change to storageMetadataLoadTime


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@140
PS7, Line 140:   public static final String STORAGE_MD_LOAD_DURATION_METRIC =
nit: change to STORAGE_METADATA_LOAD_DURATION_METRIC; I think abbreviating 
METADATA as MD is a little unclear

would be good to add some javadocs specific to this metric explaining what 
exactly it measures


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198
PS7, Line 198: metrics_.addTimer(STORAGE_MD_LOAD_DURATION_METRIC);
I'm not sure if this is intentional, but this approach also adds this metric to 
the catalogd table metrics. so if you look at:

 http://[hostname-running-catalogd]:25020/table_metrics?name=default.web_returns

this metric will show up there as well. I actually think its useful to have 
this in the catalogd UI web page as well, but we should add this to the commit 
message as well.


http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@203
PS7, Line 203:   public long getStorageLoadTime() { return 
fileSystemMdLoadTime_; }
nit: since this is a public method, it should have some code comments



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: 

[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567
PS6, Line 567:   def test_query_profile_storage_load_time(self):
> The skips are all related to hbase. It is the same as other hbase related t
If you include the annotation @SkipIfS3 or @SkipIfABFS then it won't run during 
our automated Impala-S3 / Impala-ABFS tests. The ".hbase" suffix just displays 
the reason why it was disabled.

I think this test needs to be split up so that we separate the HBase / Kudu 
specific portion of this test from the HDFS portion.

It's a bit confusing, but HDFS specific tests are capable of running against S3 
/ ADLS, because instead of storing the data on HDFS, it is stored on S3 or 
ADLS. Whereas for HBase / Kudu, there is no equivalent of running a HBase / 
Kudu specific test against S3.

The reason is that HDFS provides a FileSystem interface that allows clients to 
interact with any storage system, notably S3 and ADLS. Whereas clients of HBase 
/ Kudu don't use the FileSystem interface.



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 24 May 2019 17:23:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3381/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Fri, 24 May 2019 16:36:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#7).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output(storage-load-time is the added property):
After run a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M testdata/cluster/node_templates/common/etc/kudu/master.conf.tmpl
M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl
M tests/query_test/test_observability.py
9 files changed, 132 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/7
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 7
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-24 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 6:

(13 comments)

Will submit patch set 7

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG@26
PS6, Line 26:   - Metadata load finished. loaded-tables=1/1
> It's not clear to me what part of this profile existing before the patch, a
Done


http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift@474
PS1, Line 474: 12: optional TDataSourceTable data_source_table
> Maybe say "Set if this table needs storage access" ?
Done


http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476
PS3, Line 476:  kudu table
> Try to use meaningful variable names whenever possible (for readability). s
Done


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@229
PS3, Line 229:
> meaningful variable names (for readability)
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS6, Line 232:   long storageLdTmNano = 0;
> nit: I think storageLoadTimeNano is clearer
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@234
PS6, Line 234:   // the tables already loaded before the query called).
> nit: before the the query *was* called
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS6, Line 235:   for (FeTable ldTbl: loadedTbls_.values()) {
> nit: ldTbl --> loadedTbl
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@259
PS6, Line 259:
> nit: is this necessary?
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105
PS6, Line 105:   final Timer.Context ctxStorageLdTm =
> nit: I think storageLoadTimer is a better variable name; in general I would
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@120
PS6, Line 120:
> nit: is this necessary
Done


http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py@573
PS1, Line 573: runtime_profile = self.execute_query(query).runtime_profile
> How about having the method signature as test_query_profile_storage_load_ti
Done


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567
PS6, Line 567:   def test_query_profile_storge_load_time(self):
> the setup of this test seems a little off. right now this gets skipped for
The skips are all related to hbase. It is the same as other hbase related tests 
in the file. I thought Hbase may not work well with these system. I did not 
block tests for s3 against HDFS. And I thought if we do not skip S3, it will be 
run by some kind of tests, right? I will make code change related to the 
duplicate codes.


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@575
PS6, Line 575: # Call the second time, no metastore loading needed.
> is this a HDFS specific feature? do we cache metadata for Kudu and HBase ta
We call Kudu and HBase during metadata loading. For example
loadSchemaFromKudu() and warming up for HBase (first connect to HBase)



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 

[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-23 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

(9 comments)

a few comments / questions; still working on understanding the internals of 
this code, so will add some more comments later;

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG@26
PS6, Line 26:   load-requests=1 catalog-updates=3
It's not clear to me what part of this profile existing before the patch, and 
what was added by this patch. Can you specifically call out the parts that 
changed? I think it makes the commit message more informative.


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS6, Line 232:   for (FeTable ldTbl: loadedTbls_.values()) {
nit: I think storageLoadTimeNano is clearer


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@234
PS6, Line 234: 
requestedTbls.contains(((Table)ldTbl).getTableName())) {
nit: before the the query *was* called


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS6, Line 235:   strgTmNano += ((Table)ldTbl).getStorageLoadTime();
nit: ldTbl --> loadedTbl


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@259
PS6, Line 259: Set viewTbls = new HashSet<>();
nit: is this necessary?


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105
PS6, Line 105:   final Timer.Context ctxStg =
nit: I think storageLoadTimer is a better variable name; in general I wouldn't 
abbreviate words inside variables names unless really necessary (e.g. if it 
would make the variable name way to long)


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@120
PS6, Line 120:   // Set table stats.
nit: is this necessary


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567
PS6, Line 567:
the setup of this test seems a little off. right now this gets skipped for S3, 
Local, Isilon, ABFS, and ADLS tests; but I see no reason why this feature 
shouldn't work for all the files stores as well right? especially for S3 where 
storage load time could be long because S3 metadata operations can take a long 
time.

I would suggest re-organizing this test as follows:

* There is a lot of duplicate code here, I would suggest creating an internal 
helper method that runs the invalidate query, runs the select count(*) against 
a configurable database, checks for the storage-load-time, re-runs the query, 
and checks for storage-load-time
* you could either use the Python annotations to setup the correct test matrix, 
or use the IS_* flags from tests.util.filesystem_utils

I think the correct test matrix would be:

* When running regular tests (e.g. against HDFS), run the queries against HDFS, 
HBase, and Kudu
* When running S3 tests, only run against S3 (S3 tests load all HDFS tables 
onto S3, so you can use functional.alltypes; unlike HBase and Kudu, there is no 
S3 specific table)
* The setup for S3 should apply to Local, Isilon, ABFS, ADLS as well


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@575
PS6, Line 575: storage load time should be in the profile."""
is this a HDFS specific feature? do we cache metadata for Kudu and HBase tables 
too?



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 23 May 2019 20:18:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3337/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 22 May 2019 20:51:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/3335/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 22 May 2019 20:47:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-22 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#6).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
After run a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 129 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/6
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 6
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@930
PS5, Line 930: final Timer storageLdTimer = 
getMetrics().getTimer(Table.STORAGE_MD_LOAD_DURATION_METRIC);
line too long (98 > 90)


http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/Table.java@140
PS5, Line 140:   public static final String STORAGE_MD_LOAD_DURATION_METRIC = 
"storage-metadata-load-duration";
line too long (96 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Wed, 22 May 2019 19:46:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-05-22 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil 
Takiar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#5).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage wait time for operations with
metadata load in catalog for hdfs, kudu and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage load time in query profile.

Testing:
Ran queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit tests to test_observability.py
Ran all core tests.

Sample output:
After run a hbase query (Metadata load finished is divided into
several lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 127 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/5
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 5
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-11 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2739/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Thu, 11 Apr 2019 17:15:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-11 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#4).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage loading time in catalog for hdfs, kudu
and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage loading time in query profile.

Testing:
Run queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit test to test_observability.py

Sample output:
After run a hbase query(Metadata load finished is divided into several
lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 103 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/4
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 4
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-09 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

(4 comments)

Will fix the names, please see my responses for other comments.

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS3, Line 232: loadedTbls_
> How do you distinguish between tables loaded now vs tables already loaded.
See line 234:
requestedTbls.contains(((Table)ldTbl).getTableName()))


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@233
PS3, Line 233: ldTbl instanceof Table
> This is kinda obvious, force a downcast?
FeTable is a interface, Table just implement it, so not all FeTable is Table. 
So I think it is necessary.
private final Map loadedTbls_ = new HashMap<>();


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@109
PS3, Line 109:  hbaseTableName_ = Util.getHBaseTableName(getMetaStoreTable());
 : // Warm up the connection and verify the table exists.
 : Util.getHBaseTable(hbaseTableName_).close();
 : columnFamilies_ = null;
 : cols = Util.loadColumns(msTable_);
 :   } finally {
 : storageLdTime_ = ctxStg.stop();
> I don't think any of this calls into HBase
Util.getHBaseTable(hbaseTableName_).close();
 is the only call in load method that calls hbase.


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237
PS3, Line 1237:   final Timer.Context ctxStg =
> I think the scopes for metrics are incorrect. I'd suggest to do this on top
Do not understand your comment.



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Yongzhi Chen 
Gerrit-Comment-Date: Tue, 09 Apr 2019 14:09:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-08 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@474
PS3, Line 474: Set iff this is a table needs storage access.
I think this is vague. Better description?


http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476
PS3, Line 476: storage_ld_time
Try to use meaningful variable names whenever possible (for readability). say 
something like filesystem_metadata_load_time or something along those lines?


http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476
PS3, Line 476: optional
Why is it optional?


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@229
PS3, Line 229: strgTmNano
meaningful variable names (for readability)


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS3, Line 232: ldTbl
Better variable names


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS3, Line 232: loadedTbls_
How do you distinguish between tables loaded now vs tables already loaded.

For ex: if 3 of 5 tables need loading (meaning 2 already cached), this loop 
seems to sum the metric for all the 5 tables.


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@233
PS3, Line 233: ldTbl instanceof Table
This is kinda obvious, force a downcast?


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105
PS3, Line 105: ctxStg
better variable names


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@109
PS3, Line 109:  hbaseTableName_ = Util.getHBaseTableName(getMetaStoreTable());
 : // Warm up the connection and verify the table exists.
 : Util.getHBaseTable(hbaseTableName_).close();
 : columnFamilies_ = null;
 : cols = Util.loadColumns(msTable_);
 :   } finally {
 : storageLdTime_ = ctxStg.stop();
I don't think any of this calls into HBase


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1215
PS3, Line 1215: storageLdTm
Better variable names


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237
PS3, Line 1237:   final Timer.Context ctxStg =
I think the scopes for metrics are incorrect. I'd suggest to do this on top of 
https://gerrit.cloudera.org/#/c/12950/ which refactors the code nicely.


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@115
PS3, Line 115:   // Storage load time for the table
Time spent in the source systems loading the fs metadata... (or something 
similar?)


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@116
PS3, Line 116: storageLdTime_
better name? fileSystemMdLoadTime or something along those lines?


http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@193
PS3, Line 193: STORAGE_LOAD_DURATION_METRIC
better name



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 08 Apr 2019 23:16:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2663/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 05 Apr 2019 22:28:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2662/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 05 Apr 2019 22:18:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#3).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage loading time in catalog for hdfs, kudu
and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage loading time in query profile.

Testing:
Run queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit test to test_observability.py

Sample output:
After run a hbase query(Metadata load finished is divided into several
lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 102 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/3
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Yongzhi Chen (Code Review)
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/12940

to look at the new patch set (#2).

Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage loading time in catalog for hdfs, kudu
and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage loading time in query profile.

Testing:
Run queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit test to test_observability.py

Sample output:
After run a hbase query(Metadata load finished is divided into several
lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 102 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/2
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 2
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Anurag Mantripragada (Code Review)
Anurag Mantripragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1: Code-Review+1

(2 comments)

Apart from nits, LGTM. Will let others take a look.

http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift@474
PS1, Line 474: // Set iff this is a table needs access storage.
Maybe say "Set if this table needs storage access" ?


http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py@573
PS1, Line 573:   def test_query_profile_storge_load(self):
How about having the method signature as 
test_query_profile_storage_load_time(self) ?



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 05 Apr 2019 21:19:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/2661/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 05 Apr 2019 18:11:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile

2019-04-05 Thread Yongzhi Chen (Code Review)
Yongzhi Chen has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12940


Change subject: IMPALA-7322: Add storage wait time to profile
..

IMPALA-7322: Add storage wait time to profile

Add metrics to record storage loading time in catalog for hdfs, kudu
and hbase tables.
Pass storage wait time from catalog to fe through thrift and log
total storage loading time in query profile.

Testing:
Run queries that can trigger all of, none of or some of the related
tables loading. Check query profile for each query. Check catalog
metrics for each table.
Add unit test to test_observability.py

Sample output:
After run a hbase query(Metadata load finished is divided into several
lines because of limitation of commit message):
Query Compilation: 4s401ms
  - Metadata load started: 661.084us (661.084us)
  - Metadata load finished. loaded-tables=1/1
  load-requests=1 catalog-updates=3
  storage-load-time=233ms: 3s819ms (3s819ms)
 - Analysis finished: 3s820ms (763.979us)
 - Value transfer graph computed: 3s820ms (63.193us)
Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/KuduTable.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M tests/query_test/test_observability.py
7 files changed, 102 insertions(+), 17 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/1
--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen