[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: * Ran queries that can trigger all of, none of or some of the related tables loading. * Check query profile for each query. * Check catalog metrics for each table. * Add unit tests to test_observability.py * Ran all core tests. Sample output: Profile for Catalog V1: (storage-load-time is the added property and it is part of Metadata load in Query Compilation): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2: (StorageLoad.Time is the added property and it is in CatalogFetch): Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 3ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 - CatalogFetch.PartitionLists.Time: 4ms - CatalogFetch.Partitions.Hits: 2 - CatalogFetch.Partitions.Misses: 1 - CatalogFetch.Partitions.Requests: 3 - CatalogFetch.Partitions.Time: 1ms - CatalogFetch.RPCs.Bytes: 1.01 KB (1036) - CatalogFetch.RPCs.Requests: 4 - CatalogFetch.RPCs.Time: 93ms - CatalogFetch.StorageLoad.Time: 68ms - CatalogFetch.TableNames.Hits: 2 - CatalogFetch.TableNames.Requests: 2 - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 - CatalogFetch.Tables.Requests: 1 - CatalogFetch.Tables.Time: 91ms Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Reviewed-on: http://gerrit.cloudera.org:8080/13786 Tested-by: Impala Public Jenkins Reviewed-by: Sahil Takiar --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M tests/query_test/test_observability.py 9 files changed, 170 insertions(+), 30 deletions(-) Approvals: Impala Public Jenkins: Verified Sahil Takiar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 8 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: Code-Review+2 Carrying +2 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 17 Sep 2019 20:08:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 17 Sep 2019 19:27:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4956/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 17 Sep 2019 15:21:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4575/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 17 Sep 2019 14:23:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13786 to look at the new patch set (#6). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: * Ran queries that can trigger all of, none of or some of the related tables loading. * Check query profile for each query. * Check catalog metrics for each table. * Add unit tests to test_observability.py * Ran all core tests. Sample output: Profile for Catalog V1: (storage-load-time is the added property and it is part of Metadata load in Query Compilation): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2: (StorageLoad.Time is the added property and it is in CatalogFetch): Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 3ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 - CatalogFetch.PartitionLists.Time: 4ms - CatalogFetch.Partitions.Hits: 2 - CatalogFetch.Partitions.Misses: 1 - CatalogFetch.Partitions.Requests: 3 - CatalogFetch.Partitions.Time: 1ms - CatalogFetch.RPCs.Bytes: 1.01 KB (1036) - CatalogFetch.RPCs.Requests: 4 - CatalogFetch.RPCs.Time: 93ms - CatalogFetch.StorageLoad.Time: 68ms - CatalogFetch.TableNames.Hits: 2 - CatalogFetch.TableNames.Requests: 2 - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 - CatalogFetch.Tables.Requests: 1 - CatalogFetch.Tables.Time: 91ms Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M tests/query_test/test_observability.py 9 files changed, 170 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/6 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 6: patch 6 fix the test: because of the name change of the property, an old test was failed. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 17 Sep 2019 13:44:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/4952/ -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 16 Sep 2019 21:51:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4952/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 16 Sep 2019 17:39:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: Re-format the commit message. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 11 Sep 2019 14:03:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13786 to look at the new patch set (#4). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: * Ran queries that can trigger all of, none of or some of the related tables loading. * Check query profile for each query. * Check catalog metrics for each table. * Add unit tests to test_observability.py * Ran all core tests. Sample output: Profile for Catalog V1: (storage-load-time is the added property and it is part of Metadata load in Query Compilation): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2: (StorageLoad.Time is the added property and it is in CatalogFetch): Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 3ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 - CatalogFetch.PartitionLists.Time: 4ms - CatalogFetch.Partitions.Hits: 2 - CatalogFetch.Partitions.Misses: 1 - CatalogFetch.Partitions.Requests: 3 - CatalogFetch.Partitions.Time: 1ms - CatalogFetch.RPCs.Bytes: 1.01 KB (1036) - CatalogFetch.RPCs.Requests: 4 - CatalogFetch.RPCs.Time: 93ms - CatalogFetch.StorageLoad.Time: 68ms - CatalogFetch.TableNames.Hits: 2 - CatalogFetch.TableNames.Requests: 2 - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 - CatalogFetch.Tables.Requests: 1 - CatalogFetch.Tables.Time: 91ms Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M tests/query_test/test_observability.py 9 files changed, 169 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/4 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: Code-Review+1 (1 comment) Just a comment on the commit message. I'm not super familiar with this code, so would be good if @Bharath could take a look as well http://gerrit.cloudera.org:8080/#/c/13786/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13786/3//COMMIT_MSG@26 PS3, Line 26: Sample output: : Profile for Catalog V1: (storage-load-time is the added property and : it is part of Metadata load in Query Compilation): : After ran a hbase query (Metadata load finished is divided into : several lines because of limitation of commit message): : Query Compilation: 4s401ms : - Metadata load started: 661.084us (661.084us) : - Metadata load finished. loaded-tables=1/1 : load-requests=1 catalog-updates=3 : storage-load-time=233ms: 3s819ms (3s819ms) : - Analysis finished: 3s820ms (763.979us) : - Value transfer graph computed: 3s820ms (63.193us) : Profile for Catalog V2: (StorageLoad.Time is the added property and it : is in CatalogFetch): : Frontend: :- CatalogFetch.ColumnStats.Misses: 1 :- CatalogFetch.ColumnStats.Requests: 1 :- CatalogFetch.ColumnStats.Time: 0 :- CatalogFetch.Config.Misses: 1 :- CatalogFetch.Config.Requests: 1 :- CatalogFetch.Config.Time: 3ms :- CatalogFetch.DatabaseList.Hits: 1 :- CatalogFetch.DatabaseList.Requests: 1 :- CatalogFetch.DatabaseList.Time: 0 :- CatalogFetch.PartitionLists.Misses: 1 :- CatalogFetch.PartitionLists.Requests: 1 :- CatalogFetch.PartitionLists.Time: 4ms :- CatalogFetch.Partitions.Hits: 2 :- CatalogFetch.Partitions.Misses: 1 :- CatalogFetch.Partitions.Requests: 3 :- CatalogFetch.Partitions.Time: 1ms :- CatalogFetch.RPCs.Bytes: 1.01 KB (1036) :- CatalogFetch.RPCs.Requests: 4 :- CatalogFetch.RPCs.Time: 93ms :- CatalogFetch.StorageLoad.Time: 68ms :- CatalogFetch.TableNames.Hits: 2 :- CatalogFetch.TableNames.Requests: 2 :- CatalogFetch.TableNames.Time: 0 :- CatalogFetch.Tables.Misses: 1 :- CatalogFetch.Tables.Requests: 1 :- CatalogFetch.Tables.Time: 91ms : Catalog metrics(this sample is from a hdfs table): : storage-metadata-load-duration: :Count: 1 :Mean rate: 0.0085 :1 min. rate: 0.032 :5 min. rate: 0.1386 :15 min. rate: 0.177 :Min (msec): 111 :Max (msec): 111 :Mean (msec): 111.1802 :Median (msec): 111.1802 :75th-% (msec): 111.1802 :95th-% (msec): 111.1802 :99th-% (msec): 111.1802 I still find this hard to understand, would recommend re-formatting as follows: Sample Output: Profile for Catalog V1 ('storage-load-time' is the added property and it is part of the 'Metadata load finished' section of the profile under the 'Query Compilation' section. The example below was taken after running a HBase query. 'Metadata load finished' is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2 ('StorageLoad.Time' is the added property and it is part of the 'CatalogFetch' counters): Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 3ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 - CatalogFetch.PartitionLists.Time: 4ms - CatalogFetch.Partitions.Hits: 2 -
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4500/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 09 Sep 2019 16:34:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13786 to look at the new patch set (#3). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: * Ran queries that can trigger all of, none of or some of the related tables loading. * Check query profile for each query. * Check catalog metrics for each table. * Add unit tests to test_observability.py * Ran all core tests. Sample output: Profile for Catalog V1: (storage-load-time is the added property and it is part of Metadata load in Query Compilation): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2: (StorageLoad.Time is the added property and it is in CatalogFetch): Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 3ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 - CatalogFetch.PartitionLists.Time: 4ms - CatalogFetch.Partitions.Hits: 2 - CatalogFetch.Partitions.Misses: 1 - CatalogFetch.Partitions.Requests: 3 - CatalogFetch.Partitions.Time: 1ms - CatalogFetch.RPCs.Bytes: 1.01 KB (1036) - CatalogFetch.RPCs.Requests: 4 - CatalogFetch.RPCs.Time: 93ms - CatalogFetch.StorageLoad.Time: 68ms - CatalogFetch.TableNames.Hits: 2 - CatalogFetch.TableNames.Requests: 2 - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 - CatalogFetch.Tables.Requests: 1 - CatalogFetch.Tables.Time: 91ms Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M tests/query_test/test_observability.py 9 files changed, 169 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/3 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 2: (8 comments) Submit patch 3 to address the code review http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@18 PS2, Line 18: Testing: Ran queries that can trigger all of, none of or : some of the related tables loading. : Check query profile for each query. : Check catalog metrics for each table. : Add unit tests to test_observability.py : Ran all core tests. > nit: if these are a list of testing activities done, i generally like to fo Done http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@27 PS2, Line 27: After ran a hbase query (Metadata load finished is divided into several lines > nit: line too long Done http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@28 PS2, Line 28: because of limitation of commit message): Query Compilation: 4s401ms > nit: the profile snippet should be on a newline Done http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@35 PS2, Line 35: Profile for Catalog V2: > would be good to explain how storage-load-time is represented in the v1 pro Done http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@46 PS2, Line 46: storage-load-time > nit: why is this formatted differently from everything else - e.g. would I will change to CatalogFetch.StorageLoad.Time which consistent with V1 and has three levels. http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift@496 PS2, Line 496: 15: optional i64 storage_metadata_load_time > nit: change to storage_metadata_load_time_ns so we know it is in nanosecond Done http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@23 PS2, Line 23: SkipIfCatalogV2 > is this needed? Done http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@684 PS2, Line 684: def test_query_profile_storage_load_time(self): > nit: might be cleaner to split this into two tests, one for kudu and one fo I think they can put together: They are similar and do not belong to filesystem loading. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 09 Sep 2019 16:15:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 2: (8 comments) http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@18 PS2, Line 18: Testing: Ran queries that can trigger all of, none of or : some of the related tables loading. : Check query profile for each query. : Check catalog metrics for each table. : Add unit tests to test_observability.py : Ran all core tests. nit: if these are a list of testing activities done, i generally like to format it as follows Testing: * Ran queries that can trigger all of, none of or some of the related tables loading. * Check query profile for each query. * Check catalog metrics for each table. * dd unit tests to test_observability.py. * Ran all core tests. http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@27 PS2, Line 27: After ran a hbase query (Metadata load finished is divided into several lines nit: line too long http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@28 PS2, Line 28: because of limitation of commit message): Query Compilation: 4s401ms nit: the profile snippet should be on a newline http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@35 PS2, Line 35: Profile for Catalog V2: would be good to explain how storage-load-time is represented in the v1 profile vs the v2 profile http://gerrit.cloudera.org:8080/#/c/13786/2//COMMIT_MSG@46 PS2, Line 46: storage-load-time nit: why is this formatted differently from everything else - e.g. would - CatalogFetch.Metadata.Storage.Time Be more consistent? I'm assuming "CatalogFetch.ColumnStats.Time" is how long it takes to load column stats? http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/13786/2/common/thrift/CatalogObjects.thrift@496 PS2, Line 496: 15: optional i64 storage_metadata_load_time nit: change to storage_metadata_load_time_ns so we know it is in nanoseconds http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@23 PS2, Line 23: SkipIfCatalogV2 is this needed? http://gerrit.cloudera.org:8080/#/c/13786/2/tests/query_test/test_observability.py@684 PS2, Line 684: def test_query_profile_storage_load_time(self): nit: might be cleaner to split this into two tests, one for kudu and one for hbase -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 06 Sep 2019 16:51:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4485/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 06 Sep 2019 13:25:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 2: submit patch set 2 to include profile support for catalog V2 Ran exhaustive test: Only one unrelated failure: query_test/test_result_spooling.py (test_full_queue_large_fetch) -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 06 Sep 2019 12:47:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/13786 to look at the new patch set (#2). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile for Catalog V1: (storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Profile for Catalog V2: Frontend: - CatalogFetch.ColumnStats.Misses: 1 - CatalogFetch.ColumnStats.Requests: 1 - CatalogFetch.ColumnStats.Time: 16ms - CatalogFetch.Config.Misses: 1 - CatalogFetch.Config.Requests: 1 - CatalogFetch.Config.Time: 24ms - CatalogFetch.DatabaseList.Hits: 1 - CatalogFetch.DatabaseList.Requests: 1 - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.Metadata.storage-load-time: 218ms - CatalogFetch.PartitionLists.Misses: 1 - CatalogFetch.PartitionLists.Requests: 1 Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 --- M common/thrift/CatalogObjects.thrift M common/thrift/CatalogService.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M tests/query_test/test_observability.py 9 files changed, 160 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/2 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: Can you rebase? I can submit for a GVO. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 09 Aug 2019 17:14:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13786/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13786/1//COMMIT_MSG@7 PS1, Line 7: the formatting of this commit message looks off, is there a reason it is all indented? -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 01 Aug 2019 20:05:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: Code-Review+1 +1 pending rebase -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 01 Aug 2019 20:04:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: I created a jira for the local catalog issue: https://issues.apache.org/jira/browse/IMPALA-8822 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 01 Aug 2019 16:40:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589 PS1, Line 589: @SkipIfCatalogV2.hms_event_polling_enabled() > It works with Catalog V2, but when catalog V2 + hms event polling, running - running a query triggers no metadata loading at all. Do you know why? Agree with Sahil that we would permanently lose coverage if we skip it without understanding why. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 10 Jul 2019 00:24:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589 PS1, Line 589: @SkipIfCatalogV2.hms_event_polling_enabled() > we should really try to get this to work in Catalog V2, as that will eventu It works with Catalog V2, but when catalog V2 + hms event polling, running a query triggers no metadata loading at all. Storage loading is part of the metastore loading, we can only verify it when metastore loading existing in profile. This test is just creating a test env to make sure this loading can really happen. The skip is to avoid flaky tests. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 09 Jul 2019 16:59:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/13786/1/tests/query_test/test_observability.py@589 PS1, Line 589: @SkipIfCatalogV2.hms_event_polling_enabled() we should really try to get this to work in Catalog V2, as that will eventually be the default in our tests at some point. IMPALA-8627 / IMPALA-8715 / IMPALA-8660 is working to achieve that. Otherwise, this test will never run by default, which sort of defeats the purpose of adding it in the first place. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 05 Jul 2019 16:53:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3804/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 02 Jul 2019 17:43:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/13786 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: The original review is in https://gerrit.cloudera.org/c/13738/ -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 02 Jul 2019 17:04:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13786 Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/common/skip.py M tests/query_test/test_observability.py 8 files changed, 144 insertions(+), 30 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/13786/1 -- To view, visit http://gerrit.cloudera.org:8080/13786 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7447f8c8e7e50eb71d18643859d2e3de865368d2 Gerrit-Change-Number: 13786 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 15: Actually the test is also flaky with metadata v1 as far as I can see, since it can run in parallel with other tests that touch functional.alltypes. I'm going to revert this and my change. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 15 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 26 Jun 2019 17:06:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 15: It fails reliably with the local catalog, so raced with my change to enable local catalog in precommit. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 15 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 26 Jun 2019 17:04:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 15: It looks like the tests added here are a little flaky. https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/630/testReport/junit/query_test.test_observability/TestObservability/test_query_profile_storage_load_time_filesystem/ https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/630/testReport/junit/query_test.test_observability/TestObservability/test_query_profile_storage_load_time/ Probably a matter of time before other builds run into it, just a heads up. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 15 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 26 Jun 2019 15:40:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Reviewed-on: http://gerrit.cloudera.org:8080/12940 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 131 insertions(+), 29 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 15 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 14: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 14 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 25 Jun 2019 22:02:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4537/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 14 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 13: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 13 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 14 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 25 Jun 2019 16:30:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3725/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 13 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 24 Jun 2019 21:52:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 13: (2 comments) The new patch with the fixes http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@509 PS12, Line 509: > nit: mention the unit too.. ns ms...? Done http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@959 PS12, Line 959:if (loadParitionFileMetadata) { : storageMetadataLoadTime_ += updateUnpartitionedTableFileMd(); : } : } else { : storageMetadataLoadTime_ += updatePartitionsFromHms( : client, partitionsToUpdate, loadParitionFileMetadata); : } : LOG.info("Incrementally loaded table metadata for: " + getFullName()); : } else { : LOG.info("Fetching partition metadata from the Metastore: " + getFullName()); : // Load all partitions from Hive Metastore, including file metadata. : List msPartitions = : MetaStoreUtil.fetchAllPartitions( : client, db_.getName(), name_, NUM_PARTITION_FETCH_RETRIES); : LOG.info("Fetched partition metadata from the Metastore: " + getFullName()); : storageMetadataLoadTime_ = loadAllPartitions(msPartitions, msTbl); : } : if (loadTableSchema) setAvroSchema(client, msTbl); : setTableStats(msTbl); : fileMetadataStats_.unset(); : refreshLastUsedTime(); : } catch (TableLoadingException e) { : throw e; : } catch (Exception e) { : thr > I think you could use a single try finally for Done -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 13 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 24 Jun 2019 21:13:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#13). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 131 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/13 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 13 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 12: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@509 PS12, Line 509: nit: mention the unit too.. ns ms...? http://gerrit.cloudera.org:8080/#/c/12940/12/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@959 PS12, Line 959: try { : storageMetadataLoadTime_ += updateMdFromHmsTable(msTbl); : if (msTbl.getPartitionKeysSize() == 0) { : if (loadParitionFileMetadata) { : storageMetadataLoadTime_ += updateUnpartitionedTableFileMd(); : } : } else { : storageMetadataLoadTime_ += updatePartitionsFromHms( : client, partitionsToUpdate, loadParitionFileMetadata); : } : } finally { : storageLdTimer.update(storageMetadataLoadTime_, TimeUnit.NANOSECONDS); : } : LOG.info("Incrementally loaded table metadata for: " + getFullName()); : } else { : LOG.info("Fetching partition metadata from the Metastore: " + getFullName()); : // Load all partitions from Hive Metastore, including file metadata. : List msPartitions = : MetaStoreUtil.fetchAllPartitions( : client, db_.getName(), name_, NUM_PARTITION_FETCH_RETRIES); : LOG.info("Fetched partition metadata from the Metastore: " + getFullName()); : try { : storageMetadataLoadTime_ = loadAllPartitions(msPartitions, msTbl); : } finally { : I think you could use a single try finally for storageLdTimer.update(storageMetadataLoadTime_, TimeUnit.NANOSECONDS); Avoids unnecessary nesting. http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572 PS11, Line 572: end_time = tree.nodes[1].info_strings["End Time"] : assert len(end_time) == 0, end_time : # Save the last operation to be able to retrieve the profile after closing the query : last_op = handle.get_handle()._last_operation : self.hs2_client.close_query(handle) : tree = last_op.get_profile(TRuntimeProfileFormat.THRIFT) : end_time = tree.nodes[1].info_strings["End Time"] : assert end_time is not None : : def test_query_profile_contains_number_of_fragment_instance(self): : """Test that the expected section for number of fragment instance in : a query profile.""" : event_regexes = [r'Per Host Number of Fragment Instances'] : query = "select count (*) from functional.alltypes" : runtime_profile = self.execute_query(query).runtime_profile : self.__verify_profile_event_sequence(event_regexes, runtime_profile) : : def test_query_profile_storage_load_time_filesystem(self): > The test_query_profile_storage_load_time2 only tests kudu and hbase. For hb Ah sorry, I missed the S3 part, my bad. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 12 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 24 Jun 2019 18:29:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 12: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3711/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 12 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 21 Jun 2019 17:11:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 12: (3 comments) Uploaded patch 12 to address the issues. Run all core tests for this patch http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@945 PS11, Line 945: //TODO writeIDs may also be loaded in other code paths. : loadValidWriteIdList(client); : } : : final Timer storageLdTimer = : getMetrics().getTimer(Table.STORAGE_METADATA_LOAD_DURATION_METRIC); : storageMetadataLoadTime_ = 0; : : // Load partition and file metadata : if (reuseMetadata) { : // Incrementally update this table's partitions and file metadata : Preconditions.checkState( : partitionsToUpdate == null || loadParitionFileMetadata); : : try { : storageMetadataLoadTime_ += updateMdFromHmsTable(msTbl); : if (msTbl.getPartitionKeysSize() == 0) { : if (loadParitionFileMetadata) { : storageMetadataLoadTime_ += updateUnpartitionedTableFileMd(); : } : } else { : storageMetadataLoadTime_ += updatePartitionsFromHms( : client, partitionsToUpdate, loadParitionFileMetadata); : } : } finally { : storageLdTimer.update(storageMetadataLoadTime_, TimeUnit.NANOSECONDS); : } : LOG.info("Incrementally loaded table metadata for: " + getFullName()); : } else { > Can we clean this up a bit by making all the update/load methods return the Done http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java@118 PS11, Line 118: s > Mention that it is updated on every reload of the table? Done http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572 PS11, Line 572: end_time = tree.nodes[1].info_strings["End Time"] : assert len(end_time) == 0, end_time : # Save the last operation to be able to retrieve the profile after closing the query : last_op = handle.get_handle()._last_operation : self.hs2_client.close_query(handle) : tree = last_op.get_profile(TRuntimeProfileFormat.THRIFT) : end_time = tree.nodes[1].info_strings["End Time"] : assert end_time is not None : : def test_query_profile_contains_number_of_fragment_instance(self): : """Test that the expected section for number of fragment instance in : a query profile.""" : event_regexes = [r'Per Host Number of Fragment Instances'] : query = "select count (*) from functional.alltypes" : runtime_profile = self.execute_query(query).runtime_profile : self.__verify_profile_event_sequence(event_regexes, runtime_profile) : : def test_query_profile_storage_load_time_filesystem(self): > you could just do something like The test_query_profile_storage_load_time2 only tests kudu and hbase. For hbase and kudu, we only use hdfs, other filessystems are skipped. In order to skip the filesystems such as S3, you need to tell a reason, so .hbase is added (But both kudu and hbase tests will be skipped). For the funational tests, all the filesystem will be tested, it does not skip any tests. So we need two tests, I will change the two test names. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 12 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#12). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 142 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/12 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 12 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 11: (3 comments) lgtm, a few suggestions for cleanup, can +2 after that. http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@945 PS11, Line 945: : final Timer storageLdTimer = : getMetrics().getTimer(Table.STORAGE_METADATA_LOAD_DURATION_METRIC); : final Clock clock = Clock.defaultClock(); : long startTime; : storageMetadataLoadTime_ = 0; : : // Load partition and file metadata : if (reuseMetadata) { : // Incrementally update this table's partitions and file metadata : Preconditions.checkState( : partitionsToUpdate == null || loadParitionFileMetadata); : : try { : startTime = clock.getTick(); : updateMdFromHmsTable(msTbl); : // Time spent on getting file system and check access levels. : storageMetadataLoadTime_ += clock.getTick() - startTime; : if (msTbl.getPartitionKeysSize() == 0) { : startTime = clock.getTick(); : if (loadParitionFileMetadata) updateUnpartitionedTableFileMd(); : storageMetadataLoadTime_ += clock.getTick() - startTime; : } else { : long partitionfileLoadTime = updatePartitionsFromHms( : client, partitionsToUpdate, loadParitionFileMetadata); : storageMetadataLoadTime_ += partitionfileLoadTime; : } : } finally { : storageLdTim Can we clean this up a bit by making all the update/load methods return the time spent in the loading, sum up all of them and finally update the metric? I think that will be cleaner. http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/11/fe/src/main/java/org/apache/impala/catalog/Table.java@118 PS11, Line 118: . Mention that it is updated on every reload of the table? http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/11/tests/query_test/test_observability.py@572 PS11, Line 572: def test_query_profile_storage_load_time1(self): : """Test that when a query needs load metadata for table(s), the : storage load time should be in the profile. Tests file systems.""" : self.__check_query_profile_storage_load_time("functional") : : @SkipIfS3.hbase : @SkipIfLocal.hbase : @SkipIfIsilon.hbase : @SkipIfABFS.hbase : @SkipIfADLS.hbase : def test_query_profile_storage_load_time2(self): : """Test that when a query needs load metadata for table(s), the : storage load time should be in the profile. Tests kudu and hbase.""" : # KUDU table : self.__check_query_profile_storage_load_time("functional_kudu") : : # HBASE table : self.__check_query_profile_storage_load_time("functional_hbase") you could just do something like for suffix in ["", "_kudu", "_hbase"]: check_query_...("functional" + siffix) Please avoid test names that are hard to follow. Btw, why are we skipping hbase? -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 11 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 20 Jun 2019 18:13:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 11: Code-Review+1 Any rebase conflicts that need to be reviewed? Otherwise, carrying +1 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 11 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 20 Jun 2019 17:42:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3693/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 11 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 20 Jun 2019 15:31:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 11: Submit patch 11 for the rebase, thanks for the review. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 11 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 20 Jun 2019 14:51:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#11). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 131 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/11 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 11 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 10: Yongzhi, feel free to rebase when you get a chance. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 10 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 18 Jun 2019 21:33:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 10: Code-Review+1 Looks like this needs to be rebased, but overall LGTM. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 10 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 18 Jun 2019 20:55:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3448/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 10 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 30 May 2019 21:14:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 10: (2 comments) Submit a new patch http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@239 PS9, Line 239: .filter(Table.class::isInstance) : .map(Table.class::cast) > do u need a space between ".class" and "::" I did not do that, it was changed by the reformatting tool(recommended by a ramp-up doc): git diff -U0 --no-color HEAD^ | "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/share/clang/clang-format-diff.py" -binary "${IMPALA_TOOLCHAIN}/llvm-${IMPALA_LLVM_VERSION}/bin/clang-format" -p1 -i I will remove the space. http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198 PS7, Line 198: metrics_.addTimer(REFRESH_DURATION_METRIC); > Then we should add a snippet of the updated metrics to the commit message ( Will add. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 10 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 30 May 2019 20:48:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#10). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: Profile:(storage-load-time is the added property): After ran a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Catalog metrics(this sample is from a hdfs table): storage-metadata-load-duration: Count: 1 Mean rate: 0.0085 1 min. rate: 0.032 5 min. rate: 0.1386 15 min. rate: 0.177 Min (msec): 111 Max (msec): 111 Mean (msec): 111.1802 Median (msec): 111.1802 75th-% (msec): 111.1802 95th-% (msec): 111.1802 99th-% (msec): 111.1802 Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 131 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/10 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 10 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS7, Line 235: for (FeTable loadedTbl: loadedTbls_.values()) { > I cannot see many advantages of using stream here, but change according to IMO it's easier to understand. http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/9/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@239 PS9, Line 239: : } do u need a space between ".class" and "::" http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202 PS7, Line 202: loadAllColumnStats(msClient); > protected void loadAllColumnStats(IMetaStoreClient client) hmm originally though ImpalaRuntimeException was a RuntimeException (as the name would imply), but I guess its not http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198 PS7, Line 198: metrics_.addTimer(STORAGE_MD_LOAD_DURATION_METRIC); > Yes, it is intentional. It is in the commit message: Add metrics to record Then we should add a snippet of the updated metrics to the commit message (similar to what was done for the runtime profile). -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 29 May 2019 16:40:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3416/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 9 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 28 May 2019 23:09:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 9: (9 comments) Patch 9 should address most issues. http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG@9 PS8, Line 9: Add metrics to record storage wait time for operations with > I think we need a clearer definition of "storage-load-time" spelled out in Done http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift@487 PS7, Line 487: 15: optional i64 storage_metadata_load_time > nit: this patch seems to be interchanging "filesystem" and "storage" throug Done http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS7, Line 235: // the tables already loaded before the query was called). > You can replace this for loop with a Java stream. Should look something lik I cannot see many advantages of using stream here, but change according to the recommendations. http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@246 PS7, Line 246: + "storage-load-time=%dms", > nit: use TimeUnit to convert from nanoseconds to milliseconds: Done http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1045 PS7, Line 1045:* DROP + CREATE. It removes from this table partitions that no longer exist > nit: return --> Returns Done http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202 PS7, Line 202: try { > this changes the exception handling of `loadAllColumnStats`, if this method protected void loadAllColumnStats(IMetaStoreClient client) This method does not throw Exceptions. http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@117 PS7, Line 117: // Time spent in the source systems loading the fs metadata. > nit: change to storageMetadataLoadTime Done http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@140 PS7, Line 140: public static final String REFRESH_DURATION_METRIC = "refresh-duration"; > nit: change to STORAGE_METADATA_LOAD_DURATION_METRIC; I think abbreviating Done http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198 PS7, Line 198: metrics_.addTimer(REFRESH_DURATION_METRIC); > I'm not sure if this is intentional, but this approach also adds this metri Yes, it is intentional. It is in the commit message: Add metrics to record storage wait time for operations with metadata load in catalog -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 9 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 28 May 2019 22:31:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#9). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Storage-load-time is the amount of time spent loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase), which does not include the amount of time spending loading data from HMS. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output(storage-load-time is the added property): After run a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 131 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/9 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 9 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12940/8//COMMIT_MSG@9 PS8, Line 9: Add metrics to record storage loading time in catalog for hdfs, kudu I think we need a clearer definition of "storage-load-time" spelled out in the commit message. We should be clear that it refers to the amount of time spent waiting loading metadata from the underlying storage layer (e.g. S3, HDFS, Kudu, HBase) and *not* does not include the amount of time spending loading data from HMS. http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237 PS3, Line 1237: final Timer.Context ctxStg = > Do not understand your comment. I think Bharath means that this code is measuring the amount of time it takes to load metadata from HMS, but what we want to measure is the amount of time it takes to load metadata from HDFS (e.g. from the NameNode). https://gerrit.cloudera.org/#/c/12950/ looks like it re-factors the code so all the filesystem loading is abstracted into a single method (I think loadFileMetadataForPartitions). -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 28 May 2019 18:54:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3387/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 8 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 24 May 2019 20:51:57 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#8). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output(storage-load-time is the added property): After run a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 126 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/8 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 8 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: (9 comments) Haven't gone through the code in HdfsTable.java yet, but here are several more comments. http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/7/common/thrift/CatalogObjects.thrift@487 PS7, Line 487: 15: optional i64 filesystem_metadata_load_time nit: this patch seems to be interchanging "filesystem" and "storage" throughout the variable names. I would suggest just sticking to the more generic "storage" rather than "filesystem" throughout this patch, mainly because we typically use "filesystem" to refer to HDFS, whereas HBase / Kudu are more storage systems, since they really provide typical filesystem semantics http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS7, Line 235: for (FeTable loadedTbl: loadedTbls_.values()) { You can replace this for loop with a Java stream. Should look something like: long storageLoadTimeNano = loadedTbls_.values().stream().filter(Table.class::isInstance).map(Table.class::cast).mapToLong(Table::getStorageLoadTime()).sum(); Syntax might not be exactly correct, but that should be the general idea. http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@246 PS7, Line 246: storageLoadTimeNano/100)); nit: use TimeUnit to convert from nanoseconds to milliseconds: TimeUnit.MILLISECONDS.convert(storageLoadTimeNano, TimeUnit.NANOSECONDS); http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1045 PS7, Line 1045:* the table partitions if 'partitionsToUpdate' is null. return the time spent nit: return --> Returns http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java File fe/src/main/java/org/apache/impala/catalog/KuduTable.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/KuduTable.java@202 PS7, Line 202: loadAllColumnStats(msClient); this changes the exception handling of `loadAllColumnStats`, if this method throws an `ImpalaRuntimeException` it is no longer caught and re-thrown. I would suggest moving this back to its original location, and just using the timer to surround the call to `loadSchemaFromKudu` http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@117 PS7, Line 117: protected long fileSystemMdLoadTime_=0; nit: change to storageMetadataLoadTime http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@140 PS7, Line 140: public static final String STORAGE_MD_LOAD_DURATION_METRIC = nit: change to STORAGE_METADATA_LOAD_DURATION_METRIC; I think abbreviating METADATA as MD is a little unclear would be good to add some javadocs specific to this metric explaining what exactly it measures http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@198 PS7, Line 198: metrics_.addTimer(STORAGE_MD_LOAD_DURATION_METRIC); I'm not sure if this is intentional, but this approach also adds this metric to the catalogd table metrics. so if you look at: http://[hostname-running-catalogd]:25020/table_metrics?name=default.web_returns this metric will show up there as well. I actually think its useful to have this in the catalogd UI web page as well, but we should add this to the commit message as well. http://gerrit.cloudera.org:8080/#/c/12940/7/fe/src/main/java/org/apache/impala/catalog/Table.java@203 PS7, Line 203: public long getStorageLoadTime() { return fileSystemMdLoadTime_; } nit: since this is a public method, it should have some code comments -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer:
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567 PS6, Line 567: def test_query_profile_storage_load_time(self): > The skips are all related to hbase. It is the same as other hbase related t If you include the annotation @SkipIfS3 or @SkipIfABFS then it won't run during our automated Impala-S3 / Impala-ABFS tests. The ".hbase" suffix just displays the reason why it was disabled. I think this test needs to be split up so that we separate the HBase / Kudu specific portion of this test from the HDFS portion. It's a bit confusing, but HDFS specific tests are capable of running against S3 / ADLS, because instead of storing the data on HDFS, it is stored on S3 or ADLS. Whereas for HBase / Kudu, there is no equivalent of running a HBase / Kudu specific test against S3. The reason is that HDFS provides a FileSystem interface that allows clients to interact with any storage system, notably S3 and ADLS. Whereas clients of HBase / Kudu don't use the FileSystem interface. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 24 May 2019 17:23:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3381/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Fri, 24 May 2019 16:36:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#7). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output(storage-load-time is the added property): After run a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M testdata/cluster/node_templates/common/etc/kudu/master.conf.tmpl M testdata/cluster/node_templates/common/etc/kudu/tserver.conf.tmpl M tests/query_test/test_observability.py 9 files changed, 132 insertions(+), 31 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/7 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 7 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 6: (13 comments) Will submit patch set 7 http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG@26 PS6, Line 26: - Metadata load finished. loaded-tables=1/1 > It's not clear to me what part of this profile existing before the patch, a Done http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift@474 PS1, Line 474: 12: optional TDataSourceTable data_source_table > Maybe say "Set if this table needs storage access" ? Done http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476 PS3, Line 476: kudu table > Try to use meaningful variable names whenever possible (for readability). s Done http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@229 PS3, Line 229: > meaningful variable names (for readability) Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232 PS6, Line 232: long storageLdTmNano = 0; > nit: I think storageLoadTimeNano is clearer Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@234 PS6, Line 234: // the tables already loaded before the query called). > nit: before the the query *was* called Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS6, Line 235: for (FeTable ldTbl: loadedTbls_.values()) { > nit: ldTbl --> loadedTbl Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@259 PS6, Line 259: > nit: is this necessary? Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105 PS6, Line 105: final Timer.Context ctxStorageLdTm = > nit: I think storageLoadTimer is a better variable name; in general I would Done http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@120 PS6, Line 120: > nit: is this necessary Done http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py@573 PS1, Line 573: runtime_profile = self.execute_query(query).runtime_profile > How about having the method signature as test_query_profile_storage_load_ti Done http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567 PS6, Line 567: def test_query_profile_storge_load_time(self): > the setup of this test seems a little off. right now this gets skipped for The skips are all related to hbase. It is the same as other hbase related tests in the file. I thought Hbase may not work well with these system. I did not block tests for s3 against HDFS. And I thought if we do not skip S3, it will be run by some kind of tests, right? I will make code change related to the duplicate codes. http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@575 PS6, Line 575: # Call the second time, no metastore loading needed. > is this a HDFS specific feature? do we cache metadata for Kudu and HBase ta We call Kudu and HBase during metadata loading. For example loadSchemaFromKudu() and warming up for HBase (first connect to HBase) -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: (9 comments) a few comments / questions; still working on understanding the internals of this code, so will add some more comments later; http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG@26 PS6, Line 26: load-requests=1 catalog-updates=3 It's not clear to me what part of this profile existing before the patch, and what was added by this patch. Can you specifically call out the parts that changed? I think it makes the commit message more informative. http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232 PS6, Line 232: for (FeTable ldTbl: loadedTbls_.values()) { nit: I think storageLoadTimeNano is clearer http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@234 PS6, Line 234: requestedTbls.contains(((Table)ldTbl).getTableName())) { nit: before the the query *was* called http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235 PS6, Line 235: strgTmNano += ((Table)ldTbl).getStorageLoadTime(); nit: ldTbl --> loadedTbl http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@259 PS6, Line 259: Set viewTbls = new HashSet<>(); nit: is this necessary? http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105 PS6, Line 105: final Timer.Context ctxStg = nit: I think storageLoadTimer is a better variable name; in general I wouldn't abbreviate words inside variables names unless really necessary (e.g. if it would make the variable name way to long) http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@120 PS6, Line 120: // Set table stats. nit: is this necessary http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567 PS6, Line 567: the setup of this test seems a little off. right now this gets skipped for S3, Local, Isilon, ABFS, and ADLS tests; but I see no reason why this feature shouldn't work for all the files stores as well right? especially for S3 where storage load time could be long because S3 metadata operations can take a long time. I would suggest re-organizing this test as follows: * There is a lot of duplicate code here, I would suggest creating an internal helper method that runs the invalidate query, runs the select count(*) against a configurable database, checks for the storage-load-time, re-runs the query, and checks for storage-load-time * you could either use the Python annotations to setup the correct test matrix, or use the IS_* flags from tests.util.filesystem_utils I think the correct test matrix would be: * When running regular tests (e.g. against HDFS), run the queries against HDFS, HBase, and Kudu * When running S3 tests, only run against S3 (S3 tests load all HDFS tables onto S3, so you can use functional.alltypes; unlike HBase and Kudu, there is no S3 specific table) * The setup for S3 should apply to Local, Isilon, ABFS, ADLS as well http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@575 PS6, Line 575: storage load time should be in the profile.""" is this a HDFS specific feature? do we cache metadata for Kudu and HBase tables too? -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 23 May 2019 20:18:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3337/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 22 May 2019 20:51:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/3335/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 22 May 2019 20:47:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#6). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: After run a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 129 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/6 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 6 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 5: (2 comments) http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@930 PS5, Line 930: final Timer storageLdTimer = getMetrics().getTimer(Table.STORAGE_MD_LOAD_DURATION_METRIC); line too long (98 > 90) http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/5/fe/src/main/java/org/apache/impala/catalog/Table.java@140 PS5, Line 140: public static final String STORAGE_MD_LOAD_DURATION_METRIC = "storage-metadata-load-duration"; line too long (96 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Wed, 22 May 2019 19:46:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Sahil Takiar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#5). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage wait time for operations with metadata load in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage load time in query profile. Testing: Ran queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit tests to test_observability.py Ran all core tests. Sample output: After run a hbase query (Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 127 insertions(+), 25 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/5 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 5 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2739/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Thu, 11 Apr 2019 17:15:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#4). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage loading time in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage loading time in query profile. Testing: Run queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit test to test_observability.py Sample output: After run a hbase query(Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 103 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/4 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 4 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: (4 comments) Will fix the names, please see my responses for other comments. http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232 PS3, Line 232: loadedTbls_ > How do you distinguish between tables loaded now vs tables already loaded. See line 234: requestedTbls.contains(((Table)ldTbl).getTableName())) http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@233 PS3, Line 233: ldTbl instanceof Table > This is kinda obvious, force a downcast? FeTable is a interface, Table just implement it, so not all FeTable is Table. So I think it is necessary. private final Map loadedTbls_ = new HashMap<>(); http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@109 PS3, Line 109: hbaseTableName_ = Util.getHBaseTableName(getMetaStoreTable()); : // Warm up the connection and verify the table exists. : Util.getHBaseTable(hbaseTableName_).close(); : columnFamilies_ = null; : cols = Util.loadColumns(msTable_); : } finally { : storageLdTime_ = ctxStg.stop(); > I don't think any of this calls into HBase Util.getHBaseTable(hbaseTableName_).close(); is the only call in load method that calls hbase. http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237 PS3, Line 1237: final Timer.Context ctxStg = > I think the scopes for metrics are incorrect. I'd suggest to do this on top Do not understand your comment. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 09 Apr 2019 14:09:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: (14 comments) http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@474 PS3, Line 474: Set iff this is a table needs storage access. I think this is vague. Better description? http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476 PS3, Line 476: storage_ld_time Try to use meaningful variable names whenever possible (for readability). say something like filesystem_metadata_load_time or something along those lines? http://gerrit.cloudera.org:8080/#/c/12940/3/common/thrift/CatalogObjects.thrift@476 PS3, Line 476: optional Why is it optional? http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@229 PS3, Line 229: strgTmNano meaningful variable names (for readability) http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232 PS3, Line 232: ldTbl Better variable names http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232 PS3, Line 232: loadedTbls_ How do you distinguish between tables loaded now vs tables already loaded. For ex: if 3 of 5 tables need loading (meaning 2 already cached), this loop seems to sum the metric for all the 5 tables. http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@233 PS3, Line 233: ldTbl instanceof Table This is kinda obvious, force a downcast? http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105 PS3, Line 105: ctxStg better variable names http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@109 PS3, Line 109: hbaseTableName_ = Util.getHBaseTableName(getMetaStoreTable()); : // Warm up the connection and verify the table exists. : Util.getHBaseTable(hbaseTableName_).close(); : columnFamilies_ = null; : cols = Util.loadColumns(msTable_); : } finally { : storageLdTime_ = ctxStg.stop(); I don't think any of this calls into HBase http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1215 PS3, Line 1215: storageLdTm Better variable names http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1237 PS3, Line 1237: final Timer.Context ctxStg = I think the scopes for metrics are incorrect. I'd suggest to do this on top of https://gerrit.cloudera.org/#/c/12950/ which refactors the code nicely. http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@115 PS3, Line 115: // Storage load time for the table Time spent in the source systems loading the fs metadata... (or something similar?) http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@116 PS3, Line 116: storageLdTime_ better name? fileSystemMdLoadTime or something along those lines? http://gerrit.cloudera.org:8080/#/c/12940/3/fe/src/main/java/org/apache/impala/catalog/Table.java@193 PS3, Line 193: STORAGE_LOAD_DURATION_METRIC better name -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 08 Apr 2019 23:16:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2663/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 05 Apr 2019 22:28:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2662/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 05 Apr 2019 22:18:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#3). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage loading time in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage loading time in query profile. Testing: Run queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit test to test_observability.py Sample output: After run a hbase query(Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 102 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/3 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 3 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Hello Bharath Vissapragada, Anurag Mantripragada, Vihang Karajgaonkar, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12940 to look at the new patch set (#2). Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage loading time in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage loading time in query profile. Testing: Run queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit test to test_observability.py Sample output: After run a hbase query(Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 102 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/2 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 2 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Anurag Mantripragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: Code-Review+1 (2 comments) Apart from nits, LGTM. Will let others take a look. http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: http://gerrit.cloudera.org:8080/#/c/12940/1/common/thrift/CatalogObjects.thrift@474 PS1, Line 474: // Set iff this is a table needs access storage. Maybe say "Set if this table needs storage access" ? http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/12940/1/tests/query_test/test_observability.py@573 PS1, Line 573: def test_query_profile_storge_load(self): How about having the method signature as test_query_profile_storage_load_time(self) ? -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Fri, 05 Apr 2019 21:19:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12940 ) Change subject: IMPALA-7322: Add storage wait time to profile .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/2661/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 05 Apr 2019 18:11:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7322: Add storage wait time to profile
Yongzhi Chen has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12940 Change subject: IMPALA-7322: Add storage wait time to profile .. IMPALA-7322: Add storage wait time to profile Add metrics to record storage loading time in catalog for hdfs, kudu and hbase tables. Pass storage wait time from catalog to fe through thrift and log total storage loading time in query profile. Testing: Run queries that can trigger all of, none of or some of the related tables loading. Check query profile for each query. Check catalog metrics for each table. Add unit test to test_observability.py Sample output: After run a hbase query(Metadata load finished is divided into several lines because of limitation of commit message): Query Compilation: 4s401ms - Metadata load started: 661.084us (661.084us) - Metadata load finished. loaded-tables=1/1 load-requests=1 catalog-updates=3 storage-load-time=233ms: 3s819ms (3s819ms) - Analysis finished: 3s820ms (763.979us) - Value transfer graph computed: 3s820ms (63.193us) Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 --- M common/thrift/CatalogObjects.thrift M fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/KuduTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/query_test/test_observability.py 7 files changed, 102 insertions(+), 17 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/40/12940/1 -- To view, visit http://gerrit.cloudera.org:8080/12940 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660 Gerrit-Change-Number: 12940 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen