[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 7: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/4942/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 05 Nov 2019 07:35:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Xiaomeng Zhang, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#8). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 297 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/8 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 8 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 297 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/7 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 7 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 05 Nov 2019 03:56:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. IMPALA-9027: planner fixes for mt_dop * Update the distributed planner to reflect that broadcast join tables are replicated in all fragments. * Did a pass over the planner code looking at call sites of getNumNodes() to confirm that they shouldn't be replaced by getNumInstances() Testing: * Updated affected planner test where PARALLELPLANS had a different join strategy. * Added a targeted test to mem-limit-broadcast-join.test to show that mt_dop affects join mode. * Ran exhaustive tests. Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Reviewed-on: http://gerrit.cloudera.org:8080/14522 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/mem-limit-broadcast-join.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test 9 files changed, 769 insertions(+), 667 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 6 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h File be/src/catalog/catalog-server.h: http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h@215 PS6, Line 215: long_75_loading_time I think names like "p75_loading_time_ns", "p95_loading_time_ns" are better. http://gerrit.cloudera.org:8080/#/c/14600/6/www/scripts/util.js File www/scripts/util.js: http://gerrit.cloudera.org:8080/#/c/14600/6/www/scripts/util.js@36 PS6, Line 36: nanoseconds % 1000 Should be "nanoseconds / 1000 % 1000" and should deal with adding leading 0s, otherwise getReadableTime(1) will result in "0.1ms" There's an example in backend: https://github.com/apache/impala/blob/288abf10f6142ae6cb02329604805a9a1dcc804f/be/src/util/pretty-printer.h#L249 Maybe we can do the same in JS. -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 05 Nov 2019 02:38:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9100: Handle duplicate occurrences of flags for tests/run-tests.py
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14629 ) Change subject: IMPALA-9100: Handle duplicate occurrences of flags for tests/run-tests.py .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4941/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I60dc9a898f69804e2a53c05b5dfab2f948a22097 Gerrit-Change-Number: 14629 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 05 Nov 2019 02:07:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9100: Handle duplicate occurrences of flags for tests/run-tests.py
Joe McDonnell has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14629 Change subject: IMPALA-9100: Handle duplicate occurrences of flags for tests/run-tests.py .. IMPALA-9100: Handle duplicate occurrences of flags for tests/run-tests.py If someone passes --skip-stress multiple times to tests/run-tests.py, it currently only removes one of the occurrences from the arguments and allows the other one to pass through to pytest. This causes pytest to immediately error out. This behavior is seen on the docker-based tests, because test-with-docker.py specifies --skip-stress and bin/run-all-tests.sh adds another --skip-stress for core runs. This changes tests/run-tests.py to handle multiple occurrences of --skip-stress, --skip-parallel, and --skip-serial. Testing: - Tested manually with duplicate skip flags. Change-Id: I60dc9a898f69804e2a53c05b5dfab2f948a22097 --- M tests/run-tests.py 1 file changed, 14 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/29/14629/1 -- To view, visit http://gerrit.cloudera.org:8080/14629 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I60dc9a898f69804e2a53c05b5dfab2f948a22097 Gerrit-Change-Number: 14629 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Xiaomeng Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h File be/src/catalog/catalog-server.h: http://gerrit.cloudera.org:8080/#/c/14600/6/be/src/catalog/catalog-server.h@215 PS6, Line 215: ///"long_75_loading_time": 12361844, Does it mean 75 percentage loading time? Maybe better to add word "percent" in string? Why adding "long" as prefix? -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Xiaomeng Zhang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Tue, 05 Nov 2019 01:21:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7984: Port runtime filter from Thrift RPC to KRPC
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/13882 ) Change subject: IMPALA-7984: Port runtime filter from Thrift RPC to KRPC .. Patch Set 27: (3 comments) http://gerrit.cloudera.org:8080/#/c/13882/27/be/src/util/min-max-filter.cc File be/src/util/min-max-filter.cc: http://gerrit.cloudera.org:8080/#/c/13882/27/be/src/util/min-max-filter.cc@a696 PS27, Line 696: Don't remove this comment http://gerrit.cloudera.org:8080/#/c/13882/27/be/src/util/min-max-filter.cc@a701 PS27, Line 701: Don't remove this comment http://gerrit.cloudera.org:8080/#/c/13882/27/be/src/util/min-max-filter.cc@a704 PS27, Line 704: Don't remove this. I'm surprised this isn't causing any test failures, since we should be hitting the DCHECK below now. -- To view, visit http://gerrit.cloudera.org:8080/13882 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b394796d250286510e157ae326882bfc01d387a Gerrit-Change-Number: 13882 Gerrit-PatchSet: 27 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 05 Nov 2019 01:20:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14627 ) Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. Patch Set 2: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/527/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 05 Nov 2019 00:48:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14627 ) Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. Patch Set 2: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/527/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Tue, 05 Nov 2019 00:25:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Hello Sahil Takiar, Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14627 to look at the new patch set (#2). Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. IMPALA-9085: [DOCS] Refactored impala_s3.xml Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a --- M docs/shared/impala_common.xml M docs/topics/impala_s3.xml 2 files changed, 296 insertions(+), 471 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/14627/2 -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Filed https://issues.apache.org/jira/browse/IMPALA-9122 for the flaky test that does not seem to be related to the patch -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 04 Nov 2019 23:31:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4940/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 23:29:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5172/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 04 Nov 2019 23:30:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5171/ -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 04 Nov 2019 23:27:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4939/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 23:21:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Hello Quanlong Huang, Yongzhi Chen, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14600 to look at the new patch set (#6). Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. For now, we will ignore case like that because we are aiming to find the tables with maximum longest loading time. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. But users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 269 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/6 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 6 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 5: (3 comments) Thanks for the valid feedback! I added max, 75, 95, 99, count in the new loading time ranking table. The ranking is sorted by maximum loading time by default. However, there might be a problem here because we only keep the capacity size to 100. For example, there might be case like a table has higher median loading time but has lower Maximum loading time which cannot make itself to the Top-100. Users can sort them by other metrics in UI. Also, the more detailed metrics can be found in the metrics column. So I don't feel like we need to expose all of them. http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift@724 PS4, Line 724: 5: optional i64 median_table_loading_ns > It'd be better to not just expose the median loading time. Since this is sh Done http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java@203 PS4, Line 203: > Can we add more metrics so in the future we don't need to touch this part a Done http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl File www/catalog.tmpl: http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl@169 PS4, Line 169: Median Loading Time > We can show human readable time in this column, i.e. in the forms of 11m25s Done -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 22:42:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/14600/5/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14600/5/fe/src/main/java/org/apache/impala/catalog/Table.java@189 PS5, Line 189: return (long)metrics_.getTimer(LOAD_DURATION_METRIC).getSnapshot().get75thPercentile(); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14600/5/fe/src/main/java/org/apache/impala/catalog/Table.java@192 PS5, Line 192: return (long)metrics_.getTimer(LOAD_DURATION_METRIC).getSnapshot().get95thPercentile(); line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14600/5/fe/src/main/java/org/apache/impala/catalog/Table.java@195 PS5, Line 195: return (long)metrics_.getTimer(LOAD_DURATION_METRIC).getSnapshot().get99thPercentile(); line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 22:38:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Jiawei Wang has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. IMPALA-9109: Add top-k metadata loading ranking on catalogd UI Add functions in CatalogUsageMonitor to monitor and report the catalog usage of the tables have the longest metadata loading time(Including maximum, median, 75th-ile, 95th-ile, 99th-ile time). Set default tables loading metrics capacity to 100. Add the sorted table in Catalog server web-ui. The loading time is sorted by the maximum from load_duration metrics. And users can sort by other metrics in catalogd debug UI. Testing: - Add end-to-end test for webpage to verify the label and text exist in catalog debug page. Verify all fields are in JSON response - Launch Impala and activate some tables to see the table loading time shown successfully on the catalog debug UI page. Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M common/thrift/JniCatalog.thrift M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java M fe/src/main/java/org/apache/impala/catalog/CatalogUsageMonitor.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/catalog/Table.java M tests/webserver/test_web_pages.py M www/catalog.tmpl M www/scripts/util.js 10 files changed, 266 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/14600/5 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 5 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen
[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14538 ) Change subject: IMPALA-4400: aggregate runtime filters locally .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/4938/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14538 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iabeeab5eec869ff2197250ad41c1eb5551704acc Gerrit-Change-Number: 14538 Gerrit-PatchSet: 13 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 04 Nov 2019 21:52:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14627 ) Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. Patch Set 1: Verified+1 Build Successful https://jenkins.impala.io/job/gerrit-docs-auto-test/526/ : Doc tests passed. -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 04 Nov 2019 21:22:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally
Tim Armstrong has uploaded a new patch set (#13). ( http://gerrit.cloudera.org:8080/14538 ) Change subject: IMPALA-4400: aggregate runtime filters locally .. IMPALA-4400: aggregate runtime filters locally Move RuntimeFilterBank to QueryState(). Implement fine-grained locking for each filter to mitigate any increased lock contention from the change. Make RuntimeFilterBank handle multiple producers of the same filter, e.g. multiple instances of a partitioned join. It computes the expected number of filters upfront then sends the filter to the coordinator once all the local instances have been merged together. The merging can done in parallel locally to improve latency of filter propagation. Add Or() methods to MinMaxFilter and BloomFilter, since we now need to merge those, not just the thrift versions. Update coordinator filter routing to expect only one instance of a filter from each producer backend and to only send one instance to each consumer backend (instead of sending one per fragment). Update memory reservations and estimates to be lower to account for sharing of filters between fragment instances. mt_dop plans are modified to show these shared and non-shared resources separately. Enable waiting for runtime filters for kudu scanner with mt_dop. Made min/max filters const-correct. TODO: * Retest with KRPC runtime filter change Testing * Added unit tests for Or() methods. * Added some additional e2e test coverage for mt_dop queries * Updated planner tests with new estimates and reservation. * Ran a single node 3-impalad stress test with TPC-H kudu and TPC-DS parquet. * Ran exhaustive tests. * Ran core tests with ASAN. Perf * Did a single-node perf run on TPC-H with default settings. No perf change. * Single-node perf run with mt_dop=8 showed significant speedups: +--+---+-++++ | Workload | File Format | Avg (s) | Delta(Avg) | GeoMean(s) | Delta(GeoMean) | +--+---+-++++ | TPCH(30) | parquet / none / none | 10.07 | -5.96% | 5.07 | -10.80%| +--+---+-++++ +--+--+---++-++---++---++-+-+ | Workload | Query| File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval| +--+--+---++-++---++---++-+-+ | TPCH(30) | TPCH-Q7 | parquet / none / none | 37.49 | 36.33 | +3.18% | 6.34% | 4.85%| 20| +1.90% | 3.96| 1.75| | TPCH(30) | TPCH-Q15 | parquet / none / none | 3.77 | 3.75| +0.61% | 1.20% | 1.03%| 20| +0.74% | 1.50| 1.72| | TPCH(30) | TPCH-Q22 | parquet / none / none | 2.32 | 2.32| +0.05% | 1.62% | 2.14%| 20| -0.01% | -0.04 | 0.09| | TPCH(30) | TPCH-Q19 | parquet / none / none | 5.17 | 5.18| -0.20% | 1.56% | 1.63%| 20| -0.09% | -0.89 | -0.39 | | TPCH(30) | TPCH-Q1 | parquet / none / none | 4.27 | 4.28| -0.29% | 1.09% | 1.80%| 20| -0.05% | -0.74 | -0.61 | | TPCH(30) | TPCH-Q6 | parquet / none / none | 1.24 | 1.25| -0.35% | 3.47% | 2.95%| 20| -0.19% | -0.92 | -0.35 | | TPCH(30) | TPCH-Q13 | parquet / none / none | 9.73 | 9.87| -1.38% | 1.22% | 1.05%| 20| -1.34% | -3.26 | -3.87 | | TPCH(30) | TPCH-Q16 | parquet / none / none | 2.49 | 2.54| -1.97% | 2.91% | 2.41%| 20| -2.07% | -2.09 | -2.36 | | TPCH(30) | TPCH-Q2 | parquet / none / none | 1.97 | 2.01| -1.91% | 2.14% | 2.57%| 20| -2.21% | -2.76 | -2.58 | | TPCH(30) | TPCH-Q9 | parquet / none / none | 80.59 | 82.48 | -2.29% | 6.61% | 3.34%| 20| -3.67% | -3.17 | -1.41 | | TPCH(30) | TPCH-Q10 | parquet / none / none | 5.12 | 5.43| I -5.70% | 0.82% | 1.62%| 20| I -5.72% | -5.27 | -14.22 | | TPCH(30) | TPCH-Q21 | parquet / none / none | 24.50 | 26.20 | I -6.49% | 0.47% | 0.43%| 20| I -7.00% | -5.27 | -47.60 | | TPCH(30) | TPCH-Q18 | parquet / none / none | 8.77 | 9.48| I -7.55% | 0.83% | 0.79%| 20| I -8.06% | -5.27 | -30.59 | | TPCH(30) | TPCH-Q3 | parquet / none / none | 6.05 | 6.61| I -8.51% | 0.79% | 0.73%| 20|
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14627 ) Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. Patch Set 1: Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/526/ Testing docs change - this change appears to modify docs/ and no code. This is experimental - please report any issues to tarmstr...@cloudera.com or on this JIRA: IMPALA-7317 -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 04 Nov 2019 20:59:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9085: [DOCS] Refactored impala s3.xml
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14627 Change subject: IMPALA-9085: [DOCS] Refactored impala_s3.xml .. IMPALA-9085: [DOCS] Refactored impala_s3.xml Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a --- M docs/shared/impala_common.xml M docs/topics/impala_s3.xml 2 files changed, 286 insertions(+), 442 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/14627/1 -- To view, visit http://gerrit.cloudera.org:8080/14627 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib274968a0412b4b8757f31ab674d4b82311de70a Gerrit-Change-Number: 14627 Gerrit-PatchSet: 1 Gerrit-Owner: Alex Rodoni
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5171/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 04 Nov 2019 19:07:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 04 Nov 2019 19:06:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add toolchain maven cache to speed up maven builds
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14562 ) Change subject: IMPALA-9107: Add toolchain maven cache to speed up maven builds .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/14562/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14562/5//COMMIT_MSG@27 PS5, Line 27: > How often should this cache be refreshed? Can we automate building a new ve I have a script that parses the mvn.log and prints how many artifacts were downloaded from various repositories. I will add that to this change. If we have a fixed tarball, it would slowly degrade and the hit rate would drop. If we run this script at the end of jobs, it can print the total number of downloads. It would be something we can manually check. To automate updates, we could continue to run all-build-options-ub1604 without this optimization and have it produce a tarball at the end. The job that runs on "master" every night could upload the tarball to a particular location. -- To view, visit http://gerrit.cloudera.org:8080/14562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I043912f5fbc7cf24ee80b2855354656aa587ca9f Gerrit-Change-Number: 14562 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 04 Nov 2019 18:08:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9027: planner fixes for mt dop
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/14522 ) Change subject: IMPALA-9027: planner fixes for mt_dop .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14522 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I23395c2dadf6be0e8be99706ca3ab5f4964cbcf9 Gerrit-Change-Number: 14522 Gerrit-PatchSet: 4 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 04 Nov 2019 18:01:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9107: Add toolchain maven cache to speed up maven builds
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/14562 ) Change subject: IMPALA-9107: Add toolchain maven cache to speed up maven builds .. Patch Set 5: I'm getting cold feet about this being an actual repository. When someone populates their .m2 directory with artifacts from this cached repository, then if they switch to an older branch without this repo, it will have to reverify everything. Since this is mainly about speeding up the Jenkins jobs and I don't want to impact developer environments, I'm thinking I might switch to directly populating the .m2 directory. -- To view, visit http://gerrit.cloudera.org:8080/14562 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I043912f5fbc7cf24ee80b2855354656aa587ca9f Gerrit-Change-Number: 14562 Gerrit-PatchSet: 5 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 04 Nov 2019 17:55:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/14291 ) Change subject: IMPALA-8704: ISO:SQL:2016 datetime patterns - Milestone 2 .. Patch Set 14: Code-Review+1 (3 comments) One more round. I'll +2 it afterwards. http://gerrit.cloudera.org:8080/#/c/14291/14//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14291/14//COMMIT_MSG@28 PS14, Line 28: This is the default behaviour What does "default behavior" mean in this context? Is it when no FX or FM modifier is used? Or FX is used but FM isn't? http://gerrit.cloudera.org:8080/#/c/14291/14/be/src/runtime/datetime-iso-sql-format-tokenizer.cc File be/src/runtime/datetime-iso-sql-format-tokenizer.cc: http://gerrit.cloudera.org:8080/#/c/14291/14/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@115 PS14, Line 115: if (fm_modifier_active_) nit: not necessary, you can just set the flag to false no matter what. http://gerrit.cloudera.org:8080/#/c/14291/14/be/src/runtime/datetime-iso-sql-format-tokenizer.cc@237 PS14, Line 237: DCHECK(*current_pos == dt_ctx_->fmt); It would be more straightforward if ProcessFXModifier() did not take any parameters and returned a char* instead of void. No need to pass current_pos as a parameter if *current_pos is expected to be set to dt_ctx_->fmt. -- To view, visit http://gerrit.cloudera.org:8080/14291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I30d2f6656054371476aaa8bd0d51f572b9369855 Gerrit-Change-Number: 14291 Gerrit-PatchSet: 14 Gerrit-Owner: Gabor Kaszab Gerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 04 Nov 2019 17:12:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/14621 ) Change subject: IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/14621/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14621/3//COMMIT_MSG@21 PS3, Line 21: Impala writes all text files with a trailing dot due to some odd : behavior in hdfs-table-sink.cc > Can you mention that text tables created during dataload already used .txt Done. There are actually some tables (like alltypesinsert) that contain text data and are created via Impala during dataload. I guess we happen to not run any 'show files' tests against these tables. -- To view, visit http://gerrit.cloudera.org:8080/14621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8 Gerrit-Change-Number: 14621 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 04 Nov 2019 16:25:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames
Hello Joe McDonnell, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14621 to look at the new patch set (#4). Change subject: IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames .. IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames Writes to text tables on ABFS are failing because HADOOP-15860 recently changed the ABFS behavior when writing files / folders that end with a '.'. ABFS explicitly does not allow files / folders that end with a dot. >From the ABFS docs: "Avoid blob names that end with a dot (.), a forward slash (/), or a sequence or combination of the two." The behavior prior to HADOOP-15860 was to simply drop any trailing dots when writing files or folders, but that can lead to various issues because clients may try to read back a file that should exist on ABFS, but doesn't. HADOOP-15860 changed the behavior so that any attempt to write a file or folder with a trailing dot fails on ABFS. Impala writes all text files with a trailing dot due to some odd behavior in hdfs-table-sink.cc. The table sink writes files with a "file extension" which is dependent on the file type. For example, Parquet files have a file extension of ".parq". For some reason, text files had no file extension, so Impala would try to write text files of the following form: "244c5ee8ece6f759-8b1a1e3b_45513034_data.0.". Several tables created during dataload, such as alltypes, already use the '.txt' extension for their files. These tables are not created via Impala's INSERT code path, they are copied into the table. However, there are several tables created during dataload, such as alltypesinsert, that are created via Impala. This patch will change the files in these tables so that they end in '.txt'. This patch adds the ".txt" extension to all written text files and modifies the hdfs-table-sink.cc so that it doesn't add a trailing dot to a filename if there is no file extension. Testing: * Ran core tests * Re-ran affected ABFS tests * Added test to validate that the correct file extension is used for Parquet and text tables * Manually validated that without the addition of the '.txt' file extension, files are not written with a trailing dot Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8 --- M be/src/exec/hdfs-table-sink.cc M be/src/exec/hdfs-text-table-writer.cc M tests/query_test/test_insert.py 3 files changed, 35 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/14621/4 -- To view, visit http://gerrit.cloudera.org:8080/14621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8 Gerrit-Change-Number: 14621 Gerrit-PatchSet: 4 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar
[Impala-ASF-CR] IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/14621 ) Change subject: IMPALA-8557: Add '.txt' to text files, remove '.' at end of filenames .. Patch Set 3: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/14621/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14621/3//COMMIT_MSG@21 PS3, Line 21: Impala writes all text files with a trailing dot due to some odd : behavior in hdfs-table-sink.cc Can you mention that text tables created during dataload already used .txt extension? I was surprised at first that no existing test had to be changed (e.g. ones that use SHOW FILES), but then I realized that functional.alltypes and friends already used .txt extension as they are copied instead of created by INSERT. -- To view, visit http://gerrit.cloudera.org:8080/14621 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2a9adacd45855cde86724e10f8a131e17ebf46f8 Gerrit-Change-Number: 14621 Gerrit-PatchSet: 3 Gerrit-Owner: Sahil Takiar Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Sahil Takiar Gerrit-Comment-Date: Mon, 04 Nov 2019 15:26:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9109: Add top-k metadata loading ranking on catalogd UI
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14600 ) Change subject: IMPALA-9109: Add top-k metadata loading ranking on catalogd UI .. Patch Set 4: (5 comments) Thanks for doing this! It will be very helpful in practise! I left some comments hoping we can expose more details. http://gerrit.cloudera.org:8080/#/c/14600/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14600/4//COMMIT_MSG@16 PS4, Line 16: exisit nit: typo? http://gerrit.cloudera.org:8080/#/c/14600/4//COMMIT_MSG@18 PS4, Line 18: showed nit: shown http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift File common/thrift/JniCatalog.thrift: http://gerrit.cloudera.org:8080/#/c/14600/4/common/thrift/JniCatalog.thrift@724 PS4, Line 724: 5: optional i64 table_loading_ns It'd be better to not just expose the median loading time. Since this is shown in a detail page, we can show more like the /rpcz page: Count, min / max, 75th_percentile, 95th_percentile, 98th_percentile, 99th_percentile, 999th_percentile. http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/14600/4/fe/src/main/java/org/apache/impala/catalog/Table.java@203 PS4, Line 203: getMedian() Can we add more metrics so in the future we don't need to touch this part again? I think the max time is also useful in sorting. For example, a partitioned table may take a long time in its first load and take short time in incremental loads later. Median time does not reflect this. Other metrics like 95th/99th percentile may also helpful. http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl File www/catalog.tmpl: http://gerrit.cloudera.org:8080/#/c/14600/4/www/catalog.tmpl@169 PS4, Line 169: Metadata Loading Time (ms) We can show human readable time in this column, i.e. in the forms of 11m25s, 4s495ms, 15.178ms etc. Just like the query duration shown in the /queries page. Catalogd should still pass values in ms so DataTable can sort correctly. We just need to add a render function for this column. Here are two examples: https://github.com/apache/impala/commit/ea4715fd76d6dba0c3777146989c2bf020efabdd https://github.com/apache/impala/commit/725a47b3f275aa76db6a65d4e320f8dbaf9d6b28 -- To view, visit http://gerrit.cloudera.org:8080/14600 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9305a867d7053cde9acc42dae6e47ee440f1a8bf Gerrit-Change-Number: 14600 Gerrit-PatchSet: 4 Gerrit-Owner: Jiawei Wang Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jiawei Wang Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Yongzhi Chen Gerrit-Comment-Date: Mon, 04 Nov 2019 09:45:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14592 ) Change subject: IMPALA-2112: Support primary key/foreign key constraints as part of create table in Impala. .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14592 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id03d8d4d41a2ac1b15e7060e2a013e334d044ee7 Gerrit-Change-Number: 14592 Gerrit-PatchSet: 3 Gerrit-Owner: Anurag Mantripragada Gerrit-Reviewer: Anurag Mantripragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 04 Nov 2019 08:05:37 + Gerrit-HasComments: No