[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 24: (3 comments) http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@144 PS24, Line 144: recycle > is that the same as putting them back on the queue you talk about in the ne Yes it is. Reworded. http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@301 PS24, Line 301: /// and the state of 'range' is unmodified so that allocation can be retried. > so does this schedule the scan range (given that the previous two methods l It only makes sense to call after *needs_buffers because the calculation for # buffers to allocate only makes sense for that. I think it would work otherwise, but doesn't make sense. Updated comment to explain the cases when it should be called. http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@308 PS24, Line 308: int64_t max_bytes); > If we aren't able to reduce the number of ExternalBufferTag cases, I'm not It felt awkward to force the caller to specify the amount of memory to use for the range before it knew how much memory the ScanRange actually needed. With GetNextRange() in particular, the caller doesn't even know anything about range it will be processing so it can't make a smart decision about how much memory to use. We don't do this now so we could combine them, but it didn't feel particularly natural to me. I also liked that the buffer allocation was an explicit method call. Bundling it into another IoMgr method and controlling it by an argument seemed like it hid the memory allocation a bit. -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 24 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 07:40:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8707 to look at the new patch set (#25). Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. IMPALA-4835: Part 2: Allocate scan range buffers upfront This change is a step towards reserving memory for buffers from the buffer pool and constraining per-scanner memory requirements. This change restructures the DiskIoMgr code so that each ScanRange operates with a fixed set of buffers that are allocated upfront and recycled as the I/O mgr works through the ScanRange. One major change is that ScanRanges get blocked when a buffer is not available and get unblocked when a client returns a buffer via ReturnBuffer(). I was able to remove the logic to maintain the blocked_ranges_ list by instead adding a separate set with all ranges that are active. I plan to merge this along with the actual buffer pool switch, but separated it out to allow review of the DiskIoMgr changes separate from other aspects of the buffer pool switchover. Testing: * Ran core and exhaustive tests. Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/util/bit-util-test.cc M be/src/util/bit-util.h M be/src/util/runtime-profile-counters.h 19 files changed, 787 insertions(+), 432 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/25 -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 25 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@206 PS5, Line 206: event_regexes = [r'Fragment Instance Lifecycle Event Timeline', > nit: I think these work without the r in front since they don't contain any I figured it was a convention to use raw strings for regexes in python. I think it helps to indicate that they're intended to be interpreted as regexes. http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@209 PS5, Line 209: r'First Batch Sent', > Going through this review I noticed that "Open Finished" is not printed whi Nice catch. http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@211 PS5, Line 211: query = 'select count(*) from functional.alltypes' > nit: the rest of the file uses double quotes. This got caught in a find and replace, didn't mean to change it :) -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 07:00:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Hello Lars Volker, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9271 to look at the new patch set (#6). Change subject: IMPALA-6497: add "Last row fetched" and AC events .. IMPALA-6497: add "Last row fetched" and AC events This makes it more observable that all rows were returned to the client and also that resources were released for admission control. Testing: Manually inspected some query profiles. Added a basic observability test that ensures that the expected events appear in the profile. Ran it in a loop for a bit to make sure it wasn't flaky. Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e --- M be/src/runtime/coordinator.cc M tests/query_test/test_observability.py 2 files changed, 49 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/9271/6 -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6511: Fix state machine in FIS::UpdateState()
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9294 ) Change subject: IMPALA-6511: Fix state machine in FIS::UpdateState() .. Patch Set 1: Code-Review+2 Seems much simpler -- To view, visit http://gerrit.cloudera.org:8080/9294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7304205d245289cc3d7ca2217e212c844ee75e7b Gerrit-Change-Number: 9294 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 06:55:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 2 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 05:41:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. IMPALA-6269: Cherry-pick dependency change for KRPC Expose RPC method info map and various metrics These changes are needed for IMPALA-6269. Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Reviewed-on: http://gerrit.cloudera.org:8080/9269 Tested-by: Kudu Jenkins Reviewed-by: Alexey SerbinReviewed-by: Todd Lipcon Reviewed-on: http://gerrit.cloudera.org:8080/9287 Reviewed-by: Lars Volker Tested-by: Impala Public Jenkins --- M be/src/kudu/rpc/acceptor_pool.cc M be/src/kudu/rpc/acceptor_pool.h M be/src/kudu/rpc/service_if.h M be/src/kudu/util/metrics.h 4 files changed, 17 insertions(+), 1 deletion(-) Approvals: Lars Volker: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9292 ) Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. Patch Set 4: This is the change to expose the KRPC metrics. -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 05:04:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Lars Volker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9292 Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. IMPALA-6269: Expose KRPC metrics on debug webpage This change exposes KRPC metrics on the /rpcz debug web page. This change also fixes a bug in PrettyPrinter::GetByteUnit(), which previously did not work for unsigned values due to an implicit cast. This change is based on a change by Sailesh Mukil. TODO: testing, once IMPALA-6508 has been submitted. Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impalad-main.cc M be/src/util/histogram-metric.h M be/src/util/pretty-printer.h M common/thrift/Metrics.thrift M tests/webserver/test_web_pages.py M www/rpcz.tmpl 14 files changed, 370 insertions(+), 100 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/4 -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 4 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-6509: [docs] Note for haproxy for Kerberized clusters
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/9286 ) Change subject: IMPALA-6509: [docs] Note for haproxy for Kerberized clusters .. Patch Set 5: > Uploaded patch set 5: Patch Set 4 was rebased. For what it's worth, https://gerrit.cloudera.org/c/7241 seems to be touching the same area of the world. If IMPALA-2782 gets resolved, we may want to change this text further. -- To view, visit http://gerrit.cloudera.org:8080/9286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c Gerrit-Change-Number: 9286 Gerrit-PatchSet: 5 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: John Russell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 04:45:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool
Hello Dimitris Tsirogiannis, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8971 to look at the new patch set (#13). Change subject: IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool .. IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool This patch adds changes to the planner to account for memory used by bloom filters at the fragment instance level. Also adds changes to allocate memory for those bloom filters from the buffer pool. Testing: - Modified Planner Tests and end to end tests to account for memory reservation for the runtime filters. - Modified backend tests and benchmarks to use the bufferpool for bloom filter allocation. - Add an end to end test. - Ran rest of the core tests. Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/service/fe-support.cc M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/util/backend-gflag-util.cc M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M common/thrift/BackendGflags.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/PlanNodes.thrift M common/thrift/Planner.thrift M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/FeSupport.java M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test M testdata/workloads/functional-query/queries/QueryTest/bloom_filters_wait.test M testdata/workloads/functional-query/queries/QueryTest/explain-level0.test M testdata/workloads/functional-query/queries/QueryTest/explain-level1.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M testdata/workloads/functional-query/queries/QueryTest/explain-level3.test M testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test M testdata/workloads/functional-query/queries/QueryTest/spilling.test 38 files changed, 902 insertions(+), 550 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/8971/13 -- To view, visit http://gerrit.cloudera.org:8080/8971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f Gerrit-Change-Number: 8971 Gerrit-PatchSet: 13 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. IMPALA-5440: Add planner tests with extreme statistics values This commit address some of the issues in JIRA: tests against the cardinality overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN, 0 row number and negative row number, as well as cardinality on Subplan node. Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Reviewed-on: http://gerrit.cloudera.org:8080/9065 Reviewed-by: Alex BehmTested-by: Impala Public Jenkins --- M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 2 files changed, 117 insertions(+), 0 deletions(-) Approvals: Alex Behm: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 16 Gerrit-Owner: Xinran Tinney Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. Patch Set 15: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 15 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 04:26:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6511: Fix state machine in FIS::UpdateState()
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9294 ) Change subject: IMPALA-6511: Fix state machine in FIS::UpdateState() .. Patch Set 1: Tim, do you have time for a quick look? -- To view, visit http://gerrit.cloudera.org:8080/9294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7304205d245289cc3d7ca2217e212c844ee75e7b Gerrit-Change-Number: 9294 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 04:09:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add KRPC test flag .. Patch Set 4: (4 comments) Thanks for the review, please see my inline comments and PS5. http://gerrit.cloudera.org:8080/#/c/9291/4/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/9291/4/bin/start-impala-cluster.py@57 PS4, Line 57: krpc > KRPC Done http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/custom_cluster_test_suite.py@138 PS4, Line 138: "--impalad_args=-use_krpc" > Not quite following what this translates to ? For now, I have been using -- Yes, -use_krpc is a boolean and as such its presence enables it. This page has all four variations that are supported for a boolean flag: http://gflags.github.io/gflags/ app_containing_foo --big_menu app_containing_foo --nobig_menu app_containing_foo --big_menu=true app_containing_foo --big_menu=false I picked the single-dash variant because I've seen this convention elsewhere I think. Let me know if you prefer two dashes. http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/skip.py@89 PS4, Line 89: krpc > nit: KRPC Done http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/test_skip.py File tests/common/test_skip.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/test_skip.py@24 PS4, Line 24: class TestSkipIf(ImpalaTestSuite): > Is this test gonna be part of the patch or is it for testing only ? My idea was to keep it in this patch so that we actually have a test for the things I added. Once we add a proper (non-custom-cluster) test using the SkipIfs I plan to drop this test again. -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 04:07:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Hello Michael Ho, Michael Brown, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9291 to look at the new patch set (#5). Change subject: IMPALA-6508: add KRPC test flag .. IMPALA-6508: add KRPC test flag This change adds a flag "--use_krpc" to start-impala-cluster.py. The flag is currently passed as an argument to the impalad daemon. In the future it will also enable KRPC for the catalogd and statestored daemons. This change also adds a flag "--test_krpc" to pytest. When running tests using "impala-py.test --test_krpc", the test cluster will be started by passing "--use_krpc" to start-impala-cluster.py (see above). This change also adds a SkipIf to skip tests based on whether the cluster was started with KRPC support or not. - SkipIf.not_krpc can be used to mark a test that depends on KRPC. - SkipIf.not_thrift can be used to mark a test that depends on Thrift RPC. This change adds a meta test to make sure that the new SkipIf decorators work correctly. The test should be removed as soon as real tests have been added with the new decorators. Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab --- M bin/run-all-tests.sh M bin/start-impala-cluster.py M tests/common/custom_cluster_test_suite.py M tests/common/skip.py A tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py 7 files changed, 81 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9291/5 -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 5 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6489: use correct template tuple size
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9288 ) Change subject: IMPALA-6489: use correct template tuple size .. IMPALA-6489: use correct template tuple size The bug was that we mixed up the size of the top-level tuple with the size of the nested tuple. The upshot in this case was that the wrong amount of data was memcpy()ed over and we read past the bounds of the original allocation. Testing: TestParquetArrayEncodings.test_avro_primitive_in_list reliably reproduced the problem under ASAN. After the fix it not longer reproduces. Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Reviewed-on: http://gerrit.cloudera.org:8080/9288 Reviewed-by: Alex BehmReviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scanner.h 1 file changed, 4 insertions(+), 1 deletion(-) Approvals: Alex Behm: Looks good to me, but someone else must approve Dan Hecht: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Gerrit-Change-Number: 9288 Gerrit-PatchSet: 2 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] [docs] Removed the obsolete Llama options files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9219 ) Change subject: [docs] Removed the obsolete Llama options files .. Patch Set 2: We remove the "removed" features from docs. -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 02:31:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6508: add krpc test flag
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add krpc test flag .. Patch Set 4: (4 comments) http://gerrit.cloudera.org:8080/#/c/9291/4/bin/start-impala-cluster.py File bin/start-impala-cluster.py: http://gerrit.cloudera.org:8080/#/c/9291/4/bin/start-impala-cluster.py@57 PS4, Line 57: krpc KRPC http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/custom_cluster_test_suite.py File tests/common/custom_cluster_test_suite.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/custom_cluster_test_suite.py@138 PS4, Line 138: "--impalad_args=-use_krpc" Not quite following what this translates to ? For now, I have been using --impalad_args="--use_krpc=true". Does this line have similar effect ? http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/skip.py File tests/common/skip.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/skip.py@89 PS4, Line 89: krpc nit: KRPC http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/test_skip.py File tests/common/test_skip.py: http://gerrit.cloudera.org:8080/#/c/9291/4/tests/common/test_skip.py@24 PS4, Line 24: class TestSkipIf(ImpalaTestSuite): Is this test gonna be part of the patch or is it for testing only ? -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 02:31:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6499: [docs] Fixed formatting errors in split part function
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9275 Change subject: IMPALA-6499: [docs] Fixed formatting errors in split_part function .. IMPALA-6499: [docs] Fixed formatting errors in split_part function Change-Id: I7623e32aaf31f21a3be4513f26deb0b789a56b1a --- M docs/topics/impala_string_functions.xml 1 file changed, 24 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/9275/3 -- To view, visit http://gerrit.cloudera.org:8080/9275 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7623e32aaf31f21a3be4513f26deb0b789a56b1a Gerrit-Change-Number: 9275 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1926/ -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 2 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 02:02:02 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 2: Code-Review+2 (1 comment) Fixed comment from Michael, carrying his +2. http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h File be/src/kudu/rpc/acceptor_pool.h: http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h@20 PS1, Line 20: > Kudu uses include-what-you-use and it complains about the missing include f Done -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 2 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 02:01:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Hello Michael Ho, Alexey Serbin, Kudu Jenkins, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9287 to look at the new patch set (#2). Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. IMPALA-6269: Cherry-pick dependency change for KRPC Expose RPC method info map and various metrics These changes are needed for IMPALA-6269. Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Reviewed-on: http://gerrit.cloudera.org:8080/9269 Tested-by: Kudu Jenkins Reviewed-by: Alexey SerbinReviewed-by: Todd Lipcon --- M be/src/kudu/rpc/acceptor_pool.cc M be/src/kudu/rpc/acceptor_pool.h M be/src/kudu/rpc/service_if.h M be/src/kudu/util/metrics.h 4 files changed, 17 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/9287/2 -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 2 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6508: add krpc test flag
Hello Michael Ho, Michael Brown, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9291 to look at the new patch set (#4). Change subject: IMPALA-6508: add krpc test flag .. IMPALA-6508: add krpc test flag This change adds a flag "--use_krpc" to start-impala-cluster.py. The flag is currently passed as an argument to the impalad daemon. In the future it will also enable krpc for the catalogd and statestored daemons. This change also adds a flag "--test_krpc" to pytest. When running tests using "impala-py.test --test_krpc", the test cluster will be started by passing "--use_krpc" to start-impala-cluster.py (see above). This change also adds a SkipIf to skip tests based on whether the cluster was started with krpc support or not. - SkipIf.not_krpc can be used to mark a test that depends on krpc. - SkipIf.not_thrift can be used to mark a test that depends on Thrift RPC. This change adds a meta test to make sure that the new SkipIf decorators work correctly. The test should be removed as soon as real tests have been added with the new decorators. Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab --- M bin/run-all-tests.sh M bin/start-impala-cluster.py M tests/common/custom_cluster_test_suite.py M tests/common/skip.py A tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py 7 files changed, 81 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9291/4 -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 4 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h File be/src/kudu/rpc/acceptor_pool.h: http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h@20 PS1, Line 20: > The original patch had #include here. Is it not necessary someho Kudu uses include-what-you-use and it complains about the missing include for int64_t. Should I add it here, too? -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 01:59:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 1: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h File be/src/kudu/rpc/acceptor_pool.h: http://gerrit.cloudera.org:8080/#/c/9287/1/be/src/kudu/rpc/acceptor_pool.h@20 PS1, Line 20: The original patch had #include here. Is it not necessary somehow in this cherry-pick ? -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 01:57:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 5: (3 comments) I found a bug which is unrelated to this change and filed IMPALA-6511 for it. http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@206 PS5, Line 206: event_regexes = [r'Fragment Instance Lifecycle Event Timeline', nit: I think these work without the r in front since they don't contain any special characters. http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@209 PS5, Line 209: r'First Batch Sent', Going through this review I noticed that "Open Finished" is not printed which looks like a bug to me. I filed IMPALA-6511 to fix it. http://gerrit.cloudera.org:8080/#/c/9271/5/tests/query_test/test_observability.py@211 PS5, Line 211: query = 'select count(*) from functional.alltypes' nit: the rest of the file uses double quotes. -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 01:43:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 01:42:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 24: (3 comments) A few initial questions. http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@144 PS24, Line 144: recycle is that the same as putting them back on the queue you talk about in the next paragraph? If so, maybe say re-enqueue? http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@301 PS24, Line 301: /// and the state of 'range' is unmodified so that allocation can be retried. so does this schedule the scan range (given that the previous two methods left the range unscheduled when '*needs_buffers'? is it legal to call this only after '*needs_buffers' is returned true, or can it be called regardless? http://gerrit.cloudera.org:8080/#/c/8707/24/be/src/runtime/io/disk-io-mgr.h@308 PS24, Line 308: int64_t max_bytes); If we aren't able to reduce the number of ExternalBufferTag cases, I'm not sure I follow the benefit of this interface. Why not just pass the reservation through to AddScanRanges()/StartScanRange(), since the io_mgr seems to still be managing the buffer lifetimes anyway? -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 24 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 01:37:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8958 ) Change subject: IMPALA-5152: Introduce metadata loading phase .. Patch Set 3: (10 comments) http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java File fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@460 PS3, Line 460: // Process statements for which column-level privilege requests may be registered : // except for DESCRIBE TABLE, REFRESH/INVALIDATE, USE or SHOW TABLES statem stale comment? http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@462 PS3, Line 462: isResult_.isQueryStmt() || analysisResult_.isInsertStmt() || : analysisResult_.isUpdateStmt() || analysisResult_.isDeleteStmt() || : analysi worth grouping into a hasColumnPrivilege method? http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java@508 PS3, Line 508: sult_.isDescribeTableStmt() || : analysisResult_.isResetMetadataStmt() || : analysisResult_.isUseStmt() || : analysisResult_.isShowTablesStmt() || : analysisResult_ what's special about these? how would one know that altertable belongs here? http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@309 PS3, Line 309: // the map was populated. is that TODO about other catalog objects still relevant? http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/LimitElement.java File fe/src/main/java/org/apache/impala/analysis/LimitElement.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/LimitElement.java@124 PS3, Line 124: || expr.contains(Subquery.cl comment is stale, pls update. http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java@62 PS3, Line 62: tableName_.toPath() add a precondition in the constructor to check that this is not null. http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@71 PS3, Line 71: tableName_.toPath() check not-null in constructor. http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/service/Frontend.java@961 PS3, Line 961: DatabaseNotFoundException fwict, this is the exception that we were previously using to give back an error message about a non-existent db whereas now we state that the table is missing. would it be reasonable to record this in the table cache (in a separate map)? http://gerrit.cloudera.org:8080/#/c/8958/3/fe/src/main/java/org/apache/impala/service/Frontend.java@969 PS3, Line 969: Tbls.put(tblName, t so we can have views that are marked as loaded, but their base tables may not be loaded? not sure what we do for renaming a base table of a view. http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java File fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java: http://gerrit.cloudera.org:8080/#/c/8958/1/fe/src/test/java/org/apache/impala/analysis/AnalyzeUpsertStmtTest.java@104 PS1, Line 104: Table does > Fair point. We are somewhat inconsistent about this today. Here's some back My guess is that the case you refer to is the common case. If the error is more specific, I think that's useful (e.g., db name typo). I saw in the frontend code where we get this info so if its not too cumbersome to carry around, would be useful. I'll note that the new error is technically correct so don't I feel too strongly about modifying the proposed behavior. -- To view, visit http://gerrit.cloudera.org:8080/8958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 Gerrit-Change-Number: 8958 Gerrit-PatchSet:
[Impala-ASF-CR] DRAFT IMPALA-6508: add krpc test flag
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: DRAFT IMPALA-6508: add krpc test flag .. Patch Set 3: I'm currently running a private build to validate that the new flag works well with Jenkins. -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 3 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 01:05:23 + Gerrit-HasComments: No
[Impala-ASF-CR] DRAFT IMPALA-6508: add krpc test flag
Lars Volker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9291 Change subject: DRAFT IMPALA-6508: add krpc test flag .. DRAFT IMPALA-6508: add krpc test flag This change adds a flag "--use_krpc" to start-impala-cluster.py. The flag is currently passed as an argument to the impalad daemon. In the future it will also enable krpc for the catalogd and statestored daemons. This change also adds a flag "--test_krpc" to pytest. When running tests using "impala-py.test --test_krpc", the test cluster will be started by passing "--use_krpc" to start-impala-cluster.py (see above). This change also adds a SkipIf to skip tests based on whether the cluster was started with krpc support or not. - SkipIf.not_krpc can be used to mark a test that depends on krpc. - SkipIf.not_thrift can be used to mark a test that depends on Thrift RPC. This change adds a meta test to make sure that the new SkipIf decorators work correctly. The test should be removed as soon as real tests have been added with the new decorators. Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab --- M bin/run-all-tests.sh M bin/start-impala-cluster.py M tests/common/custom_cluster_test_suite.py M tests/common/skip.py A tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py 7 files changed, 81 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/9291/3 -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 3 Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8971 ) Change subject: IMPALA-5519: Allocate fragment's runtime filter memory from Buffer pool .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1921/ -- To view, visit http://gerrit.cloudera.org:8080/8971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iea2759665fb2e8bef9433014a8d42a7ebf99ce1f Gerrit-Change-Number: 8971 Gerrit-PatchSet: 12 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 01:04:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1925/ -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 9 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 00:47:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 9: Code-Review+2 (3 comments) Thanks for the review. Carry +2. http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/runtime/query-exec-mgr.h@28 PS8, Line 28: > I don't see that used explicitly in this file. Can it be removed? Done http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/service/impala-server.h@44 PS8, Line 44: #include "util/thread-pool > same Done http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/util/sharded-query-map-util.h@95 PS8, Line 95: int qs_m > generally we just use the un-sized primitive type (which for impala codebas Done -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 9 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 00:46:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Hello Philip Zeyliger, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8363 to look at the new patch set (#9). Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ The following 2 locks have shown to be frequent points of contention on recent perf runs: - qs_map_lock_ - client_request_state_map_lock_ Since these are process wide locks, any threads waiting on these locks potentially slow down the runtime of a query. I tried to address this previously by converting the client_request_state_map_lock_ to a reader-writer lock. This showed great perf improvements in the general case, however, there were edge cases with big regressions as well. In the general case, strict readers of the map got through so quickly that we were able to see a reduction in the number of client connections created, since this lock was contended for in the context of Thrift threads too. The bad case is when writers were starved trying to register a new query since there were so many readers. Changing the starve option resulted in worse read performance all round. Another approach which is relatively simpler is to shard the locks, which proves to be very effective with no regressions. The maps and locks are sharded to a default of 4 buckets initally. Query IDs are created by using boost::uuids::random_generator. We use the high bits of a query ID to assign queries to buckets. I verified that the distribution of the high bits of a query ID are even across buckets on my local machine: For 10,000 queries sharded across 4 buckets, the distribution was: bucket[0]: 2500 bucket[1]: 2489 bucket[2]: 2566 bucket[3]: 2445 A micro-benchmark is added to measure the improvement in performance. This benchmark creates multiple threads each of which creates a QueryState and accesses it multiple times. We can see improvements in the range 2x - 3.5x. BEFORE: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 1ms Total Time (#Queries: 50 #Accesses: 100) : 8ms Total Time (#Queries: 50 #Accesses: 1000) : 54ms Total Time (#Queries: 500 #Accesses: 100) : 76ms Total Time (#Queries: 500 #Accesses: 1000) : 543ms AFTER: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles Total Time (#Queries: 50 #Accesses: 100) : 4ms Total Time (#Queries: 50 #Accesses: 1000) : 15ms Total Time (#Queries: 500 #Accesses: 100) : 46ms Total Time (#Queries: 500 #Accesses: 1000) : 151ms This change introduces a ShardedQueryMap, which is used to replace the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_, and their corresponding locks, thereby abstracting away the access to the maps locks. For operations that need to happen on every entry in the ShardedQueryMap maps, a new function ShardedQueryMap::DoFuncForAllEntries() is introduced which takes a user supplied lambda and passes it every individual map entry and executes it. NOTE: This microbenchmark has shown that SpinLock has better performance than boost::mutex for the qs_map_lock_'s, so that change has been made too. TODO: Add benchmark for client_request_state_map_lock_ too. The APIs around that are more complicated, so this patch only includes the benchmarking of qs_map_lock_. TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap. Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/process-wide-locks-benchmark.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/util/sharded-query-map-util.h 8 files changed, 376 insertions(+), 57 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/9 -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 9 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 1: No, I picked it by hand since the original change replaced a method "histogram_for_tests()" which our copy doesn't have (and doesn't use). -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 00:43:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1924/ -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 15 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 00:41:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. Patch Set 15: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 15 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 00:40:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 15: (1 comment) Applied the update. http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@81 PS15, Line 81: getLoadedTable > I think this function should be called loadAndAddTable. The current name im Done -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 15 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Tue, 13 Feb 2018 00:34:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Hello Dimitris Tsirogiannis, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8851 to look at the new patch set (#16). Change subject: IMPALA-3193: Show table's comment on show tables .. IMPALA-3193: Show table's comment on show tables Currently SHOW TABLES command does not display given comment when creating a table. DESC shows column's comment and SHOW DATABASES also shows database's comment. On this change, table's comment is displayed if the table is loaded. If the table is not loaded, an empty comment is returned. Testing: Adds E2E test: TestShowTables.test_table_comment_visibility Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java M testdata/workloads/functional-query/queries/QueryTest/show.test M tests/metadata/test_metadata_query_statements.py M www/catalog.tmpl 20 files changed, 338 insertions(+), 218 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8851/16 -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 16 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Xinran Tinney has uploaded a new patch set (#15). ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. IMPALA-5440: Add planner tests with extreme statistics values This commit address some of the issues in JIRA: tests against the cardinality overflowing from JOIN, UNION, CROSS JOIN, FULL OUTER JOIN, 0 row number and negative row number, as well as cardinality on Subplan node. Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb --- M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 2 files changed, 117 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/9065/15 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 15 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Patch Set 1: Is this a clean cherry-pick ? -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 13 Feb 2018 00:31:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5440: Add planner tests with extreme statistics values
Xinran Tinney has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440: Add planner tests with extreme statistics values .. Patch Set 15: (3 comments) http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@670 PS14, Line 670:* This function plans the given query and fails if the estimated cardinalities are > This function plans the given query and fails if the estimated cardinalitie Done http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@695 PS14, Line 695: StringBuilder errorLog = new StringBuilder(); > That's fine. In that case we should remove the function comment that talks Done http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@695 PS14, Line 695: StringBuilder errorLog = new StringBuilder(); > I changed the function testing -1 rows, min = -1, this way, it returns erro Done -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 15 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 00:31:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. Patch Set 14: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1923/ -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 14 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 00:29:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Lars Volker has removed Kudu Jenkins from this change. ( http://gerrit.cloudera.org:8080/9287 ) Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. Removed reviewer Kudu Jenkins. -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. Patch Set 14: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 14 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Tue, 13 Feb 2018 00:28:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Cherry-pick dependency change for KRPC
Hello Alexey Serbin, Kudu Jenkins, Todd Lipcon, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/9287 to review the following change. Change subject: IMPALA-6269: Cherry-pick dependency change for KRPC .. IMPALA-6269: Cherry-pick dependency change for KRPC Expose RPC method info map and various metrics These changes are needed for IMPALA-6269. Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Reviewed-on: http://gerrit.cloudera.org:8080/9269 Tested-by: Kudu Jenkins Reviewed-by: Alexey SerbinReviewed-by: Todd Lipcon --- M be/src/kudu/rpc/acceptor_pool.cc M be/src/kudu/rpc/acceptor_pool.h M be/src/kudu/rpc/service_if.h M be/src/kudu/util/metrics.h 4 files changed, 16 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/9287/1 -- To view, visit http://gerrit.cloudera.org:8080/9287 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8bda390ea92cceb0d696767402c978a83b386825 Gerrit-Change-Number: 9287 Gerrit-PatchSet: 1 Gerrit-Owner: Lars Volker Gerrit-Reviewer: Alexey Serbin Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] updated PR to include Michael's review comments
njanartha...@cloudera.com has abandoned this change. ( http://gerrit.cloudera.org:8080/9290 ) Change subject: updated PR to include Michael's review comments .. Abandoned fixup commit to another PR -- To view, visit http://gerrit.cloudera.org:8080/9290 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I38fb22955aa19992674da8071189c7b7ae0968a1 Gerrit-Change-Number: 9290 Gerrit-PatchSet: 1 Gerrit-Owner: njanartha...@cloudera.com
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 8: Code-Review+2 (3 comments) http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/runtime/query-exec-mgr.h@28 PS8, Line 28: #include "util/spinlock.h" I don't see that used explicitly in this file. Can it be removed? http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/service/impala-server.h@44 PS8, Line 44: #include "util/spinlock.h" same http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/8/be/src/util/sharded-query-map-util.h@95 PS8, Line 95: uint64_t generally we just use the un-sized primitive type (which for impala codebase we default to 'int'). -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 8 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 13 Feb 2018 00:10:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 5: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 13 Feb 2018 00:07:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@67 PS7, Line 67: struct MapShard { : std::unordered_mapmap_; : SpinLock map_lock_; : }; > you should run a benchmark to verify, but you may want to explicitly align I ran the benchmarks both with and without inheriting CacheLineAligned, and there wasn't a noticeable delta on my machine. But I inherited CacheLineAligned to be more explicit. http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@94 PS7, Line 94: uint8_t > see comment below about uint8_t Done http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@122 PS7, Line 122: uint8_t > why uint8_t? Since it doesn't live in memory, doesn't seem necessary to di I thought it wouldn't make sense to have more than 255 buckets, but better not to make my own assumptions. I changed it to uint64_t. Do let me know if you had something else in mind. http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@129 PS7, Line 129: std::unordered_map * map_; : SpinLock* map_lock_; > This could be just a ShardedQueryMap::MapShard*, but up to you. Done -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 12 Feb 2018 23:56:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Hello Philip Zeyliger, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8363 to look at the new patch set (#8). Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ The following 2 locks have shown to be frequent points of contention on recent perf runs: - qs_map_lock_ - client_request_state_map_lock_ Since these are process wide locks, any threads waiting on these locks potentially slow down the runtime of a query. I tried to address this previously by converting the client_request_state_map_lock_ to a reader-writer lock. This showed great perf improvements in the general case, however, there were edge cases with big regressions as well. In the general case, strict readers of the map got through so quickly that we were able to see a reduction in the number of client connections created, since this lock was contended for in the context of Thrift threads too. The bad case is when writers were starved trying to register a new query since there were so many readers. Changing the starve option resulted in worse read performance all round. Another approach which is relatively simpler is to shard the locks, which proves to be very effective with no regressions. The maps and locks are sharded to a default of 4 buckets initally. Query IDs are created by using boost::uuids::random_generator. We use the high bits of a query ID to assign queries to buckets. I verified that the distribution of the high bits of a query ID are even across buckets on my local machine: For 10,000 queries sharded across 4 buckets, the distribution was: bucket[0]: 2500 bucket[1]: 2489 bucket[2]: 2566 bucket[3]: 2445 A micro-benchmark is added to measure the improvement in performance. This benchmark creates multiple threads each of which creates a QueryState and accesses it multiple times. We can see improvements in the range 2x - 3.5x. BEFORE: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 1ms Total Time (#Queries: 50 #Accesses: 100) : 8ms Total Time (#Queries: 50 #Accesses: 1000) : 54ms Total Time (#Queries: 500 #Accesses: 100) : 76ms Total Time (#Queries: 500 #Accesses: 1000) : 543ms AFTER: --Benchmark 1: Create and access Query States. Total Time (#Queries: 5 #Accesses: 100) : 2173.59K clock cycles Total Time (#Queries: 50 #Accesses: 100) : 4ms Total Time (#Queries: 50 #Accesses: 1000) : 15ms Total Time (#Queries: 500 #Accesses: 100) : 46ms Total Time (#Queries: 500 #Accesses: 1000) : 151ms This change introduces a ShardedQueryMap, which is used to replace the QueryExecMgr::qs_map_ and the ImpalaServer::client_request_state_map_, and their corresponding locks, thereby abstracting away the access to the maps locks. For operations that need to happen on every entry in the ShardedQueryMap maps, a new function ShardedQueryMap::DoFuncForAllEntries() is introduced which takes a user supplied lambda and passes it every individual map entry and executes it. NOTE: This microbenchmark has shown that SpinLock has better performance than boost::mutex for the qs_map_lock_'s, so that change has been made too. TODO: Add benchmark for client_request_state_map_lock_ too. The APIs around that are more complicated, so this patch only includes the benchmarking of qs_map_lock_. TODO 2: Consider adopting the ShardedQueryMapTemplate for the SessionStateMap. Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 --- M be/src/benchmarks/CMakeLists.txt A be/src/benchmarks/process-wide-locks-benchmark.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-exec-mgr.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/util/sharded-query-map-util.h 8 files changed, 379 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/8363/8 -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 8 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG@20 PS1, Line 20: this flag is left at 0, in which case it will be set to 20% of process > I was using the assumption of 32KB avg payload size, 500 nodes, 500 fragmen It seems too aggressive for small-to-medium deployments for sure :). It might be a reasonable value for very large deployments with high concurrency, but I think that would require adjusting buffer_pool_limit too. http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc File be/src/service/data-stream-service.cc: http://gerrit.cloudera.org:8080/#/c/9282/1/be/src/service/data-stream-service.cc@37 PS1, Line 37: DEFINE_int64(datastream_service_queue_mem_limit, 0, It would be good to use ParseUtil::ParseMemSpec() for this for consistency and convenience - see what is done for --buffer_pool_limit. This lets users specify it as a bytes amount with the usual suffixes or a percentage. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 23:44:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] updated PR to include Michael's review comments
njanartha...@cloudera.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9290 Change subject: updated PR to include Michael's review comments .. updated PR to include Michael's review comments Change-Id: I38fb22955aa19992674da8071189c7b7ae0968a1 --- M bin/mvn-quiet.sh 1 file changed, 15 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/9290/1 -- To view, visit http://gerrit.cloudera.org:8080/9290 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I38fb22955aa19992674da8071189c7b7ae0968a1 Gerrit-Change-Number: 9290 Gerrit-PatchSet: 1 Gerrit-Owner: njanartha...@cloudera.com
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
Xinran Tinney has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/9065/14/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@695 PS14, Line 695: if (cardinality < min || cardinality > max){ > * should not return an error for -1 I changed the function testing -1 rows, min = -1, this way, it returns error only cardinality < -1. Is this covering -1 validation? -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 14 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney Gerrit-Comment-Date: Mon, 12 Feb 2018 23:29:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6489: use correct template tuple size
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9288 ) Change subject: IMPALA-6489: use correct template tuple size .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1922/ -- To view, visit http://gerrit.cloudera.org:8080/9288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Gerrit-Change-Number: 9288 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 12 Feb 2018 23:29:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@81 PS15, Line 81: getLoadedTable I think this function should be called loadAndAddTable. The current name implies that you return a table that is already loaded but this function is the one doing the load. http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@107 PS15, Line 107: /** :* Overrides ImpaladCatalog.getTables to load the table metadata if it is missing. :*/ : @Override : public List getTables(String dbName, PatternMatcher matcher) : throws DatabaseNotFoundException { : List tables = super.getTables(dbName, matcher); : : // The table was not yet loaded. Load it in to the catalog and try getTable() : // again. : for (int i = 0; i < tables.size(); ++i) { : Table table = tables.get(i); : // Table doesn't exist or is already loaded. Just return it. : if (table != null && !table.isLoaded()) { : try { : tables.set(i, getLoadedTable(dbName, table.getName())); : } catch (CatalogException e) { : LOG.error(String.format("Error loading table: %s.%s", dbName, table.getName()), : e); : } : } : } : return tables; : } > During JUnit test, it should be called at "impaladCatalog_.get().getTables( Good point. Thanks for the explanation. -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 15 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 12 Feb 2018 23:26:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6489: use correct template tuple size
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9288 ) Change subject: IMPALA-6489: use correct template tuple size .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Gerrit-Change-Number: 9288 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Comment-Date: Mon, 12 Feb 2018 23:17:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/main/java/org/apache/impala/service/MetadataOp.java@279 PS15, Line 279: for (Table table: fe.getTables(db.getName(), tablePatternMatcher, user)) { : String tabName = table.getName(); > Are we certain that this code doesn't introduce the same infinite wait for I confirmed the problem should be solved with my change. The cause of the problem is JUnit test requires tables loading explicitly if they are not loaded. This is the reason why I introduce a new code at ImpaladTestCatalog.java. http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@107 PS15, Line 107: /** :* Overrides ImpaladCatalog.getTables to load the table metadata if it is missing. :*/ : @Override : public List getTables(String dbName, PatternMatcher matcher) : throws DatabaseNotFoundException { : List tables = super.getTables(dbName, matcher); : : // The table was not yet loaded. Load it in to the catalog and try getTable() : // again. : for (int i = 0; i < tables.size(); ++i) { : Table table = tables.get(i); : // Table doesn't exist or is already loaded. Just return it. : if (table != null && !table.isLoaded()) { : try { : tables.set(i, getLoadedTable(dbName, table.getName())); : } catch (CatalogException e) { : LOG.error(String.format("Error loading table: %s.%s", dbName, table.getName()), : e); : } : } : } : return tables; : } > Where is the ImpaladTestCatalog::getTables() called? During JUnit test, it should be called at "impaladCatalog_.get().getTables(dbName, matcher)". See https://gerrit.cloudera.org/#/c/8851/15/fe/src/main/java/org/apache/impala/service/Frontend.java at 625. -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 15 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 12 Feb 2018 23:17:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py@220 PS4, Line 220: line > would it make sense to print the whole profile here? that way if we hit a p Done http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py@234 PS4, Line 234: assert event in runtime_profile > maybe print the profile here too I switched this test to use the new utility function too. -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 23:12:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG@20 PS1, Line 20: this flag is left at 0, in which case it will be set to 20% of process > Devoting 20% of the process limit to this seems very high. The buffer pool I was using the assumption of 32KB avg payload size, 500 nodes, 500 fragments per host. Assuming each host has 64GB so process mem limit is 80% of it. The queue consumption translates to about 15% of process mem limit and I bumped it up to 20% just to be safe as broadcast exchange can have payload much larger than 32KB. The consequence of having a value too low is that RPCs may have to be retried multiple times. That said, it sounds like 20% may be too aggressive. Will trimming it down to 5% still too high or should we just hard code a value based on the assumption above ? -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 23:03:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6489: use correct template tuple size
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9288 ) Change subject: IMPALA-6489: use correct template tuple size .. Patch Set 1: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Gerrit-Change-Number: 9288 Gerrit-PatchSet: 1 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Comment-Date: Mon, 12 Feb 2018 23:02:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6489: use correct template tuple size
Tim Armstrong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9288 Change subject: IMPALA-6489: use correct template tuple size .. IMPALA-6489: use correct template tuple size The bug was that we mixed up the size of the top-level tuple with the size of the nested tuple. The upshot in this case was that the wrong amount of data was memcpy()ed over and we read past the bounds of the original allocation. Testing: TestParquetArrayEncodings.test_avro_primitive_in_list reliably reproduced the problem under ASAN. After the fix it not longer reproduces. Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 --- M be/src/exec/hdfs-scanner.h 1 file changed, 4 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/9288/1 -- To view, visit http://gerrit.cloudera.org:8080/9288 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8193c04673f15e5057f457cc8a3a91a8fef64be2 Gerrit-Change-Number: 9288 Gerrit-PatchSet: 1 Gerrit-Owner: Tim Armstrong
[Impala-ASF-CR] IMPALA-6479: Update DESCRIBE to respect column privileges
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9276 ) Change subject: IMPALA-6479: Update DESCRIBE to respect column privileges .. Patch Set 1: (22 comments) http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/client-request-state.cc@381 PS1, Line 381: params->__isset.table_name ? &(params->table_name) : NULL; NULL -> nullptr http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/frontend.h File be/src/service/frontend.h: http://gerrit.cloudera.org:8080/#/c/9276/1/be/src/service/frontend.h@133 PS1, Line 133: Status DescribeTable(const TDescribeOutputStyle::type output_style, Why do we need to change the signature so dramatically? Should it not be sufficient to have: Status DescribeTable(const TDescribeTableParams& params, const TSessionState& session, TDescribeResult* response); http://gerrit.cloudera.org:8080/#/c/9276/1/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/9276/1/common/thrift/Frontend.thrift@173 PS1, Line 173: 4: optional ImpalaInternalService.TSessionState session I don't think this is needed. The session state is available in the BE during: ClientRequestState::ExecLocalCatalogOp() You can pass 'query_ctx_.session' to frontend_->DescribeTable() http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java File fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java: http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java@122 PS1, Line 122: resultStruct_ = Path.getTypeAsStruct(path_.destType()); Maybe I'm missing something, but it seems like the following scenario is not authorized properly: create table default.t ( id int, a array> ) User has column-level privileges on 'id', but not on 'a'. DESCRIBE default.t.a That describe should fail, but I think it will succeed. This case needs a test. http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java File fe/src/main/java/org/apache/impala/catalog/Table.java: http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java@489 PS1, Line 489: public StructType getHiveColumnsAsStruct(List columns) { Seems weird to have this as a member, since it's totally non-specific to this Table. How about making this a static function in Column like columnsToStruct() http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/catalog/Table.java@490 PS1, Line 490: ArrayList fields = Lists.newArrayListWithCapacity(colsByPos_.size()); columns.size() http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java File fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java: http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@199 PS1, Line 199: List nonClustered = new ArrayList(table.getNonClusteringColumns()); Will this code work if 'nonClustered' or 'clustered' is empty? http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@200 PS1, Line 200: nonClustered.retainAll(filteredColumns); Concise but slow. I suggest this instead List nonClustered List clustered for (Column c: filteredColumns) { if (table.isClusteringColumn(c) { clustered.add } else { nonClustered.add } } http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@259 PS1, Line 259:* Builds a TDescribeResult for a table. update comment http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/DescribeResultFactory.java@261 PS1, Line 261: public static TDescribeResult buildDescribeMinimalResult(List columns) { buildKuduDescribeMinimalResult() http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java@796 PS1, Line 796: for (Column col: table.getColumnsInHiveOrder()) { Can we do a table-level check first? Checking all columns all the time is pretty expensive. http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/Frontend.java@806 PS1, Line 806: filteredColumns = table.getColumns(); Shouldn't this be getColumnsInHiveOrder()? http://gerrit.cloudera.org:8080/#/c/9276/1/fe/src/main/java/org/apache/impala/service/JniFrontend.java File
[Impala-ASF-CR] IMPALA-4456: Address scalability issues of qs map lock and client request state map lock
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8363 ) Change subject: IMPALA-4456: Address scalability issues of qs_map_lock_ and client_request_state_map_lock_ .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h File be/src/util/sharded-query-map-util.h: http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@67 PS7, Line 67: struct MapShard { : std::unordered_mapmap_; : SpinLock map_lock_; : }; you should run a benchmark to verify, but you may want to explicitly align that to a cache line (by inheriting CacheLineAligned). http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@94 PS7, Line 94: uint8_t see comment below about uint8_t http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@122 PS7, Line 122: uint8_t why uint8_t? Since it doesn't live in memory, doesn't seem necessary to dictate the size. http://gerrit.cloudera.org:8080/#/c/8363/7/be/src/util/sharded-query-map-util.h@129 PS7, Line 129: std::unordered_map * map_; : SpinLock* map_lock_; This could be just a ShardedQueryMap::MapShard*, but up to you. -- To view, visit http://gerrit.cloudera.org:8080/8363 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I61089090e1095da45a8a64ed3ccc78bd310807f1 Gerrit-Change-Number: 8363 Gerrit-PatchSet: 7 Gerrit-Owner: Sailesh Mukil Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Mon, 12 Feb 2018 22:41:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py@220 PS4, Line 220: line would it make sense to print the whole profile here? that way if we hit a problem we can see which events actually got printed and in which order. same for the assert below this. http://gerrit.cloudera.org:8080/#/c/9271/4/tests/query_test/test_observability.py@234 PS4, Line 234: assert event in runtime_profile maybe print the profile here too -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 21:29:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Hello Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9271 to look at the new patch set (#4). Change subject: IMPALA-6497: add "Last row fetched" and AC events .. IMPALA-6497: add "Last row fetched" and AC events This makes it more observable that all rows were returned to the client and also that resources were released for admission control. Testing: Manually inspected some query profiles. Added a basic observability test that ensures that the expected events appear in the profile. Ran it in a loop for a bit to make sure it wasn't flaky. Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e --- M be/src/runtime/coordinator.cc M tests/query_test/test_observability.py 2 files changed, 42 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/9271/4 -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5139: Update mvn-quiet.sh to print execution content to log file
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/9273 ) Change subject: IMPALA-5139: Update mvn-quiet.sh to print execution content to log file .. Patch Set 1: I think this is a good start, but this patch be improved again. I concede too that the IMPALA-5139 description doesn't provide full context. Let me provide some now: - I think mvn-quiet.sh capturing of mvn INFO output should go into one or more files rooted in $IMPALA_LOG_DIR. This will let Jenkins jobs which tend to look in that place collect the mvn logs, too. - The working directory and args matter. You can see that's already captured L25-26 of this file, but I think we need that info alongside all the maven output and be written to the log. - More than one thing calls mvn-quiet.sh. You need a strategy for appending to a log file or writing multiple log files. tee -a might be of assistance here. -- To view, visit http://gerrit.cloudera.org:8080/9273 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I475b17a4dccfa624dda61402491b461c53473f8b Gerrit-Change-Number: 9273 Gerrit-PatchSet: 1 Gerrit-Owner: njanartha...@cloudera.com Gerrit-Reviewer: David KnuppGerrit-Reviewer: Michael Brown Gerrit-Comment-Date: Mon, 12 Feb 2018 20:33:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 3: Looks good! did you miss checking in the changes for the test that check the order of events? -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 20:16:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9282/1//COMMIT_MSG@20 PS1, Line 20: this flag is left at 0, in which case it will be set to 20% of process Devoting 20% of the process limit to this seems very high. The buffer pool is given 80% of the process limit, so this leaves no guaranteed headroom for anything else. I.e. if the service expands up to its limit, it will squeeze out any non-buffer-pool memory from queries and likely cause query failures. It seems to somewhat defeat the purpose of limiting the amount of memory consumed if we're not limiting it low enough to prevent query failures. I'd be interested in learning more about how the number was obtained and what the trade-offs are in setting it to a lower value -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 19:59:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 3: > I don't think there's a JIRA. Filed IMPALA-6501 Great, thanks. -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 19:45:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 3: (4 comments) Logic of the patch looks good to me, did a final pass for stylistic nits. I'll +2 once those are fixed. http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/exec/hdfs-scanner.cc@228 PS3, Line 228: // Let's not leave tuple ptrs uninitialized. Can you add the JIRA to this comment to better explain the consequences of not doing this? http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/exec/kudu-scanner.cc File be/src/exec/kudu-scanner.cc: http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/exec/kudu-scanner.cc@259 PS3, Line 259: // Initialize tuple ptrs to a dummy non-null value Can you add the JIRA there too? http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/runtime/tuple.h File be/src/runtime/tuple.h: http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/runtime/tuple.h@89 PS3, Line 89: Poison This should either be lower-case if we're treating it as a variable or upper-case if we're treating it as a constant. It's a weird edge case but it feels like it should be a constant to me. http://gerrit.cloudera.org:8080/#/c/9239/3/be/src/runtime/tuple.h@91 PS3, Line 91: void Init(int size) { Unnecessary formatting change here. -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 19:48:17 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h File be/src/common/thread-debug-info.h: http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115 PS4, Line 115: system_thread_id_ > I think it is possible that the parent TDI object gets destroyed by the tim Are you saying that thread_debug_info_ pointer may point to stray memory? If that's the case, maybe we shouldn't keep that pointer but instead have a way to traverse TDIs and find it if still exists? http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119 PS4, Line 119: boost::thread::id boost_thread_id_; : int64_t system_thread_id_ = 0; > boost::thread::id is less prone to being recycled, but I think this propert I'm not sure what you mean by "switch to the parent thread quickly". I don't think I fully understand the tradeoffs. Just seems confusing to have multiple IDs and I don't understand how each of these fields helps in debugging. Perhaps some comments attached to them explaining how one can use them for debugging would help? -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 19:45:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add KuduDataTypeToColumnType default case
Grant Henke has abandoned this change. ( http://gerrit.cloudera.org:8080/9283 ) Change subject: Add KuduDataTypeToColumnType default case .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/9283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 Gerrit-Change-Number: 9283 Gerrit-PatchSet: 1 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 3: I don't think there's a JIRA. Filed IMPALA-6501 -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 19:41:51 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Hello Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9271 to look at the new patch set (#3). Change subject: IMPALA-6497: add "Last row fetched" and AC events .. IMPALA-6497: add "Last row fetched" and AC events This makes it more observable that all rows were returned to the client and also that resources were released for admission control. Testing: Manually inspected some query profiles. Added a basic observability test that ensures that the expected events appear in the profile. Ran it in a loop for a bit to make sure it wasn't flaky. Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e --- M be/src/runtime/coordinator.cc M tests/query_test/test_observability.py 2 files changed, 23 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/9271/3 -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 3 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig
[Impala-ASF-CR] Add KuduDataTypeToColumnType default case
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9283 ) Change subject: Add KuduDataTypeToColumnType default case .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9283/1/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/9283/1/be/src/exec/kudu-util.cc@196 PS1, Line 196: default: return ColumnType(PrimitiveType::INVALID_TYPE); > It didn't appear intentional given there was a "fall through" to a `return I believe the "return INVALID_TYPE" is required by at least some compilers (even though of course as you say it can never be executed). If not, we can remove it, or else leave a comment. When going the other direction, Impala type -> Kudu type, its genuinely non-exhaustive (i.e. there are types in Impala that have no corresponding type in Kudu, but all Kudu types have a corresponding Impala type), so the default makes more sense. If there are any other places where we're going Kudu type -> Impala type where a 'default' is used, I would argue we should remove those defaults. Its also a little different for the FE, because we don't have as much control over when maven starts pulling in the new Kudu client, since it happens automatically, but for the BE we do have control. -- To view, visit http://gerrit.cloudera.org:8080/9283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 Gerrit-Change-Number: 9283 Gerrit-PatchSet: 1 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 12 Feb 2018 19:32:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Hello Bharath Vissapragada, Dimitris Tsirogiannis, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8958 to look at the new patch set (#3). Change subject: IMPALA-5152: Introduce metadata loading phase .. IMPALA-5152: Introduce metadata loading phase Reworks the collection and loading of missing metadata when compiling a statement. Introduces a new metadata-loading phase between parsing and analysis. Summary of the new compilation flow: 1. Parse statement. 2. Collect all table references from the parsed statement and generate a list of tables that need to be loaded for analysis to succeed. 3. Request missing metadata and wait for it to arrive. As views become loaded we expand the set of required tables based on the view definitions. This step populates a statement-local table cache that contains all loaded tables relevant to the statement. 4. Create a new Analyzer with the table cache and analyze the statement. During analysis only the table cache is consulted for table metadata, the ImpaladCatalog is not used for that purpose anymore. 5. Authorize the statement. 6. Plan generation as usual. The intent of the existing code was to collect all tables missing metadata during analysis, load the metadata, and then re-analyze the statement (and repeat those steps until all metadata is loaded). Unfortunately, the relevant code was hard-to-follow, subtle and not well tested, and therefore it was broken in several ways over the course of time. For example, the introduction of path analysis for nested types subtly broke the intended behavior, and there are other similar examples. The serial table loading observed in the JIRA was caused by the following code in the resolution of table references: for (all path interpretations) { try { // Try to resolve the path; might call getTable() which // throws for nonexistent tables. } catch (AnalysisException e) { if (analyzer.hasMissingTbls()) throw e; } } The following example illustrates the problem: SELECT * FROM a.b, x.y When resolving the path "a.b" we consider that "a" could be a database or a table. Similarly, "b" could be a table or a nested collection. If the path resolution for "a.b" adds a missing table entry, then the path resolution for "x.y" could exit prematurely, without trying the other path interpretations that would lead to adding the expected missing table. So effectively, the tables end up being loaded one-by-one. Testing: - A core/hdfs run succeeded - No new tests were added because the existing functional tests provide good coverage of various metadata loading scenarios. - The issue reported in IMPALA-5152 is basically impossible now. Adding FE unit tests for that bug specifically would require ugly changes to the new code to enable such testing. Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 --- M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/AuthorizationStmt.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/LimitElement.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateFunctionStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java M
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Dimitris Tsirogiannis has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 15: (2 comments) http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/main/java/org/apache/impala/service/MetadataOp.java@279 PS15, Line 279: for (Table table: fe.getTables(db.getName(), tablePatternMatcher, user)) { : String tabName = table.getName(); Are we certain that this code doesn't introduce the same infinite wait for metadata load problem you run into during testing? http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java File fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java: http://gerrit.cloudera.org:8080/#/c/8851/15/fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java@107 PS15, Line 107: /** :* Overrides ImpaladCatalog.getTables to load the table metadata if it is missing. :*/ : @Override : public List getTables(String dbName, PatternMatcher matcher) : throws DatabaseNotFoundException { : List tables = super.getTables(dbName, matcher); : : // The table was not yet loaded. Load it in to the catalog and try getTable() : // again. : for (int i = 0; i < tables.size(); ++i) { : Table table = tables.get(i); : // Table doesn't exist or is already loaded. Just return it. : if (table != null && !table.isLoaded()) { : try { : tables.set(i, getLoadedTable(dbName, table.getName())); : } catch (CatalogException e) { : LOG.error(String.format("Error loading table: %s.%s", dbName, table.getName()), : e); : } : } : } : return tables; : } Where is the ImpaladTestCatalog::getTables() called? -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 15 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 12 Feb 2018 18:42:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add KuduDataTypeToColumnType default case
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9283 ) Change subject: Add KuduDataTypeToColumnType default case .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9283/1/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/9283/1/be/src/exec/kudu-util.cc@196 PS1, Line 196: default: return ColumnType(PrimitiveType::INVALID_TYPE); Excluding the default here was an intentional choice so that the compiler will tell us if we've missed something. I understand that without the 'default' we'll get a compiler error when we bump the toolchain version to pull in the new Kudu client, but I would prefer that we just include a 'case DECIMAL' in that same patch, rather than losing the compiler check here. -- To view, visit http://gerrit.cloudera.org:8080/9283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 Gerrit-Change-Number: 9283 Gerrit-PatchSet: 1 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 12 Feb 2018 18:38:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1920/ -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6077: remove Parquet BIT PACKED def level support
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9241 ) Change subject: IMPALA-6077: remove Parquet BIT_PACKED def level support .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9241 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12c75b7f162dd7de8e26cf31be142b692e3624ae Gerrit-Change-Number: 9241 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 12 Feb 2018 18:05:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6204: Remove external DataSource
Philip Zeyliger has abandoned this change. ( http://gerrit.cloudera.org:8080/9192 ) Change subject: IMPALA-6204: Remove external DataSource .. Abandoned Per conversation on user@, folks are interested in keeping this around. -- To view, visit http://gerrit.cloudera.org:8080/9192 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I02a3a6740466ed7372b71d948c705b30886dcfb6 Gerrit-Change-Number: 9192 Gerrit-PatchSet: 7 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-6497: add "Last row fetched" and AC events
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/9271 ) Change subject: IMPALA-6497: add "Last row fetched" and AC events .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9271/2/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/9271/2/be/src/runtime/coordinator.cc@865 PS2, Line 865: returned_all_results_ = true; should'nt "Last row fetched" come directly after this? WaitForBackendCompletion can add unrelated time to this. -- To view, visit http://gerrit.cloudera.org:8080/9271 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32a707e4660061e75c86ad967f1fac6f6737da7e Gerrit-Change-Number: 9271 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Comment-Date: Mon, 12 Feb 2018 17:54:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add KuduDataTypeToColumnType default case
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9283 ) Change subject: Add KuduDataTypeToColumnType default case .. Patch Set 1: Test job running here: https://jenkins.impala.io/job/pre-review-test/113/ -- To view, visit http://gerrit.cloudera.org:8080/9283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 Gerrit-Change-Number: 9283 Gerrit-PatchSet: 1 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Mon, 12 Feb 2018 17:37:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9239/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9239/2//COMMIT_MSG@57 PS2, Line 57: | 03:AGGREGATE | 1 | 129.01us | 129.01us | 1 | 1 | 28.00 KB | 10.00 MB | FINALIZE| > Yes, I measured it against a debug build... With the new numbers, I think that this seems okay. Is there a JIRA for the count(*) optimization for Kudu? (and if not, could you file one?) -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 17:25:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5440 Add planner tests with extreme statistics values
Xinran Tinney has posted comments on this change. ( http://gerrit.cloudera.org:8080/9065 ) Change subject: IMPALA-5440 Add planner tests with extreme statistics values .. Patch Set 14: (16 comments) http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@509 PS13, Line 509: public void testCardinalityOverflow() throws ImpalaException { > Please clean up / condense the SQL to make it more readable. I pointing out Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@510 PS13, Line 510: String tblName = "tpch.cardinality_overflow"; > Would be great to have individual clauses as building blocks to make the CR Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@517 PS13, Line 517: + "l_comment STRING" > clause not needed Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@518 PS13, Line 518: + ")"; > clause not needed Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@519 PS13, Line 519: String tblLocation = "LOCATION " > clause not needed Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@521 PS13, Line 521: String tblPropsTemplate = "TBLPROPERTIES('numRows'='%s')"; > For TBLPROPERTIES, we only care about the 'numRows'. All other keys can be Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@524 PS13, Line 524: addTestTable(String.format("CREATE EXTERNAL TABLE %s %s %s %s;", > numFiles property not needed Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@530 PS13, Line 530: + "tpch.cardinality_overflow b, tpch.cardinality_overflow c"; > extra space at the end Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@541 PS13, Line 541: query = "select l_shipmode from tpch.cardinality_overflow " > use unqualified column references where possible: Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@550 PS13, Line 550: > fits in previous line? Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@783 PS13, Line 783: ueryFileParser.parseFile(); : StringBuilder errorLog = new StringBuilder(); : for (TestCase testCase : queryFileParser.getTestCases()) { : actualOutput.append(testCas > easier: Plans 'query' and fails if estimated cardinalities are not within t Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@788 PS13, Line 788: try { > Would be great to run this for all existing planner tests list like checkLi Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@799 PS13, Line 799: try { > style: indent 2 Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@805 PS13, Line 805: errorLog.append("Unable to create output file: " + e.getMessage()); > style: space after if Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@805 PS13, Line 805: rLog.append("Unable to create output file: " + e.getMessage()); : } > nit: spacing Done http://gerrit.cloudera.org:8080/#/c/9065/13/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@809 PS13, Line 809: if (errorLog.length() != 0) { > * in general a cardinality of -1 means "unknown" and is also valid; I think Done -- To view, visit http://gerrit.cloudera.org:8080/9065 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I86dec47cf1438882cafaec53e97864ccfcdff3cb Gerrit-Change-Number: 9065 Gerrit-PatchSet: 14 Gerrit-Owner: Xinran TinneyGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Vuk Ercegovac Gerrit-Reviewer: Xinran Tinney
[Impala-ASF-CR] Add KuduDataTypeToColumnType default case
Grant Henke has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9283 Change subject: Add KuduDataTypeToColumnType default case .. Add KuduDataTypeToColumnType default case Given the Impala build breaks when a non-exhaustive enum switch statement is used, there is no way the KuduDataTypeToColumnType switch statement could fall through to return `ColumnType(PrimitiveType::INVALID_TYPE);` and still compile. Instead, this should be moved to the default case in the switch statment. Because Kudu is adding a new Decimal type, this will prevent a compilation failure if/when the Kudu version is upgraded as well. Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 --- M be/src/exec/kudu-util.cc 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/9283/1 -- To view, visit http://gerrit.cloudera.org:8080/9283 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8f7d9e16070b66274107a2bff52cbe8f3ae9b8f1 Gerrit-Change-Number: 9283 Gerrit-PatchSet: 1 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9239/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9239/2//COMMIT_MSG@57 PS2, Line 57: 00:SCAN KUDU3 90.856ms 107.409ms 6.00M 6.00M 512.00 KB 0 tpch_kudu.lineitem > I tried to do a similar experiment with a larger Kudu scale factor (I creat Yes, I measured it against a debug build... Re-run the query against release versions of Impala and now the difference is much smaller. Updated the commit with the new data. -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 2 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 16:26:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Hello Thomas Tauber-Marshall, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9239 to look at the new patch set (#3). Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows Tuple pointers in the generated row batches may not be initialized if a tuple has byte size 0. There are some codes which compare these uninitialized pointers against nullptr so having them uninitialized may return wrong (and non-deterministic) results, e.g.: impala::TupleIsNullPredicate::GetBooleanVal() The following query produces non-deterministic results currently: SELECT count(v.x) FROM functional.alltypestiny t3 LEFT OUTER JOIN ( SELECT true AS x FROM functional.alltypestiny t1 LEFT OUTER JOIN functional.alltypestiny t2 ON (true)) v ON (v.x = t3.bool_col) WHERE t3.bool_col = true; The alltypestiny table has 8 records, 4 records of them has the true boolean value for bool_col. Therefore, the above query is a fancy way of saying "8 * 8 * 4", i.e. it should return 256. The solution is that scanners initialize tuple pointers to a non-null value when there are no materialized slots. This non-null value is provided by the new static method called Tuple::Poison(). I extended QueryTest/scanners.test with the above query. This test executes the query against all table formats. This change has the biggest performance impact on count(*) queries on large kudu tables. For my quick benchmark I copied tpch_kudu.lineitem and doubled its data. The resulting table has 12,002,430 rows. Without this patch 'select count(*) from biglineitem' runs for ~0.12s. With the patch applied, the overhead is around a dozens of ms. I measured the query on my desktop PC using a relase build of Impala. On debug builds, the execution time of the patched version is around 160% of the original version. Without this patch: +--++--+--+++---+---+-+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +--++--+--+++---+---+-+ | 03:AGGREGATE | 1 | 127.50us | 127.50us | 1 | 1 | 28.00 KB | 10.00 MB | FINALIZE| | 02:EXCHANGE | 1 | 22.32ms | 22.32ms | 3 | 1 | 0 B | 0 B | UNPARTITIONED | | 01:AGGREGATE | 3 | 1.78ms | 1.89ms | 3 | 1 | 16.00 KB | 10.00 MB | | | 00:SCAN KUDU | 3 | 8.00ms | 8.28ms | 12.00M | -1 | 512.00 KB | 0 B | default.biglineitem | +--++--+--+++---+---+-+ With this patch: +--++--+--+++---+---+-+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +--++--+--+++---+---+-+ | 03:AGGREGATE | 1 | 129.01us | 129.01us | 1 | 1 | 28.00 KB | 10.00 MB | FINALIZE| | 02:EXCHANGE | 1 | 33.00ms | 33.00ms | 3 | 1 | 0 B | 0 B | UNPARTITIONED | | 01:AGGREGATE | 3 | 1.99ms | 2.13ms | 3 | 1 | 16.00 KB | 10.00 MB | | | 00:SCAN KUDU | 3 | 13.13ms | 13.97ms | 12.00M | -1 | 512.00 KB | 0 B | default.biglineitem | +--++--+--+++---+---+-+ Change-Id: I2981227e62eb5971508e0698e189519755de --- M be/src/exec/hdfs-scanner.cc M be/src/exec/kudu-scanner.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h M testdata/workloads/functional-query/queries/QueryTest/scanners.test 5 files changed, 30 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/39/9239/3 -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9053 ) Change subject: IMPALA-6416: extend Thread::Create to track instance id .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h File be/src/common/thread-debug-info.h: http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115 PS4, Line 115: system_thread_id_ > if that matches thread_debug_info_->system_thread_id_, why store it here al I think it is possible that the parent TDI object gets destroyed by the time the minidump is generated, but the parent thread is still running. Maybe the parent thread is a member of a thread pool, and it didn't wait for its child thread to finish and now it is already working on something else. From the system thread id at least we'll know which thread pool created the child thread. http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119 PS4, Line 119: boost::thread::id boost_thread_id_; : int64_t system_thread_id_ = 0; > are these related? are they both needed for debugging or is it possible to boost::thread::id is less prone to being recycled, but I think this property doesn't add much value currently. If we had stored the boost::thread::id in ParentInfo, then we could have some use of it. E.g. if the parent TDI is invalid we could use the system thread id to switch to the parent thread quickly (if it's still running), then with some luck we can check the boost thread id (maybe with the help of a new TDI), and from that we could tell if this thread is really the same thread that created the child. So, we have two options here, delete the use of boost::thread::id, or use it more extensively by adding it to ParentInfo as well. Do you have a preference? -- To view, visit http://gerrit.cloudera.org:8080/9053 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187 Gerrit-Change-Number: 9053 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 12 Feb 2018 13:35:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Kim Jin Chul has posted comments on this change. ( http://gerrit.cloudera.org:8080/8851 ) Change subject: IMPALA-3193: Show table's comment on show tables .. Patch Set 14: (1 comment) Applied the update. I realized that table loading should be required if a table has not been loaded on JUnit test. In getTables, some of tables can be IncompleteTable. Thus, these tables should be loaded explicitly. Please see my change in ImpaladTestCatalog. http://gerrit.cloudera.org:8080/#/c/8851/14/fe/src/main/java/org/apache/impala/service/MetadataOp.java File fe/src/main/java/org/apache/impala/service/MetadataOp.java: http://gerrit.cloudera.org:8080/#/c/8851/14/fe/src/main/java/org/apache/impala/service/MetadataOp.java@283 PS14, Line 283: else { : continue; : } > Are you sure this is correct? If table is loaded and is not an instance of Sorry, I should have thought it deeply. -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 14 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul Gerrit-Comment-Date: Mon, 12 Feb 2018 12:42:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3193: Show table's comment on show tables
Hello Dimitris Tsirogiannis, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8851 to look at the new patch set (#15). Change subject: IMPALA-3193: Show table's comment on show tables .. IMPALA-3193: Show table's comment on show tables Currently SHOW TABLES command does not display given comment when creating a table. DESC shows column's comment and SHOW DATABASES also shows database's comment. On this change, table's comment is displayed if the table is loaded. If the table is not loaded, an empty comment is returned. Testing: Adds E2E test: TestShowTables.test_table_comment_visibility Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog.cc M be/src/catalog/catalog.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-http-handler.cc M be/src/service/impala-server.cc M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/Catalog.java M fe/src/main/java/org/apache/impala/catalog/Table.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniCatalog.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/main/java/org/apache/impala/service/MetadataOp.java M fe/src/test/java/org/apache/impala/testutil/ImpaladTestCatalog.java M testdata/workloads/functional-query/queries/QueryTest/show.test M tests/metadata/test_metadata_query_statements.py M www/catalog.tmpl 20 files changed, 338 insertions(+), 218 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/51/8851/15 -- To view, visit http://gerrit.cloudera.org:8080/8851 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I44f814af05db6f3c027718ade9f474f8b8153bcd Gerrit-Change-Number: 8851 Gerrit-PatchSet: 15 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kim Jin Chul
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 9: (14 comments) main changes: (1) minimized changes to the main scheduler method with an iterator, (2) added backend tests http://gerrit.cloudera.org:8080/#/c/8523/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/9//COMMIT_MSG@30 PS9, Line 30: Testing: > Do we have a test that covers mixed queries? Have you considered adding a t thx for the suggestions. added tests to scheduler-test.cc. there are end-to-end tests for testing multiple file systems in the same table/query (see testdata/workloads/functional-query/queries/QueryTest/multiple-filesystems.test) http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.h File be/src/scheduling/scheduler.h: http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.h@450 PS9, Line 450: instances > nit: instance Done http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@257 PS9, Line 257: int num_ranges = 0; > Have you considered generating the scan ranges here and passing them into C I tried something like this, but it created two pases so I backed it out. After discussing, so that we avoid additional complexity to the main ComputeRangeAssignment method, I wrapped the original scan ranges in an iterator that generates new scan ranges or returns the original ones. http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@504 PS9, Line 504: // expanded_locations. 'spec' is assumed to *not* have block location information. > missing rename? removed http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@504 PS9, Line 504: 'spec' is assumed to *not* have block location information. > It doesn't have a member to do so. Is this a stale comment? removed. this refers to longer range plan where hdfs files (with block info) will also be handled. the todo in the thrift spec covers this. http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@505 PS9, Line 505: HDFS > This comment seems a bit misleading, since S3 etc are not HDFS filesystems. remvoed http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@516 PS9, Line 516: long scan_range_offset = 0; > Any reason to not use int64_t here? Done http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@518 PS9, Line 518: long scan_range_length = std::min(spec.max_block_size, fb_desc->length()); > I'm not sure it can happen, but if spec.max_block_size == 0, the loop below good point. added a check for this in FE as well as here. http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@582 PS9, Line 582: Defer > "Defer" read to me like if something is being done immediately. Can you cla Done http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@603 PS9, Line 603: *num_ranges += 1; > If you move this count to L676 right after the call to RecordScanRangeAssig got rid of this part of the api. the number of scan ranges is now accessed via the iterator. http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@681 PS9, Line 681: remote_scan_range_locations.push_back(); > nit: use vector<>::insert() instead of a loop here. removed http://gerrit.cloudera.org:8080/#/c/8523/9/be/src/scheduling/scheduler.cc@686 PS9, Line 686: // Allow exec_at_coord to take precedence over selecting a remote host. > This should only happen for generated scan ranges. Can you add a DCHECK to removed http://gerrit.cloudera.org:8080/#/c/8523/9/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/8523/9/common/thrift/PlanNodes.thrift@204 PS9, Line 204: // -1 means not-splittable > nit: add newlines above comments. Done http://gerrit.cloudera.org:8080/#/c/8523/9/common/thrift/PlanNodes.thrift@205 PS9, Line 205: 2: required i64 max_block_size > Could this ever be infinity? I feel it might be clearer to have a boolean f Done -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 9 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 12 Feb 2018
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Hello Lars Volker, Dimitris Tsirogiannis, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8523 to look at the new patch set (#10). Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. IMPALA-5931: Generates scan ranges in planner for s3/adls Currently, for filesystems that do not include physical block information (e.g., block replica locations, caching), synthetic blocks are generated and stored in the catalog when metadata is loaded. Example file systems for which this is done includes S3, ADLS, and local fs. This change avoids generating these blocks when metadata is loaded. Instead, scan ranges are directly generated from such files by the backend coordinator. Previously, all scan ranges were produced by the planner in HDFSScanNode in the frontend. Now, those files without block information are sent to the coordinator represented by a split specification that determines how the coordinator will create scan ranges to send to executors. This change reduces the space needed in the catalog and reduces the scan range data structures that are passed from the frontend to the backend when planning and coordinating a query. In addition a bug is avoided where non-splittable files were being split anyways to support the query parameter that places a limit on scan ranges. Testing: - added backend scheduler tests - mixed-filesystems test covers tables/queries with multiple fs's. - local fs tests cover the code paths in this change - all core tests pass when configured with s3 - manually tried larger local filesystem tables (tpch) with multiple partitions and observed the same scan ranges. - TODO: adls testing Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 --- M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler-test-util.cc M be/src/scheduling/scheduler-test-util.h M be/src/scheduling/scheduler-test.cc M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/util/CMakeLists.txt A be/src/util/flat_buffer.cc A be/src/util/flat_buffer.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java 13 files changed, 638 insertions(+), 191 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8523/10 -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 10 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac