[Impala-ASF-CR] IMPALA-11120: Fix codec not set in generating ORC tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18228 ) Change subject: IMPALA-11120: Fix codec not set in generating ORC tables .. Patch Set 3: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7888/ -- To view, visit http://gerrit.cloudera.org:8080/18228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bd5d9400864145133ff019a3d076a6cab36fcc Gerrit-Change-Number: 18228 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 01 Mar 2022 06:57:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11120: Fix codec not set in generating ORC tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18228 ) Change subject: IMPALA-11120: Fix codec not set in generating ORC tables .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bd5d9400864145133ff019a3d076a6cab36fcc Gerrit-Change-Number: 18228 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 01 Mar 2022 02:12:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11120: Fix codec not set in generating ORC tables
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18228 ) Change subject: IMPALA-11120: Fix codec not set in generating ORC tables .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7888/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bd5d9400864145133ff019a3d076a6cab36fcc Gerrit-Change-Number: 18228 Gerrit-PatchSet: 3 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 01 Mar 2022 02:12:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11120: Fix codec not set in generating ORC tables
Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18228 ) Change subject: IMPALA-11120: Fix codec not set in generating ORC tables .. Patch Set 2: Thank Andrew! -- To view, visit http://gerrit.cloudera.org:8080/18228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bd5d9400864145133ff019a3d076a6cab36fcc Gerrit-Change-Number: 18228 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 01 Mar 2022 02:12:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11120: Fix codec not set in generating ORC tables
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/18228 ) Change subject: IMPALA-11120: Fix codec not set in generating ORC tables .. Patch Set 2: Code-Review+2 LGTM -- To view, visit http://gerrit.cloudera.org:8080/18228 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bd5d9400864145133ff019a3d076a6cab36fcc Gerrit-Change-Number: 18228 Gerrit-PatchSet: 2 Gerrit-Owner: Quanlong Huang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Tue, 01 Mar 2022 01:57:41 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. IMPALA-10049: Include RPC call_id in slow RPC logs KRPC log slow RPC trace in the receiver side. The trace log has the call_id info that matches with the sender. However, our slow RPC logging in the sender side does not log this call_id. It is hard to associate the slow RPC logs between sender and receiver. With the recent KRPC rebase in IMPALA-10931, we can now log the call_id on the sender side. Testing: I tested this with a low threshold and delays added (the same as we did in IMPALA-9128): start-impala-cluster.py \ --impalad_args=--impala_slow_rpc_threshold_ms=1 \ --impalad_args=--debug_actions=END_DATA_STREAM_DELAY:JITTER@3000@1.0 The following is how the logs look like on the sender and receiver sides: impalad_node1.INFO (sender): I0217 10:29:36.278754 6606 krpc-data-stream-sender.cc:394] Slow TransmitData RPC (request call id 414) to 127.0.0.1:27002 (fragment_instance_id=d8453c2785c38df4:3473e28b0041): took 343.279ms. Receiver time: 342.780ms Network time: 498.405us impalad_node2.INFO (receiver): I0217 10:29:36.278379 6775 rpcz_store.cc:269] Call impala.DataStreamService.TransmitData from 127.0.0.1:39702 (request call id 414) took 342ms. Trace: I0217 10:29:36.278479 6775 rpcz_store.cc:270] 0217 10:29:35.935586 (+ 0us) impala-service-pool.cc:179] Inserting onto call queue 0217 10:29:36.277730 (+342144us) impala-service-pool.cc:278] Handling call 0217 10:29:36.277859 (+ 129us) krpc-data-stream-recvr.cc:397] Deserializing batch 0217 10:29:36.278330 (+ 471us) krpc-data-stream-recvr.cc:424] Enqueuing deserialized batch 0217 10:29:36.278369 (+39us) inbound_call.cc:171] Queueing success response Metrics: {} Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Reviewed-on: http://gerrit.cloudera.org:8080/18243 Reviewed-by: Wenzhe Zhou Tested-by: Impala Public Jenkins --- M be/src/runtime/krpc-data-stream-sender.cc 1 file changed, 2 insertions(+), 1 deletion(-) Approvals: Wenzhe Zhou: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 23:07:14 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10992 Planner changes for estimate peak memory
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18178 ) Change subject: IMPALA-10992 Planner changes for estimate peak memory .. Patch Set 14: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10237/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a Gerrit-Change-Number: 18178 Gerrit-PatchSet: 14 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 21:17:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10992 Planner changes for estimate peak memory
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/18178 ) Change subject: IMPALA-10992 Planner changes for estimate peak memory .. Patch Set 14: (16 comments) http://gerrit.cloudera.org:8080/#/c/18178/13//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18178/13//COMMIT_MSG@9 PS13, Line 9: executor group > nit: multiple executor group sets. Done http://gerrit.cloudera.org:8080/#/c/18178/13//COMMIT_MSG@74 PS13, Line 74: Almost all FE and BE tests are now run in the artificial two : executor setup except a few where a specific cluster configuration : is desirable; > Please see my comment in Frontend.java about how we can ensure re-planning Done http://gerrit.cloudera.org:8080/#/c/18178/13/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: http://gerrit.cloudera.org:8080/#/c/18178/13/common/thrift/Frontend.thrift@729 PS13, Line 729: // The optional threshold to determine which executor group set t > nit: can you provide more context here as to what this threshold signifies, Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@265 PS13, Line 265: // An inner class to capture the state of compilation for auto-scaling. : final class AutoScalingCompilationState { > nit: would it make sense to put this inside a separate Move all the new data members and function members into a new inner class called AutoScalingCompilationState. http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@284 PS13, Line 284: // Set when the query is compiled against the 1st group set inside > nit: mention when it is set and when can it be reset Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@335 PS13, Line 335: le) in next > nit: not sure auto-scaling is the right term here, since we are not scaling I used the name AutoScalingCompilationState to capture the data structures and methods. The nature of all of it is to facilitate auto-scaling in BE. So in this sense, the work is still about auto-scaling. http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1757 PS13, Line 1757: to the max_query_mem_limit from the pool :* service for the > nit: would be good to explain why a group can be classified as useless and Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1776 PS13, Line 1776: else if (test_replan) { : ExecutorMembershipSnapshot cluster = ExecutorMembershipSnapshot.getCluster(); : int num_nodes = cluster.numExecutors(); : // Form a two-executor group testing environment so that we can exercise : // auto-scaling logic (see getTExecRequest() in Frontend.java). : TExecutorGroupSet r = new TExecutorGroupSet(num_nodes, num_nodes, "small"); : r.setThreshold(64*MEGABYTE); : result.add(r); : TExecutorGroupSet l = new TExecutorGroupSet(e); : Preconditions. > if we want to emulate a 2 exec group set configuration, what if we only add Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1788 PS13, Line 1788: > nit: can just use 'e' here too. Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1799 PS13, Line 1799: f defined, r > if we use query_exec_request.query_ctx.request_pool instead of queryOptions query_exec_request.query_ctx.request_pool is set after we have decided the group set to use for the query. Here we establish a list of such group set candidates. http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1816 PS13, Line 1816: result.add(new_entry); : } > nit: how about: Request pool: does not map to any known executo Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1929 PS13, Line 1929: } : : // Find out the per host memory estimated from two possible sources. : per_host_mem_estimate = -1; : if (req.query_exec_request != null) { : > when would either case be used? as in, why would query_exec_request not be Done http://gerrit.cloudera.org:8080/#/c/18178/13/fe/src/main/java/org/apache/impala/service/Frontend.java@1937 PS13, Line
[Impala-ASF-CR] IMPALA-10992 Planner changes for estimate peak memory
Qifan Chen has uploaded a new patch set (#14). ( http://gerrit.cloudera.org:8080/18178 ) Change subject: IMPALA-10992 Planner changes for estimate peak memory .. IMPALA-10992 Planner changes for estimate peak memory This patch provides replan support for multiple executor group sets. Each executor group set is associated with a distinct number of nodes and a threshold for estimated memory per host in bytes that can be denoted as [:<#nodes>, ]. In the patch, a query of type EXPLAIN, QUERY or DML can be compiled more than once. In each attempt, per host memory is estimated and compared with the threshold of an executor group set. If the estimated memory is no more than the threshold, the iteration process terminates and the final plan is determined. The executor group set with the threshold is selected to run the query. A new query option 'enable_replan', default to 1 (enabled), is added. It can be set to 0 to disable this patch and to generate the distributed plan for the default executor group. To avoid long compilation time, the following enhancement is enabled. Note 1) and 2) can be disabled when relevant meta-data change is detected. 1. Authorization is performed only for the 1st compilation; 2. The needed meta-data is fetched into a StmtTableCache in 1st compilation and reused in subsequent compilations; 3. openTransaction() is called for transactional queries in 1st compilation and the saved transactional info is used in subsequent compilations. Similar logic is applied to Kudu transactional queries. To facilitate testing, the patch imposes an artificial two executor group setup in FE as follows. 1. [regular:<#nodes>, 64MB] 2. [large:<#nodes>, 8PB] This setup is enabled when a new query option 'test_replan' is set to 1 in backend tests, or RuntimeEnv.INSTANCE.isTestEnv() is true as in most frontend tests. This query option is set to 0 by default. Compilation time increases when a query is compiled in several iterations, as shown below for several TPCDs queries. The increase is mostly due to redundant work in either single node plan creation or recomputing value transfer graph phase. For small queries, the increase can be avoided if they can be compiled in sinlge iteration by properly setting the smallest threshold among all executor group sets. For example, for the set of queries listed below, the smallest threshold can be set to 320MB to catch both q15 and q21 in one compilation. Compilation time (ms) Queries Estimated Memory 2-iterations 1-iteration Percentage of increase q1 408MB 18.32 13.0140.81% q11 1.37GB 186.17 86.28 115.77% q10a 519MB 108.27 53.58 102.07% q13339MB 118.03 82.4343.19% q14a 3.56GB 628.27307.24 104.49% q14b 2.20GB 518.79239.05 117.02% q15314MB 13.12 4.51 190.91% q21275MB 11.04 6.3474.13% q23a 1.34GB 458.7227.62 101.52% q23b 1.50GB 471.29224.75 109.70% q42.60GB 206.34 98.64 109.18% q67 5.16GB 691.45336.31 105.60% Testing: 1. Almost all FE and BE tests are now run in the artificial two executor setup except a few where a specific cluster configuration is desirable; 2. Ran core tests successfully; 3. Added a new observability test and a test to explicitly ensure replan takes place among two group sets. Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/ClassUtil.java M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java M fe/src/test/java/org/apache/impala/common/QueryFixture.java M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java M tests/common/test_dimensions.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_coordinators.py M tests/custom_cluster/test_executor_groups.py M tests/query_test/test_observability.py 21 files changed, 533 insertions(+), 70
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 18:21:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7887/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 18:23:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10236/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 18:22:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Hello Joe McDonnell, Wenzhe Zhou, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18243 to look at the new patch set (#7). Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. IMPALA-10049: Include RPC call_id in slow RPC logs KRPC log slow RPC trace in the receiver side. The trace log has the call_id info that matches with the sender. However, our slow RPC logging in the sender side does not log this call_id. It is hard to associate the slow RPC logs between sender and receiver. With the recent KRPC rebase in IMPALA-10931, we can now log the call_id on the sender side. Testing: I tested this with a low threshold and delays added (the same as we did in IMPALA-9128): start-impala-cluster.py \ --impalad_args=--impala_slow_rpc_threshold_ms=1 \ --impalad_args=--debug_actions=END_DATA_STREAM_DELAY:JITTER@3000@1.0 The following is how the logs look like on the sender and receiver sides: impalad_node1.INFO (sender): I0217 10:29:36.278754 6606 krpc-data-stream-sender.cc:394] Slow TransmitData RPC (request call id 414) to 127.0.0.1:27002 (fragment_instance_id=d8453c2785c38df4:3473e28b0041): took 343.279ms. Receiver time: 342.780ms Network time: 498.405us impalad_node2.INFO (receiver): I0217 10:29:36.278379 6775 rpcz_store.cc:269] Call impala.DataStreamService.TransmitData from 127.0.0.1:39702 (request call id 414) took 342ms. Trace: I0217 10:29:36.278479 6775 rpcz_store.cc:270] 0217 10:29:35.935586 (+ 0us) impala-service-pool.cc:179] Inserting onto call queue 0217 10:29:36.277730 (+342144us) impala-service-pool.cc:278] Handling call 0217 10:29:36.277859 (+ 129us) krpc-data-stream-recvr.cc:397] Deserializing batch 0217 10:29:36.278330 (+ 471us) krpc-data-stream-recvr.cc:424] Enqueuing deserialized batch 0217 10:29:36.278369 (+39us) inbound_call.cc:171] Queueing success response Metrics: {} Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 --- M be/src/runtime/krpc-data-stream-sender.cc 1 file changed, 2 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/18243/7 -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 6: > Patch Set 6: Verified-1 > > Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7886/ This patch failed test_exchange_small_delay. The call_id() that is being logged is retrieved from KRPC response message. However, test_exchange_small_delay depends on receiver timing out and not sending any response message. We need to revert the logging at LogSlowFailedRpc(). -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 18:00:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10992 Planner changes for estimate peak memory - v1
Qifan Chen has abandoned this change. ( http://gerrit.cloudera.org:8080/18143 ) Change subject: IMPALA-10992 Planner changes for estimate peak memory - v1 .. Abandoned This version is the draft version to https://gerrit.cloudera.org/#/c/18178/. -- To view, visit http://gerrit.cloudera.org:8080/18143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: Ibe71f905d6a8c1e42cf951b3a69ff33b81277c24 Gerrit-Change-Number: 18143 Gerrit-PatchSet: 29 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-11133 (Addendum): Encode a string in utf8 before printing it
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/18270 ) Change subject: IMPALA-11133 (Addendum): Encode a string in utf8 before printing it .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18270 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad9b1fb0a523e219bc9f40a57ff7335808be283f Gerrit-Change-Number: 18270 Gerrit-PatchSet: 2 Gerrit-Owner: Fang-Yu Rao Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Quanlong Huang Gerrit-Comment-Date: Mon, 28 Feb 2022 16:34:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9433: Improved caching of HdfsFileHandles
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18191 ) Change subject: IMPALA-9433: Improved caching of HdfsFileHandles .. Patch Set 26: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10235/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a Gerrit-Change-Number: 18191 Gerrit-PatchSet: 26 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 28 Feb 2022 14:34:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9433: Improved caching of HdfsFileHandles
Gergely Fürnstáhl has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/18191 ) Change subject: IMPALA-9433: Improved caching of HdfsFileHandles .. IMPALA-9433: Improved caching of HdfsFileHandles Seperated LRU caching functionality to a templated LruMultiCache class. Replaced std::multimap with std::unordered_map with std::list for O(1) lookups and less memory overhead, as it stores each key one time. Added boost::intrusive::list to handle LRU relations with less overhead. Added O(1) release method, instead of O(n) with minimal memory overhead. Implemented RAII Accessor to remove the responsibility of releasing the objects from the user. Wrapped cache accessor and related DiskIOManager metrics to a FileHandleCache::Accessor. Removed Release*() call trees from FileHandleCache and DiskIOManager, removed scoped exit from HdfsFileReader as they are handled automatically. Testing: Implemented extensive unit testing of the class, including forced rehashes, collisions, capacity overshoot, explicit/automatic release and destroy. Ran tests/custom_cluster/test_hdfs_fd_caching.py to verify FileHandleCache::Accessor behaviour through metrics. Ran bin/single_node_perf_run.py with TPCH and TPC-DS on parquet tables, no visible change in performance: TPCH scale=10 iterations=100: Delta(Avg)=-0.67% Delta(GeoMean)=-0.49% TPC-DS scale=10 iterations= 50: Delta(Avg)=-0.02% Delta(GeoMean)= 0.00% Tested some manual queries on functional_parquet.widetable_1000_cols with 64 threads but did not notice significant changes in scan times. Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a --- M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/handle-cache.h M be/src/runtime/io/handle-cache.inline.h M be/src/runtime/io/hdfs-file-reader.cc M be/src/util/CMakeLists.txt A be/src/util/lru-multi-cache-test.cc A be/src/util/lru-multi-cache.h A be/src/util/lru-multi-cache.inline.h 9 files changed, 1,175 insertions(+), 274 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/18191/26 -- To view, visit http://gerrit.cloudera.org:8080/18191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a Gerrit-Change-Number: 18191 Gerrit-PatchSet: 26 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-9433: Improved caching of HdfsFileHandles
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18191 ) Change subject: IMPALA-9433: Improved caching of HdfsFileHandles .. Patch Set 25: Code-Review+1 (5 comments) The code looks awesome! http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/runtime/io/handle-cache.inline.h File be/src/runtime/io/handle-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/runtime/io/handle-cache.inline.h@80 PS25, Line 80: if (cache_accessor_.Get()) : ImpaladMetrics::IO_MGR_NUM_FILE_HANDLES_OUTSTANDING->Increment(1L); nit: multi-line if stmt needs braces http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.h File be/src/util/lru-multi-cache.h: http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.h@53 PS25, Line 53: deigned nit: designed http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.h@61 PS25, Line 61: nit: extra space http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.h@74 PS25, Line 74: LRU order Please mention that least recently used elements are at the front. http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.inline.h File be/src/util/lru-multi-cache.inline.h: http://gerrit.cloudera.org:8080/#/c/18191/25/be/src/util/lru-multi-cache.inline.h@214 PS25, Line 214: in_use Do we need 'in_use'? Can't we just use 'member_hook.is_linked()' instead? Maybe we just need a member function InUse(): return !member_hook.is_linked(); -- To view, visit http://gerrit.cloudera.org:8080/18191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6b5c5e9e2b5db2847ab88c41f667c9ca1b03d51a Gerrit-Change-Number: 18191 Gerrit-PatchSet: 25 Gerrit-Owner: Gergely Fürnstáhl Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 28 Feb 2022 11:21:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10049: Include RPC call id in slow RPC logs
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18243 ) Change subject: IMPALA-10049: Include RPC call_id in slow RPC logs .. Patch Set 6: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7886/ -- To view, visit http://gerrit.cloudera.org:8080/18243 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7fb5746fa0be575745a8e168405d43115c425389 Gerrit-Change-Number: 18243 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Mon, 28 Feb 2022 09:46:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables
Tamas Mate has posted comments on this change. ( http://gerrit.cloudera.org:8080/18240 ) Change subject: IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables .. Patch Set 4: Code-Review+1 Thank you for the update Zoltan, this change looks nice! LGTM! -- To view, visit http://gerrit.cloudera.org:8080/18240 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8 Gerrit-Change-Number: 18240 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gergely Fürnstáhl Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tamas Mate Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 28 Feb 2022 08:32:01 + Gerrit-HasComments: No