[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1950/ -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 07:19:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 6: Yes, we will do appropriate tuning for the queue memory limit to get reasonable performance before turning KRPC on by default. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 07:19:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 6: Code-Review+2 Carry Tim's +2 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 07:18:09 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/5/be/src/rpc/impala-service-pool.h File be/src/rpc/impala-service-pool.h: http://gerrit.cloudera.org:8080/#/c/9282/5/be/src/rpc/impala-service-pool.h@41 PS5, Line 41: service_queue_depth > service_queue_length? Done -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 07:17:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9282 to look at the new patch set (#6). Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. IMPALA-6116: Bound memory usage of DataStreamSevice's service queue The fix for IMPALA-6193 added a memory tracker for the memory consumed by the payloads in the service queue of DataStreamService. This change extends it by introducing a bound on the memory usage for that service queue. In addition, it deprecates FLAGS_datastream_service_queue_depth and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags only take effect when KRPC is in use and KRPC was never enabled in any previous releases so it seems safe to do this flag replacement. The new flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount of memory which can be consumed by the service queue of DataStreamService. This allows a more direct control over the memory usage of the queue instead of inferring via the number of entries in the queue. The default value of this flag is left at 0, in which case it will be set to 5% of process memory limit. Testing done: exhaustive debug builds. Updated data-stream-test to exercise the case in which the payload is larger than the limit. Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/mem-tracker.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M tests/custom_cluster/test_krpc_mem_usage.py 15 files changed, 282 insertions(+), 162 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/6 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8707 to look at the new patch set (#28). Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. IMPALA-4835: Part 2: Allocate scan range buffers upfront This change is a step towards reserving memory for buffers from the buffer pool and constraining per-scanner memory requirements. This change restructures the DiskIoMgr code so that each ScanRange operates with a fixed set of buffers that are allocated upfront and recycled as the I/O mgr works through the ScanRange. One major change is that ScanRanges get blocked when a buffer is not available and get unblocked when a client returns a buffer via ReturnBuffer(). I was able to remove the logic to maintain the blocked_ranges_ list by instead adding a separate set with all ranges that are active. There is also some miscellaneous cleanup included - e.g. reducing the amount of code devoted to maintaining counters and metrics. I plan to merge this along with the actual buffer pool switch, but separated it out to allow review of the DiskIoMgr changes separate from other aspects of the buffer pool switchover. Testing: * Ran core and exhaustive tests. Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/io/disk-io-mgr-internal.h M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/util/bit-util-test.cc M be/src/util/bit-util.h 19 files changed, 956 insertions(+), 546 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/28 -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 28 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9354 ) Change subject: IMPALA-6526: Fix spilling test for running on local FS .. IMPALA-6526: Fix spilling test for running on local FS One of the spilling test was failing because its minimum bufferpool mem requirement was more when ran on local FS as compared to when it is run on HDFS. The fix is to increase the bufferpool limit to a value just above the min limit so that it still forces spill to disk on both filesystems. Testing: Ran core tests with local FS as target file system. Made sure the failing test passed. Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Reviewed-on: http://gerrit.cloudera.org:8080/9354 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M testdata/workloads/functional-query/queries/QueryTest/spilling.test 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Gerrit-Change-Number: 9354 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9354 ) Change subject: IMPALA-6526: Fix spilling test for running on local FS .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Gerrit-Change-Number: 9354 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 01:40:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 5: (I'm assuming that we'll validate that the mem limit works ok by doing additional concurrency testing before turning on KRPC by default). -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 01:38:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 27: (1 comment) http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/scan-range.cc File be/src/runtime/io/scan-range.cc: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/scan-range.cc@95 PS25, Line 95: reader_->num_used_buffers_.Add(-1); > It looks like it's only used for debugging (validation + debug strings). I Either now or later is okay with me, if you're confident they don't help (I agree they don't seem valuable). I think they are a fairly big distraction in already complicated code. -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 27 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 01:22:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.h@411 PS6, Line 411: /// Populate a row batch from the serialized row batch header. that comment doesn't look correct. This constructor just constructs an empty row batch, right? Actually, wouldn't it be better to not even pass in header here, but instead just "resize" the rowbatch in FromThrift when allocating the buffers? i.e make this a constructor that just creates a completely empty row batch. -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 01:11:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 6: (4 comments) The overall scope of the change looks good. Comments are just about the error paths and testing. http://gerrit.cloudera.org:8080/#/c/9344/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9344/6//COMMIT_MSG@39 PS6, Line 39: Testing done: Debug core build. How will we be able to test the error paths? Are we relying on the stress tests? http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc@304 PS6, Line 304: recvr_->num_buffered_bytes_.Add(batch_size); Does this need to be decremented on failure? Or is it somehow ok because the query is going to fail anyway? Would be good to either fix or add a comment to explain why it's intentional. http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/krpc-data-stream-recvr.cc@306 PS6, Line 306: ++num_pending_enqueue_; Same here. http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/6/be/src/runtime/row-batch.cc@193 PS6, Line 193: row_batch->tuple_ptrs_info_->client = client; tuple_ptrs_info_ gets left in an intermediate state if the allocation fails. It's probably benign but it's hard to reason about. Would we be better off setting tuple_ptrs_info_ after the AllocateBuffer() call? -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 00:59:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9286 ) Change subject: IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG@7 PS5, Line 7: docs > Uppercase "DOCS" Done http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml File docs/topics/impala_proxy.xml: http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@244 PS5, Line 244: k > Uppercase "Kerberized" is the convention we use elsewhere. You'll only see Done http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@247 PS5, Line 247: : > Please put the note outside the paragraph, i.e. Done -- To view, visit http://gerrit.cloudera.org:8080/9286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c Gerrit-Change-Number: 9286 Gerrit-PatchSet: 5 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Sat, 17 Feb 2018 00:49:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters
Hello John Russell, Philip Zeyliger, Sailesh Mukil, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9286 to look at the new patch set (#6). Change subject: IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters .. IMPALA-6509: [DOCS] Note for haproxy for Kerberized clusters Noted that after enabling ha-proxy in a kerberized cluster, users can no-longer connect to individual impala daemons directly from impala shell. Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c --- M docs/topics/impala_proxy.xml 1 file changed, 12 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/9286/6 -- To view, visit http://gerrit.cloudera.org:8080/9286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c Gerrit-Change-Number: 9286 Gerrit-PatchSet: 6 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8707 to look at the new patch set (#27). Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. IMPALA-4835: Part 2: Allocate scan range buffers upfront This change is a step towards reserving memory for buffers from the buffer pool and constraining per-scanner memory requirements. This change restructures the DiskIoMgr code so that each ScanRange operates with a fixed set of buffers that are allocated upfront and recycled as the I/O mgr works through the ScanRange. One major change is that ScanRanges get blocked when a buffer is not available and get unblocked when a client returns a buffer via ReturnBuffer(). I was able to remove the logic to maintain the blocked_ranges_ list by instead adding a separate set with all ranges that are active. I plan to merge this along with the actual buffer pool switch, but separated it out to allow review of the DiskIoMgr changes separate from other aspects of the buffer pool switchover. Testing: * Ran core and exhaustive tests. Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/io/disk-io-mgr-internal.h M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/util/bit-util-test.cc M be/src/util/bit-util.h 19 files changed, 964 insertions(+), 505 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/27 -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 27 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 26: (4 comments) http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@113 PS26, Line 113: /// concurrently consumes and processes the data. For example, the logic in a scanner thread Long line http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@140 PS26, Line 140: /// when the scan range has no buffers to read data into causes a resource deadlock. I should add a note to explain that it's actually ok to hold onto multiple buffers. http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@281 PS26, Line 281: encounted typo http://gerrit.cloudera.org:8080/#/c/8707/26/be/src/runtime/io/disk-io-mgr.h@290 PS26, Line 290: GetNextRange I feel like this name is pretty vague - I wonder if we should call it GetNextUnstartedRange() to disambiguate? -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 26 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 00:28:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8707 ) Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. Patch Set 25: (35 comments) Thanks for the comments. Addressed most of your points but had a few minor follow-on questions. Still need to do a second pass over my changes for typos, but I thought it was worth pushing out now (it's also easier to see typos for me in gerrit :)) http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/exec/hdfs-parquet-scanner.cc@1445 PS25, Line 1445: ; > << "Already provided a buffer" was nice Done http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@71 PS25, Line 71: /// TODO: We should map readers to queries. A reader is the unit of scheduling and queries Moved this to live with the other TODOs at the bottom. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@84 PS25, Line 84: GetNextRange: returns to the caller the next scan range it should process. : /// This is based on disk load. This also begins reading the data in this scan : /// range. This is blocking. > that could probably use updating? or maybe rephrase to talk about states or Removed some of the detail here that fits better in the method comments. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@109 PS25, Line 109: Memory usage in the IoMgr comes from queued read buffers. If we queue the minimum : /// (i.e. 1), then the disks are idle while we are processing the buffer. If we don't : /// limit the queue, then it possible we end up queueing the entire data set (i.e. CPU : /// is slower than disks) and run out of memory. > this seems a bit outdated Removed this part, I think it was irrelevant in light of later sections. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@116 PS25, Line 116: In that case each query will need fewer queued buffers and : /// therefore have less memory usage. : // > and that Reworked this and the below paragraph. It's all a bit outdated. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@136 PS25, Line 136: a client buffer provided by the caller : /// when constructing the scan range. > in that case, is it required that the single buffer can fit the entire rang Clarified. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@147 PS25, Line 147: buffers are owned > how does a buffer become owned by the caller? I guess that means when it's I guess we don't actually provide a way for clients to determine how many buffers were allocated. It's easy to expose but we don't have a use case right now. I reworded this comment in terms of GetNext()/ReturnBuffer() to suggest the conservative behaviour for now. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@149 PS25, Line 149: it can allocate at least three : /// max-sized buffers per scan range > the API below (AllocateBuffersForRange()) doesn't seem to give the caller ( Reworded slightly - agree it implied that. The AllocateBuffersForRange() comment does mention 3 * max_buffer_size() so that makes it a little explicit at least. http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@269 PS25, Line 269: , > "on return" Done http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@282 PS25, Line 282: range is : /// available > what does it mean for a range to be available (from an API perspective)? I think the bit about blocking was actually wrong (or at least extremely misleading) - it doesn't block indefinitely but rather gives up ASAP if there are no unstarted ranges. There's a .wait() call on a condition variable, but I think that only exists to deal with races with ranges being pulled of queues (I'm sure there's a potential simplification there). http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@288 PS25, Line 288: 'needs_buffers' is set to true > this returns true in '*needs_buffers' Done http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@300 PS25, Line 300: allocate > allocated Done http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.h@474 PS25, Line 474: /// Reads the specified scan range and calls HandleReadFinished() when done. > this should probably mention how it might not do anything if there's no buf Done http://gerrit.cloudera.org:8080/#/c/8707/25/be/src/runtime/io/disk-io-mgr.cc File
[Impala-ASF-CR] IMPALA-4835: Part 2: Allocate scan range buffers upfront
Hello Michael Ho, Tianyi Wang, Bikramjeet Vig, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8707 to look at the new patch set (#26). Change subject: IMPALA-4835: Part 2: Allocate scan range buffers upfront .. IMPALA-4835: Part 2: Allocate scan range buffers upfront This change is a step towards reserving memory for buffers from the buffer pool and constraining per-scanner memory requirements. This change restructures the DiskIoMgr code so that each ScanRange operates with a fixed set of buffers that are allocated upfront and recycled as the I/O mgr works through the ScanRange. One major change is that ScanRanges get blocked when a buffer is not available and get unblocked when a client returns a buffer via ReturnBuffer(). I was able to remove the logic to maintain the blocked_ranges_ list by instead adding a separate set with all ranges that are active. I plan to merge this along with the actual buffer pool switch, but separated it out to allow review of the DiskIoMgr changes separate from other aspects of the buffer pool switchover. Testing: * Ran core and exhaustive tests. Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/scanner-context.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/io/disk-io-mgr-internal.h M be/src/runtime/io/disk-io-mgr-stress-test.cc M be/src/runtime/io/disk-io-mgr-stress.cc M be/src/runtime/io/disk-io-mgr-test.cc M be/src/runtime/io/disk-io-mgr.cc M be/src/runtime/io/disk-io-mgr.h M be/src/runtime/io/request-context.cc M be/src/runtime/io/request-context.h M be/src/runtime/io/request-ranges.h M be/src/runtime/io/scan-range.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h M be/src/util/bit-util-test.cc M be/src/util/bit-util.h 19 files changed, 961 insertions(+), 505 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/8707/26 -- To view, visit http://gerrit.cloudera.org:8080/8707 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia243bf8b62feeea602b122e0503ea4ba7d3ee70f Gerrit-Change-Number: 8707 Gerrit-PatchSet: 26 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9219 ) Change subject: [DOCS] Removed the obsolete Llama options files .. Patch Set 4: Cherry-picks added -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 4 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 00:05:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG@29 PS3, Line 29: Note that the proper reservation mechanism of the exchange node > Good point. As a comparison, suppose we allocate the memory from non-buffer Bumped the buffer pool limit to 85% by default in the latest PS. -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 17 Feb 2018 00:04:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files
Hello Michael Brown, John Russell, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9219 to look at the new patch set (#4). Change subject: [DOCS] Removed the obsolete Llama options files .. [DOCS] Removed the obsolete Llama options files Removed the Llama options file. Removed impala_sqlref.ditamap that is not used. Removed the reference to impala_sqlref.ditamap in README.md Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Cherry-picks: not for 2.x --- M docs/README.md M docs/impala.ditamap D docs/impala_sqlref.ditamap M docs/shared/impala_common.xml D docs/topics/impala_reservation_request_timeout.xml D docs/topics/impala_v_cpu_cores.xml 6 files changed, 0 insertions(+), 261 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/9219/4 -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 4 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6499: [docs] Fixed formatting errors in split part function
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9275 ) Change subject: IMPALA-6499: [docs] Fixed formatting errors in split_part function .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/9275/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9275/3//COMMIT_MSG@7 PS3, Line 7: docs Uppercase "DOCS" http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml File docs/topics/impala_string_functions.xml: http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1067 PS3, Line 1067: : Return type: This looks like the whole paragraph was reformatted by accident. The line shouldn't be wrapped at the end of another line containing text. Was there any change to the actual text in this paragraph? http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1075 PS3, Line 1075: Remove trailing space. http://gerrit.cloudera.org:8080/#/c/9275/3/docs/topics/impala_string_functions.xml@1129 PS3, Line 1129:
[Impala-ASF-CR] [DOCS] Removed the obsolete Llama options files
Hello Michael Brown, John Russell, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9219 to look at the new patch set (#3). Change subject: [DOCS] Removed the obsolete Llama options files .. [DOCS] Removed the obsolete Llama options files Removed the Llama options file. Removed impala_sqlref.ditamap that is not used. Removed the reference to impala_sqlref.ditamap in README.md Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d --- M docs/README.md M docs/impala.ditamap D docs/impala_sqlref.ditamap M docs/shared/impala_common.xml D docs/topics/impala_reservation_request_timeout.xml D docs/topics/impala_v_cpu_cores.xml 6 files changed, 0 insertions(+), 261 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/9219/3 -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9341 ) Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b Gerrit-Change-Number: 9341 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 16 Feb 2018 23:56:54 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9341 ) Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options .. [DOCS] Fix in REPLICA_PREFERENCE numeric options Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b Reviewed-on: http://gerrit.cloudera.org:8080/9341 Reviewed-by: John RussellTested-by: Impala Public Jenkins --- M docs/topics/impala_replica_preference.xml 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: John Russell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b Gerrit-Change-Number: 9341 Gerrit-PatchSet: 4 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-6509: [docs] Note for haproxy for Kerberized clusters
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9286 ) Change subject: IMPALA-6509: [docs] Note for haproxy for Kerberized clusters .. Patch Set 5: (3 comments) http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9286/5//COMMIT_MSG@7 PS5, Line 7: docs Uppercase "DOCS" http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml File docs/topics/impala_proxy.xml: http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@244 PS5, Line 244: k Uppercase "Kerberized" is the convention we use elsewhere. You'll only see it lowercase in user-entered text like some JIRA titles. http://gerrit.cloudera.org:8080/#/c/9286/5/docs/topics/impala_proxy.xml@247 PS5, Line 247: : Please put the note outside the paragraph, i.e. ... ... It makes the output more predictable in terms of indentation and spacing, and is also friendlier from a maintenance perspective because there's less indentation within the actual source file (i.e. leaves more space on each line for actual text, leading to less line wrapping). -- To view, visit http://gerrit.cloudera.org:8080/9286 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ce9930e3f658c52502a2ba95f93647e2706d58c Gerrit-Change-Number: 9286 Gerrit-PatchSet: 5 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 23:56:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9347 ) Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 23:55:32 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9347 ) Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. [DOCS] Typos fixed in Impala Analytic Functions doc Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Reviewed-on: http://gerrit.cloudera.org:8080/9347 Reviewed-by: Tim ArmstrongReviewed-by: John Russell Tested-by: Impala Public Jenkins --- M docs/shared/impala_common.xml M docs/topics/impala_analytic_functions.xml 2 files changed, 3 insertions(+), 3 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved John Russell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni Gerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9344 to look at the new patch set (#6). Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool Previously, tuple pointers of a row batch are allocated from the heap via malloc() and tuple data is allocated from the MemPool associated with the RowBatch. This change converts the allocations of tuple pointers and tuple data to using BufferPool for row batches allocated from KrpcDataStreamRecvr. The primary motivation for this change is to take advantage of the fact that buffers allocated from BufferPool always go back to the per-core arena they came from when they are freed. This alleviates the TCMalloc imbalance between the RPC service threads and the fragment execution threads. As described in IMPALA-5518, row batches are always allocated from the service threads' TCMalloc cache and placed into the fragment execution threads' TCMalloc cache when they're freed. This leads to underflow and overflow in those threads' caches and high contention for the spinlock of the central free list. With BufferPool, the memory always went back to its originating arena so this kind of imbalance is less likely to occur. This also dovetails with the long term plan to put most allocations under BufferPool and have each operators in the plan reserved appropriate amount of memory before execution. Note that the proper reservation mechanism of the exchange node hasn't yet been implemented in this change so the buffer pool client handle used for allocating buffers has an ad-hoc set-up of no reservation limit and using root reservation tracker as parent. This needs to be fixed as part of IMPALA-6524. The default buffer pool limit is also bumped to 85% to account for the extra usage from the exchange nodes. The minimum buffer size is also lowered to 8KB to reduce amount of memory wastage as a row batch's tuple pointers / tuple data can sometimes be much smaller than 64KB. Testing done: Debug core build. Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 --- M be/src/common/global-flags.cc M be/src/exec/blocking-join-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/partial-sort-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/query-options-test.cc M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test 18 files changed, 217 insertions(+), 85 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/6 -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 6 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6515: [docs] HAproxy with sticky session requires the check option
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9293 ) Change subject: IMPALA-6515: [docs] HAproxy with sticky session requires the check option .. Patch Set 3: Different people from various field organizations have requested / suggested all sorts of combinations of HAProxy settings. There has never been a consistent set that worked the best for everyone. I suggest tagging in Alan Choi since he was the first one who verified the HAProxy instructions. -- To view, visit http://gerrit.cloudera.org:8080/9293 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I47165df6624f958c2e0542e2627d4f5377789ab8 Gerrit-Change-Number: 9293 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: John Russell Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 23:51:29 + Gerrit-HasComments: No
[Impala-ASF-CR] [docs] Removed the obsolete Llama options files
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/9219 ) Change subject: [docs] Removed the obsolete Llama options files .. Patch Set 2: > (1 comment) > > Now the question is, how do we make sure on the doc side that this > change stays only in the 3.x (master?) branch and doesn't get > applied to Impala 2.x branches? If you don't want a commit to end up in the 2.x branch, your commit message needs to have this line: Cherry-picks: not for 2.x This granularity is at the commit level: if this entire commit doesn't belong in 2.x, just git commit --amend the commit message and push to Gerrit. -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 23:50:42 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9341 ) Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/198/ -- To view, visit http://gerrit.cloudera.org:8080/9341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b Gerrit-Change-Number: 9341 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 16 Feb 2018 23:48:50 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Fix in REPLICA PREFERENCE numeric options
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9341 ) Change subject: [DOCS] Fix in REPLICA_PREFERENCE numeric options .. Patch Set 3: I'm not familiar with whatever error condition is indicated in the console output from the verification job, i.e. https://jenkins.impala.io/job/gerrit-docs-submit/195/console I'm going to try the verification job again with the hope that it's a transient error. If not, something might be going wrong with the commit flow (incorrect rebase, or review created from master instead of a private branch); if that's the case, the solution would be to abandon and re-do. -- To view, visit http://gerrit.cloudera.org:8080/9341 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia10e69ac38229e0969db11b7edbcf08c2444602b Gerrit-Change-Number: 9341 Gerrit-PatchSet: 3 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Lars Volker Gerrit-Comment-Date: Fri, 16 Feb 2018 23:48:30 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9347 ) Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/197/ -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 23:44:41 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9347 ) Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 23:43:58 + Gerrit-HasComments: No
[Impala-ASF-CR] [docs] Removed the obsolete Llama options files
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/9219 ) Change subject: [docs] Removed the obsolete Llama options files .. Patch Set 2: (1 comment) Now the question is, how do we make sure on the doc side that this change stays only in the 3.x (master?) branch and doesn't get applied to Impala 2.x branches? (I presume the stub files with their warnings should stay in place for any 2.12, 2.13, etc. releases that are done. http://gerrit.cloudera.org:8080/#/c/9219/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9219/2//COMMIT_MSG@7 PS2, Line 7: docs Uppercase "DOCS" -- To view, visit http://gerrit.cloudera.org:8080/9219 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0c2b8160af9c95ec1e1b744b558d9537dd2550d Gerrit-Change-Number: 9219 Gerrit-PatchSet: 2 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 23:42:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9344 to look at the new patch set (#5). Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool Previously, tuple pointers of a row batch are allocated from the heap via malloc() and tuple data is allocated from the MemPool associated with the RowBatch. This change converts the allocations of tuple pointers and tuple data to using BufferPool for row batches allocated from KrpcDataStreamRecvr. The primary motivation for this change is to take advantage of the fact that buffers allocated from BufferPool always go back to the per-core arena they came from when they are freed. This alleviates the TCMalloc imbalance between the RPC service threads and the fragment execution threads. As described in IMPALA-5518, row batches are always allocated from the service threads' TCMalloc cache and placed into the fragment execution threads' TCMalloc cache when they're freed. This leads to underflow and overflow in those threads' caches and high contention for the spinlock of the central free list. With BufferPool, the memory always went back to its originating arena so this kind of imbalance is less likely to occur. This also dovetails with the long term plan to put most allocations under BufferPool and have each operators in the plan reserved appropriate amount of memory before execution. Note that the proper reservation mechanism of the exchange node hasn't yet been implemented in this change so the buffer pool client handle used for allocating buffers has an ad-hoc set-up of no reservation limit and using root reservation tracker as parent. This needs to be fixed as part of IMPALA-6524. Part of this change also lowers the minimum buffer size to 32KB to reduce amount of memory wastage as a row batch's tuple pointers / tuple data can sometimes be smaller than 64KB. Testing done: Debug core build. Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 --- M be/src/common/global-flags.cc M be/src/exec/blocking-join-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/partial-sort-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/query-options-test.cc M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test 18 files changed, 216 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/5 -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9282 to look at the new patch set (#5). Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. IMPALA-6116: Bound memory usage of DataStreamSevice's service queue The fix for IMPALA-6193 added a memory tracker for the memory consumed by the payloads in the service queue of DataStreamService. This change extends it by introducing a bound on the memory usage for that service queue. In addition, it deprecates FLAGS_datastream_service_queue_depth and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags only take effect when KRPC is in use and KRPC was never enabled in any previous releases so it seems safe to do this flag replacement. The new flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount of memory which can be consumed by the service queue of DataStreamService. This allows a more direct control over the memory usage of the queue instead of inferring via the number of entries in the queue. The default value of this flag is left at 0, in which case it will be set to 5% of process memory limit. Testing done: exhaustive debug builds. Updated data-stream-test to exercise the case in which the payload is larger than the limit. Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/mem-tracker.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M tests/custom_cluster/test_krpc_mem_usage.py 15 files changed, 282 insertions(+), 162 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/5 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 5 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Hello Michael Ho, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9292 to look at the new patch set (#11). Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. IMPALA-6269: Expose KRPC metrics on debug webpage This change exposes KRPC metrics on the /rpcz debug web page. This change also exposes metrics for rejected RPCs on the /metrics debug web page. See here for an example: https://git.io/vAczm This change also fixes a bug in PrettyPrinter::GetByteUnit(), which previously did not work for unsigned values due to an implicit cast. This change contains tests to check that the metrics show up in /rpcz and /metrics and that they update as expected when executing queries. This change is based on a change by Sailesh Mukil. Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impalad-main.cc M be/src/util/histogram-metric.h M be/src/util/pretty-printer.h M common/thrift/Metrics.thrift M common/thrift/metrics.json A tests/custom_cluster/test_krpc_metrics.py M tests/webserver/test_web_pages.py M www/rpcz.tmpl 16 files changed, 503 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/11 -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 11 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9292 ) Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. Patch Set 10: Please see the tests in PS10. -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 22:05:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Hello Michael Ho, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9292 to look at the new patch set (#10). Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. IMPALA-6269: Expose KRPC metrics on debug webpage This change exposes KRPC metrics on the /rpcz debug web page. This change also exposes metrics for rejected RPCs on the /metrics debug web page. See here for an example: https://git.io/vAczm This change also fixes a bug in PrettyPrinter::GetByteUnit(), which previously did not work for unsigned values due to an implicit cast. This change is based on a change by Sailesh Mukil. TODO: testing, once IMPALA-6508 has been submitted. Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impalad-main.cc M be/src/util/histogram-metric.h M be/src/util/pretty-printer.h M common/thrift/Metrics.thrift M common/thrift/metrics.json A tests/custom_cluster/test_krpc_metrics.py M tests/webserver/test_web_pages.py M www/rpcz.tmpl 16 files changed, 503 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/10 -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9354 ) Change subject: IMPALA-6526: Fix spilling test for running on local FS .. Patch Set 1: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1949/ -- To view, visit http://gerrit.cloudera.org:8080/9354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Gerrit-Change-Number: 9354 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 21:51:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252 PS4, Line 252: skip_gc = false > Should we change this so that we only GC if the process limit was exceeded? Actually, Tim pointed out that we only installed GC functions for process mem tracker so I was just misreading the code then. Sorry for the confusion. Will remove the flag. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 21:50:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9354 ) Change subject: IMPALA-6526: Fix spilling test for running on local FS .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Gerrit-Change-Number: 9354 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 21:47:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6526: Fix spilling test for running on local FS
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9354 Change subject: IMPALA-6526: Fix spilling test for running on local FS .. IMPALA-6526: Fix spilling test for running on local FS One of the spilling test was failing because its minimum bufferpool mem requirement was more when ran on local FS as compared to when it is run on HDFS. The fix is to increase the bufferpool limit to a value just above the min limit so that it still forces spill to disk on both filesystems. Testing: Ran core tests with local FS as target file system. Made sure the failing test passed. Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 --- M testdata/workloads/functional-query/queries/QueryTest/spilling.test 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/9354/1 -- To view, visit http://gerrit.cloudera.org:8080/9354 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I50648d7936007a26891cf64d6343c47d9d646596 Gerrit-Change-Number: 9354 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159 PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* output_batch, bool* eos) { > I thought we discussed this in the past and the conclusion was that we can Oops, you are right. http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123 PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = &(tuple_ptrs_info_->buffer); > Mostly 8KB I suspect if there are 1024 rows and 1 tuple (e.g. from scan nod Seems ok to me. -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 21:22:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables
Hello Thomas Tauber-Marshall, Taras Bobrovytsky, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9306 to look at the new patch set (#4). Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables .. IMPALA-5752: Add support for DECIMAL on Kudu tables Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0. Note: Adding support for Kudu decimal min/max filters is tracked in IMPALA-6533. Tests: * Added Kudu create with decimal test to AnalyzeDDLTest.java * Added Kudu table_format to test_decimal_queries.py ** Both decimal.test and decimal-exprs.test workloads * Added decimal queries to the following Kudu workloads: ** kudu_create.test ** kudu_delete.test ** kudu_insert.test ** kudu_update.test ** kudu_upsert.test Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 --- M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exprs/kudu-partition-expr.cc M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test M tests/query_test/test_decimal_queries.py M tests/query_test/test_kudu.py 20 files changed, 716 insertions(+), 603 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/4 -- To view, visit http://gerrit.cloudera.org:8080/9306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 Gerrit-Change-Number: 9306 Gerrit-PatchSet: 4 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Feb 2018 20:23:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows Tuple pointers in the generated row batches may not be initialized if a tuple has byte size 0. There are some codes which compare these uninitialized pointers against nullptr so having them uninitialized may return wrong (and non-deterministic) results, e.g.: impala::TupleIsNullPredicate::GetBooleanVal() The following query produces non-deterministic results currently: SELECT count(v.x) FROM functional.alltypestiny t3 LEFT OUTER JOIN ( SELECT true AS x FROM functional.alltypestiny t1 LEFT OUTER JOIN functional.alltypestiny t2 ON (true)) v ON (v.x = t3.bool_col) WHERE t3.bool_col = true; The alltypestiny table has 8 records, 4 records of them has the true boolean value for bool_col. Therefore, the above query is a fancy way of saying "8 * 8 * 4", i.e. it should return 256. The solution is that scanners initialize tuple pointers to a non-null value when there are no materialized slots. This non-null value is provided by the new static member Tuple::POISON. I extended QueryTest/scanners.test with the above query. This test executes the query against all table formats. This change has the biggest performance impact on count(*) queries on large kudu tables. For my quick benchmark I copied tpch_kudu.lineitem and doubled its data. The resulting table has 12,002,430 rows. Without this patch 'select count(*) from biglineitem' runs for ~0.12s. With the patch applied, the overhead is around a dozens of ms. I measured the query on my desktop PC using a relase build of Impala. On debug builds, the execution time of the patched version is around 160% of the original version. Without this patch: +--++--+--+++---+---+-+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +--++--+--+++---+---+-+ | 03:AGGREGATE | 1 | 127.50us | 127.50us | 1 | 1 | 28.00 KB | 10.00 MB | FINALIZE| | 02:EXCHANGE | 1 | 22.32ms | 22.32ms | 3 | 1 | 0 B | 0 B | UNPARTITIONED | | 01:AGGREGATE | 3 | 1.78ms | 1.89ms | 3 | 1 | 16.00 KB | 10.00 MB | | | 00:SCAN KUDU | 3 | 8.00ms | 8.28ms | 12.00M | -1 | 512.00 KB | 0 B | default.biglineitem | +--++--+--+++---+---+-+ With this patch: +--++--+--+++---+---+-+ | Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail | +--++--+--+++---+---+-+ | 03:AGGREGATE | 1 | 129.01us | 129.01us | 1 | 1 | 28.00 KB | 10.00 MB | FINALIZE| | 02:EXCHANGE | 1 | 33.00ms | 33.00ms | 3 | 1 | 0 B | 0 B | UNPARTITIONED | | 01:AGGREGATE | 3 | 1.99ms | 2.13ms | 3 | 1 | 16.00 KB | 10.00 MB | | | 00:SCAN KUDU | 3 | 13.13ms | 13.97ms | 12.00M | -1 | 512.00 KB | 0 B | default.biglineitem | +--++--+--+++---+---+-+ Change-Id: I2981227e62eb5971508e0698e189519755de Reviewed-on: http://gerrit.cloudera.org:8080/9239 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scanner.cc M be/src/exec/kudu-scanner.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h M testdata/workloads/functional-query/queries/QueryTest/scanners.test 5 files changed, 27 insertions(+), 2 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 7 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159 PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* output_batch, bool* eos) { > Forgot to check this. I think we should be setting the flush_resources_ bit I thought we discussed this in the past and the conclusion was that we can live with this wrong accounting for now if it means not breaking the ability to accumulate multiple row batches here in ExchangeNode::GetNext(). http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123 PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = &(tuple_ptrs_info_->buffer); > The common case is that the caller of ExchangeNode::GetNext() operates on a Mostly 8KB I suspect if there are 1024 rows and 1 tuple (e.g. from scan node). From join, it may be 16KB with 2 tuples. How about we lower the min buffer size to 8KB. Will the overhead be unbearable then ? -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 20:16:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9195 ) Change subject: IMPALA-6337: Fix infinite loop in Impala shell .. Patch Set 9: Code-Review+1 lgtm. would anyone with more experience with the shell/python care to have a look at this patch? -- To view, visit http://gerrit.cloudera.org:8080/9195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b Gerrit-Change-Number: 9195 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 20:02:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell
Fredy Wijaya has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/9195 ) Change subject: IMPALA-6337: Fix infinite loop in Impala shell .. IMPALA-6337: Fix infinite loop in Impala shell The patch uses a workaround for a bug in the sqlparse.split() function by joining the statements that contain error tokens into a single statement Testing: - Ran end-to-end shell tests Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b --- M shell/impala_shell.py M tests/shell/test_shell_interactive.py 2 files changed, 59 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/9195/9 -- To view, visit http://gerrit.cloudera.org:8080/9195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b Gerrit-Change-Number: 9195 Gerrit-PatchSet: 9 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6337: Fix infinite loop in Impala shell
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/9195 ) Change subject: IMPALA-6337: Fix infinite loop in Impala shell .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/9195/8/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/9195/8/shell/impala_shell.py@580 PS8, Line 580: select repeated 'select'? http://gerrit.cloudera.org:8080/#/c/9195/8/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/9195/8/tests/shell/test_shell_interactive.py@482 PS8, Line 482: # IMPALA-6337: Fix infinite loop. add the simpler example from the jira -- To view, visit http://gerrit.cloudera.org:8080/9195 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9142f21a888189d351f00ce09baeba123bc0959b Gerrit-Change-Number: 9195 Gerrit-PatchSet: 8 Gerrit-Owner: Fredy WijayaGerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 19:26:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252 PS4, Line 252: skip_gc = false > Yes, we can try to document it. Should we change this so that we only GC if the process limit was exceeded? And then we don't need this flag, right? -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 19:18:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252 PS4, Line 252: skip_gc = false > I think it's going to be kind of hard for readers of the code and new calls Yes, we can try to document it. I guess the problem to me is that this function is unconditionally GC'ing when any MemTracker's limit is exceeded. However, IIUC, GC'ing should help if you exceeded the process mem limit. So, it's unclear to me if it's okay to add a check here to see if the mem_tracker exceeding the limit is the process_mem_tracker_ and only call GcMemory in that case. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 19:17:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5690: Part 1: Rename ostream operators for thrift types
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/9168 ) Change subject: IMPALA-5690: Part 1: Rename ostream operators for thrift types .. Patch Set 9: > Patch Set 9: Code-Review+1 > > Fyi, its nice for reviewers if you don't submit changes at the same time as a > rebase - if you look at the diff between patch 8 and 9 here it contains a > bunch of unrelated stuff, which makes it hard to see what changes you > actually made. > > I usually submit the review with the changes I made and then immediately > rebase and submit again. Thanks for the tip. -- To view, visit http://gerrit.cloudera.org:8080/9168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c303997411237e988ef960157f781776f6fcb60 Gerrit-Change-Number: 9168 Gerrit-PatchSet: 9 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 16 Feb 2018 19:15:55 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
Alex Rodoni has posted comments on this change. ( http://gerrit.cloudera.org:8080/9347 ) Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. Patch Set 1: Now it is public. -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: Alex Rodoni Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 19:01:54 + Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Typos fixed in Impala Analytic Functions doc
Alex Rodoni has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9347 Change subject: [DOCS] Typos fixed in Impala Analytic Functions doc .. [DOCS] Typos fixed in Impala Analytic Functions doc Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 --- M docs/shared/impala_common.xml M docs/topics/impala_analytic_functions.xml 2 files changed, 3 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/9347/1 -- To view, visit http://gerrit.cloudera.org:8080/9347 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iec4a2822f5e066574e64bf025d300e4cde7a7d29 Gerrit-Change-Number: 9347 Gerrit-PatchSet: 1 Gerrit-Owner: Alex RodoniGerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@159 PS1, Line 159: Status ExchangeNode::GetNext(RuntimeState* state, RowBatch* output_batch, bool* eos) { Forgot to check this. I think we should be setting the flush_resources_ bit on RowBatches that are backed by buffers so that consumers don't accumulate memory, which will be accounted against this ExchangeNode. http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123 PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = &(tuple_ptrs_info_->buffer); > I'd prefer if we don't break the RowBatch abstraction further. i.e. if diff The common case is that the caller of ExchangeNode::GetNext() operates on a single batch at a time, whereas we can have many buffered row batches here, so I think that is not important to optimise for.. My biggest concern with this patch is memory consumption of the queued batches. It does depend on the numbers though - do you have a sense for what the typical size of tuple_ptrs_ is? -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 18:58:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9306 ) Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables .. Patch Set 3: FYI. I am adding some primary key tests and working on running/validating the tests. -- To view, visit http://gerrit.cloudera.org:8080/9306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 Gerrit-Change-Number: 9306 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 16 Feb 2018 18:55:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5690: Part 1: Rename ostream operators for thrift types
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/9168 ) Change subject: IMPALA-5690: Part 1: Rename ostream operators for thrift types .. Patch Set 9: Code-Review+1 Fyi, its nice for reviewers if you don't submit changes at the same time as a rebase - if you look at the diff between patch 8 and 9 here it contains a bunch of unrelated stuff, which makes it hard to see what changes you actually made. I usually submit the review with the changes I made and then immediately rebase and submit again. -- To view, visit http://gerrit.cloudera.org:8080/9168 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9c303997411237e988ef960157f781776f6fcb60 Gerrit-Change-Number: 9168 Gerrit-PatchSet: 9 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tianyi Wang Gerrit-Comment-Date: Fri, 16 Feb 2018 18:50:38 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123 PS1, Line 123: BufferPool::BufferHandle* tuple_ptrs_buffer = &(tuple_ptrs_info_->buffer); > On the other hand, that also means holding onto the memory of tuple_ptrs_ l I'd prefer if we don't break the RowBatch abstraction further. i.e. if different instances behave differently (w.r.t. Reset() for example), then the abstraction is probably not right. We already have this abstraction breakage for AcquireState(). Later, when figuring out the reservation for receiver/xchg node, I think we should revisit this. There are probably other improvements we can make and maybe RowBatch isn't the best way to do this buffering. -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 18:47:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/9282/4/be/src/runtime/mem-tracker.h@252 PS4, Line 252: skip_gc = false I think it's going to be kind of hard for readers of the code and new callsites to know why/when to set skip_gc=true. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 18:42:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Hello Michael Ho, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9292 to look at the new patch set (#8). Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. IMPALA-6269: Expose KRPC metrics on debug webpage This change exposes KRPC metrics on the /rpcz debug web page. This change also exposes metrics for rejected RPCs on the /metrics debug web page. See here for an example: https://git.io/vAczm This change also fixes a bug in PrettyPrinter::GetByteUnit(), which previously did not work for unsigned values due to an implicit cast. This change is based on a change by Sailesh Mukil. TODO: testing, once IMPALA-6508 has been submitted. Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/impalad-main.cc M be/src/util/histogram-metric.h M be/src/util/pretty-printer.h M common/thrift/Metrics.thrift M common/thrift/metrics.json M tests/webserver/test_web_pages.py M www/rpcz.tmpl 15 files changed, 387 insertions(+), 106 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/9292/8 -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 8 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9292 ) Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. Patch Set 7: Thanks for the review, I address the comments in PS8. I will rebase the change next to pull in the --test_krpc flag. -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 7 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 18:33:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8958 ) Change subject: IMPALA-5152: Introduce metadata loading phase .. Patch Set 5: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/8958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 Gerrit-Change-Number: 8958 Gerrit-PatchSet: 5 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8958 ) Change subject: IMPALA-5152: Introduce metadata loading phase .. Patch Set 4: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/8958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 Gerrit-Change-Number: 8958 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Hello Bharath Vissapragada, Philip Zeyliger, Dimitris Tsirogiannis, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8958 to look at the new patch set (#5). Change subject: IMPALA-5152: Introduce metadata loading phase .. IMPALA-5152: Introduce metadata loading phase Reworks the collection and loading of missing metadata when compiling a statement. Introduces a new metadata-loading phase between parsing and analysis. Summary of the new compilation flow: 1. Parse statement. 2. Collect all table references from the parsed statement and generate a list of tables that need to be loaded for analysis to succeed. 3. Request missing metadata and wait for it to arrive. As views become loaded we expand the set of required tables based on the view definitions. This step populates a statement-local table cache that contains all loaded tables relevant to the statement. 4. Create a new Analyzer with the table cache and analyze the statement. During analysis only the table cache is consulted for table metadata, the ImpaladCatalog is not used for that purpose anymore. 5. Authorize the statement. 6. Plan generation as usual. The intent of the existing code was to collect all tables missing metadata during analysis, load the metadata, and then re-analyze the statement (and repeat those steps until all metadata is loaded). Unfortunately, the relevant code was hard-to-follow, subtle and not well tested, and therefore it was broken in several ways over the course of time. For example, the introduction of path analysis for nested types subtly broke the intended behavior, and there are other similar examples. The serial table loading observed in the JIRA was caused by the following code in the resolution of table references: for (all path interpretations) { try { // Try to resolve the path; might call getTable() which // throws for nonexistent tables. } catch (AnalysisException e) { if (analyzer.hasMissingTbls()) throw e; } } The following example illustrates the problem: SELECT * FROM a.b, x.y When resolving the path "a.b" we consider that "a" could be a database or a table. Similarly, "b" could be a table or a nested collection. If the path resolution for "a.b" adds a missing table entry, then the path resolution for "x.y" could exit prematurely, without trying the other path interpretations that would lead to adding the expected missing table. So effectively, the tables end up being loaded one-by-one. Testing: - A core/hdfs run succeeded - No new tests were added because the existing functional tests provide good coverage of various metadata loading scenarios. - The issue reported in IMPALA-5152 is basically impossible now. Adding FE unit tests for that bug specifically would require ugly changes to the new code to enable such testing. Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 --- M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/AuthorizationStmt.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateFunctionStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/LimitElement.java M fe/src/main/java/org/apache/impala/analysis/LoadDataStmt.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/Path.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SetStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateFunctionStmt.java M fe/src/main/java/org/apache/impala/analysis/ShowCreateTableStmt.java M
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/8958 ) Change subject: IMPALA-5152: Introduce metadata loading phase .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@302 PS4, Line 302: Map of all query-relevant tables that is populated before analysis in : // Frontend#requestTableLoadAndWait(). : // An entry in the map is guaranteed to point to a loaded table. This could mean : // the table was loaded successfully or a load was attempted but failed. : // The absence of an entry indicates that table was not in the Catalog at the time : // t > nit: is most of the detail for this comment better stated in the StmtTableC Condensed comment. http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@43 PS4, Line 43: v > nit: sp Done http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@46 PS4, Line 46: Frontend.class > use this class. Oops. Done. http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@62 PS4, Line 62: numCatalogUpdates_ > unclear from the name what numCatalogUpdates_ tracks. how about: numLoadsRe In my mind these are two completely different things, so I'd prefer not to use the same "load" terminology for them. I modified the var names to distinguish between sent/received and added comments. http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@115 PS4, Line 115:* Collects and loads all tables and views required to analyze the given statement > nit: period Done -- To view, visit http://gerrit.cloudera.org:8080/8958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 Gerrit-Change-Number: 8958 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 18:00:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5152: Introduce metadata loading phase
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8958 ) Change subject: IMPALA-5152: Introduce metadata loading phase .. Patch Set 4: Code-Review+1 (5 comments) thanks for the refactor and db load handling. mostly small comments from my end. http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java File fe/src/main/java/org/apache/impala/analysis/Analyzer.java: http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@302 PS4, Line 302: Map of all query-relevant tables that is populated before analysis in : // Frontend#requestTableLoadAndWait(). : // An entry in the map is guaranteed to point to a loaded table. This could mean : // the table was loaded successfully or a load was attempted but failed. : // The absence of an entry indicates that table was not in the Catalog at the time : // t nit: is most of the detail for this comment better stated in the StmtTableCache class? http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java: http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@43 PS4, Line 43: v nit: sp http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@46 PS4, Line 46: Frontend.class use this class. http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@62 PS4, Line 62: numCatalogUpdates_ unclear from the name what numCatalogUpdates_ tracks. how about: numLoadsReceived_ (and perhaps numLoadsRequested_)? http://gerrit.cloudera.org:8080/#/c/8958/4/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@115 PS4, Line 115:* Collects and loads all tables and views required to analyze the given statement nit: period -- To view, visit http://gerrit.cloudera.org:8080/8958 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I68d32d5acd4a6f6bc6cedb05e6cc5cf604d24a55 Gerrit-Change-Number: 8958 Gerrit-PatchSet: 4 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Feb 2018 17:29:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1948/ -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Feb 2018 16:41:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Feb 2018 16:40:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT
Hello Lars Volker, Gabor Kaszab, Alex Behm, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8400 to look at the new patch set (#14). Change subject: IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT .. IMPALA-4167: Support insert plan hints for CREATE TABLE AS SELECT This change adds support for "clustered", "noclustered", "shuffle" and "noshuffle" hints in CTAS statement. Example: create /*+ clustered,noshuffle */ table t partitioned by (year, month) as select * from functional.alltypes The effect of these hints are the same as in insert statements: clustered: Sort locally by partition expression before insert to ensure that only one partition is written at a time. The goal is to reduce the number of files kept open / buffers kept in memory simultaneously. noclustered: Do not sort by primary key before insert to Kudu table. No effect on HDFS tables currently, as this is their default behavior. shuffle: Forces the planner to add an exchange node that repartitions by the partition expression of the output table. This means that a partition will be written only by a single node, which minimizes the global number of simultaneous writes. If only one partition is written (because all partitioning columns are constant or the target table is not partitioned), then the shuffle hint leads to a plan where all rows are merged at the coordinator where the table sink is executed. noshuffle: Do not add exchange node before insert to partitioned tables. The parser needed some modifications to be able to differentiate between CREATE statements that allow hints (CTAS), and CREATE statements that do not (every other type of CREATE statements). As a result, KW_CREATE was moved from tbl_def_without_col_defs to statement rules. The tests mainly mirror the analysis / planner / parser tests of INSERT. Query tests are not created, as the hints have no effect on the DDL part of CTAS, and the actual query ran is the same as in the insert case. Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 --- M fe/src/main/cup/sql-parser.cup M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java M fe/src/test/java/org/apache/impala/analysis/ParserTest.java M fe/src/test/java/org/apache/impala/analysis/ToSqlTest.java M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test 9 files changed, 333 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8400/14 -- To view, visit http://gerrit.cloudera.org:8080/8400 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8d74bca999da8ae1bb89427c70841f33e3c56ab0 Gerrit-Change-Number: 8400 Gerrit-PatchSet: 14 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9306 ) Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc@214 PS2, Line 214: > Yeah, I just took what the formatter suggested. Will update. Done http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@625 PS2, Line 625: for (TupleId tupleId: targetSlotsByTid.keySet()) { > I filed IMPALA-6533. Can you reference that here? Done -- To view, visit http://gerrit.cloudera.org:8080/9306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 Gerrit-Change-Number: 9306 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 16 Feb 2018 16:22:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/9306 ) Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables .. Patch Set 3: (4 comments) > For example, you could add tests for insert/update/delete with decimal values > that are outside the range for the precision/scale of the column, creating a > table with a decimal primary key, etc. Do these tests exist for the decimal type in general? For other formats like text, parquet, and Avro? I looked in the decimal.test and decimal_exprs.test workloads but didn't see anything. I suspect that sort of "validation" occurs before calling down to the specific format. If not, reusing the existing tests would be great. http://gerrit.cloudera.org:8080/#/c/9306/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9306/2//COMMIT_MSG@10 PS2, Line 10: > We generally include a "Testing" section here, eg. what tests were added, w Done http://gerrit.cloudera.org:8080/#/c/9306/1/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: http://gerrit.cloudera.org:8080/#/c/9306/1/be/src/exec/kudu-util.h@a69 PS1, Line 69: > Well its useful for debugging I filed KUDU-2303 to add a ToString method to KuduSchema itself. http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: http://gerrit.cloudera.org:8080/#/c/9306/2/be/src/exec/kudu-util.cc@214 PS2, Line 214: > I imagine the formatter told you to do this. We don't follow it 100%, so yo Yeah, I just took what the formatter suggested. Will update. http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java File fe/src/main/java/org/apache/impala/util/KuduUtil.java: http://gerrit.cloudera.org:8080/#/c/9306/2/fe/src/main/java/org/apache/impala/util/KuduUtil.java@400 PS2, Line 400: Fall through below */ > Another place you can ignore the formatter. Done -- To view, visit http://gerrit.cloudera.org:8080/9306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 Gerrit-Change-Number: 9306 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Fri, 16 Feb 2018 16:21:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5752: Add support for DECIMAL on Kudu tables
Hello Thomas Tauber-Marshall, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9306 to look at the new patch set (#3). Change subject: IMPALA-5752: Add support for DECIMAL on Kudu tables .. IMPALA-5752: Add support for DECIMAL on Kudu tables Adds support for the Kudu DECIMAL type introduced in Kudu 1.7.0. Note: Adding support for Kudu decimal min/max filters is tracked in IMPALA-6533. Tests: * Added Kudu create with decimal test to AnalyzeDDLTest.java * Added Kudu table_format to test_decimal_queries.py ** Both decimal.test and decimal-exprs.test workloads * Added decimal queries to the following Kudu workloads: ** kudu_create.test ** kudu_delete.test ** kudu_insert.test ** kudu_update.test ** kudu_upsert.test Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 --- M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-table-sink.cc M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exprs/kudu-partition-expr.cc M fe/src/main/java/org/apache/impala/catalog/KuduColumn.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/main/java/org/apache/impala/service/KuduCatalogOpExecutor.java M fe/src/main/java/org/apache/impala/util/KuduUtil.java M fe/src/test/java/org/apache/impala/analysis/AnalyzeDDLTest.java M testdata/datasets/functional/schema_constraints.csv M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test M tests/query_test/test_decimal_queries.py M tests/query_test/test_kudu.py 20 files changed, 691 insertions(+), 603 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/9306/3 -- To view, visit http://gerrit.cloudera.org:8080/9306 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I2f927fce25dc4fa9529a4e0b688825699f5c0ea6 Gerrit-Change-Number: 9306 Gerrit-PatchSet: 3 Gerrit-Owner: Grant HenkeGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 5: Seems like it was hit by IMPALA-6532 -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Feb 2018 14:25:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/9063 ) Change subject: IMPALA-5801: Clean up codegen GetType() interface .. Patch Set 8: I took the liberty to make the GetIntConstant cleanup in patch set 8. -- To view, visit http://gerrit.cloudera.org:8080/9063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 Gerrit-Change-Number: 9063 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 14:20:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5801: Clean up codegen GetType() interface
Hello Lars Volker, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9063 to look at the new patch set (#8). Change subject: IMPALA-5801: Clean up codegen GetType() interface .. IMPALA-5801: Clean up codegen GetType() interface Several functions that return llvm::(Pointer)Type were renamed to make them shorter or indicate their roles more clearly. Some additional convenience functions were created to make some common codegen tasks simpler: - Get(Ptr)Type functions with string parameter are renamed to GetNamed(Ptr)Type - GetStruct(Ptr)Type template functions are created to make GetNamedType(MyStruct::LLVM_CLASS_NAME) calls simpler (some classes had LLVM_CLASS_NAME as private, these are changed to public) - integer type convenience functions are renamed to indicate bit width instead of matching SQL type (e.g. int_type->i32_type) - similar convenience functions were created for ptr to primitive types, int_ptr_type - Get(Ptr)Type functions with ColumnType parameter are renamed to GetSlot(Ptr)Type - GetIntConstant function is split to several functions depending on the type of the integer e.g. GetI32Constant The renamed functions can be found in llvm-codegen.h/cc. Changes in other files are mainly renamed calls with the same functionality. Testing: No new tests are necessary, as no functionality was changed. Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/exec-node.cc M be/src/exec/filter-context.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/text-converter.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/tuple.cc M be/src/runtime/types.cc M be/src/util/tuple-row-compare.cc 26 files changed, 357 insertions(+), 392 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/9063/8 -- To view, visit http://gerrit.cloudera.org:8080/9063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib146ca21af82023b0b460f813eae3435b4b2eb77 Gerrit-Change-Number: 9063 Gerrit-PatchSet: 8 Gerrit-Owner: Csaba RinghoferGerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9239 ) Change subject: IMPALA-6258: Uninitialized tuple pointers in row batch for empty rows .. Patch Set 5: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1947/ -- To view, visit http://gerrit.cloudera.org:8080/9239 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2981227e62eb5971508e0698e189519755de Gerrit-Change-Number: 9239 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 16 Feb 2018 09:45:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add KRPC test flag .. Patch Set 10: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 10 Gerrit-Owner: Lars VolkerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 09:26:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6508: add KRPC test flag
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9291 ) Change subject: IMPALA-6508: add KRPC test flag .. IMPALA-6508: add KRPC test flag This change adds a flag "--use_krpc" to start-impala-cluster.py. The flag is currently passed as an argument to the impalad daemon. In the future it will also enable KRPC for the catalogd and statestored daemons. This change also adds a flag "--test_krpc" to pytest. When running tests using "impala-py.test --test_krpc", the test cluster will be started by passing "--use_krpc" to start-impala-cluster.py (see above). This change also adds a SkipIf to skip tests based on whether the cluster was started with KRPC support or not. - SkipIf.not_krpc can be used to mark a test that depends on KRPC. - SkipIf.not_thrift can be used to mark a test that depends on Thrift RPC. This change adds a meta test to make sure that the new SkipIf decorators work correctly. The test should be removed as soon as real tests have been added with the new decorators. Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Reviewed-on: http://gerrit.cloudera.org:8080/9291 Reviewed-by: David KnuppTested-by: Impala Public Jenkins --- M bin/run-all-tests.sh M bin/start-impala-cluster.py M tests/common/custom_cluster_test_suite.py M tests/common/skip.py A tests/common/test_skip.py M tests/conftest.py M tests/custom_cluster/test_krpc_mem_usage.py M tests/custom_cluster/test_rpc_exception.py 8 files changed, 86 insertions(+), 7 deletions(-) Approvals: David Knupp: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ie01a5de2afac4a0f43d5fceff283f6108ad6a3ab Gerrit-Change-Number: 9291 Gerrit-PatchSet: 11 Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-6269: Expose KRPC metrics on debug webpage
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9292 ) Change subject: IMPALA-6269: Expose KRPC metrics on debug webpage .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@55 PS7, Line 55: RPC_QUEUE_OVERFLOW_METRIC_KEY = "rpc.$0.rpcs_queue_overflow"; Looks reasonable to me. http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@72 PS7, Line 72: service_->methods_by_name(); nit: indent http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@74 PS7, Line 74: nit: space http://gerrit.cloudera.org:8080/#/c/9292/7/be/src/rpc/impala-service-pool.cc@77 PS7, Line 77: nit: indent -- To view, visit http://gerrit.cloudera.org:8080/9292 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7af7c1a84a5be82c979ca4ef1edf35167493be3f Gerrit-Change-Number: 9292 Gerrit-PatchSet: 7 Gerrit-Owner: Lars VolkerGerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Fri, 16 Feb 2018 09:17:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9344 ) Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9344/3//COMMIT_MSG@29 PS3, Line 29: Note that the proper reservation mechanism of the exchange node > I was thinking through the implications of allocating this memory through t Good point. As a comparison, suppose we allocate the memory from non-bufferpool sources, does it mean the chance of allocation failure will be lower when we are that close to capacity ? If so, will bumping it to 85% get us to parity ? http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/blocking-join-node.cc File be/src/exec/blocking-join-node.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/blocking-join-node.cc@163 PS1, Line 163: // Release resources in 'build_batch_' before closing the children as some of the > I wonder if this was the root cause of IMPALA-5715. Could be. http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc File be/src/exec/exchange-node.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@82 PS1, Line 82: ExecEnv::GetInsta > Use ExecEnv::GetInstance() for consistency? Done http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/exec/exchange-node.cc@138 PS1, Line 138: ExecEnv::GetInstance()->buffer_pool()->DeregisterClient(_buffer_pool_client_); > This check isn't necessary, deregistering an unregistered client is a no-op Done http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@123 PS1, Line 123: > Why can't we allocate one buffer large enough for the tuple pointers and th On the other hand, that also means holding onto the memory of tuple_ptrs_ longer than necessary as row batches are accumulated by the caller of ExchangeNode::GetNext(). The hope is that this tuple_ptrs_ will mostly be satisfied with min_buffer_size so they will be recycled quickly across row batches. While I agree that we don't necessarily need to preserve tuple_ptrs_ across Reset() for batch generated from KrpcDataStreamRecvr, it seems to complicate the RowBatch logic even further. In other words, one cannot easily tell whether tuple_ptrs_ will remain valid after Reset() if we follow this pattern, making the code slightly harder to reason about. So, in light of that, I try to keep the logic of Reset() similar across row batches created from the 3 different ctors. http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@199 PS1, Line 199: const int64_t uncompressed_size = header.uncompressed_size(); > nit: don't need the std:: prefixes in .cc generally Done http://gerrit.cloudera.org:8080/#/c/9344/1/be/src/runtime/row-batch.cc@484 PS1, Line 484: // tuple_ptrs_ were allocated with malloc so can be swapped between batches. > We don't need to support AcquireState() for these batches do we? I don't th Good point. Replaced with a DCHECK. -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 09:06:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool
Hello Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9344 to look at the new patch set (#4). Change subject: IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool .. IMPALA-5518: Allocate KrpcDataStreamRecvr RowBatch tuples from BufferPool Previously, tuple pointers of a row batch are allocated from the heap via malloc() and tuple data is allocated from the MemPool associated with the RowBatch. This change converts the allocations of tuple pointers and tuple data to using BufferPool for row batches allocated from KrpcDataStreamRecvr. The primary motivation for this change is to take advantage of the fact that buffers allocated from BufferPool always go back to the per-core arena they came from when they are freed. This alleviates the TCMalloc imbalance between the RPC service threads and the fragment execution threads. As described in IMPALA-5518, row batches are always allocated from the service threads' TCMalloc cache and placed into the fragment execution threads' TCMalloc cache when they're freed. This leads to underflow and overflow in those threads' caches and high contention for the spinlock of the central free list. With BufferPool, the memory always went back to its originating arena so this kind of imbalance is less likely to occur. This also dovetails with the long term plan to put most allocations under BufferPool and have each operators in the plan reserved appropriate amount of memory before execution. Note that the proper reservation mechanism of the exchange node hasn't yet been implemented in this change so the buffer pool client handle used for allocating buffers has an ad-hoc set-up of no reservation limit and using root reservation tracker as parent. This needs to be fixed as part of IMPALA-6524. Part of this change also lowers the minimum buffer size to 32KB to reduce amount of memory wastage as a row batch's tuple pointers / tuple data can sometimes be smaller than 64KB. Testing done: Debug core build. Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 --- M be/src/common/global-flags.cc M be/src/exec/blocking-join-node.cc M be/src/exec/exchange-node.cc M be/src/exec/exchange-node.h M be/src/exec/partial-sort-node.cc M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/query-options-test.cc M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test 18 files changed, 216 insertions(+), 84 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/9344/4 -- To view, visit http://gerrit.cloudera.org:8080/9344 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If4b1a45f68b9df0d3b539511e15aff15700246f2 Gerrit-Change-Number: 9344 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] Bump Kudu Java version to 1.7.0
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9349 ) Change subject: Bump Kudu Java version to 1.7.0 .. Bump Kudu Java version to 1.7.0 Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab Reviewed-on: http://gerrit.cloudera.org:8080/9349 Reviewed-by: Grant HenkeReviewed-by: Philip Zeyliger Tested-by: Impala Public Jenkins --- M bin/impala-config.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Grant Henke: Looks good to me, but someone else must approve Philip Zeyliger: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/9349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab Gerrit-Change-Number: 9349 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/9282 ) Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/9282/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9282/3//COMMIT_MSG@20 PS3, Line 20: 20 > 5% Done http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/9282/2/be/src/rpc/impala-service-pool.cc@155 PS2, Line 155: CheckLimitExceeded > Why don't we call AnyLimitExceeded()? If we're over the process limit it se I thought about that option but I am concerned about calling Gc in the middle of the Rpc path. Updated the patch to make AnyLimitExceeded() to not GC if specified by caller. http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/impala-service-pool.cc@159 PS3, Line 159: c->DiscardTransfer(); > Maybe drop the lock at this point to avoid calling free() while holding the Done http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h File be/src/rpc/rpc-mgr-test-base.h: http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h@135 PS3, Line 135: TakeoverService > nit: take over is two words, so TakeOverService Done http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test-base.h@234 PS3, Line 234: dynamic_cast(ping_impl)->mem_tracker())); > dynamic_cast<> here and below shouldn't be necessary - don't need to check Done http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/rpc/rpc-mgr-test.cc@186 PS3, Line 186: dynamic_cast > static_cast should also work here and below. Done http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/9282/3/be/src/runtime/mem-tracker.h@281 PS3, Line 281: bool CheckLimitExceeded() const { return limit_ >= 0 && limit_ < consumption(); } > Why not use the existing public LimitExceeded() or AnyLimitExceeded() APIs. I am a bit concerned by the fact that exceeding the memory limit may lead to GcMemory being called in the middle of some critical path. May be an alternative would be to make LimitExceeded() take an argument to skip Gc and simply return true. Implemented this idea in the new patch. -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 16 Feb 2018 08:46:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Bump Kudu Java version to 1.7.0
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/9349 ) Change subject: Bump Kudu Java version to 1.7.0 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/9349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I69a9bc9e36657a936591f723cf4110a1297973ab Gerrit-Change-Number: 9349 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Grant Henke Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Comment-Date: Fri, 16 Feb 2018 08:46:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6116: Bound memory usage of DataStreamSevice's service queue
Hello Lars Volker, Tim Armstrong, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/9282 to look at the new patch set (#4). Change subject: IMPALA-6116: Bound memory usage of DataStreamSevice's service queue .. IMPALA-6116: Bound memory usage of DataStreamSevice's service queue The fix for IMPALA-6193 added a memory tracker for the memory consumed by the payloads in the service queue of DataStreamService. This change extends it by introducing a bound on the memory usage for that service queue. In addition, it deprecates FLAGS_datastream_service_queue_depth and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags only take effect when KRPC is in use and KRPC was never enabled in any previous releases so it seems safe to do this flag replacement. The new flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount of memory which can be consumed by the service queue of DataStreamService. This allows a more direct control over the memory usage of the queue instead of inferring via the number of entries in the queue. The default value of this flag is left at 0, in which case it will be set to 5% of process memory limit. Testing done: exhaustive debug builds. Updated data-stream-test to exercise the case in which the payload is larger than the limit. Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test-base.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/data-stream-test.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/mem-tracker.h M be/src/service/data-stream-service.cc M be/src/service/data-stream-service.h M tests/custom_cluster/test_krpc_mem_usage.py 15 files changed, 292 insertions(+), 170 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/9282/4 -- To view, visit http://gerrit.cloudera.org:8080/9282 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8248e880b9 Gerrit-Change-Number: 9282 Gerrit-PatchSet: 4 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong