[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Tim Armstrong has uploaded a new patch set (#26). Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. IMPALA-4674: Part 2: port backend exec to BufferPool Always create global BufferPool at startup using 80% of memory and limit reservations to 80% of query memory (same as BufferedBlockMgr). The query's initial reservation is computed in the planner, claimed centrally (managed by the InitialReservations class) and distributed to query operators from there. min_spillable_buffer_size and default_spillable_buffer_size query options control the buffer size that the planner selects for spilling operators. Port ExecNodes to use BufferPool: * Each ExecNode has to claim its reservation during Open() * Port Sorter to use BufferPool. * Switch from BufferedTupleStream to BufferedTupleStreamV2 * Port HashTable to use BufferPool via a Suballocator. This also makes PAGG memory consumption more efficient (avoid wasting buffers) and improve the spilling algorithm: * Allow preaggs to execute with 0 reservation - if streams and hash tables cannot be allocated, it will pass through rows. * Halve the buffer requirement for spilling aggs - avoid allocating buffers for aggregated and unaggregated streams simultaneously. * Rebuild spilled partitions instead of repartitioning (IMPALA-2708) TODO in follow-up patches: * Rename BufferedTupleStreamV2 to BufferedTupleStream * Implement max_row_size query option. Testing: * Updated tests to reflect new memory requirements Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/partitioned-hash-join-node.inline.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/runtime/CMakeLists.txt D be/src/runtime/buffered-block-mgr-test.cc D be/src/runtime/buffered-block-mgr.cc D be/src/runtime/buffered-block-mgr.h D be/src/runtime/buffered-tuple-stream-test.cc D be/src/runtime/buffered-tuple-stream.cc D be/src/runtime/buffered-tuple-stream.h D be/src/runtime/buffered-tuple-stream.inline.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc A be/src/runtime/initial-reservations.cc A be/src/runtime/initial-reservations.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.h M be/src/service/client-request-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/bloom-filter.h M be/src/util/static-asserts.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/Pla
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. Patch Set 23: (4 comments) http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: Line 209: &buffer_pool_client_, resource_profile_.min_reservation); > No. I'm not sure that it's a useful check either - if the ExecNode implemen What I mean is wouldn't it be incorrect to return min_reservation if the client doesn't have it? Is it legal to call ReservationTracker::TransferReservationTo() with bytes > current reservation? what happens in that case? http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h File be/src/exec/hash-table.h: PS23, Line 625: Status* status > I tried to do this a while back with BTS::AddRow() but it was a can of worm That's really unfortunate. Probably worth revisiting at some point, but not now. http://gerrit.cloudera.org:8080/#/c/5801/25/be/src/exec/hash-table.h File be/src/exec/hash-table.h: PS25, Line 532: f th double into http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS23, Line 403: > In most cases it's probably a lot better, since it holds onto unclaimed ini I'm not sure why it's better in the case of cancelation, since the old code would release all buffered block mgr blocks at time of fragment instance cancel. I think we should basically do that but it should be more generally just an atomic count of executing fragments, and when the last fragment instance completes, ReleaseResources() is called. The current callsite of ReleaseResources() seems wrong -- the refcount for the QS should only keep the control structures around, not the resources. And regarding runtime filters, why should those resources live beyond the last running fragment instance? Assuming we could go forward with that, maybe best to separate this into a different commit that goes in first, to isolate it. -- To view, visit http://gerrit.cloudera.org:8080/5801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Gerrit-PatchSet: 23 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5643: Add total number of threads created per group to /threadz .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5643: Add total number of threads created per group to /threadz .. IMPALA-5643: Add total number of threads created per group to /threadz Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Reviewed-on: http://gerrit.cloudera.org:8080/7390 Reviewed-by: Henry Robinson Tested-by: Impala Public Jenkins --- M be/src/util/thread.cc M www/threadz.tmpl 2 files changed, 15 insertions(+), 7 deletions(-) Approvals: Impala Public Jenkins: Verified Henry Robinson: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. Patch Set 4: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/843/ -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Tim Armstrong has uploaded a new patch set (#25). Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. IMPALA-4674: Part 2: port backend exec to BufferPool Always create global BufferPool at startup using 80% of memory and limit reservations to 80% of query memory (same as BufferedBlockMgr). The query's initial reservation is computed in the planner, claimed centrally (managed by the InitialReservations class) and distributed to query operators from there. min_spillable_buffer_size and default_spillable_buffer_size query options control the buffer size that the planner selects for spilling operators. Port ExecNodes to use BufferPool: * Each ExecNode has to claim its reservation during Open() * Port Sorter to use BufferPool. * Switch from BufferedTupleStream to BufferedTupleStreamV2 * Port HashTable to use BufferPool via a Suballocator. This also makes PAGG memory consumption more efficient (avoid wasting buffers) and improve the spilling algorithm: * Allow preaggs to execute with 0 reservation - if streams and hash tables cannot be allocated, it will pass through rows. * Halve the buffer requirement for spilling aggs - avoid allocating buffers for aggregated and unaggregated streams simultaneously. * Rebuild spilled partitions instead of repartitioning (IMPALA-2708) TODO in follow-up patches: * Rename BufferedTupleStreamV2 to BufferedTupleStream * Implement max_row_size query option. Testing: * Updated tests to reflect new memory requirements Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/partitioned-hash-join-node.inline.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/runtime/CMakeLists.txt D be/src/runtime/buffered-block-mgr-test.cc D be/src/runtime/buffered-block-mgr.cc D be/src/runtime/buffered-block-mgr.h D be/src/runtime/buffered-tuple-stream-test.cc D be/src/runtime/buffered-tuple-stream.cc D be/src/runtime/buffered-tuple-stream.h D be/src/runtime/buffered-tuple-stream.inline.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc A be/src/runtime/initial-reservations.cc A be/src/runtime/initial-reservations.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.h M be/src/service/client-request-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/bloom-filter.h M be/src/util/static-asserts.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/Pla
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. Patch Set 23: (27 comments) http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: PS23, Line 194: !buffer_pool_client_.is_registered() > I guess that's needed for subplans? Yeah exactly. It's a little ugly but I couldn't think of a cleaner way to deal with this. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.h File be/src/exec/analytic-eval-node.h: PS23, Line 336: reservation > the stream will try to grow the reservation first, right? Done http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: Line 209: &buffer_pool_client_, resource_profile_.min_reservation); > do we somewhere inside there dcheck that the exec-node has at least min_res No. I'm not sure that it's a useful check either - if the ExecNode implementation wants to give up reservation earlier that seems OK. Line 226: resource_profile_.spillable_buffer_size < buffer_pool->min_buffer_len()) { > when can that happen? It's possible if the coordinator has a different configuration from the backends. It seems unlikely that anyone would configure it in that way but who knows. Line 233: // to get some reservation. > I'm not sure what that comment is saying. Yeah it was garbled. I don't think it's needed. PS23, Line 240: if (resource_profile_.min_reservation > 0) { > maybe delete that to exercise this path unconditionally Done http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h File be/src/exec/hash-table.h: PS23, Line 948: > is that solved or was it not a problem? This was IMPALA-3899, which is fixed. This comment just didn't get updated. PS23, Line 532: Either a pointer to the single tuple of this row > where does the tuple live in that case? Reworded the comment to be clearer (hopefully). PS23, Line 605: success > that's kind of confusing given the OK on status means success also. Maybe m Yeah got_memory is clearer. PS23, Line 617: > blank Done PS23, Line 625: Status* status > do we need to pass that indirectly for codegen/perf still? i thought we mad I tried to do this a while back with BTS::AddRow() but it was a can of worms. The problem was mainly codegen time, IIRC the runtime difference wasn't significant. In a lot of cases the compiler couldn't infer that the destructor for the automatic Status variable was dead code so the IR ends up with lots of unnecessary branches from the Status destructors. PS23, Line 688: success > same comment Done PS23, Line 854: Status* status > same question about status perf. otherwise, seems better to pass HtData** See above comment. PS23, Line 862: success > same comment changed to got_memory PS23, Line 879: status > same comment vs DuplicateNode** See above comment. PS23, Line 895: tatus* status) > this is opposite all the "bool* success" interfaces. reason for that? Same as the other ones - this gets inlined into the IR and the automatic Status causes problems. PS23, Line 923: stores_tuples_ > this could probably use a comment (the comment above seems to apply to the Yeah Tuple* is significantly faster at least until GetTupleRow() is codegen'd. Added a comment. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/partitioned-aggregation-node-ir.cc File be/src/exec/partitioned-aggregation-node-ir.cc: Line 224: if (hash_tbl == nullptr) return false; // Hash table was not created - pass through. > why does that happen now? or is it a bug fix? This was required so that preaggs can execute with zero reservation. Before we "got away" with allocating small hash tables and streams because the hash table and small buffer memory wasn't counted against the reservations. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: PS23, Line 88: page > should we express that as min_buffer_size instead? maybe easy to understan Yeah that would be more consistent. There was already a min_buffer_size flag in disk-io-mgr so I combined the two - they'll be merged anyway when we switch the io-mgr over to the buffer pool. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.cc File be/src/runtime/initial-reservations.cc: Line 69: DCHECK(success); > Explain why success is expected (planner computation). Done http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.h File be/src/runtime/initial-reservations.h: PS23, Line 40: This creates trackers for initial reservations under those. > This seems similar to a SubReservation, no? Right, they're both wrappers around ReservationTrackers but a SubReservation is associated with a BufferPool Client, and the operations are a bit different. PS23, Line 47: Claim >
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/7267/4/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: Line 527: if (!insertStmt.hasNoClusteredHint() && !ctx_.isSingleNodeExec()) { Why is this limited to Kudu? Inserting into Partitioned HDFS tables is a very common use case. -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Bharath Vissapragada has posted comments on this change. Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. Patch Set 3: Code-Review+2 (1 comment) I can run the pre-commit job once you fix the longline. +2'ing since the patch is trivial. http://gerrit.cloudera.org:8080/#/c/7188/3/shell/impala_shell.py File shell/impala_shell.py: Line 1333: print_to_stderr("Option --ldap_password_cmd requires using LDAP authentication mechanism (-l)") long line -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 19: Code-Review+2 Rebase, fix test -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 19: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/842/ -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 19 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Hello Impala Public Jenkins, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7223 to look at the new patch set (#19). Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/sort-node.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/common/TreeNode.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSink.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/BitUtil.java A fe/src/main/java/org/apache/impala/util/MathUtil.java A fe/src/test/java/org/apache/impala/util/BitUtilTest.java A fe/src/test/java/org/apache/impala/util/MathUtilTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-d
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 18: Verified-1 Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/839/ -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5629: avoid expensive list::size() call .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5629: avoid expensive list::size() call .. IMPALA-5629: avoid expensive list::size() call As a workaround until we move to GCC5+, explicitly track the pages_ list size. This is not too bad in practice since it is only mutated in 3 places. Testing: Ran buffered-tuple-stream-v2-test (the only coverage of BufferedTupleStreamV2 currently). Reran the query with the perf issue, confirmed that it was no longer spending lots of time in BufferedTupleStreamV2::AdvanceWritePage(). Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66 Reviewed-on: http://gerrit.cloudera.org:8080/7382 Reviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M be/src/runtime/buffered-tuple-stream-v2.cc M be/src/runtime/buffered-tuple-stream-v2.h 2 files changed, 21 insertions(+), 13 deletions(-) Approvals: Impala Public Jenkins: Verified Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic
Zach Amsden has posted comments on this change. Change subject: IMPALA-5164: Better benchmark heuristic .. Patch Set 1: -Code-Review -- To view, visit http://gerrit.cloudera.org:8080/7389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5582: Null-aware anti-join can take a long time to cancel
anujphadke has uploaded a new change for review. http://gerrit.cloudera.org:8080/7393 Change subject: IMPALA-5582: Null-aware anti-join can take a long time to cancel .. IMPALA-5582: Null-aware anti-join can take a long time to cancel Queries with a null-aware anti-join joining on a large number of NULLs can take a long time to cancel if threads are stuck in PartitionedHashJoinNode::EvaluateNullProbe(). This change adds the RETURN_IF_CANCELLED macro to the function. Testing: Added logs to PartitionedHashJoinNode::EvaluateNullProbe and made sure that the function returns right away on cancellation. Change-Id: I0800754d4ad31cbadbdfadc630c640963f3f6053 --- M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h 2 files changed, 12 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/7393/1 -- To view, visit http://gerrit.cloudera.org:8080/7393 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0800754d4ad31cbadbdfadc630c640963f3f6053 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke
[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem .. IMPALA-5623: Fix lag() on STRING cols to release UDF mem IMPALA-4120 fixed an issue where lead/lag was potentially operating on memory that the UDA didn't own, resulting in potentially wrong results. As part of that fix, lead and lag started allocating 'global' UDF memory (e.g. via Allocate() rather than AllocateLocal()) in Init() which needs to be freed in Serialize() or Finalize(), but only lead() was updated to free the memory. This memory is eventually freed when the fragment is torn down, but as a result of not freeing the memory in Serialize or Finalize, the memory may be allocated longer than necessary. Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3 Reviewed-on: http://gerrit.cloudera.org:8080/7371 Reviewed-by: Matthew Jacobs Reviewed-by: Dan Hecht Tested-by: Impala Public Jenkins --- M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java M testdata/workloads/functional-query/queries/QueryTest/analytic-fns.test 2 files changed, 19 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Matthew Jacobs: Looks good to me, but someone else must approve Dan Hecht: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5514: Throw an error when --ldap password cmd is used without LDAP auth
Hello Bharath Vissapragada, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7188 to look at the new patch set (#4). Change subject: IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth .. IMPALA-5514: Throw an error when --ldap_password_cmd is used without LDAP auth When only with ldap_password_cmd option, impala-shell runs successfully. I solved this problem by throwing an error when --ldap_password_cmd is used without LDAP auth, that is, ldap_password_cmd option will only take effect if ldap option presents. Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e --- M shell/impala_shell.py 1 file changed, 5 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7188/4 -- To view, visit http://gerrit.cloudera.org:8080/7188 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I3711d8a0eca2fa8612e2943fa9121945db6b012e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5643: Add total number of threads created per group to /threadz .. Patch Set 1: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/841/ -- To view, visit http://gerrit.cloudera.org:8080/7390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5507: Add clear description to help information of KEYVAL option
Hello Impala Public Jenkins, Jim Apple, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7179 to look at the new patch set (#11). Change subject: IMPALA-5507: Add clear description to help information of KEYVAL option .. IMPALA-5507: Add clear description to help information of KEYVAL option Help information of KEYVAL option in impala-shell is not clear enough. I fix this issue by adding clear description to help information of KEYVAL option. Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 --- M shell/option_parser.py 1 file changed, 6 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/7179/11 -- To view, visit http://gerrit.cloudera.org:8080/7179 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I68cfc16838c6c0e7813f03dd4296f9eb54ec4c63 Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Donghui Xu Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Donghui Xu Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz
Henry Robinson has posted comments on this change. Change subject: IMPALA-5643: Add total number of threads created per group to /threadz .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Matthew Jacobs has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Code-Review+1 I think this makes sense (bash flags are weird). It'd be good to get someone more familiar with bash to check as well. -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7391/2/functions.sh File functions.sh: Line 380: if set +o | grep -q "set -o xtrace"; then Should we parse "set -o" here instead? -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. IMPALA-5164: Bloom filter benchmark fails to complete Due to large allocations during benchmark, we spend a lot of time under memory pressure, which triggers the context switch detection. Not using micro-benchmark heuristics for these benchmarks fixes the problem. Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Reviewed-on: http://gerrit.cloudera.org:8080/7381 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins --- M be/src/benchmarks/bloom-filter-benchmark.cc 1 file changed, 2 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-3603 [DOCS] Document handling of NaN values .. IMPALA-3603 [DOCS] Document handling of NaN values Added information in the "DOUBLE Data Type" (impala_double.html) and the "FLOAT Data Type" (impala_float.html) topics about how Impala handles NaN values. Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 Reviewed-on: http://gerrit.cloudera.org:8080/7098 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M docs/shared/impala_common.xml M docs/topics/impala_double.xml M docs/topics/impala_float.xml 3 files changed, 22 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Michael Brown: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Will Berkeley
[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3603 [DOCS] Document handling of NaN values .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars .. IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars In the "Restrictions" section of the "APPX_MEDIAN Function" topic, added information about how the function truncates string values. Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 Reviewed-on: http://gerrit.cloudera.org:8080/7094 Reviewed-by: Michael Brown Tested-by: Impala Public Jenkins --- M docs/topics/impala_appx_median.xml 1 file changed, 6 insertions(+), 1 deletion(-) Approvals: Impala Public Jenkins: Verified Michael Brown: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Peter Brejcak
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has submitted this change and it was merged. Change subject: Prevent leaking AWS credentials to the log. .. Prevent leaking AWS credentials to the log. Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 --- M functions.sh 1 file changed, 11 insertions(+), 0 deletions(-) Approvals: David Knupp: Looks good to me, approved Lars Volker: Verified Matthew Jacobs: Looks good to me, but someone else must approve Tim Wood: Looks good to me, but someone else must approve -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Wood
[Impala-ASF-CR] IMPALA-3603 [DOCS] Document handling of NaN values
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-3603 [DOCS] Document handling of NaN values .. Patch Set 3: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/143/ -- To view, visit http://gerrit.cloudera.org:8080/7098 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id9485b6790d58fafdae32332d2634cbe893d7fb0 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Will Berkeley Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Peter Brejcak Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Wood Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Thank you all for the quick reviews! -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Wood Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
David Knupp has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Wood Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Tim Wood has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Wood Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
David Knupp has posted comments on this change. Change subject: Prevent leaking AWS credentials to the log. .. Patch Set 2: Code-Review+1 Lars asked me to look at this as well, and I feel the same way. Tim Wood is better at bash than me. I'll confer with him, then +2 if things check out. -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec
anujphadke has posted comments on this change. Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/7245/2//COMMIT_MSG Commit Message: Line 7: IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec > It would be easier to review if we included the IMPALA-5311 fix in this, si Done http://gerrit.cloudera.org:8080/#/c/7245/2/be/src/exec/hdfs-scan-node-base.cc File be/src/exec/hdfs-scan-node-base.cc: PS2, Line 553: > It seems like this will remove useful information for file formats where we Done Line 771: > missing space - maybe run it it through clang-format before posting? Done http://gerrit.cloudera.org:8080/#/c/7245/2/be/src/exec/hdfs-scan-node-base.h File be/src/exec/hdfs-scan-node-base.h: PS2, Line 250: t > should have spaces around " = " Done PS2, Line 462: > don't need space between >> in C++11 Not needed anymore. Moved std::pair to a tuple -- To view, visit http://gerrit.cloudera.org:8080/7245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
Impala Public Jenkins has posted comments on this change. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/141/ -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata
Impala Public Jenkins has submitted this change and it was merged. Change subject: single_node_perf_run.py: clean up newly-added testdata .. single_node_perf_run.py: clean up newly-added testdata In single_node_perf_run.py, restore_workloads() can make the tree "dirty", and when a tree is dirty, git won't let you switch branches in a way that clobbers the dirty file contents: $ cd $(mktemp -d) $ git init . Initialized empty Git repository in /tmp/tmp.H0NxzTXLUj/.git/ $ touch foo && git add foo && git commit -a -m "foo" [master (root-commit) 3776149] foo 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 foo $ git checkout -b ok_foo && echo "ok" >> foo && git commit -a -m "foo is ok" Switched to a new branch 'ok_foo' [ok_foo 9fd5bde] foo is ok 1 file changed, 1 insertion(+) $ git checkout master && echo "not ok" >> foo Switched to branch 'master' $ git checkout ok_foo error: Your local changes to the following files would be overwritten by checkout: foo Please, commit your changes or stash them before you can switch branches. Aborting Discovered when testing single_node_perf_run with https://gerrit.cloudera.org/#/c/7153/; after this commit, that patch works with single_node_perf_run.py Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4 Reviewed-on: http://gerrit.cloudera.org:8080/7291 Reviewed-by: Jim Apple Tested-by: Impala Public Jenkins --- M bin/single_node_perf_run.py 1 file changed, 5 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] IMPALA-4107 [DOCS] APPX MEDIAN cuts string to 10 chars
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4107 [DOCS] APPX_MEDIAN cuts string to 10 chars .. Patch Set 4: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/142/ -- To view, visit http://gerrit.cloudera.org:8080/7094 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I141787452e74ecdb491765cd7fd4c9a771c5bbc2 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Laurel Hale Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Laurel Hale Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Peter Brejcak Gerrit-HasComments: No
[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata
Impala Public Jenkins has posted comments on this change. Change subject: single_node_perf_run.py: clean up newly-added testdata .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7232 to look at the new patch set (#10). Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type .. IMPALA-5240: Allow config of number of disk I/O threads per disk type Currently Impala defaults to 8 threads per flash disk and 1 thread per rotational disk. This can be overridden with --num_threads_per_disk, but that sets the thread count independent of disk type. This would allow control of the number of disk I/O threads spawned independently for solid state disks using "--num_io_threads_per_solid_state_disk" and for rotational disks using "--num_io_threads_per_rotational_disk" as startup parameters. Testing: Added backend tests that verify if "num_threads_per_disk", "num_io_threads_per_solid_state_disk" and "num_io_threads_per_rotational_disk" are used to spawn threads appropriately. Existing tests have been updated to use the new DiskIoMgr test constructor, but retain the same behavior as before. Additionally made some changes to fix a bug, where impala would crash if num_disks are set more than the number of logical disks on system and num_threads_per_disk is not set. This bug also lets the user create more disk queues than the num of logical disks which would eventually never be used by the actual code. Moreover, after this fix, if num_disks is set to more than the number of logical disks then it will default to max available disks and log a warning. Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c --- M be/src/runtime/disk-io-mgr-stress.cc M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h M be/src/util/disk-info.h M be/src/util/thread.cc M be/src/util/thread.h M fe/src/test/java/org/apache/impala/testutil/SentryServicePinger.java 8 files changed, 124 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/7232/10 -- To view, visit http://gerrit.cloudera.org:8080/7232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. Patch Set 24: Rebased and fixed a bunch of tests. -- To view, visit http://gerrit.cloudera.org:8080/5801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Gerrit-PatchSet: 24 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/7391 Change subject: Prevent leaking AWS credentials to the log. .. Prevent leaking AWS credentials to the log. Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 --- M functions.sh 1 file changed, 11 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/91/7391/1 -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 1 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker
[native-toolchain-CR] Prevent leaking AWS credentials to the log.
Lars Volker has uploaded a new patch set (#2). Change subject: Prevent leaking AWS credentials to the log. .. Prevent leaking AWS credentials to the log. Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 --- M functions.sh 1 file changed, 11 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/91/7391/2 -- To view, visit http://gerrit.cloudera.org:8080/7391 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I43ae8d6f347d0ce3d223bd6e86759bb33f2520e6 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Dan Hecht has posted comments on this change. Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. Patch Set 23: (27 comments) First batch of comments. Will continue but figure we can pipeline somewhat. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.cc File be/src/exec/analytic-eval-node.cc: PS23, Line 194: !buffer_pool_client_.is_registered() I guess that's needed for subplans? http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.h File be/src/exec/analytic-eval-node.h: PS23, Line 336: reservation the stream will try to grow the reservation first, right? Maybe reword to be more accurate: When the tuple stream cannot acquire additional reservation, input_stream_ is unpinned... or something. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc File be/src/exec/exec-node.cc: Line 209: &buffer_pool_client_, resource_profile_.min_reservation); do we somewhere inside there dcheck that the exec-node has at least min_reservation reservation still? Line 226: resource_profile_.spillable_buffer_size < buffer_pool->min_buffer_len()) { when can that happen? Line 233: // to get some reservation. I'm not sure what that comment is saying. PS23, Line 240: if (resource_profile_.min_reservation > 0) { maybe delete that to exercise this path unconditionally http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h File be/src/exec/hash-table.h: PS23, Line 948: is that solved or was it not a problem? PS23, Line 532: Either a pointer to the single tuple of this row where does the tuple live in that case? PS23, Line 605: success that's kind of confusing given the OK on status means success also. Maybe make that 'got_memory' or 'got_reservation' (whichever the case is)? PS23, Line 617: blank PS23, Line 625: Status* status do we need to pass that indirectly for codegen/perf still? i thought we made status faster on the success path. PS23, Line 688: success same comment PS23, Line 854: Status* status same question about status perf. otherwise, seems better to pass HtData** PS23, Line 862: success same comment PS23, Line 879: status same comment vs DuplicateNode** PS23, Line 895: tatus* status) this is opposite all the "bool* success" interfaces. reason for that? PS23, Line 923: stores_tuples_ this could probably use a comment (the comment above seems to apply to the set of variables since this one is not dependent on join vs agg, but instead the number of tuples, right?). Also a short motivation for this optimization would be helpful. I suppose it's still worthwhile even with the new BTS format. http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/partitioned-aggregation-node-ir.cc File be/src/exec/partitioned-aggregation-node-ir.cc: Line 224: if (hash_tbl == nullptr) return false; // Hash table was not created - pass through. why does that happen now? or is it a bug fix? http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/exec-env.cc File be/src/runtime/exec-env.cc: PS23, Line 88: page should we express that as min_buffer_size instead? maybe easy to understand by users that way? http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.cc File be/src/runtime/initial-reservations.cc: Line 69: DCHECK(success); Explain why success is expected (planner computation). http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.h File be/src/runtime/initial-reservations.h: PS23, Line 40: This creates trackers for initial reservations under those. This seems similar to a SubReservation, no? PS23, Line 47: Claim claim it from where? How is this different than Claim() claiming? also what do we mean by "peak" here? PS23, Line 48: that be claimed garbled PS23, Line 48: but not : /// returned why not returned? PS23, Line 50: Status explain failure modes http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/runtime-state.h File be/src/runtime/runtime-state.h: PS23, Line 403: in order to not regress w.r.t. IMPALA-1575, don't we need to release the initial reservations when the last fragment instance completes? i think this shared_ptr was effectively doing that in the BufferedBlockMgr world, right? http://gerrit.cloudera.org:8080/#/c/5801/23/fe/src/main/java/org/apache/impala/util/MathUtil.java File fe/src/main/java/org/apache/impala/util/MathUtil.java: Line 45: } look familar -- To view, visit http://gerrit.cloudera.org:8080/5801 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e Gerrit-PatchSet: 23 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type
Bikramjeet Vig has posted comments on this change. Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.cc File be/src/runtime/disk-io-mgr.cc: PS9, Line 53: string num_io_threads_per_rotational_disk_help_msg = Substitute("Number of I/O threads" : " per rotational disk. Has priority over num_threads_per_disk. If neither is set, " : "defaults to $0 thread(s) per rotational disk", THREADS_PER_ROTATIONAL_DISK); : DEFINE_int32(num_io_threads_per_rotational_disk, 0, : num_io_threads_per_rotational_disk_help_msg.c_str()); @Matthew: as discussed, switched to a static const string instead which works as intended. @Tim: I wanted to avoid using macro constants as they are also being used during member initialization in the constructor PS9, Line 271: deafult_val > typo: default Done http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h File be/src/util/disk-info.h: Line 61: if (disk_id >= disks_.size()) return false; > It shouldn't be in practice, but this is necessary for the tests which test Done -- To view, visit http://gerrit.cloudera.org:8080/7232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Tim Armstrong has uploaded a new patch set (#24). Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool .. IMPALA-4674: Part 2: port backend exec to BufferPool Always create global BufferPool at startup using 80% of memory and limit reservations to 80% of query memory (same as BufferedBlockMgr). The query's initial reservation is computed in the planner, claimed centrally (managed by the InitialReservations class) and distributed to query operators from there. min_spillable_buffer_size and default_spillable_buffer_size query options control the buffer size that the planner selects for spilling operators. Port ExecNodes to use BufferPool: * Each ExecNode has to claim its reservation during Open() * Port Sorter to use BufferPool. * Switch from BufferedTupleStream to BufferedTupleStreamV2 * Port HashTable to use BufferPool via a Suballocator. This also makes PAGG memory consumption more efficient (avoid wasting buffers) and improve the spilling algorithm: * Allow preaggs to execute with 0 reservation - if streams and hash tables cannot be allocated, it will pass through rows. * Halve the buffer requirement for spilling aggs - avoid allocating buffers for aggregated and unaggregated streams simultaneously. * Rebuild spilled partitions instead of repartitioning (IMPALA-2708) TODO in follow-up patches: * Rename BufferedTupleStreamV2 to BufferedTupleStream * Implement max_row_size query option. Testing: * Updated tests to reflect new memory requirements Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/analytic-eval-node.cc M be/src/exec/analytic-eval-node.h M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hash-table-test.cc M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/exec/nested-loop-join-builder.cc M be/src/exec/partitioned-aggregation-node-ir.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-aggregation-node.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-builder.h M be/src/exec/partitioned-hash-join-node-ir.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/partitioned-hash-join-node.inline.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/runtime/CMakeLists.txt D be/src/runtime/buffered-block-mgr-test.cc D be/src/runtime/buffered-block-mgr.cc D be/src/runtime/buffered-block-mgr.h D be/src/runtime/buffered-tuple-stream-test.cc D be/src/runtime/buffered-tuple-stream.cc D be/src/runtime/buffered-tuple-stream.h D be/src/runtime/buffered-tuple-stream.inline.h M be/src/runtime/bufferpool/buffer-pool.cc M be/src/runtime/bufferpool/buffer-pool.h M be/src/runtime/bufferpool/reservation-tracker.h M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc A be/src/runtime/initial-reservations.cc A be/src/runtime/initial-reservations.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/runtime/test-env.cc M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.h M be/src/service/client-request-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/bloom-filter.h M be/src/util/static-asserts.cc M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M common/thrift/generate_error_codes.py M fe/src/main/java/org/apache/impala/common/RuntimeEnv.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/Pla
[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
Henry Robinson has posted comments on this change. Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue .. Patch Set 2: (1 comment) After some discussion, I think two issues are being conflated here. 1. If a complete KRB principal is not supplied via --principal, Impala tries to guess at the hostname part using gethostname(). This might not return the FQDN which is needed for Kerberos. The linked bug (IMPALA-4978) is really talking about this issue. 2. However, the more important issue is that IMPALA-5253 made it critical that for TLS-enabled deployments, every service thinks its hostname is the FQDN. That hostname is controlled by --hostname. If that is not set, Impala falls back to gethostname() - and this is where IMPALA-4978 comes in, because it should also use the getaddrinfo() call first. The workaround for this issue is to set --hostname correctly to the FQDN (or ensure that 'hostname' returns the FQDN by configuring name resolution correctly). We found this at Cloudera because in some situations our cluster management software would set --hostname to the shortname. http://gerrit.cloudera.org:8080/#/c/7388/2/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: PS2, Line 156: This problem might occur immediately after an upgrade, due to changes in cluster : management software that supplies the --hostname flag : automatically to the Impala-related daemons. It's a bit hard to understand *what* kind of changes would cause this, and without that knowledge this is a bit vague. In my opinion it makes sense to either link directly to the CM known issue (I think it's perfectly ok to point to third-party software that breaks Impala during upgrades), or to remove this entirely. -- To view, visit http://gerrit.cloudera.org:8080/7388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
Impala Public Jenkins has submitted this change and it was merged. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. [DOCS] Advise setting vm.overcommit_memory=1 in various places Reusing the same advice under "Known Issues", scalability considerations, and in the Impala + Kerberos section. Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Reviewed-on: http://gerrit.cloudera.org:8080/7349 Reviewed-by: Mostafa Mokhtar Tested-by: Impala Public Jenkins --- M docs/shared/impala_common.xml M docs/topics/impala_kerberos.xml M docs/topics/impala_known_issues.xml M docs/topics/impala_scalability.xml 4 files changed, 59 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Mostafa Mokhtar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
Impala Public Jenkins has posted comments on this change. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/7267/4/be/src/exec/partial-sort-node.cc File be/src/exec/partial-sort-node.cc: Line 111: *eos = true; Do we need to update num_rows_returned_ here? Line 148: We can't have a partial sort in a subplan, right? It's probably best to just put a DCHECK(false) here since there's no way to test the code. PS4, Line 149: :Res Let's use nullptr consistently in the new code. http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.h File be/src/runtime/sorter.h: Line 96: Sorter(const TupleRowComparator& compare_less_than, > This will be included in a follow up patch. WFM. If you want to wait for https://gerrit.cloudera.org/#/c/5801/ to land there will be a way to directly set a maximum on bytes of buffers in the planner, which would accomplish the same thing. http://gerrit.cloudera.org:8080/#/c/7267/4/be/src/runtime/sorter.h File be/src/runtime/sorter.h: PS4, Line 99: RuntimeProfile* profile, Run I feel like this is breaking some of the encapsulation of the sort implementation. This detail leaks out into the planner anyway so I guess it's not really encapsulated. How about we just document more clearly what value is expected. The class comment talks about the number of blocks per run, so it could be extended to talk about the minimum requirement for spilling and non-spilling sorts. http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: PS3, Line 349: highest/lowest > My problem with that wording is that it could be interpreted as "take the f WFM -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. IMPALA-5583: [DOCS] Document default_join_distribution_mode query option New page for the query option. Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Reviewed-on: http://gerrit.cloudera.org:8080/7300 Reviewed-by: Mostafa Mokhtar Tested-by: Impala Public Jenkins --- M docs/impala.ditamap M docs/impala_keydefs.ditamap A docs/topics/impala_default_join_distribution_mode.xml 3 files changed, 136 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Mostafa Mokhtar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Impala Public Jenkins has submitted this change and it was merged. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. IMPALA-5605: [DOCS] New known issue for upping thread resource limits Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Reviewed-on: http://gerrit.cloudera.org:8080/7348 Reviewed-by: Mostafa Mokhtar Tested-by: Impala Public Jenkins --- M docs/topics/impala_known_issues.xml 1 file changed, 33 insertions(+), 0 deletions(-) Approvals: Impala Public Jenkins: Verified Mostafa Mokhtar: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 4: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
Mostafa Mokhtar has posted comments on this change. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 4: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/140/ -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-docs-submit/139/ -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5605: [DOCS] New known issue for upping thread resource limits
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-5605: [DOCS] New known issue for upping thread resource limits .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7348 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4300fcb30c1bc0b1f3cd4eeeb25ad05ec4173fa6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Matthew Mulder Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4674: Part 1: remove old aggs and joins
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4674: Part 1: remove old aggs and joins .. Patch Set 8: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/7102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5ce2236d37c0ced188a4a81f7e00d4b8ac98e7e9 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5629: avoid expensive list::size() call .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/840/ -- To view, visit http://gerrit.cloudera.org:8080/7382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 18: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/839/ -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5498: Support for partial sorts .. Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.cc File be/src/exec/partial-sort-node.cc: Line 32: sorter_(NULL), > This doesn't need to handle offset? Done Line 127: > Since this only requires a partial sort, you can exit early when the cumula As noted elsewhere, removing limit-related code. Line 140: *eos = input_eos_; > I guess it is not clear to me why allowing a limit without an offset is use Right, so 'offset' never really makes sense with partial-sort-node, since as you say there's not a total ordering and you're not guaranteed that one instance of partial-sort-node with an offset can pick up where another instance left off with a limit. A limit is potentially useful just for cutting off the number of rows returned in a single partial-sort-node, but with the initial use case for this there's no way to end up with a partial-sort-node with a limit, so in the interest of keeping things simple and not having untested code paths, I think I'll just remove the limit code. http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/exec/partial-sort-node.h File be/src/exec/partial-sort-node.h: Line 58: virtual Status QueryMaintenance(RuntimeState* state); > Just for general niceness, can you also update be/src/exec/sort-node.h's pr Done http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.cc File be/src/runtime/sorter.cc: Line 1384: expr_mem_pool, &sort_tuple_expr_evals_)); > Need to update this calculation for partial sorts. Done Line 1434: if (sorted_runs_.size() == 1) { > Unnecessary - this is unconditionally dereferenced Done http://gerrit.cloudera.org:8080/#/c/7267/3/be/src/runtime/sorter.h File be/src/runtime/sorter.h: Line 96: Sorter(const TupleRowComparator& compare_less_than, > Wasn't the plan to apply the partition sort memory limit to the sorter itse This will be included in a follow up patch. http://gerrit.cloudera.org:8080/#/c/7267/3/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: PS3, Line 349: e { > Should this be "first" since the order is specified elsewhere. My problem with that wording is that it could be interpreted as "take the first N elements of input and then sort them" Maybe: "Return the first N sorted elements"? Line 360: struct TSortNode { > Please ignore my earlier comments about offset handling (although a comment Added to partial-sort-node.h http://gerrit.cloudera.org:8080/#/c/7267/3/fe/src/main/java/org/apache/impala/planner/SortNode.java File fe/src/main/java/org/apache/impala/planner/SortNode.java: Line 58: private final long PARTIAL_SORT_MEM_LIMIT = 128 * 1024 * 1024; > It might be clearer to specify this in bytes, then just make sure that its Done Line 286: // The memory limit cannot be less than the size of the required blocks. > This looks good to me. Done -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5498: Support for partial sorts
Thomas Tauber-Marshall has uploaded a new patch set (#4). Change subject: IMPALA-5498: Support for partial sorts .. IMPALA-5498: Support for partial sorts Impala currently supports total sorts (the entire set of data is sorted) and top-n sorts (only the highest/lowest n elements are sorted). This patch adds the ability to do partial sorts, where the data is divided up into some number of subsets, each of which is sorted individually. It accomplishes this by adding a new exec node, PartialSortNode. When PartialSortNode::GetNext() is called, it retrieves input up to its memory limit, uses the existing Sorter class to sort it, and outputs it. This is faster than a total sort with SortNode as it avoids the need to spill if the input is larger than the memory limit. Future work will look into setting a more restrictive memory limit on the PartialSortNode. In the planner, the SortNode plan node is used, with an enum value indicating if it is a total or partial sort. As a first use case, partial sort is used where a total sort was used previously for inserts into Kudu. Testing: - E2E test with a large INSERT into a Kudu table with a mem limit. Checks that no spills occurred. - Updated planner tests. - Existing E2E tests and stress test verify correctness of INSERT. - Perf tests on the 10 node cluster: inserting tpch_100.lineitem into a Kudu table with mem_limit=3gb: Previously: 5 runs are spilled, sort took 7m33s Now: no spills, sort takes 6m19s, for ~18% speedup Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 --- M be/src/exec/CMakeLists.txt M be/src/exec/exec-node.cc A be/src/exec/partial-sort-node.cc A be/src/exec/partial-sort-node.h M be/src/exec/sort-node.cc M be/src/exec/sort-node.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/AnalyticPlanner.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test 16 files changed, 458 insertions(+), 72 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/7267/4 -- To view, visit http://gerrit.cloudera.org:8080/7267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ieec2a15a0cc5240b1c13682067ab64670d1e0a38 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
John Russell has posted comments on this change. Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/7300/3/docs/topics/impala_default_join_distribution_mode.xml File docs/topics/impala_default_join_distribution_mode.xml: Line 52: on the right-hand side of the join is broadcast. This behavior > I believe the word right-hand side is reserved to the position of the table I knew 'RHS' was going to get me in trouble. :-) -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5643: Add total number of threads created per group to /threadz
Lars Volker has uploaded a new change for review. http://gerrit.cloudera.org:8080/7390 Change subject: IMPALA-5643: Add total number of threads created per group to /threadz .. IMPALA-5643: Add total number of threads created per group to /threadz Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 --- M be/src/util/thread.cc M www/threadz.tmpl 2 files changed, 15 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/7390/1 -- To view, visit http://gerrit.cloudera.org:8080/7390 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8b532220b6940aa3d6b88a0ebaa247f6914fef53 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Lars Volker
[Impala-ASF-CR] IMPALA-5623: Fix lag() on STRING cols to release UDF mem
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5623: Fix lag() on STRING cols to release UDF mem .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/838/ -- To view, visit http://gerrit.cloudera.org:8080/7371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id2b69b4ccb9cac076abca19bed6f0b1dd11dfff3 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew Jacobs Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read
Tim Armstrong has posted comments on this change. Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read .. Patch Set 4: FWIW in practice it's not possible to re-use a ScanRange object because the lifecycle is a bit messed-up - IMPALA-4249 -- To view, visit http://gerrit.cloudera.org:8080/7294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 16: (10 comments) http://gerrit.cloudera.org:8080/#/c/7223/16/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: PS16, Line 394: total > total claims Done http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java: Line 89: // We already recursed on the join build fragment in createBuildPlan(). > is this a bug fix independent of this patch? this statement isn't obviously Yes, the problem was that createBuildPlan() was not correctly updating the fragment tree, and this function was implicitly depending on that bug. Before these fixes we got obviously incorrect output in the planner tests - the resource consumption of a query was greater than the sum of all operators because operators were double-counted. http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: PS16, Line 210: Compute the resource profile of the fragment > I think this should say something about being host wide (i.e. for all insta Done Line 224: // finish. > where does that happen? JoinNode.computeTreeResourceProfiles(). Added a reference in the comment. Line 385: builder.append("Per-Host Resources: "); > did you and Alex already decide that this should be output regardless of ex Maybe I misunderstood the question, but we only output the fragment header at EXTENDED explain level and above anyway - this patch doesn't change that. Yeah Per-Host was meant to differentiate that. http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: PS16, Line 617: resources > peak resources? Done http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: PS16, Line 342: the given plans > obvious* I augmented the comment to explicitly mention fragment instances. PS16, Line 354: Sum of per-host minimum reservations over all plan nodes and sinks. > that's still the ambiguous way of describing it. How about rewording that t I thought it was useful to describe how it was computed. Reworded the comment a bit to be more concise and explain what it is before how it is computed. PS16, Line 366: so we are : // forced to > forced to what? Oops didn't finish writing this comment (or left it in an editor buffer). http://gerrit.cloudera.org:8080/#/c/7223/16/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: PS16, Line 141: mem-estimate=15.68KB mem-reservation=136.00MB > does it make sense for the estimate ever less than the reservation? i guess In the current patch, no, since the reservations aren't really enforced. With the switch-over to the buffer pool that would make sense. Filed IMPALA-5641 to keep track of it - it will require updating a bunch of planner tests so probably best to do it as a separate patch. -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has uploaded a new patch set (#17). Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/sort-node.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/common/TreeNode.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSink.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/BitUtil.java A fe/src/main/java/org/apache/impala/util/MathUtil.java A fe/src/test/java/org/apache/impala/util/BitUtilTest.java A fe/src/test/java/org/apache/impala/util/MathUtilTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workl
[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic
Zach Amsden has uploaded a new change for review. http://gerrit.cloudera.org:8080/7389 Change subject: IMPALA-5164: Better benchmark heuristic .. IMPALA-5164: Better benchmark heuristic If any benchmark took longer than 50ms to run, the test suite would bail. This doesn't seem reasonable for tests which have large data structures. Just increase the timeout in that case while still trying to measure context-switch free runs. Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96 --- M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/util/benchmark.cc 2 files changed, 9 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/7389/1 -- To view, visit http://gerrit.cloudera.org:8080/7389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden
[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read
Jim Apple has posted comments on this change. Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/7294/2/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: PS2, Line 584: is_cancelled_ = false; > Nevermind about my comment of Reset() being used only in test code. It's us Would you comment on IMPALA-5634 to that effect? -- To view, visit http://gerrit.cloudera.org:8080/7294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: PS16, Line 342: the given plans > I don't think it's opposite that this incorporates DOP. obvious* -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5031: Set DiskIoMgr::ScanRange::is cancelled before read
Michael Ho has posted comments on this change. Change subject: IMPALA-5031: Set DiskIoMgr::ScanRange::is_cancelled_ before read .. Patch Set 4: Code-Review+1 (1 comment) http://gerrit.cloudera.org:8080/#/c/7294/2/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: PS2, Line 584: is_cancelled_ = false; > Done: Nevermind about my comment of Reset() being used only in test code. It's used here too: https://github.com/apache/incubator-impala/blob/master/be/src/exec/hdfs-scan-node-base.cc#L614 Those bool flags are currently being initialized in InitInternal(). We should consider moving their initialization to Reset() instead. -- To view, visit http://gerrit.cloudera.org:8080/7294 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I22898ec96ac44c4902f8072f27453cfc58358cae Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Ho Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic
Jim Apple has posted comments on this change. Change subject: IMPALA-5164: Better benchmark heuristic .. Patch Set 1: I won't have time to review this for a few days; you might find a quicker review from whomever reviewed your first patch that changed how benchmarking worked. Also, it's pretty rare to +anything one's own patch except on rebase. -- To view, visit http://gerrit.cloudera.org:8080/7389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Better benchmark heuristic
Zach Amsden has posted comments on this change. Change subject: IMPALA-5164: Better benchmark heuristic .. Patch Set 1: Code-Review+1 Turns out this approach works even better and avoids any special changes to benchmarks. -- To view, visit http://gerrit.cloudera.org:8080/7389 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie6fb1283716dbc591c10238d85b9ab436d02fb96 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Dan Hecht has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 16: (11 comments) I only did a quick look through the .test file changes. LMK if there's something specific there I should verify. http://gerrit.cloudera.org:8080/#/c/7223/16/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: PS16, Line 394: total total claims http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java File fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java: Line 89: // We already recursed on the join build fragment in createBuildPlan(). is this a bug fix independent of this patch? this statement isn't obviously true given the comment on collectJoins(). do we have sufficient test coverage here? http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanFragment.java File fe/src/main/java/org/apache/impala/planner/PlanFragment.java: PS16, Line 210: Compute the resource profile of the fragment I think this should say something about being host wide (i.e. for all instances not just for one instance of this plan fragment). Line 224: // finish. where does that happen? Line 385: builder.append("Per-Host Resources: "); did you and Alex already decide that this should be output regardless of explain level? Notes we only print at TExplainLevel.EXTENDED, right? It's a bit subtle that this includes all instances, whereas at the node level it will be for a single instance, but I guess the "Per-Host" is meant to differentiate that? http://gerrit.cloudera.org:8080/#/c/7223/15/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: Line 117: protected ResourceProfile nodeResourceProfile_ = ResourceProfile.invalid(); it seems a little weird that the this doesn't really have a single meaning. It kind of depends on the type of plan node and then is interpreted appropriately by computeTreeResourceProfiles(). But you don't have to revisit this now. http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/PlanNode.java File fe/src/main/java/org/apache/impala/planner/PlanNode.java: PS16, Line 617: resources peak resources? http://gerrit.cloudera.org:8080/#/c/7223/16/fe/src/main/java/org/apache/impala/planner/Planner.java File fe/src/main/java/org/apache/impala/planner/Planner.java: PS16, Line 342: the given plans I don't think it's opposite that this incorporates DOP. PS16, Line 354: Sum of per-host minimum reservations over all plan nodes and sinks. that's still the ambiguous way of describing it. How about rewording that to be more similar to the thrift comment. Also say "initial" rather than "minimum"? PS16, Line 366: so we are : // forced to forced to what? http://gerrit.cloudera.org:8080/#/c/7223/16/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: PS16, Line 141: mem-estimate=15.68KB mem-reservation=136.00MB does it make sense for the estimate ever less than the reservation? i guess it's saying we don't expect a full buffer to be used, so useful to not clip it? -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec
anujphadke has uploaded a new patch set (#3). Change subject: IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec .. IMPALA-4863/IMPALA-5311: Correctly account the file type and compression codec If a scan range is skipped at runtime the scan node skips reading the range and never figures out the underlying compression codec used to compress the files. In such a scenario we default the compression codec to NONE which can be misleading. This change marks these files as filtered in the scan node profile e.g. - File Formats: PARQUET/SNAPPY:364 PARQUET(Skipped):1460 Change-Id: I797916505f62e568f4159e07099481b8ff571da2 --- M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M testdata/workloads/functional-query/queries/QueryTest/hdfs_scanner_profile.test 6 files changed, 42 insertions(+), 15 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/7245/3 -- To view, visit http://gerrit.cloudera.org:8080/7245 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I797916505f62e568f4159e07099481b8ff571da2 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Jim Apple has posted comments on this change. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Jim Apple has posted comments on this change. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. Patch Set 3: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/837/ -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Zach Amsden has posted comments on this change. Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. Patch Set 1: @jbapple, let's go ahead with this diff and if there are more failures, I'll attach more fixes. -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5164: Bloom filter benchmark fails to complete
Hello Jim Apple, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7381 to look at the new patch set (#2). Change subject: IMPALA-5164: Bloom filter benchmark fails to complete .. IMPALA-5164: Bloom filter benchmark fails to complete Due to large allocations during benchmark, we spend a lot of time under memory pressure, which triggers the context switch detection. Not using micro-benchmark heuristics for these benchmarks fixes the problem. Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc --- M be/src/benchmarks/bloom-filter-benchmark.cc 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/7381/2 -- To view, visit http://gerrit.cloudera.org:8080/7381 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifad385a633c45af7d1dbcc87e56661c413bde7dc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] single node perf run.py: clean up newly-added testdata
Impala Public Jenkins has posted comments on this change. Change subject: single_node_perf_run.py: clean up newly-added testdata .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/836/ -- To view, visit http://gerrit.cloudera.org:8080/7291 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id0220f3cd7a26d2627e40cd432c23815a6d65ea4 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Lars Volker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has uploaded a new patch set (#16). Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. IMPALA-4862: make resource profile consistent with backend behaviour This moves away from the PipelinedPlanNodeSet approach of enumerating sets of concurrently-executing nodes because unions would force creating many overlapping sets of nodes. The new approach computes the peak resources during Open() and the peak resources between Open() and Close() (i.e. while calling GetNext()) bottom-up for each plan node in a fragment. The fragment resources are then combined to produce the query resources. The basic assumptions for the new resource estimates are: * resources are acquired during or after the first call to Open() and released in Close(). * Blocking nodes call Open() on their child before acquiring their own resources (this required some backend changes). * Blocking nodes call Close() on their children before returning from Open(). * The peak resource consumption of the query is the sum of the independent fragments (except for the parallel join build plans where we can assume there will be synchronisation). This is conservative but we don't synchronise fragment Open() and Close() across exchanges so can't make stronger assumptions in general. Also compute the sum of minimum reservations. This will be useful in the backend to determine exactly when all of the initial reservations have been claimed from a shared pool of initial reservations. Testing: * Updated planner tests to reflect behavioural changes. * Added extra resource requirement planner tests for unions, subplans, pipelines of blocking operators, and bushy join plans. * Added single-node plans to resource-requirements tests. These have more complex plan trees inside a single fragment, which is useful for testing the peak resource requirement logic. Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d --- M be/src/exec/blocking-join-node.cc M be/src/exec/blocking-join-node.h M be/src/exec/exec-node.h M be/src/exec/hash-join-node.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/partitioned-hash-join-node.h M be/src/exec/sort-node.cc M be/src/util/backend-gflag-util.cc M common/thrift/BackendGflags.thrift M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/catalog/HBaseTable.java M fe/src/main/java/org/apache/impala/common/TreeNode.java M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java M fe/src/main/java/org/apache/impala/planner/DataSink.java M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java M fe/src/main/java/org/apache/impala/planner/ParallelPlanner.java D fe/src/main/java/org/apache/impala/planner/PipelinedPlanNodeSet.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/ResourceProfile.java M fe/src/main/java/org/apache/impala/planner/SelectNode.java M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/main/java/org/apache/impala/planner/SubplanNode.java M fe/src/main/java/org/apache/impala/planner/UnionNode.java M fe/src/main/java/org/apache/impala/planner/UnnestNode.java M fe/src/main/java/org/apache/impala/service/BackendConfig.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/BitUtil.java A fe/src/main/java/org/apache/impala/util/MathUtil.java A fe/src/test/java/org/apache/impala/util/BitUtilTest.java A fe/src/test/java/org/apache/impala/util/MathUtilTest.java M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-selectivity.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workl
[Impala-ASF-CR] IMPALA-4862: make resource profile consistent with backend behaviour
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4862: make resource profile consistent with backend behaviour .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/7223/14/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: PS14, Line 401: node > Yes, that'd be more meaningful and make it clearer. Let's go with that. Done -- To view, visit http://gerrit.cloudera.org:8080/7223 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I492cf5052bb27e4e335395e2a8f8a3b07248ec9d Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
John Russell has uploaded a new patch set (#2). Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue .. IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue Because there was no obvious subcategory of known issues to put this one in, I made a new subcategory 'startup issues'. Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0 --- M docs/topics/impala_known_issues.xml 1 file changed, 43 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/7388/2 -- To view, visit http://gerrit.cloudera.org:8080/7388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
John Russell has posted comments on this change. Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/7349/1/docs/shared/impala_common.xml File docs/shared/impala_common.xml: Line 1335: On a kerberized cluster with high memory utilization, kinit commands executed after > On a kerberized cluster, kinit commands executed after every 'kerberos_rein Done. I included the extra detail from Sailesh's comment too. Line 1335: On a kerberized cluster with high memory utilization, kinit commands executed after > This issue doesn't happen during startup. Done -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue
John Russell has posted comments on this change. Change subject: IMPALA-4978 / IMPALA-5631: [DOCS] Add FQDN known issue .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/7388/1/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: PS1, Line 146: from hostname command > I think the precise issue is that Impala must have the --hostname flag set OK, trying an alternative wording to convey the symptom in the title in a way that somebody suffering from the problem will recognize. I think repeating the JIRA title wouldn't be clear in the context of "what issue did I encounter". PS1, Line 155: : Mention that the problem might occur after an upgrade on a system where it formerly worked. (Due AIUI to changes in how Cloudera Manager supplies --hostname. But find a way not to mention Cloudera Manager by name.) -- To view, visit http://gerrit.cloudera.org:8080/7388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib039d0102878f1c05470371f581cb258287b9bc0 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Lars Volker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5583: [DOCS] Document default join distribution mode query option
John Russell has uploaded a new patch set (#4). Change subject: IMPALA-5583: [DOCS] Document default_join_distribution_mode query option .. IMPALA-5583: [DOCS] Document default_join_distribution_mode query option New page for the query option. Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 --- M docs/impala.ditamap M docs/impala_keydefs.ditamap A docs/topics/impala_default_join_distribution_mode.xml 3 files changed, 136 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/7300/4 -- To view, visit http://gerrit.cloudera.org:8080/7300 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4ec6213efc46bce0fe07c590841d51c009fb5c84 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: John Russell Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5629: avoid expensive list::size() call
Impala Public Jenkins has posted comments on this change. Change subject: IMPALA-5629: avoid expensive list::size() call .. Patch Set 2: Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/835/ -- To view, visit http://gerrit.cloudera.org:8080/7382 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id83fcf68dcc3ea729df167885f999ff32b861e66 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h File be/src/util/disk-info.h: Line 61: if (disk_id >= disks_.size()) return false; > why is this the right thing to assume? why should disk_id ever be more than It shouldn't be in practice, but this is necessary for the tests which test having # disk queues greater than the number of physical disks. Bikram filed a separate JIRA to make the tests more realistic. We should put a TODO with that JIRA number here to switch this back to a DCHECK later. -- To view, visit http://gerrit.cloudera.org:8080/7232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5240: Allow config of number of disk I/O threads per disk type
Dan Hecht has posted comments on this change. Change subject: IMPALA-5240: Allow config of number of disk I/O threads per disk type .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: PS9, Line 631: This constructor is only used for testing. it's unfortunate that we have this at all. you don't have to fix it in this change, but I wonder why we can't just have the test set the FLAG_ variables, like in the real impalad environment. http://gerrit.cloudera.org:8080/#/c/7232/9/be/src/util/disk-info.h File be/src/util/disk-info.h: Line 61: if (disk_id >= disks_.size()) return false; why is this the right thing to assume? why should disk_id ever be more than the number of disks? -- To view, visit http://gerrit.cloudera.org:8080/7232 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I094aff3747104104fe0465d73dcdbef5d9892f7c Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] [DOCS] Advise setting vm.overcommit memory=1 in various places
John Russell has uploaded a new patch set (#2). Change subject: [DOCS] Advise setting vm.overcommit_memory=1 in various places .. [DOCS] Advise setting vm.overcommit_memory=1 in various places Reusing the same advice under "Known Issues", scalability considerations, and in the Impala + Kerberos section. Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 --- M docs/shared/impala_common.xml M docs/topics/impala_kerberos.xml M docs/topics/impala_known_issues.xml M docs/topics/impala_scalability.xml 4 files changed, 59 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/7349/2 -- To view, visit http://gerrit.cloudera.org:8080/7349 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icbfa755e2c9769a8458fd93362769856cf32e301 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: John Russell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil