[Impala-ASF-CR] IMPALA-5754: Improve randomness of rand()/random()
Hello Jim Apple, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8355 to look at the new patch set (#4). Change subject: IMPALA-5754: Improve randomness of rand()/random() .. IMPALA-5754: Improve randomness of rand()/random() Currently implementation of rand/random built-in functions use rand_r of C library. We recognized its randomness was poor. std::mt19937 in the C++11 standard libarary shows better randomness than rand_r. Testing: rand-util-test is newly addded. It checks randomness, deterministic and range. Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684 --- M be/src/exprs/expr-test.cc M be/src/exprs/math-functions-ir.cc M be/src/util/CMakeLists.txt A be/src/util/rand-util-test.cc A be/src/util/rand-util.cc A be/src/util/rand-util.h 6 files changed, 185 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/55/8355/4 -- To view, visit http://gerrit.cloudera.org:8080/8355 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Idafdd5fe7502ff242c76a91a815c565146108684 Gerrit-Change-Number: 8355 Gerrit-PatchSet: 4 Gerrit-Owner: Kim Jin ChulGerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. IMPALA-2281: Replace FNV with FastHash in exchange nodes FNV is not a good enough hash function. This patch introduces FastHash into the codebase and uses it in exchange nodes. Testing: Two test cases involving arbitrary ordering are changed. Single node performance benchmark shows no performance difference. Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Reviewed-on: http://gerrit.cloudera.org:8080/8417 Reviewed-by: Jim AppleTested-by: Impala Public Jenkins --- M LICENSE.txt M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/raw-value-test.cc M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/util/hash-util-ir.cc M be/src/util/hash-util.h M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 13 files changed, 209 insertions(+), 103 deletions(-) Approvals: Jim Apple: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 7 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 07:39:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 16: Code-Review+2 Forgot that this needed a coordinated change with the Impala-lzo plugin. That has one callsite: context_->ReleaseCompletedResources(nullptr, false); So I just added a wrapper function with the old API that we can remove once it's not needed. -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 06:45:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8023 to look at the new patch set (#10). Change subject: IMPALA-4856: Port data stream service to KRPC .. IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents some issues with thrift RPC in which a thread may be stuck in an RPC call without being able to check for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, there are 64 service threads. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a blocked_sender_ queue to be processed later. The stashed RPC request will not be responded to until it is processed so as to exert back pressure to the client. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING --- * Build passed with FLAGS_use_krpc=true. TO DO - * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 --- M be/src/common/status.cc M be/src/common/status.h M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/kudu-util.h M be/src/rpc/CMakeLists.txt M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/CMakeLists.txt M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h A be/src/runtime/krpc-data-stream-sender.cc A be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/CMakeLists.txt A be/src/service/data-stream-service.cc A be/src/service/data-stream-service.h M be/src/service/impala-server.cc M cmake_modules/FindProtobuf.cmake M common/protobuf/CMakeLists.txt A common/protobuf/common.proto A common/protobuf/data_stream_service.proto A common/protobuf/row_batch.proto M common/thrift/generate_error_codes.py 33 files changed, 3,159 insertions(+), 183 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/10 -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe,
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 10: (6 comments) http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62 PS9, Line 62: FLAGS_datastream_service_num_deserialization_threads, 1, > why disallow the setting of this greater than num_cores? Given that it take Yes. The assumption is that the critical section is small and most threads won't block for too long so no point in pushing above number of cores. To be consistent with other knobs we have (e.g. # service threads, reactor threads), we shouldn't impose an implicit upper bound. http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67 PS9, Line 67: the > the resources Done http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195 PS9, Line 195: current_ba > current_batch_ Done http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225 PS9, Line 225: DCHECK > shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could h Done http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248 PS9, Line 248: current_batch_.reset(resu > this is part of the same operation as line 244 (both are adjusting the queu Done http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252 PS9, Line 252: // It's important that the dequeuing of 'deferred_rpcs_' is done after the entry > how about moving this to line 250 now that it's that scope that drops the l Done -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 10 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 03:35:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. Patch Set 6: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h@32 PS5, Line 32: public: > Done. I followed the practice in be/src/kudu/security/init.cc and assumed t I'm not 100% on that, since http://www.apache.org/legal/src-headers.html#3party says, "Do not modify or remove any copyright notices or licenses within third-party works," but if Kudu does it, it's probably the right thing to do. -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 04:17:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Hello Michael Ho, Alex Behm, Dan Hecht, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8172 to look at the new patch set (#16). Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. IMPALA-5307: Part 4: copy out uncompressed text and seq This is the final patch for IMPALA-5307. Text and Seq scanners are converted to use the same approach as Avro. contains_tuple_data is now false so a bunch of dead code in ScannerContext can be removed. We also no longer attach I/O buffers to row batches so that logic can be removed. Testing: Ran core ASAN tests. Perf: I reran the same benchmarks as in Part 2. There was no measurable difference before and after - for both text and seq processing time is dominated by text parsing. Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/base-sequence-scanner.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/exec/text-converter.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 23 files changed, 144 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/16 -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 16: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1447/ -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 16 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 06:45:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6084: Avoid using of global namespace for llvm
Kim Jin Chul has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/8489 ) Change subject: IMPALA-6084: Avoid using of global namespace for llvm .. IMPALA-6084: Avoid using of global namespace for llvm There are a large number of places in the backend where we import everything from the llvm namespace into the global namespace. (e.g. using namespace llvm;) Here are the reasons why we don't prefer this: * It could make symbol conflicts if a newly added code has same symbole name. * It makes code readability uncomfortable. We may not recognize a symbol came from. We adopt a sequence of namespace specifiers in each use case. Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-callgraph.cc M be/src/codegen/codegen-symbol-emitter.cc M be/src/codegen/codegen-util.cc M be/src/codegen/instruction-counter-test.cc M be/src/codegen/instruction-counter.cc M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/exec/blocking-join-node.cc M be/src/exec/exec-node.cc M be/src/exec/filter-context.cc M be/src/exec/hash-table.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/partitioned-aggregation-node.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/partitioned-hash-join-node.cc M be/src/exec/select-node.cc M be/src/exec/text-converter.cc M be/src/exec/topn-node.cc M be/src/exec/union-node.cc M be/src/exprs/agg-fn-evaluator.cc M be/src/exprs/agg-fn.cc M be/src/exprs/case-expr.cc M be/src/exprs/compound-predicates.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr-test.cc M be/src/exprs/literal.cc M be/src/exprs/null-literal.cc M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-fn-call.cc M be/src/exprs/slot-ref.cc M be/src/runtime/descriptors.cc M be/src/runtime/runtime-state.cc M be/src/runtime/types.cc M be/src/util/tuple-row-compare.cc 41 files changed, 1,452 insertions(+), 1,378 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/8489/3 -- To view, visit http://gerrit.cloudera.org:8080/8489 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7098baff5f746bcc2965583ca99f6188997b733a Gerrit-Change-Number: 8489 Gerrit-PatchSet: 3 Gerrit-Owner: Kim Jin Chul
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8487 to look at the new patch set (#2). Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues Some test runs are currently failing randomly in test_ir_functions due to LLVM linkage error. This happens when impala tries to link a function from a lib file on hdfs, but it somehow got removed from the lib cache before it could link. This results in a new file being cached but with a slightly different local filename, and since impala only uses local filenames to check for collision for linking of LLVM modules, it ends up linking the same file twice and hence encounters an error. This patch fixes this issue by using the hdfs file path to check for collision of lib files. Testing: Added a backend test that tries to link the same lib twice. Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 --- M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h 5 files changed, 39 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/8487/2 -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8490 ) Change subject: IMPALA-2248: Make idle_session_timeout a query option .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/service/JdbcTest.java File fe/src/test/java/org/apache/impala/service/JdbcTest.java: http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/service/JdbcTest.java@594 PS3, Line 594: "0, 5, 10"'s and "timeout * 1.5"'s relation could be more explicit, for example: float timeout_tolerance = 0.5; ... Arrays.asList(0, 5, 5*(1 + 2*timeout_tolerance)) ... int sleepPeriod = (int)(timeout * (1 + timeout_tolerance)); -- To view, visit http://gerrit.cloudera.org:8080/8490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e Gerrit-Change-Number: 8490 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 00:21:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8487 ) Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. Patch Set 2: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 02:36:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1446/ -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 04:17:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 15: Code-Review+2 carry -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 15 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Hello Michael Ho, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8172 to look at the new patch set (#14). Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. IMPALA-5307: Part 4: copy out uncompressed text and seq This is the final patch for IMPALA-5307. Text and Seq scanners are converted to use the same approach as Avro. contains_tuple_data is now false so a bunch of dead code in ScannerContext can be removed. We also no longer attach I/O buffers to row batches so that logic can be removed. Testing: Ran core ASAN tests. Perf: I reran the same benchmarks as in Part 2. There was no measurable difference before and after - for both text and seq processing time is dominated by text parsing. Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/base-sequence-scanner.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/exec/text-converter.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 23 files changed, 137 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/14 -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Tianyi Wang has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value-test.cc File be/src/runtime/raw-value-test.cc: http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value-test.cc@104 PS5, Line 104: FastHash > "FastHash" Done http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/runtime/raw-value.h@74 PS5, Line 74: /// Get a 64-bit hash value using the FastHash function. > Please add a reference. https://www.google.com/search?q=fasthash doesn't di Done http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h File be/src/util/hash-util.h: http://gerrit.cloudera.org:8080/#/c/8417/5/be/src/util/hash-util.h@32 PS5, Line 32: public: > Please add the name of this file in the appropriate place in LICENSE.txt. Y Done. I followed the practice in be/src/kudu/security/init.cc and assumed the license here should be removed. -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 01:14:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Tianyi Wang has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. IMPALA-2281: Replace FNV with FastHash in exchange nodes FNV is not a good enough hash function. This patch introduces FastHash into the codebase and uses it in exchange nodes. Testing: Two test cases involving arbitrary ordering are changed. Single node performance benchmark shows no performance difference. Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 --- M LICENSE.txt M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/raw-value-test.cc M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/util/hash-util-ir.cc M be/src/util/hash-util.h M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test 13 files changed, 209 insertions(+), 103 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/8417/6 -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 6 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8487 ) Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues Some test runs are currently failing randomly in test_ir_functions due to LLVM linkage error. This happens when impala tries to link a function from a lib file on hdfs, but it somehow got removed from the lib cache before it could link. This results in a new file being cached but with a slightly different local filename, and since impala only uses local filenames to check for collision for linking of LLVM modules, it ends up linking the same file twice and hence encounters an error. This patch fixes this issue by using the hdfs file path to check for collision of lib files. Testing: Added a backend test that tries to link the same lib twice. Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Reviewed-on: http://gerrit.cloudera.org:8080/8487 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/codegen/llvm-codegen-test.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/util/filesystem-util.cc M be/src/util/filesystem-util.h 5 files changed, 39 insertions(+), 35 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. IMPALA-6134: Update code base to use impala::ConditionVariable boost::condtion_variable supports thread interruption which has some overhead. In some places we already use impala::ConditionVariable which is a very thin layer around pthread, therefore it has less overhead. This commit substitues every boost::condition_variable in the codebase (except under kudu/) to impala::ConditionVariable. It also extends impala::ConditionVariable class to support waiting with a given timeout. The WaitFor function takes a duration as parameter. The WaitUntil function takes an absolute time as parameter. Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Reviewed-on: http://gerrit.cloudera.org:8080/8428 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/exec/plan-root-sink.cc M be/src/exec/plan-root-sink.h M be/src/rpc/thrift-server.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/disk-io-mgr-internal.h M be/src/runtime/disk-io-mgr-reader-context.cc M be/src/runtime/disk-io-mgr-scan-range.cc M be/src/runtime/disk-io-mgr-stress.cc M be/src/runtime/disk-io-mgr-stress.h M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/disk-io-mgr.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/statestore/statestore.h M be/src/util/blocking-queue.h M be/src/util/condition-variable.h M be/src/util/promise.h M be/src/util/thread-pool.h 31 files changed, 137 insertions(+), 122 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 5 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Nov 2017 02:16:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 15: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1445/ -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 15 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 02:12:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.h@128 PS9, Line 128: Takes over the RPC payload of an early sender to 'deferred_rpcs_' queue of the Takes over the RPC state of an early sender for deferred processing and kicks... -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 9 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 00:54:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 22: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 22 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 08 Nov 2017 00:55:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h@274 PS13, Line 274: // > nit: /// Done -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 01:06:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 15: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1445/ -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 15 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 08 Nov 2017 01:07:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 9: (6 comments) http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-mgr.cc@62 PS9, Line 62: min(FLAGS_datastream_service_num_deserialization_threads, CpuInfo::num_cores()), why disallow the setting of this greater than num_cores? Given that it takes locks, there could be some benefit to doing so, right? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@67 PS9, Line 67: data the resources http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@195 PS9, Line 195: cur_batch_ current_batch_ http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@225 PS9, Line 225: DCHECK shoudl we also DCHECK that num_pending_enqueue_ == 0 (otherwise, we could have acquired the lock while it was dropped for deserialization and there is still a row batch)? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@248 PS9, Line 248: batch_queue_.pop_front(); this is part of the same operation as line 244 (both are adjusting the queue state), so mind moving it to be adjacent? http://gerrit.cloudera.org:8080/#/c/8023/9/be/src/runtime/krpc-data-stream-recvr.cc@252 PS9, Line 252: // Don't hold lock when calling EnqueueDeserializeTask() as it may block. how about moving this to line 250 now that it's that scope that drops the lock. Also, it's important that this happens after batch_queue_.pop_front(), right? maybe note that. -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 9 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 08 Nov 2017 01:32:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Hello Sailesh Mukil, Mostafa Mokhtar, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8023 to look at the new patch set (#9). Change subject: IMPALA-4856: Port data stream service to KRPC .. IMPALA-4856: Port data stream service to KRPC This patch implements a new data stream service which utilizes KRPC. Similar to the thrift RPC implementation, there are 3 major components to the data stream services: KrpcDataStreamSender serializes and sends row batches materialized by a fragment instance to a KrpcDataStreamRecvr. KrpcDataStreamMgr is responsible for routing an incoming row batch to the appropriate receiver. The data stream service runs on the port FLAGS_krpc_port which is 29000 by default. Unlike the implementation with thrift RPC, KRPC provides an asynchronous interface for invoking remote methods. As a result, KrpcDataStreamSender doesn't need to create a thread per connection. There is one connection between two Impalad nodes for each direction (i.e. client and server). Multiple queries can multi-plex on the same connection for transmitting row batches between two Impalad nodes. The asynchronous interface also prevents some issues with thrift RPC in which a thread may be stuck in an RPC call without being able to check for cancellation. A TransmitData() call with KRPC is in essence a trio of RpcController, a serialized protobuf request buffer and a protobuf response buffer. The call is invoked via a DataStreamService proxy object. The serialized tuple offsets and row batches are sent via "sidecars" in KRPC to avoid extra copy into the serialized request buffer. Each impalad node creates a singleton DataStreamService object at start-up time. All incoming calls are served by a service thread pool created as part of DataStreamService. By default, there are 64 service threads. The service threads are shared across all queries so the RPC handler should avoid blocking as much as possible. In thrift RPC implementation, we make a thrift thread handling a TransmitData() RPC to block for extended period of time when the receiver is not yet created when the call arrives. In KRPC implementation, we store TransmitData() or EndDataStream() requests which arrive before the receiver is ready in a per-receiver early sender list stored in KrpcDataStreamMgr. These RPC calls will be processed and responded to when the receiver is created or when timeout occurs. Similarly, there is limited space in the sender queues in KrpcDataStreamRecvr. If adding a row batch to a queue in KrpcDataStreamRecvr causes the buffer limit to exceed, the request will be stashed in a blocked_sender_ queue to be processed later. The stashed RPC request will not be responded to until it is processed so as to exert back pressure to the client. An alternative would be to reply with an error and the request / row batches need to be sent again. This may end up consuming more network bandwidth than the thrift RPC implementation. This change adopts the behavior of allowing one stashed request per sender. All rpc requests and responses are serialized using protobuf. The equivalent of TRowBatch would be ProtoRowBatch which contains a serialized header about the meta-data of the row batch and two Kudu Slice objects which contain pointers to the actual data (i.e. tuple offsets and tuple data). This patch is based on an abandoned patch by Henry Robinson. TESTING --- * Build passed with FLAGS_use_krpc=true. TO DO - * Port some BE tests to KRPC services. Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 --- M be/src/common/status.cc M be/src/common/status.h M be/src/exec/data-sink.cc M be/src/exec/exchange-node.cc M be/src/exec/kudu-util.h M be/src/rpc/CMakeLists.txt M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/runtime/CMakeLists.txt M be/src/runtime/data-stream-mgr-base.h M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/krpc-data-stream-mgr.cc M be/src/runtime/krpc-data-stream-mgr.h M be/src/runtime/krpc-data-stream-recvr.cc M be/src/runtime/krpc-data-stream-recvr.h A be/src/runtime/krpc-data-stream-sender.cc A be/src/runtime/krpc-data-stream-sender.h M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/service/CMakeLists.txt A be/src/service/data-stream-service.cc A be/src/service/data-stream-service.h M be/src/service/impala-server.cc M cmake_modules/FindProtobuf.cmake M common/protobuf/CMakeLists.txt A common/protobuf/common.proto A common/protobuf/data_stream_service.proto A common/protobuf/row_batch.proto M common/thrift/generate_error_codes.py 33 files changed, 3,156 insertions(+), 183 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/8023/9 -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe,
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8484 ) Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Wed, 08 Nov 2017 00:13:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. IMPALA-5017: Error on decimal overflow Before this patch, decimal operations would either silently overflow (in the case of sum() and avg()), or produce a warning. In this patch, the behaviour is changed so that an error is produced in the case of overflow when DECIMAL_v2 is enabled. Decimal v1 behaviour is unchanged. We introduce overflow checks when computing sum() and avg(). This results in a ~30% performance regression when we are in decimal v2 mode compared to decimal v1. Benchmarks: Query: select sum(dec_38_19) from decimal_tbl Decimal v1: 13.09s Decimal v2: 17.10s Query: select avg(dec_38_19) from decimal_tbl Decimal v1: 13.60s Decimal v2: 19.10s Query: select avg(dec_9_5) from decimal_tbl Decimal v1: 12.06s Decimal v2: 18.07s Testing: - Added several end to end tests. - Updated Expr tests to check for error in case of overflow. Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 --- M be/src/exprs/aggregate-functions-ir.cc M be/src/exprs/decimal-operators-ir.cc M be/src/exprs/expr-codegen-test.cc M be/src/exprs/expr-test.cc M testdata/workloads/functional-query/queries/QueryTest/decimal-exprs.test M testdata/workloads/functional-query/queries/QueryTest/decimal.test 6 files changed, 214 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/8404/2 -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 2 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc@1215 PS8, Line 1215: string_map["support_start_over"] = "false"; > This may not be related to this change, but it would be good to think a bit I agree that support_start_over implementation is a bit weird. Note1, this change is only about introducing visibility levels on query options, not to refactor how they are implemented. Note2, currently this change only affects the Beeswax interface that is used by impala-shell, but doesn't change HS2 interface towards e.g. HUE. So this option won't be hidden from HUE. -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Wed, 08 Nov 2017 00:05:53 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8487 ) Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. Patch Set 2: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1444/ -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 23:15:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Hello Juan Yu, Michael Brown, Philip Zeyliger, Balazs Jeszenszky, Dimitris Tsirogiannis, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8202 to look at the new patch set (#22). Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. IMPALA-4704: Turns on client connections when local catalog initialized. Currently, impalad starts beeswax and hs2 servers even if the catalog has not yet been initialized. As a result, client connections see an error message stating that the impalad is not yet ready. This patch changes the impalad startup sequence to wait until the catalog is received before opening beeswax and hs2 ports and starting their servers. Testing: - python e2e tests that start a cluster without a catalog and check that client connections are rejected as expected. Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 --- M be/src/benchmarks/expr-benchmark.cc M be/src/common/global-flags.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M bin/start-impala-cluster.py M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_wait.py M tests/custom_cluster/test_coordinators.py 20 files changed, 353 insertions(+), 216 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/22 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 22 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1443/ -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 22:51:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Looks like it hit IMPALA-6092 plus some network flakiness trying to send RPCs to itself. I don't see how they're likely to be related to the change. -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 22:50:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Hello Juan Yu, Michael Brown, Philip Zeyliger, Balazs Jeszenszky, Dimitris Tsirogiannis, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8202 to look at the new patch set (#21). Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. IMPALA-4704: Turns on client connections when local catalog initialized. Currently, impalad starts beeswax and hs2 servers even if the catalog has not yet been initialized. As a result, client connections see an error message stating that the impalad is not yet ready. This patch changes the impalad startup sequence to wait until the catalog is received before opening beeswax and hs2 ports and starting their servers. Testing: - python e2e tests that start a cluster without a catalog and check that client connections are rejected as expected. Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 --- M be/src/benchmarks/expr-benchmark.cc M be/src/common/global-flags.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M bin/start-impala-cluster.py M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/service/FrontendTest.java M tests/common/custom_cluster_test_suite.py M tests/common/impala_cluster.py M tests/common/impala_service.py A tests/custom_cluster/test_catalog_wait.py M tests/custom_cluster/test_coordinators.py 20 files changed, 355 insertions(+), 216 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/8202/21 -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 21 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/8/be/src/service/impala-server.cc@1215 PS8, Line 1215: string_map["support_start_over"] = "false"; This may not be related to this change, but it would be good to think a bit about support_start_over option. Unlike other options, it is not an integer, and it should not be changed. It is read by Hue ccording to https://www.cloudera.com/documentation/enterprise/5-4-x/topics/impala_support_start_over.html I do not know how Hue accesses query options, but if via the output of "set;" hiding it by default may actually change the behavior of Hue. Even if this is not an issue, it would be probably better to separate this option clearly e.g. by giving it a separate level like "capability" and treat these query options as read only. ( this comment added support_start_over a long time ago: https://github.com/cloudera/Impala/commit/94a5ed487e04b7d2c9bf6bee399f0f464f289211b ) -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 23:07:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 13: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/8172/13/be/src/exec/hdfs-scanner.h@274 PS13, Line 274: // nit: /// -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 22:50:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 20: (2 comments) http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89 PS19, Line 89: must call the following : /// methods: > That DCHECK: combined them. http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936 PS20, Line 1936: > I think a quick comment is worth it to highlight the dependency between the Done -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 20 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Nov 2017 23:08:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-docs-submit/171/ -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 22:22:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 3: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 22:29:29 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. IMPALA-5473: [DOCS] Document TLS min version & cipher options Under the doc JIRA IMPALA-6065. Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Reviewed-on: http://gerrit.cloudera.org:8080/8401 Reviewed-by: Michael BrownReviewed-by: Sailesh Mukil Tested-by: Impala Public Jenkins --- M docs/topics/impala_ssl.xml 1 file changed, 71 insertions(+), 0 deletions(-) Approvals: Michael Brown: Looks good to me, but someone else must approve Sailesh Mukil: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 4 Gerrit-Owner: John Russell Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness
Hello Greg Rahn, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8418 to look at the new patch set (#3). Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness .. IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb --- M docs/topics/impala_known_issues.xml 1 file changed, 45 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/18/8418/3 -- To view, visit http://gerrit.cloudera.org:8080/8418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb Gerrit-Change-Number: 8418 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8418 ) Change subject: IMPALA-4539: [DOCS] Add known issue for uncompressed Parquet correctness .. Patch Set 2: (3 comments) http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml File docs/topics/impala_known_issues.xml: http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951 PS2, Line 951: Medium > I think TSBs probably use a different scale though. Most of the medium seve Sure! Changed. http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951 PS2, Line 951: Medium > I agree it's a little strange. But Cloudera issued a technical service bull Done http://gerrit.cloudera.org:8080/#/c/8418/2/docs/topics/impala_known_issues.xml@951 PS2, Line 951: Medium > Missed this on the first pass - shouldn't this be high? Done -- To view, visit http://gerrit.cloudera.org:8080/8418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I731eb0e029dc3cc251f4df0c5a8ad281c81595cb Gerrit-Change-Number: 8418 Gerrit-PatchSet: 2 Gerrit-Owner: John RussellGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: John Russell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 22:21:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1441/ -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 22:07:44 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 9: (21 comments) http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h File be/src/runtime/krpc-data-stream-mgr.h: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.h@435 PS8, Line 435: 'num_requ > 'num_request' requests Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc File be/src/runtime/krpc-data-stream-mgr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@62 PS8, Line 62: ds, C > how was that chosen? do we have a test case that causes this queue to fill Just a large enough number to reduce the chance of the queue filling up. Yes, need to come up with a test case to fill up this queue. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-mgr.cc@109 PS8, Line 109: // Let the receiver take over the RPC payloads of early senders and process them : // asynchronously. : for (unique_ptr this comment seems out of place. this is more an implementation detail of t Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h File be/src/runtime/krpc-data-stream-recvr.h: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@129 PS8, Line 129: n > and start a deserialization task to process it asynchronously. Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@130 PS8, Line 130: ask to p > this "transfer" is in the oppose direction of how our "Transfer" methods us Renamed to TakeOverEarlySender(). http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@197 PS8, Line 197: eive > cpu time or wall time? wall-clock time. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.h@200 PS8, Line 200: _; > same question wall-clock time. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@67 PS8, Line 67: data > the resources Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@72 PS8, Line 72: imit, the > RPC state Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@73 PS8, Line 73: nto 'deferred_ > we shouldn't normally refer to private fields in public class comments, but Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@87 PS8, Line 87: void TakeOverEarlySender(std::unique_ptr ctx); > same comment as recvr header comment. Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@195 PS8, Line 195: // cur_batch_ must be replaced with the > I don't think we need this loop. see other comments in this function. Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@197 PS8, Line 197: atch = nullptr; > nit: consider swapping the order of these so that the fast case comes first Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@210 PS8, Line 210: } > nit: i think we could do without some of the blank lines in this method to I leave a blank line between each loop exiting conditions. http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211 PS8, Line 211: > Actually, I think the loop exiting condition is not quite right, which led There is an invariant that EOS cannot be sent by a sender when there is outstanding TransmitData() RPC so we should be able to get by by just checking for the termination condition of: (num_remaining_senders_ == 0 && batch_queue_.empty()) http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@218 PS8, Line 218: > given that we just checked the other two loop exit conditions, isn't this d Converted to DCHECK(). http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@229 PS8, Line 229: d > to parallelize the CPU bound deserialization work. Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@239 PS8, Line 239: } > once you get rid of the loop, I think you'll be able to eliminate this unlo Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@389 PS8, Line 389: status.ToProto(ctx->response->mutable_status()); > at this point, 'ctx' effectively takes ownership, right? we should add a co Done http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@406 PS8, Line 406: const RowBatchHeaderPB& header =
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8484 ) Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. Patch Set 3: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1442/ -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 07 Nov 2017 20:55:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8484 ) Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. Patch Set 3: Code-Review+1 I haven't dabbled enough with test files to know that VERIFY_IS_EQUAL_SORTED even exists. I agree with its use over changing the TPC-DS queries. -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 07 Nov 2017 20:53:25 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5017: Error on decimal overflow
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/8404 ) Change subject: IMPALA-5017: Error on decimal overflow .. Patch Set 1: (4 comments) http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@12 PS1, Line 12: In this patch, the beharior is changed so that an error is produced in > I'll also accept behaviour ;) Done http://gerrit.cloudera.org:8080/#/c/8404/1//COMMIT_MSG@27 PS1, Line 27: select avg(dec_38_19) from decimal_tbl > I guess these regressions occur regardless of the width of the input decima Correct. Added another benchmark to illustrate this. http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@413 PS1, Line 413: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val4))) { > This isn't correct if sum_val16 and src.val* have opposite sign, is it? That's true. Fixed and added a test case. http://gerrit.cloudera.org:8080/#/c/8404/1/be/src/exprs/aggregate-functions-ir.cc@420 PS1, Line 420: abs(avg->sum_val16) > DecimalUtil::MAX_UNSCALED_DECIMAL16 - abs(src.val8))) { > What do you think about only checking for overflow of the integer type at t 1. Are you suggesting to check for something like abs(avg->sum_val16) > 2**128 - abs(src.val8))) ? I don't really see why it is faster to check against 2**128 instead of against MAX_UNSCALED_DECIMAL. 2. What do you mean by initial check? Where would this check be done? Something like (decimal_v2 && sum_val16 > 2**32 && abs(avg->sum_val16) > MAX_UNSCALED_DECIMAL - abs(src.val8)))? -- To view, visit http://gerrit.cloudera.org:8080/8404 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id98a92c9a9469ec8cf14e518c741a2dab7053019 Gerrit-Change-Number: 8404 Gerrit-PatchSet: 1 Gerrit-Owner: Taras BobrovytskyGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: John Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 07 Nov 2017 21:59:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Hello Lars Volker, Laszlo Gaal, Zoltan Borok-Nagy, Philip Zeyliger, Attila Jeges, Tim Armstrong, Csaba Ringhofer, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8447 to look at the new patch set (#8). Change subject: IMPALA-2181: Add query option levels for display .. IMPALA-2181: Add query option levels for display Three display levels are introduced for each query option: REGULAR, ADVANCED and DEPRECATED. When the query options are diplayed in Impala shell using SET then only the REGULAR options are shown. A new command called SET ALL shows all the options grouped by their option levels. Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 --- M be/src/service/child-query.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/beeswax.thrift M shell/impala_client.py M shell/impala_shell.py M tests/shell/test_shell_interactive.py 10 files changed, 225 insertions(+), 91 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/8447/8 -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8487 ) Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 21:21:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. IMPALA-6151: add query-level fragment/backend counters This adds NumBackends, NumFragments and NumFragmentInstances counters to the query profile to make it easier to manually or programmatically analyse the query. Also add a num-queries-registered metric to track the number of queries that have been executed but are not yet unregistered. Testing: Ran "select count(*) from alltypessmall" and checked profile: Backend startup latencies: Count: 3, min / max: 1ms / 1ms, 25th %-ile: 1ms, 50th %-ile: 1ms, 75th %-ile: 1ms, 90th %-ile: 1ms, 95th %-ile: 1ms, 99.9th %-ile: 1ms Per Node Peak Memory Usage: tarmstrong-box:22000(1.10 MB) tarmstrong-box:22001(1.02 MB) tarmstrong-box:22002(1.02 MB) - FiltersReceived: 0 (0) - FinalizationTimer: 0.000ns - NumBackends: 3 (3) - NumFragmentInstances: 4 (4) - NumFragments: 2 (2) Ran some query tests (both beeswax and HS2) and manually checked the num-queries-registered metric on the /metrics page when the queries were running and after they finished. Added the metric to test_metrics_are_zero() to make sure that there are no accounting errors. Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Reviewed-on: http://gerrit.cloudera.org:8080/8461 Reviewed-by: Tim ArmstrongTested-by: Impala Public Jenkins --- M be/src/runtime/coordinator.cc M be/src/service/impala-server.cc M be/src/util/impalad-metrics.cc M be/src/util/impalad-metrics.h M common/thrift/metrics.json M tests/verifiers/metric_verifier.py 6 files changed, 37 insertions(+), 5 deletions(-) Approvals: Tim Armstrong: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 7 Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 21:44:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/8487 ) Change subject: IMPALA-5999: Fix LLVM linkage errors due LibCache sync issues .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc File be/src/codegen/llvm-codegen-test.cc: http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen-test.cc@485 PS1, Line 485: NULL > nullptr Done http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h File be/src/codegen/llvm-codegen.h: http://gerrit.cloudera.org:8080/#/c/8487/1/be/src/codegen/llvm-codegen.h@584 PS1, Line 584: LinkModule > Maybe LinkModuleFromLocalFs() or something like that to disambiguate it the Done -- To view, visit http://gerrit.cloudera.org:8080/8487 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iefa23bbf47998fe7e84011e1edf8e794e94a1757 Gerrit-Change-Number: 8487 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet VigGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 21:15:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: PS7: > I assume this was a wholesale move of the class without modifications, exce There was some minor cleanup - initialising member variables inline, adding a DCHECK in the destructor, etc. I created a diff here to see how the class definition changed: https://gist.github.com/timarmstrong/f640dd200260556e1167d8ca69a67f51/revisions?diff=split http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662 PS7, Line 662: void UnregisterContext(DiskIoRequestContext* context); > Given that the context cannot be used after this call, should we move the u I don't feel strongly. I think your argument has merit but destroying the data structure goes (somewhat) against the idea that we shouldn't tear down control structures until the very end of the query. -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 21:10:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Tim Wood has posted comments on this change. ( http://gerrit.cloudera.org:8080/8484 ) Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. Patch Set 3: > Uploaded patch set 3. Applied Tim A.'s suggestion. -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 07 Nov 2017 20:42:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Hello Greg Rahn, Michael Brown, Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8484 to look at the new patch set (#3). Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. Use VERIFY_IS_EQUAL_SORTED tag on RESULTS section to allow low-order sort deviations to compare. Testing: - Passed local tests/run-tests.py ... - Changed order of expected rows by low-order column and verified test still passes. - Analyzed all TPC-DS query files for similar patterns, found no others. Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c --- M testdata/workloads/tpcds/queries/tpcds-q77a.test 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/8484/3 -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood
[Impala-ASF-CR] IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8484 ) Change subject: IMPALA-6155: Allow tests to pass when ORDER BY does not cover the query. .. Patch Set 3: Code-Review+2 Thanks Tim W. -- To view, visit http://gerrit.cloudera.org:8080/8484 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib42ba64ce6ac9b75b4a532f20cee0055aaed5a6c Gerrit-Change-Number: 8484 Gerrit-PatchSet: 3 Gerrit-Owner: Tim WoodGerrit-Reviewer: Greg Rahn Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Tim Wood Gerrit-Comment-Date: Tue, 07 Nov 2017 20:54:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h@516 PS7, Line 516: /// Given a query option name this function gets the option level for it to use when : /// displaying from Impala shell and adds the level to the ConfigVariable parameter : /// in numeric format. For values see TQueryOptionLevel enum. > nit: For me, this comment is too complex and tells too much implementation Thx, that sounds better indeed. Done http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc@1228 PS7, Line 1228: const string& option_key > Why do we pass option_key here? It is already there in config, isn't it? In this case it is. But if I relied on that than it would introduce some risk when someone starts to rearrange the part where config is populated as this function can't guarantee that config contains the key every time. To be on the safe side I think it's reasonable to keep the key as a parameter. http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@240 PS7, Line 240: if print_mode == QueryOptionDisplayModes.ALL_OPTIONS: : if len(advanced_options) > 0: : print '\nAdvanced Query Options:' : self._print_option_group(advanced_options) : if len(deprecated_options) > 0: : print '\nDeprecated Query Options:' > nit: this can be simplified as: Done http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@620 PS7, Line 620: ESS : : # Remove any extra spaces surrounding the tokens. > These members are already available in _print_with_set through self. And as Good point! Done -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 8 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 20:51:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Bump Kudu version to 1520b39
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8492 Change subject: Bump Kudu version to 1520b39 .. Bump Kudu version to 1520b39 Change-Id: Ib3d5d88bd4d3347447a64fac75ca3e0427b11ec6 --- M bin/impala-config.sh 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/8492/1 -- To view, visit http://gerrit.cloudera.org:8080/8492 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib3d5d88bd4d3347447a64fac75ca3e0427b11ec6 Gerrit-Change-Number: 8492 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Tianyi Wang has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/8417 ) Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes .. IMPALA-2281: Replace FNV with FastHash in exchange nodes FNV is not a good enough hash function. This patch introduces FastHash into the codebase and uses it in exchange nodes. Testing: Two test cases involving arbitrary ordering are changed. Single node performance benchmark shows no performance difference. Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 --- M be/src/benchmarks/hash-benchmark.cc M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/llvm-codegen.cc M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/raw-value-test.cc M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/util/hash-util-ir.cc M be/src/util/hash-util.h M testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test M testdata/workloads/tpcds/queries/tpcds-q77a.test 13 files changed, 201 insertions(+), 101 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/8417/4 -- To view, visit http://gerrit.cloudera.org:8080/8417 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2 Gerrit-Change-Number: 8417 Gerrit-PatchSet: 4 Gerrit-Owner: Tianyi WangGerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 ) Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. Patch Set 11: (5 comments) http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc File be/src/exec/hdfs-scanner-ir.cc: http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc@61 PS11, Line 61: return 0; > Is there a reason why we don't treat this as parse error and returns -1 ? No that's a good point. I changed it. http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h File be/src/exec/hdfs-scanner.h: http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h@347 PS11, Line 347: /// Returns -1 if parsing should be aborted due to parse errors. > May help to also add "Returns 0 if copying strings into 'pool' failed'. Clarified that it could be a memory allocation error. Per the other comment, it probably makes most sense to just return -1. http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc File be/src/exec/hdfs-sequence-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc@a282 PS11, Line 282: > It may be trivial but does it make sense to add DCHECK(row_batch->num_tuple The assumption of 1 tuple per row for scans is pretty deeply baked into the scanners and planner, so it seemed silly to have this speculative generality here. I added a DCHECK to HdfsScanner to document that the assumption is valid for all HDFS scanners. http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc File be/src/exec/hdfs-text-scanner.cc: http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@849 PS11, Line 849: tuples_returned > There is a subtle behavior here if Tuple::CopyStrings() failed in WriteAlig Yeah I think I just made a mistake returning 0. Returning -1 makes more sense. I manually tested with this query: [localhost:21000] > set num_nodes=1; set disable_codegen=1; set mem_limit=5m; select * from widetable_1000_cols; NUM_NODES set to 1 DISABLE_CODEGEN set to 1 MEM_LIMIT set to 5m Query: select * from widetable_1000_cols Query submitted at: 2017-11-07 11:52:01 (Coordinator: http://tarmstrong-box:25000) Query progress can be monitored at: http://tarmstrong-box:25000/query_plan?query_id=9948285e1aa8e172:f73fe971 ERROR: Memory limit exceeded: HdfsScanner::WriteAlignedTuples() failed to allocate 125 bytes for strings. HDFS_SCAN_NODE (id=0) could not allocate 125.00 B without exceeding limit. Error occurred on backend tarmstrong-box:22000 by fragment 9948285e1aa8e172:f73fe971 Memory left in process limit: 8.01 GB Memory left in query limit: 947.59 KB Query(9948285e1aa8e172:f73fe971): Limit=5.00 MB Reservation=0 ReservationLimit=0 OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB Fragment 9948285e1aa8e172:f73fe971: Reservation=0 OtherMemory=4.07 MB Total=4.07 MB Peak=4.07 MB HDFS_SCAN_NODE (id=0): Total=4.07 MB Peak=4.07 MB PLAN_ROOT_SINK: Total=0 Peak=0 It would be nice to turn it into an end-to-end test but I cant see how to make it robustly fail in that place. http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@857 PS11, Line 857: num_tuples > Why is this not tuples_returned here ? Is there any guarantee max_added_tup I think num_fields is tracking the number of fields consumed from the input. -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 11 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 19:54:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Hello Michael Ho, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8172 to look at the new patch set (#13). Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq .. IMPALA-5307: Part 4: copy out uncompressed text and seq This is the final patch for IMPALA-5307. Text and Seq scanners are converted to use the same approach as Avro. contains_tuple_data is now false so a bunch of dead code in ScannerContext can be removed. We also no longer attach I/O buffers to row batches so that logic can be removed. Testing: Ran core ASAN tests. Perf: I reran the same benchmarks as in Part 2. There was no measurable difference before and after - for both text and seq processing time is dominated by text parsing. Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/base-sequence-scanner.cc M be/src/exec/exec-node.cc M be/src/exec/exec-node.h M be/src/exec/hdfs-avro-scanner-ir.cc M be/src/exec/hdfs-avro-scanner.cc M be/src/exec/hdfs-avro-scanner.h M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-rcfile-scanner.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/hdfs-sequence-scanner.cc M be/src/exec/hdfs-text-scanner.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/exec/text-converter.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 23 files changed, 137 insertions(+), 212 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/8172/13 -- To view, visit http://gerrit.cloudera.org:8080/8172 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941 Gerrit-Change-Number: 8172 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8424 ) Change subject: IMPALA-4835 (prep only): create io subfolder and namespace .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184 PS6, Line 184: // > Can remove this TODO Done -- To view, visit http://gerrit.cloudera.org:8080/8424 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b Gerrit-Change-Number: 8424 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 19:55:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace
Hello Joe McDonnell, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8424 to look at the new patch set (#7). Change subject: IMPALA-4835 (prep only): create io subfolder and namespace .. IMPALA-4835 (prep only): create io subfolder and namespace Instead of using the DiskIoMgr class as a namespace, which prevents forward-declaration of inner classes, create an impala::io namespace and unnested the inner class. This is done in anticipation of DiskIoMgr depending on BufferPool. This helps avoid a circular dependency between DiskIoMgr, TmpFileMgr and BufferPool headers that could not be broken with forward declarations. Testing: Ran core tests. Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b --- M be/CMakeLists.txt M be/src/exec/base-sequence-scanner.cc M be/src/exec/hdfs-parquet-scanner.cc M be/src/exec/hdfs-parquet-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scan-node-mt.h M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/exec/hdfs-text-scanner.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/scanner-context.cc M be/src/exec/scanner-context.h M be/src/runtime/CMakeLists.txt D be/src/runtime/disk-io-mgr.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h A be/src/runtime/io/CMakeLists.txt R be/src/runtime/io/disk-io-mgr-internal.h R be/src/runtime/io/disk-io-mgr-stress-test.cc R be/src/runtime/io/disk-io-mgr-stress.cc R be/src/runtime/io/disk-io-mgr-stress.h R be/src/runtime/io/disk-io-mgr-test.cc R be/src/runtime/io/disk-io-mgr.cc A be/src/runtime/io/disk-io-mgr.h R be/src/runtime/io/handle-cache.h R be/src/runtime/io/handle-cache.inline.h R be/src/runtime/io/request-context.cc R be/src/runtime/io/request-context.h A be/src/runtime/io/request-ranges.h R be/src/runtime/io/scan-range.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.h M be/src/runtime/tmp-file-mgr-test.cc M be/src/runtime/tmp-file-mgr.cc M be/src/runtime/tmp-file-mgr.h 38 files changed, 1,416 insertions(+), 1,311 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/24/8424/7 -- To view, visit http://gerrit.cloudera.org:8080/8424 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b Gerrit-Change-Number: 8424 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4835 (prep only): create io subfolder and namespace
Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8424 ) Change subject: IMPALA-4835 (prep only): create io subfolder and namespace .. Patch Set 6: (1 comment) One small comment http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h File be/src/runtime/io/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8424/6/be/src/runtime/io/disk-io-mgr.h@184 PS6, Line 184: /// TODO: Break this up the .h/.cc into multiple files under an /io subdirectory. Can remove this TODO -- To view, visit http://gerrit.cloudera.org:8080/8424 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If807f93a47d8027a43e56dd80b1b535d0bb74e1b Gerrit-Change-Number: 8424 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 19:39:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8267 ) Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. Patch Set 14: Code-Review+1 Rebased, carry Lars' +1. Lars are you comfortable with bringing this to a +2? -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 19:12:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Hello Lars Volker, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8267 to look at the new patch set (#14). Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding Switch the decoders to using more batch-oriented interfaces. As an intermediate step this doesn't make the interfaces of LevelDecoder or DictDecoder batch-oriented, only the lower-level utility classes. The next step would be to change those interfaces to be batch-oriented and make according optimisations in parquet. This could deliver much larger perf improvements than the current patch. The high-level changes are. * BitReader -> BatchedBitReader, which is built to unpack runs of 32 bit-packed values efficiently. * RleDecoder -> RleBatchDecoder, which exposes the repeated and literal runs to the caller and uses BatchedBitReader to unpack literal runs efficiently. * Dict decoding uses RleBatchDecoder to decode repeated runs efficiently and uses the BitPacking utilities to unpack and encode in a single step. Also removes an older benchmark that isn't too interesting (since the batch-oriented approach to encoding and decoding is so much faster than the value-by-value approach). Testing: * Ran core tests. * Updated unit tests to exercise new code. * Added test coverage for the deprecated bit-packed level encoding to that it still works (there was no coverage previously). Perf: Single-node benchmarks showed a few % performance gain. 16 node cluster benchmarks only showed a gain for TPC-H nested. Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 --- M be/src/benchmarks/CMakeLists.txt M be/src/benchmarks/bit-packing-benchmark.cc D be/src/benchmarks/rle-benchmark.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h D be/src/experiments/bit-stream-utils.8byte.h D be/src/experiments/bit-stream-utils.8byte.inline.h M be/src/util/bit-packing.h M be/src/util/bit-packing.inline.h M be/src/util/bit-stream-utils.h M be/src/util/bit-stream-utils.inline.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc M be/src/util/parquet-reader.cc M be/src/util/rle-encoding.h M be/src/util/rle-test.cc M testdata/data/README A testdata/data/alltypes_agg_bitpacked_def_levels.parquet A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 20 files changed, 1,251 insertions(+), 966 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/14 -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 14 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8267 ) Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: http://gerrit.cloudera.org:8080/#/c/8267/12/be/src/exec/parquet-column-readers.h@96 PS12, Line 96: for different encodings. > what do we mean by that? Isn't this just for RLE? Done -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 12 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:57:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6121: remove I/O mgr request context cache
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8408 ) Change subject: IMPALA-6121: remove I/O mgr request context cache .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr-reader-context.h File be/src/runtime/disk-io-mgr-reader-context.h: PS7: I assume this was a wholesale move of the class without modifications, except for the new methods defined in the .cc file? http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h File be/src/runtime/disk-io-mgr.h: http://gerrit.cloudera.org:8080/#/c/8408/7/be/src/runtime/disk-io-mgr.h@662 PS7, Line 662: void UnregisterContext(DiskIoRequestContext* context); Given that the context cannot be used after this call, should we move the unique_ptr into this call and delete it then? -- To view, visit http://gerrit.cloudera.org:8080/8408 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I91414eceaa4938fccd74686fe6bebede6ef36108 Gerrit-Change-Number: 8408 Gerrit-PatchSet: 7 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:56:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding
Hello Lars Volker, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8267 to look at the new patch set (#13). Change subject: IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding .. IMPALA-4177,IMPALA-6039: batched bit reading and rle decoding Switch the decoders to using more batch-oriented interfaces. As an intermediate step this doesn't make the interfaces of LevelDecoder or DictDecoder batch-oriented, only the lower-level utility classes. The next step would be to change those interfaces to be batch-oriented and make according optimisations in parquet. This could deliver much larger perf improvements than the current patch. The high-level changes are. * BitReader -> BatchedBitReader, which is built to unpack runs of 32 bit-packed values efficiently. * RleDecoder -> RleBatchDecoder, which exposes the repeated and literal runs to the caller and uses BatchedBitReader to unpack literal runs efficiently. * Dict decoding uses RleBatchDecoder to decode repeated runs efficiently and uses the BitPacking utilities to unpack and encode in a single step. Also removes an older benchmark that isn't too interesting (since the batch-oriented approach to encoding and decoding is so much faster than the value-by-value approach). Testing: * Ran core tests. * Updated unit tests to exercise new code. * Added test coverage for the deprecated bit-packed level encoding to that it still works (there was no coverage previously). Perf: Single-node benchmarks showed a few % performance gain. 16 node cluster benchmarks only showed a gain for TPC-H nested. Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 --- M be/src/benchmarks/CMakeLists.txt M be/src/benchmarks/bit-packing-benchmark.cc D be/src/benchmarks/rle-benchmark.cc M be/src/exec/parquet-column-readers.cc M be/src/exec/parquet-column-readers.h D be/src/experiments/bit-stream-utils.8byte.h D be/src/experiments/bit-stream-utils.8byte.inline.h M be/src/util/bit-packing.h M be/src/util/bit-packing.inline.h M be/src/util/bit-stream-utils.h M be/src/util/bit-stream-utils.inline.h M be/src/util/dict-encoding.h M be/src/util/dict-test.cc M be/src/util/parquet-reader.cc M be/src/util/rle-encoding.h M be/src/util/rle-test.cc M testdata/data/README A testdata/data/alltypes_agg_bitpacked_def_levels.parquet A testdata/workloads/functional-query/queries/QueryTest/parquet-def-levels.test M tests/query_test/test_scanners.py 20 files changed, 1,252 insertions(+), 966 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/8267/13 -- To view, visit http://gerrit.cloudera.org:8080/8267 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I35de0cf80c86f501c4a39270afc8fb8111552ac6 Gerrit-Change-Number: 8267 Gerrit-PatchSet: 13 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1441/ -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 18:49:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6134: Update code base to use impala::ConditionVariable
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8428 ) Change subject: IMPALA-6134: Update code base to use impala::ConditionVariable .. Patch Set 4: Code-Review+2 Thanks Zoltan - I really appreciate the cleanup. -- To view, visit http://gerrit.cloudera.org:8080/8428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3085c6dcb42350b61244df6e7f091a1e7db356c9 Gerrit-Change-Number: 8428 Gerrit-PatchSet: 4 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 18:49:26 + Gerrit-HasComments: No
[native-toolchain-CR] Bump Kudu version to 1520b39
Thomas Tauber-Marshall has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8491 ) Change subject: Bump Kudu version to 1520b39 .. Bump Kudu version to 1520b39 Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Tim Armstrong: Looks good to me, approved Thomas Tauber-Marshall: Verified -- To view, visit http://gerrit.cloudera.org:8080/8491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 Gerrit-Change-Number: 8491 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong
[native-toolchain-CR] Bump Kudu version to 1520b39
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8491 ) Change subject: Bump Kudu version to 1520b39 .. Patch Set 1: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/8491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 Gerrit-Change-Number: 8491 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:38:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/8490 ) Change subject: IMPALA-2248: Make idle_session_timeout a query option .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@205 PS3, Line 205: int32_t requested_timeout = atoi(value.c_str()); Could you extract the body of the 'if' to a new function in order to keep this part more verbose? http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/util/Metrics.java File fe/src/test/java/org/apache/impala/util/Metrics.java: http://gerrit.cloudera.org:8080/#/c/8490/3/fe/src/test/java/org/apache/impala/util/Metrics.java@26 PS3, Line 26: public class Metrics { Could you please write a short comment to describe the purpose of this class? -- To view, visit http://gerrit.cloudera.org:8080/8490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e Gerrit-Change-Number: 8490 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:36:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 8: > It's still technically accurate Yeah, I guess that's true. -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 07 Nov 2017 18:34:06 + Gerrit-HasComments: No
[native-toolchain-CR] Bump Kudu version to 1520b39
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8491 ) Change subject: Bump Kudu version to 1520b39 .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 Gerrit-Change-Number: 8491 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:33:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 8: It's still technically accurate in that the queries consume some amount of memory, especially if the result cache is used. Could we tweak the wording somewhat, e.g. "To free any remaining resources they are using..." -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 07 Nov 2017 18:32:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 6: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1440/ -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 6: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 6 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953 PS5, Line 953: ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L); > On the /queries webpage, I think "num_in_flight_queries" is this same numbe Oh, I see, I didn't look at the queries json. The queries page seems internally inconsistent - num_in_flight_queries includes "queries in flight" and "queries waiting to be closed". But yeah, I deliberately avoided using the term "in flight" because it's unclear whether it should include queries that are waiting to be closed. I don't like inventing a new term but this does seem less ambiguous going forward. It would be nice to re-evaluate all the names but probably best to do that all at once instead of piecemeal, since it will require considering impact on other tools, like you mentioned. -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 18:24:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4704: Turns on client connections when local catalog initialized.
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8202 ) Change subject: IMPALA-4704: Turns on client connections when local catalog initialized. .. Patch Set 20: (2 comments) http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8202/19/be/src/service/impala-server.h@89 PS19, Line 89: must call the following : /// methods: > good question-- I've kept it the same as before so not sure what the reason That DCHECK: DCHECK(exec_env.process_mem_tracker() != nullptr) << "ExecEnv::StartServices() must be called before starting RPC services"; looks stale since the process_mem_tracker() is set in exec_env_.Init() which is called earlier in main (not to mention StartServices() doesn't exist anymore). I also don't see what the specific dependency is that this is trying to point out (this DCHECK should have been closer to the code that requires this to be true). If you want to keep the dcheck, I see no reason it couldn't be moved before both Init and Start. Then we could combine them, since there doesn't appear to be any logic behind the separation of those two. But you can defer cleaning this up if you prefer (though the dcheck wording should be updated at least). http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8202/20/be/src/service/impala-server.cc@1936 PS20, Line 1936: I think a quick comment is worth it to highlight the dependency between these two lines: // Subscribe to the catalog topic and then wait for the initial catalog update. (or whatever is accurate) -- To view, visit http://gerrit.cloudera.org:8080/8202 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I52b881cba18a7e4533e21a78751c2e35c3d4c8a6 Gerrit-Change-Number: 8202 Gerrit-PatchSet: 20 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Balazs Jeszenszky Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 07 Nov 2017 18:04:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6151: add query-level fragment/backend counters
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8461 ) Change subject: IMPALA-6151: add query-level fragment/backend counters .. Patch Set 5: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8461/5/be/src/service/impala-server.cc@953 PS5, Line 953: ImpaladMetrics::NUM_QUERIES_REGISTERED->Increment(1L); On the /queries webpage, I think "num_in_flight_queries" is this same number (the count of client_request_state_map_ entries). Should we make the names consistent? Probably "registered" is a better name than "in-flight" (since internally, we use in-flight to mean something slightly different and in-flight seems more ambiguous). But I'm not sure if we can change the webserver names without breaking something. -- To view, visit http://gerrit.cloudera.org:8080/8461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3df350414733e98d1ec28adc1c98f45bb0c4e3e9 Gerrit-Change-Number: 8461 Gerrit-PatchSet: 5 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:27 + Gerrit-HasComments: Yes
[native-toolchain-CR] Bump Kudu version to 1520b39
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8491 ) Change subject: Bump Kudu version to 1520b39 .. Patch Set 1: http://unittest.jenkins.cloudera.com/job/verify-impala-toolchain-package-build/482/ -- To view, visit http://gerrit.cloudera.org:8080/8491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 Gerrit-Change-Number: 8491 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Comment-Date: Tue, 07 Nov 2017 17:32:26 + Gerrit-HasComments: No
[native-toolchain-CR] Bump Kudu version to 1520b39
Thomas Tauber-Marshall has uploaded this change for review. ( http://gerrit.cloudera.org:8080/8491 Change subject: Bump Kudu version to 1520b39 .. Bump Kudu version to 1520b39 Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 --- M buildall.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/91/8491/1 -- To view, visit http://gerrit.cloudera.org:8080/8491 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ib5e0dd9db01972d518e4944cf2a476f1d9e7cf08 Gerrit-Change-Number: 8491 Gerrit-PatchSet: 1 Gerrit-Owner: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs
David Knupp has posted comments on this change. ( http://gerrit.cloudera.org:8080/8456 ) Change subject: IMPALA-6148: Specifying thirdparty deps as URLs .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py@112 PS2, Line 112: return v > Why not cache v instead? Also, would it be appropriate to cache the value as an attribute of Package() instead? -- To view, visit http://gerrit.cloudera.org:8080/8456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422 Gerrit-Change-Number: 8456 Gerrit-PatchSet: 3 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 07 Nov 2017 17:29:57 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-1575: part 2: yield admission control resources
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8323 ) Change subject: IMPALA-1575: part 2: yield admission control resources .. Patch Set 8: Maybe this tooltip should be updated? document->AddMember("waiting-tooltip", "These queries are no longer executing, either " "because they encountered an error or because they have returned all of their " "results, but they are still active so that their results can be inspected. To " "free the resources they are using, they must be closed.", -- To view, visit http://gerrit.cloudera.org:8080/8323 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I80279eb2bda740d7f61420f52db3bfa42a6a51ac Gerrit-Change-Number: 8323 Gerrit-PatchSet: 8 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: anujphadke Gerrit-Comment-Date: Tue, 07 Nov 2017 17:28:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Sailesh Mukil has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 3: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 17:05:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/8490 ) Change subject: IMPALA-2248: Make idle_session_timeout a query option .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@204 PS3, Line 204: if (iequals(key, "idle_session_timeout")) { Its unfortunate this is special cased here. Could you add a comment explaining why that's necessary. http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/client-request-state.cc@213 PS3, Line 213: key, value, : _->set_query_options, single line http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8490/3/be/src/service/impala-server.cc@189 PS3, Line 189: DEFINE_int32(idle_session_timeout, 0, "The time, in seconds, that a session may be idle" Note how this interacts with the query option. http://gerrit.cloudera.org:8080/#/c/8490/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/8490/3/common/thrift/ImpalaService.thrift@292 PS3, Line 292: // The time, in seconds, that a session may be idle for before it is closed (and all Note how this interacts with the flag. -- To view, visit http://gerrit.cloudera.org:8080/8490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e Gerrit-Change-Number: 8490 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 17:03:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6148: Specifying thirdparty deps as URLs
Zach Amsden has posted comments on this change. ( http://gerrit.cloudera.org:8080/8456 ) Change subject: IMPALA-6148: Specifying thirdparty deps as URLs .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: http://gerrit.cloudera.org:8080/#/c/8456/2/bin/bootstrap_toolchain.py@112 PS2, Line 112: return v Why not cache v instead? -- To view, visit http://gerrit.cloudera.org:8080/8456 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4628d86022d4bd8b762313f7056d76416a58b422 Gerrit-Change-Number: 8456 Gerrit-PatchSet: 2 Gerrit-Owner: Philip ZeyligerGerrit-Reviewer: David Knupp Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Zach Amsden Gerrit-Comment-Date: Tue, 07 Nov 2017 16:45:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5473: [DOCS] Document TLS min version & cipher options
Michael Brown has posted comments on this change. ( http://gerrit.cloudera.org:8080/8401 ) Change subject: IMPALA-5473: [DOCS] Document TLS min version & cipher options .. Patch Set 3: Code-Review+1 Sailesh, if you are happy with this, want to +2 it? Thanks! -- To view, visit http://gerrit.cloudera.org:8080/8401 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia1705262f8c01e38c616541d1c48f5d0cad5498e Gerrit-Change-Number: 8401 Gerrit-PatchSet: 3 Gerrit-Owner: John RussellGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: John Russell Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 16:34:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 7: (3 comments) http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.h@516 PS7, Line 516: /// Given a query option name this function gets the option level for it to use when : /// displaying from Impala shell and adds the level to the ConfigVariable parameter : /// in numeric format. For values see TQueryOptionLevel enum. nit: For me, this comment is too complex and tells too much implementation details. How about something like this: This function sets the option level for parameter 'option' based on the mapping stored in 'query_option_levels_'. The option level is used by the Impala shell when it displays the options. http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/8447/7/be/src/service/impala-server.cc@1228 PS7, Line 1228: const string& option_key Why do we pass option_key here? It is already there in config, isn't it? http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@620 PS7, Line 620: self.imp_client.default_query_options, : self.set_query_options, : self.imp_client.query_option_levels These members are already available in _print_with_set through self. And as far as I can see the same members are passed on every callsites. I think it would be easier to read if _print_with_set didn't take these params. -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 16:25:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6105: Clarify argument order in single node perf run
Jim Apple has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8470 ) Change subject: IMPALA-6105: Clarify argument order in single_node_perf_run .. IMPALA-6105: Clarify argument order in single_node_perf_run single_node_perf_run.py uses git_hash_A vs. git_hash_B, distinguish them by their position in the command-line arguments. single_node_perf_run.py calls report_benchmark_results.py, which uses the "reference vs. input", distinguished by their command-line flags. The output of report_benchmark_results.py uses "{empty string} vs Base". In the long run, I think it would be better to fix all three to use the same terminology, but this comment hopefully adds clarity. Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639 Reviewed-on: http://gerrit.cloudera.org:8080/8470 Tested-by: Impala Public Jenkins Reviewed-by: Jim Apple--- M bin/single_node_perf_run.py 1 file changed, 22 insertions(+), 2 deletions(-) Approvals: Impala Public Jenkins: Verified Jim Apple: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8470 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639 Gerrit-Change-Number: 8470 Gerrit-PatchSet: 3 Gerrit-Owner: Jim Apple Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6105: Clarify argument order in single node perf run
Jim Apple has posted comments on this change. ( http://gerrit.cloudera.org:8080/8470 ) Change subject: IMPALA-6105: Clarify argument order in single_node_perf_run .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/8470 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib236ce7e83dc193ef1382f630ce58759a639 Gerrit-Change-Number: 8470 Gerrit-PatchSet: 2 Gerrit-Owner: Jim AppleGerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 07 Nov 2017 16:16:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4856: Port data stream service to KRPC
Dan Hecht has posted comments on this change. ( http://gerrit.cloudera.org:8080/8023 ) Change subject: IMPALA-4856: Port data stream service to KRPC .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc File be/src/runtime/krpc-data-stream-recvr.cc: http://gerrit.cloudera.org:8080/#/c/8023/8/be/src/runtime/krpc-data-stream-recvr.cc@211 PS8, Line 211: deferred_rpcs_.empty() && batch_queue_.empty() > why not write this condition as: Actually, I think the loop exiting condition is not quite right, which led to this confusin conditional. The loop exiting condition for "we're done" should check that there are no more senders, and that there's nothing left to drain from the deferred_rpcs_ queue and that there's no pending insert into batch_queue_. So, the third wait loop condition should be something like: (num_remaining_senders > 0 || !deferred_rpcs_.empty() || num_pending_enqueue_ > 0) and then this if-stmt conditional can just be: if (batch_queue_.empty()) and then the DCHECK can be the negation of that third wait loop conditional: // Wait loop is exited with an empty batch_queue_ only when there will be no more batches. DCHECK(num_remaining_senders == 0 && deferred_rpcs_.empty() && num_pending_enqueue_ == 0); And then you can get rid of the outer loop. That outer loop should be removed since it's effectively a busy wait (and I think we could get into a busy wait state in the previous patchsets). -- To view, visit http://gerrit.cloudera.org:8080/8023 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic0b8c1e50678da66ab1547d16530f88b323ed8c1 Gerrit-Change-Number: 8023 Gerrit-PatchSet: 8 Gerrit-Owner: Michael HoGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 07 Nov 2017 15:58:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2248: Make idle session timeout a query option
Hello Michael Ho, Laszlo Gaal, Gabor Kaszab, Attila Jeges, Tim Armstrong, Csaba Ringhofer, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8490 to look at the new patch set (#3). Change subject: IMPALA-2248: Make idle_session_timeout a query option .. IMPALA-2248: Make idle_session_timeout a query option This commit makes idle_session_timeout a query option. idle_session_timeout currently can be set as a command line option, which will be the default timeout for sessions. HS2 sessions can override it with a smaller value by setting it in the configuration overlay of HS2 OpenSession(). However, we can't override idle_session_timeout for JDBC/ODBC connections, because we cannot put this in the connection string. This commit is a workaround for this problem, it allows JDBC/ODBC connections to set the session timeout as a query option with the SET statement. I created an automated test case in JdbcTest.java based on test_hs2.py::test_concurrent_session_mixed_idle_timeout. Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e --- M be/src/service/client-request-state.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/test/java/org/apache/impala/service/JdbcTest.java A fe/src/test/java/org/apache/impala/util/Metrics.java 10 files changed, 236 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/8490/3 -- To view, visit http://gerrit.cloudera.org:8080/8490 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I32e2775f80da387b0df4195fe2c5435b3f8e585e Gerrit-Change-Number: 8490 Gerrit-PatchSet: 3 Gerrit-Owner: Zoltan Borok-NagyGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/7793 ) Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. Patch Set 11: (8 comments) http://gerrit.cloudera.org:8080/#/c/7793/9//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/7793/9//COMMIT_MSG@27 PS9, Line 27: > Thanks for explaining. https://docs.google.com/document/d/1G-SPZelateebNxzVb67urEVtjc5Itw-B_jjfS85bSCE/edit?usp=sharing http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h File be/src/exec/filter-context.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/exec/filter-context.h@118 PS11, Line 118: Materialize filter values. > what does that mean? Done http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h File be/src/runtime/runtime-filter-bank.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/runtime/runtime-filter-bank.h@78 PS11, Line 78: /// may both be NULL, representing a filter that allows all rows to pass. > is it the case that at most one of bloom_filter and min_max_filter should b Done http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h File be/src/util/min-max-filter.h: http://gerrit.cloudera.org:8080/#/c/7793/11/be/src/util/min-max-filter.h@62 PS11, Line 62: Materialize filter values > what does that mean? Done http://gerrit.cloudera.org:8080/#/c/7793/9/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: http://gerrit.cloudera.org:8080/#/c/7793/9/common/thrift/PlanNodes.thrift@103 PS9, Line 103: 12: optional string kudu_col_name > case sensitive? Done http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java File fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java: http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@359 PS11, Line 359: public Operator getJoinOp() { return exprCmpOp_; } > getExprCmpOp() Done http://gerrit.cloudera.org:8080/#/c/7793/11/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java@602 PS11, Line 602: if (!(targetExpr instanceof SlotRef) > I think only explicit casts are problematic. Implicit casts should be ok, o Right, we should be able to support integer implicit casts. The complication is that if, say, the calculated max is outside of the range for the type of the targeted column, we can't just pass that value into Kudu as it will complain. In that case, we would need to convert the max we send to be the max for the type of the targeted column. I've added some code to the BE to do this conversion. http://gerrit.cloudera.org:8080/#/c/7793/11/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test: http://gerrit.cloudera.org:8080/#/c/7793/11/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test@144 PS11, Line 144: on a.month = cast(b.month + 1 as int); > Was this your change? Why the change? This was necessary if we didn't support implicit casts on the target. Removed now. -- To view, visit http://gerrit.cloudera.org:8080/7793 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c Gerrit-Change-Number: 7793 Gerrit-PatchSet: 11 Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Anonymous Coward #345 Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 07 Nov 2017 15:06:55 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4252: Min-max runtime filters for Kudu
Hello Michael Ho, Lars Volker, Matthew Jacobs, Anonymous Coward #345, Tim Armstrong, Todd Lipcon, Mostafa Mokhtar, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/7793 to look at the new patch set (#12). Change subject: IMPALA-4252: Min-max runtime filters for Kudu .. IMPALA-4252: Min-max runtime filters for Kudu This patch implements min-max filters for runtime filters. Each runtime filter generates a bloom filter or a min-max filter, depending on if it has HDFS or Kudu targets, respectively. In RuntimeFilterGenerator in the planner, each hash join node generates a bloom and min-max filter for each equi-join predicate, but only those filters that end up being assigned to a target make it into the final plan. Min-max filters are only assigned to Kudu scans if the target expr is a column, as Kudu doesn't support bounds on general exprs, and only if the join op is '=' and not 'is distinct from', as Kudu doesn't support returning NULLs if a bound is set. Min-max filters are inserted into by the PartitionedHashJoinBuilder. Codegen is used to eliminate branching on the type of filter. String min-max filters truncate their bounds at 1024 chars, so that the max amount of memory used by min-max filters is negligible. For now, min-max filters are only applied at the KuduScanner, which passes them into the Kudu client. Future work will address applying min-max filters at HDFS scan nodes and applying bloom filters at Kudu scan nodes. Functional Testing: - Added new planner tests and updated the old ones. (in old tests, a lot of runtime filters are renumbered as we always generate min-max filters even if they don't end up getting assigned and they take up some of the RF ids). - Updated existing runtime filter tests to work with Kudu. - Added e2e tests for min-max filter specific functionality. Perf Testing: - All tests run on Kudu stress cluster (10 nodes) and tpch_100_kudu, timings are averages of 3 runs. - Ran a contrived query with a filter that does not eliminate any rows (full self join of lineitem). The difference in running time was negligible - 24.46s with filters on, 24.15s with filters off for a ~1% slowdown. - Ran a contrived query with a filter that elimiates all rows (self join on lineitem with a join condition that never matches). The filters resulted in a significant speedup - 0.26s with filters on, 1.46s with filters off for a ~5.6x speedup. This query is added to targeted-perf. Change-Id: I02bad890f5b5f78388a3041bf38f89369b5e2f1c --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/exec/filter-context.cc M be/src/exec/filter-context.h M be/src/exec/hdfs-parquet-scanner-ir.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/kudu-scan-node-base.cc M be/src/exec/kudu-scan-node-mt.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-scanner.h M be/src/exec/kudu-util.cc M be/src/exec/kudu-util.h M be/src/exec/partitioned-hash-join-builder-ir.cc M be/src/exec/partitioned-hash-join-builder.cc M be/src/exec/scan-node.cc M be/src/runtime/coordinator-filter-state.h M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-filter-ir.cc M be/src/runtime/runtime-filter.cc M be/src/runtime/runtime-filter.h M be/src/runtime/runtime-filter.inline.h M be/src/runtime/timestamp-value.h M be/src/service/impala-internal-service.cc M be/src/util/CMakeLists.txt A be/src/util/min-max-filter-ir.cc A be/src/util/min-max-filter-test.cc A be/src/util/min-max-filter.cc A be/src/util/min-max-filter.h M common/thrift/Data.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanNode.java M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/join-order.test M testdata/workloads/functional-planner/queries/PlannerTest/joins.test M
[Impala-ASF-CR] IMPALA-2181: Add query option levels for display
Attila Jeges has posted comments on this change. ( http://gerrit.cloudera.org:8080/8447 ) Change subject: IMPALA-2181: Add query option levels for display .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py File shell/impala_shell.py: http://gerrit.cloudera.org:8080/#/c/8447/7/shell/impala_shell.py@240 PS7, Line 240: if len(advanced_options) > 0 and print_mode == QueryOptionDisplayModes.ALL_OPTIONS: : print '\nAdvanced Query Options:' : self._print_option_group(advanced_options, set_options) : if len(deprecated_options) > 0 and print_mode == QueryOptionDisplayModes.ALL_OPTIONS: : print '\nDeprecated Query Options:' : self._print_option_group(deprecated_options, set_options) nit: this can be simplified as: if print_mode == QueryOptionDisplayModes.ALL_OPTIONS: if advanced_options: print '\nAdvanced Query Options:' self._print_option_group(advanced_options, set_options) if deprecated_options: print '\nDeprecated Query Options:' self._print_option_group(deprecated_options, set_options) -- To view, visit http://gerrit.cloudera.org:8080/8447 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I75720d0d454527e1a0ed19bb43cf9e4f018ce1d1 Gerrit-Change-Number: 8447 Gerrit-PatchSet: 7 Gerrit-Owner: Gabor KaszabGerrit-Reviewer: Attila Jeges Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Gabor Kaszab Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Reviewer: Philip Zeyliger Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 07 Nov 2017 15:01:23 + Gerrit-HasComments: Yes