[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has uploaded a new patch set (#7). Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC This patch adds the core abstractions necessary for using Kudu RPC for Impala's backend services. The StatestoreService and StatestoreSubscriberService are both ported to KRPC to demonstrate how the new RPC framework operates. The main new class is RpcMgr, which is responsible for both services run by a daemon and for obtaining clients to remote services. The new Rpc class helps clients build rpc invocation objects, and use them to call remote methods. Also done: * Utility code to convert from a Thrift structure to a Protobuf wrapper. Thrift RPCs do not need to be ported to Protobuf to work. * Added more build support for Protobuf generation. All library targets now depend on the Protobuf generation step, just as they do Thrift. * thrift-server-test is rewritten to use Beeswax as a test service, since the old StatestoreService has been removed. * Remove InProcessStatestore that was only used for statestore-test and mini-impala-cluster. As far as we know, mini-impala-cluster is unused by anyone. (IMPALA-4784) TESTING: * Impala's test suite passes. * statestore-test has been rewritten to use new statestore code. It includes all test cases from the now-removed Python test_statestore.py * rpc-mgr-test unit-tests the RpcMgr and Rpc classes. TODO: * Enable SSL and Kerberos support (already in Kudu's imported libraries) Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/CMakeLists.txt M be/src/common/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/kudu-util.h M be/src/exec/parquet-column-stats.h M be/src/experiments/CMakeLists.txt M be/src/exprs/CMakeLists.txt M be/src/rpc/CMakeLists.txt A be/src/rpc/common.proto A be/src/rpc/rpc-mgr-test.cc A be/src/rpc/rpc-mgr.cc A be/src/rpc/rpc-mgr.h A be/src/rpc/rpc-mgr.inline.h A be/src/rpc/rpc.h A be/src/rpc/rpc_test.proto M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/bufferpool/CMakeLists.txt M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/service/CMakeLists.txt M be/src/service/query-exec-state.cc M be/src/statestore/CMakeLists.txt M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h A be/src/statestore/statestore.proto M be/src/statestore/statestored-main.cc M be/src/testutil/CMakeLists.txt M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h D be/src/testutil/mini-impala-cluster.cc M be/src/transport/CMakeLists.txt M be/src/udf/CMakeLists.txt M be/src/udf_samples/CMakeLists.txt M be/src/util/CMakeLists.txt M be/src/util/collection-metrics.h M be/src/util/counting-barrier.h M bin/start-impala-cluster.py M bin/start-impalad.sh M common/thrift/CMakeLists.txt M common/thrift/StatestoreService.thrift D tests/statestore/test_statestore.py 54 files changed, 1,744 insertions(+), 1,087 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/7 -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/5720/7/be/src/rpc/rpc.h File be/src/rpc/rpc.h: Line 55: template > p is only used in Execute(). why not add p as a template parameter to that I think if we do that, we'd have to list all the template parameters for Execute(); i.e. if it can't infer one of them we have to list all of them. -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. Patch Set 5: (3 comments) The next patch after this will be a rebase, but will contain no functional differences. This patch set only adds a comment. http://gerrit.cloudera.org:8080/#/c/5720/4/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: Line 166: .Execute(::Ping, request, ); > is there a todo somewhere? No - I think this is part of the larger synchronisation effort between the two sets of utility libraries. http://gerrit.cloudera.org:8080/#/c/5720/5/be/src/rpc/rpc.h File be/src/rpc/rpc.h: Line 49: static constexpr int RPC_DEFAULT_MAX_ATTEMPTS = 3; > move them into the class so they're naturally qualified? I really couldn't make that work well with a templated class. I'll leave a TODO. http://gerrit.cloudera.org:8080/#/c/5720/4/be/src/statestore/statestore-test.cc File be/src/statestore/statestore-test.cc: Line 379: subscribers_[0]->WaitForFailure(statestore_.get()); > ? Done -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. Patch Set 7: This patch is a rebase. The only outstanding issue is the performance of name lookups in the RPC. -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has uploaded a new patch set (#6). Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC This patch adds the core abstractions necessary for using Kudu RPC for Impala's backend services. The StatestoreService and StatestoreSubscriberService are both ported to KRPC to demonstrate how the new RPC framework operates. The main new class is RpcMgr, which is responsible for both services run by a daemon and for obtaining clients to remote services. The new Rpc class helps clients build rpc invocation objects, and use them to call remote methods. Also done: * Utility code to convert from a Thrift structure to a Protobuf wrapper. Thrift RPCs do not need to be ported to Protobuf to work. * Added more build support for Protobuf generation. All library targets now depend on the Protobuf generation step, just as they do Thrift. * thrift-server-test is rewritten to use Beeswax as a test service, since the old StatestoreService has been removed. * Remove InProcessStatestore that was only used for statestore-test and mini-impala-cluster. As far as we know, mini-impala-cluster is unused by anyone. (IMPALA-4784) TESTING: * Impala's test suite passes. * statestore-test has been rewritten to use new statestore code. It includes all test cases from the now-removed Python test_statestore.py * rpc-mgr-test unit-tests the RpcMgr and Rpc classes. TODO: * Enable SSL and Kerberos support (already in Kudu's imported libraries) Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/CMakeLists.txt M be/src/common/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/kudu-util.h M be/src/experiments/CMakeLists.txt M be/src/exprs/CMakeLists.txt M be/src/rpc/CMakeLists.txt A be/src/rpc/common.proto A be/src/rpc/rpc-mgr-test.cc A be/src/rpc/rpc-mgr.cc A be/src/rpc/rpc-mgr.h A be/src/rpc/rpc-mgr.inline.h A be/src/rpc/rpc.h A be/src/rpc/rpc_test.proto M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/bufferpool/CMakeLists.txt M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/service/CMakeLists.txt M be/src/service/query-exec-state.cc M be/src/statestore/CMakeLists.txt M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h A be/src/statestore/statestore.proto M be/src/statestore/statestored-main.cc M be/src/testutil/CMakeLists.txt M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h D be/src/testutil/mini-impala-cluster.cc M be/src/transport/CMakeLists.txt M be/src/udf/CMakeLists.txt M be/src/udf_samples/CMakeLists.txt M be/src/util/CMakeLists.txt M be/src/util/collection-metrics.h M be/src/util/counting-barrier.h M bin/start-impala-cluster.py M bin/start-impalad.sh M common/thrift/CMakeLists.txt M common/thrift/StatestoreService.thrift D tests/statestore/test_statestore.py 53 files changed, 1,740 insertions(+), 1,083 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/6 -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/5720/4/be/src/statestore/statestore-test.cc File be/src/statestore/statestore-test.cc: Line 379: subscribers_[0]->WaitForFailure(statestore_.get()); > what's done? i was wondering why the line disappeared. Sorry, moving fast. This was a bug: if the statestore could detect the failure of the beartbeat before it got around to sending an update, the update would never show up. There's no reason for this line - I think it was a bug in the Python version of this test but it never got triggered because I the heartbeat frequency was left at its normal levels. In this test, I decrease the hb frequency significantly (so the test runs faster!), and that triggered this bug. -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4856: Port ImpalaInternalService to KRPC
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/5888 Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC .. IMPALA-4856: Port ImpalaInternalService to KRPC This patch ports the ImpalaInternalService to KRPC. * ImpalaInternalService is split into two KRPC services. The first, ImpalaInternalService, deals with control messages for plan fragment instance execution, cancellation and reporting. The second, DataStreamService, handles large-payload RPCs for transmitting runtime filters and row batches between hosts. The separation allows us to dedicate resources to each service, rather than have them compete for the same thread pool and queue space. * In the DataStreamService, all RPCs use 'native' protobuf. ImpalaInternalService RPCs remain wrappers of Thrift data structures. * This patch adds support for asynchronous RPCs to the RpcMgr and Rpc classes. Previously, Impala used fixed size thread pools + synchronous RPCs to achieve some parallelism for 'broadcast' RPCs like filter propagation, or a dedicated per-sender+receiver pair thread on the sender side in the DataStreamSender case. In this patch, the PublishFilter(), CancelPlanFragment() and TransmitData() RPCs are all sent asynchronously using KRPC's thread pools. * As a result, DataStreamSender no longer creates a thread-per-connection on the sender side. In this patch, the receiver-side still blocks if the receiver is unable to process a new row batch. A follow-on patch will change that to send notifications to the sender asynchronously, without blocking in the receiver threads. * A large portion of this patch is the replacement of TRowBatch with its Protobuf equivalent, RowBatchPb. The replacement is a literal port of the data structure, and row-batch-test, row-batch-list-test and row-batch-serialize-benchmark continue to execute without logic changes. * This patch also begins a clean-up of how ImpalaServer instances are created (by removing CreateImpalaServer), and clarifying the relationship between ExecEnv and ImpalaServer. ImpalaServer now follows the standard construct->Init()->Start()->Join() lifecycle that we use for other services. TESTING --- * New tests added to rpc-mgr-test. TO DO - * Re-enable throughput and latency measurements per data-stream sender when that information is exposed from KRPC (KUDU-1738). Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 --- M .clang-format M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/benchmarks/bloom-filter-benchmark.cc M be/src/benchmarks/expr-benchmark.cc M be/src/benchmarks/row-batch-serialize-benchmark.cc M be/src/common/status.cc M be/src/common/status.h M be/src/rpc/CMakeLists.txt M be/src/rpc/common.proto M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc.h M be/src/rpc/thrift-server-test.cc D be/src/runtime/backend-client.h M be/src/runtime/client-cache-types.h M be/src/runtime/client-cache.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/data-stream-mgr.cc M be/src/runtime/data-stream-mgr.h M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-recvr.h M be/src/runtime/data-stream-sender.cc M be/src/runtime/data-stream-sender.h M be/src/runtime/data-stream-test.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/plan-fragment-executor.h M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-filter-bank.h M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/scheduling/request-pool-service.h M be/src/scheduling/simple-scheduler-test-util.h M be/src/service/CMakeLists.txt M be/src/service/fe-support.cc M be/src/service/frontend.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h A be/src/service/impala_internal_service.proto M be/src/service/impalad-main.cc M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h M be/src/util/bloom-filter-test.cc M be/src/util/bloom-filter.cc M be/src/util/bloom-filter.h M be/src/util/hdfs-util-test.cc M common/thrift/CMakeLists.txt M common/thrift/ImpalaInternalService.thrift M common/thrift/Results.thrift 59 files changed, 1,400 insertions(+), 1,264 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/5888/1 -- To view, visit http://gerrit.cloudera.org:8080/5888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 Gerrit-PatchSet: 1
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 7: (22 comments) Took a look, had some comments. http://gerrit.cloudera.org:8080/#/c/5792/7/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: Line 37 where has this gone? where does the definition for Webserver::UrlCallback come from now? It's better to include-what-you-use. PS7, Line 92: bind(LogLevelCallBack, _1, _2)) Everywhere else uses lambdas, please do the same here. http://gerrit.cloudera.org:8080/#/c/5792/7/be/src/util/logging-support.cc File be/src/util/logging-support.cc: PS7, Line 79: / only // for .cc files PS7, Line 78: static jclass log4j_logger_class_; : /// Jni method descriptors corresponding to getLogLevel() and setLogLevel() operations. : static jmethodID get_log_level_method; // GlogAppender.getLogLevel() : static jmethodID set_log_level_method; // GlogAppender.setLogLevel() : static jmethodID reset_log_levels_method; // GlogAppender.resetLogLevels() Do these need to be in the impala namespace? If not, move to anonymous namespace. PS7, Line 187: rapidjson remove PS7, Line 197: get_ don't need to prefix the parameters with 'get' PS7, Line 207: return why not SetErrorMsg() ? PS7, Line 207: NULL prefer nullptr now PS7, Line 224: if (result == NULL) return; how can result be nullptr if it's a stack-allocated string? This doesn't compile for me: std::string foo; if (foo == nullptr) return 1; PS7, Line 225: result.insert(0, "Effective log level: "); This kind of presentation logic probably belongs in the template, not here. Line 227: } else if (args.find("reset_java_log") != args.end()) { It might be easier to have several different callbacks: /reset_java_log /set_java_log?class=foo=2 to help break this method up. They can all still use the same template. Line 247: Status s("Bad glog level input. Valid inputs are integers in [0-3] range."); > in the [0-3] range Even better: in the range [0-3] PS7, Line 256: FLAGS_v gflags has a SetCommandLineOption() API. Consider using that here instead? PS7, Line 257: result = "Glog logging level reset to: " + std::to_string(FLAGS_v); use Substitute() PS7, Line 260: Value output(result.c_str(), document->GetAllocator()); : document->AddMember(display_member, output, document->GetAllocator()); I think it would be clearer to do this inline, and return rather than carry result and display_member through the method. Then you can get rid of the 'else's everywhere. http://gerrit.cloudera.org:8080/#/c/5792/7/be/src/util/logging-support.h File be/src/util/logging-support.h: PS7, Line 51: /// Helper method to set the log level of given Java class. It is a JNI wrapper around : /// GlogAppender.setLogLevel(). : Status SetJavaLogLevel(const TSetJavaLogLevelParams& params, string* response); : : /// Helper method to get the log level of given Java class. It is a JNI wrapper around : /// GlogAppender.getLogLevel(). : Status GetJavaLogLevel(const TGetJavaLogLevelParams& params, string* response); Why are these exported? There are no other consumers, right? Line 62: void LogLevelCallBack(const Webserver::ArgumentMap& args, rapidjson::Document* document); How about "RegisterLogLevelCallback(string url, Webserver*)" ? Then you don't need to worry about getting the rapidjson types declared. Line 66: void SetErrorMessage(const Status& status, const char* member, No need for this to be declared in the header. http://gerrit.cloudera.org:8080/#/c/5792/7/common/thrift/Logging.thrift File common/thrift/Logging.thrift: PS7, Line 45: TResetJavaLogLevelParams Reset implies 'return to defaults'. So it's confusing that this takes parameters - are those the defaults? http://gerrit.cloudera.org:8080/#/c/5792/7/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS7, Line 56: + it might be easier to read to wrap the string in parentheses: get_loglevel_url = ("http://localhost:25000/log_level?; "getclass=org.apache.impala.catalog.HdfsTable_java_log=...") http://gerrit.cloudera.org:8080/#/c/5792/7/www/log_level.tmpl File www/log_level.tmpl: PS7, Line 32: b use instead of PS7, Line 76: Is there any way not to use tables for layout in this file? They are hard to maintain. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada
[Impala-ASF-CR] IMPALA-3909 (follow-up): Properly qualify min() and max() in header
Henry Robinson has posted comments on this change. Change subject: IMPALA-3909 (follow-up): Properly qualify min() and max() in header .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5916/2//COMMIT_MSG Commit Message: Line 14: I tried to make a static_assert method that complained if names.h was > clang-tidy can check this, IIRC: Can it check if a using statement is transitively included? It wasn't clear from those links. -- To view, visit http://gerrit.cloudera.org:8080/5916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3909 (follow-up): Properly qualify min() and max() in header
Henry Robinson has uploaded a new patch set (#2). Change subject: IMPALA-3909 (follow-up): Properly qualify min() and max() in header .. IMPALA-3909 (follow-up): Properly qualify min() and max() in header parquet-column-stats.h was using unqualified references to std::max and std::min, which we disallow in header files. The reason this compiled is that some headers included "names.h". This commit flushes those out as well, and fixes the resulting compilation errors. I tried to make a static_assert method that complained if names.h was included in a header file, but couldn't do so without needing an extra macro, e.g.: USE_IMPORTED_NAMES(); Which seemed too verbose for now. Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b --- M be/src/exec/parquet-column-stats.h M be/src/exec/write-stream.inline.h M be/src/exprs/anyval-util.h M be/src/runtime/fragment-instance-state.cc M be/src/runtime/query-exec-mgr.cc M be/src/runtime/query-state.cc M be/src/service/impala-internal-service.cc M be/src/util/process-state-info.h M be/src/util/uid-util.h 9 files changed, 22 insertions(+), 20 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/5916/2 -- To view, visit http://gerrit.cloudera.org:8080/5916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Jim Apple
[Impala-ASF-CR] IMPALA-3909 (follow-up): Properly qualify min() and max() in header
Henry Robinson has posted comments on this change. Change subject: IMPALA-3909 (follow-up): Properly qualify min() and max() in header .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/5916/1//COMMIT_MSG Commit Message: Line 7: IMPALA-3909 (follow-up): Properly qualify min() and max() in header > Why? because we require names to be fully qualified in headers. http://gerrit.cloudera.org:8080/#/c/5916/1/be/src/exec/parquet-column-stats.h File be/src/exec/parquet-column-stats.h: Line 106: min_value_ = std::min(min_value_, v); > Also need #include Done -- To view, visit http://gerrit.cloudera.org:8080/5916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/5792/7/common/thrift/Logging.thrift File common/thrift/Logging.thrift: PS7, Line 45: TResetJavaLogLevelParams > Yes, I'll do it then. Just wanted to confirm that is the ask here. I would also settle for comments and / or a better method name if this is the 'right' place to have the parameters. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4874: Increase the maximum KRPC message size to 4GB
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/5887 Change subject: IMPALA-4874: Increase the maximum KRPC message size to 4GB .. IMPALA-4874: Increase the maximum KRPC message size to 4GB KRPC restricts the maximum message size with --rpc_max_message_size. The Protobuf API used to deserialize messages also has a maximum size. Both can be raised, and should be as a stop-gap until Impala has mechanisms to prevent all its RPC payloads getting too large. Note that some statestore payloads may still be larger than the 4GB max. IMPALA-4875 will handle that for the statestore case particularly. TESTING --- * This allows test_very_large_string to pass with KRPC. * Tests added to statestore-test and rpc-mgr-test for large payloads. Change-Id: Idb97dd4604548c404278fe8094db2e419021a7c3 --- M be/src/kudu/rpc/serialization.cc M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr.cc M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc_test.proto M be/src/statestore/statestore-test.cc 6 files changed, 88 insertions(+), 8 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/5887/1 -- To view, visit http://gerrit.cloudera.org:8080/5887 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Idb97dd4604548c404278fe8094db2e419021a7c3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-3909 (follow-up): Properly qualify min() and max() in header
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/5916 Change subject: IMPALA-3909 (follow-up): Properly qualify min() and max() in header .. IMPALA-3909 (follow-up): Properly qualify min() and max() in header Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b --- M be/src/exec/parquet-column-stats.h 1 file changed, 4 insertions(+), 4 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/5916/1 -- To view, visit http://gerrit.cloudera.org:8080/5916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: If8381d5b1fa072e89ac1f7f54959c02bf8e43b4b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-3785: Record query handle for invalid handle
Henry Robinson has posted comments on this change. Change subject: IMPALA-3785: Record query handle for invalid handle .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/5748/3/be/src/util/debug-util-test.cc File be/src/util/debug-util-test.cc: Line 81: TEST(DebugUtil, ErrMsgBenchmark) { If you want to keep this around, look in be/src/benchmarks - that's where we keep benchmark code. -- To view, visit http://gerrit.cloudera.org:8080/5748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc113b3673e1b90f81e80e841740b8006bfd31ba Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: PS3, Line 52: 4 > Could be left as future work, but just wanted to bring it up. Do you think Yeah, this should be plumbed through, will do that. PS3, Line 64: service_pool->Init > Safer to Init() after calling messenger_->RegisterService() ? Hm, why do you think so? Seems more safe to get the threads running before the service is 'exposed' via RegisterService() (even though StartServices() is where requests are actually allowed to start arriving). PS3, Line 74: int32_t num_acceptor_threads > If this is already a user controlled flag, why pass it in as a parameter? Because the caller may not want to respect the flag's value (maybe for tests or something). http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/rpc.h File be/src/rpc/rpc.h: PS3, Line 90: func > As I understand it currently, if 'func' is the asynchronous variant of the You simply can't call asynchronous functions with Execute() - there's no callback provided to the method invocation on line 108. ExecuteAsync() is in the next patch. There's no use cases for it in this one. -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has uploaded a new patch set (#4). Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC This patch adds the core abstractions necessary for using Kudu RPC for Impala's backend services. The StatestoreService and StatestoreSubscriberService are both ported to KRPC to demonstrate how the new RPC framework operates. The main new class is RpcMgr, which is responsible for both services run by a daemon and for obtaining clients to remote services. The new Rpc class helps clients build rpc invocation objects, and use them to call remote methods. Also done: * Utility code to convert from a Thrift structure to a Protobuf wrapper. Thrift RPCs do not need to be ported to Protobuf to work. * Added more build support for Protobuf generation. All library targets now depend on the Protobuf generation step, just as they do Thrift. * thrift-server-test is rewritten to use Beeswax as a test service, since the old StatestoreService has been removed. * Remove InProcessStatestore that was only used for statestore-test and mini-impala-cluster. As far as we know, mini-impala-cluster is unused by anyone. (IMPALA-4784) TESTING: * Impala's test suite passes. * statestore-test has been rewritten to use new statestore code. It includes all test cases from the now-removed Python test_statestore.py * rpc-mgr-test unit-tests the RpcMgr and Rpc classes. TODO: * Enable SSL and Kerberos support (already in Kudu's imported libraries) Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/CMakeLists.txt M be/src/common/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/kudu-util.h M be/src/experiments/CMakeLists.txt M be/src/exprs/CMakeLists.txt M be/src/rpc/CMakeLists.txt A be/src/rpc/common.proto A be/src/rpc/rpc-mgr-test.cc A be/src/rpc/rpc-mgr.cc A be/src/rpc/rpc-mgr.h A be/src/rpc/rpc-mgr.inline.h A be/src/rpc/rpc.h A be/src/rpc/rpc_test.proto M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/bufferpool/CMakeLists.txt M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/service/CMakeLists.txt M be/src/service/query-exec-state.cc M be/src/statestore/CMakeLists.txt M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h A be/src/statestore/statestore.proto M be/src/statestore/statestored-main.cc M be/src/testutil/CMakeLists.txt M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h D be/src/testutil/mini-impala-cluster.cc M be/src/transport/CMakeLists.txt M be/src/udf/CMakeLists.txt M be/src/udf_samples/CMakeLists.txt M be/src/util/CMakeLists.txt M be/src/util/collection-metrics.h M be/src/util/counting-barrier.h M bin/start-impala-cluster.py M bin/start-impalad.sh M common/thrift/CMakeLists.txt M common/thrift/StatestoreService.thrift D tests/statestore/test_statestore.py 53 files changed, 1,727 insertions(+), 1,083 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/4 -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4822: Dynamic log level changes to Catalog and Frontend JVMs
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Dynamic log level changes to Catalog and Frontend JVMs .. Patch Set 1: Could you also add support for changing log levels in C++ code? I think that's possible since gflags lets you change flags through an API. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. Patch Set 3: (28 comments) Although it might look like a lot has changed in this new patch, there are only two relatively minor big changes: 1. Rewrite test_statestore.py in C++ (see statestore-test.cc) to make sure we don't lose any of that coverage. 2. Remove InProcessStatestore which was unused and causing pain to keep up-to-date. http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/common.proto File be/src/rpc/common.proto: Line 27: message ThriftWrapperPB { required bytes thrift_struct = 1; } > a brief comment regarding the purpose would be good. Done http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: Line 44: DEFINE_int32(num_acceptor_threads, 2, "Number of threads dedicated to accepting " > why so low? Acceptor threads don't do much (just create a socket and pass it to the reactor threads). This is unlike Thrift where the acceptor thread also did the SASL negotiation, which was expensive and harmed throughput. So really this is double the number of Thrift threads, and each will have less work to do per cnxn. We expect the number of connections to be much lower with KRPC as well. PS3, Line 52: 4 > Yeah, this should be plumbed through, will do that. Done Line 59: DCHECK(is_inited()) << "Must call Init() before RegisterService()"; > dcheck that services haven't been started yet, or does that not matter? Done Line 62: new ServicePool(move(scoped_ptr), messenger_->metric_entity(), service_queue_depth); > why not just pass service_ptr.release()? I can't quite do that because the gscoped_ptr(T*) c'tor is explicit. I can do: new ServicePool(gscoped_ptr(service_ptr.release()) PS3, Line 64: service_pool->Init > good point. if the init call fails, do we abort impalad startup? (i would a yes indeed (and statestore startup). PS3, Line 74: int32_t num_acceptor_threads > so then move the flag to where it's used? it's used in several places (anywhere that has an RpcMgr might want to use the flag). So I think the DEFINE(..) here, DECLARE(...) in modules that want to use it is the right pattern. http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: PS3, Line 41: 0 > what does it mean to manage 0 services? It means that there are 0 services managed by the RpcMgr? Since the RpcMgr also helps get proxies, you don't have to register a service for it to be useful. PS3, Line 49: CLIENTS > why define new terminology? isn't this just "proxy" in KuduRPC speak? I've tried to standardize a bit better. 'clients' and 'servers' are pretty established nouns in this space though. PS3, Line 52: GetProxy > so I guess RpcMgr deals with both the server and client side of things? As See line 36: "Central manager for all RPC services and clients". A service is a network-exported set of methods that can be called using the KRPC protocol. Speaking precisely, a service 'definition' is the set of methods (defined in a protobuf), but a service is the instantiation of that service definition on a particular machine. It's a server-side only construct. A proxy allows clients to easily call methods on one service, and so they share the same set of method signatures (i.e. they refer to the same service definition). Typically a proxy and a service are used from different machines, and KRPC does not check that the remote service has been started before creating a proxy object. Connectivity / service availability checks are performed when the actual RPC method is invoked. Line 64: /// RpcMgr::RegisterService() before RpcMgr::StartServices() is called. > does init() go before register()? Done Line 111: /// Creates a new proxy for a remote service of type P at location 'address', and places > might want to point out here what P needs to be (auto-generated from a serv Done http://gerrit.cloudera.org:8080/#/c/5720/3/be/src/rpc/rpc-mgr.inline.h File be/src/rpc/rpc-mgr.inline.h: Line 39: kudu::HostPort(address.hostname, address.port).ResolveAddresses(), > how expensive is this? It's a DNS subsystem call. Without the nscd (nameservice caching daemon) running, the mean response time was ~5ms (although the 99.9th percentile was lower - there are larger timeout events that are expensive). However, nscd brings the mean response time down to about 10us per invocation. We (Cloudera) recommend that users have nscd running for CDH deployments; we should make sure that that is emphasised for Impala as well (it would make a difference no matter if KRPC was used or not). nscd also seems to reduce the frequency of outliers as well. Line 41: proxy->reset(new P(messenger_, addresses[0])); > dcheck that addresses isn't empty? Done
[Impala-ASF-CR] IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has uploaded a new patch set (#5). Change subject: IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC .. IMPALA-{4670,4672,4784}: Add RpcMgr and port Statestore services to KRPC This patch adds the core abstractions necessary for using Kudu RPC for Impala's backend services. The StatestoreService and StatestoreSubscriberService are both ported to KRPC to demonstrate how the new RPC framework operates. The main new class is RpcMgr, which is responsible for both services run by a daemon and for obtaining clients to remote services. The new Rpc class helps clients build rpc invocation objects, and use them to call remote methods. Also done: * Utility code to convert from a Thrift structure to a Protobuf wrapper. Thrift RPCs do not need to be ported to Protobuf to work. * Added more build support for Protobuf generation. All library targets now depend on the Protobuf generation step, just as they do Thrift. * thrift-server-test is rewritten to use Beeswax as a test service, since the old StatestoreService has been removed. * Remove InProcessStatestore that was only used for statestore-test and mini-impala-cluster. As far as we know, mini-impala-cluster is unused by anyone. (IMPALA-4784) TESTING: * Impala's test suite passes. * statestore-test has been rewritten to use new statestore code. It includes all test cases from the now-removed Python test_statestore.py * rpc-mgr-test unit-tests the RpcMgr and Rpc classes. TODO: * Enable SSL and Kerberos support (already in Kudu's imported libraries) Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/CMakeLists.txt M be/src/common/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/kudu-util.h M be/src/experiments/CMakeLists.txt M be/src/exprs/CMakeLists.txt M be/src/rpc/CMakeLists.txt A be/src/rpc/common.proto A be/src/rpc/rpc-mgr-test.cc A be/src/rpc/rpc-mgr.cc A be/src/rpc/rpc-mgr.h A be/src/rpc/rpc-mgr.inline.h A be/src/rpc/rpc.h A be/src/rpc/rpc_test.proto M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/bufferpool/CMakeLists.txt M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/service/CMakeLists.txt M be/src/service/query-exec-state.cc M be/src/statestore/CMakeLists.txt M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h A be/src/statestore/statestore.proto M be/src/statestore/statestored-main.cc M be/src/testutil/CMakeLists.txt M be/src/testutil/in-process-servers.cc M be/src/testutil/in-process-servers.h D be/src/testutil/mini-impala-cluster.cc M be/src/transport/CMakeLists.txt M be/src/udf/CMakeLists.txt M be/src/udf_samples/CMakeLists.txt M be/src/util/CMakeLists.txt M be/src/util/collection-metrics.h M be/src/util/counting-barrier.h M bin/start-impala-cluster.py M bin/start-impalad.sh M common/thrift/CMakeLists.txt M common/thrift/StatestoreService.thrift D tests/statestore/test_statestore.py 53 files changed, 1,738 insertions(+), 1,083 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/5 -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has uploaded a new patch set (#3). Change subject: IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC .. IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC This patch adds the core abstractions necessary for using Kudu RPC for Impala's backend services. The StatestoreService and StatestoreSubscriberService are both ported to KRPC to demonstrate how the new RPC framework operates. The main new class is RpcMgr, which is responsible for both services run by a daemon and for obtaining clients to remote services. Also done: * Utility code to convert from a Thrift structure to a Protobuf wrapper. Thrift RPCs do not need to be ported to Protobuf to work. * Added more build support for Protobuf generation. All library targets now depend on the Protobuf generation step, just as they do Thrift. * thrift-server-test is rewritten to use Beeswax as a test service, since the old StatestoreService has been removed. TESTING: * Impala's test suite passes. * statestore-test has been rewritten to use new statestore code. * rpc-mgr-test unit-tests the RpcMgr and Rpc classes. TODO: * Enable SSL and Kerberos support (already in Kudu's imported libraries) * Rewrite test_statestore.py in C++. We don't have protobuf support in Python. Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 --- M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/catalog/CMakeLists.txt M be/src/catalog/catalog-server.cc M be/src/catalog/catalog-server.h M be/src/codegen/CMakeLists.txt M be/src/common/CMakeLists.txt M be/src/exec/CMakeLists.txt M be/src/exec/catalog-op-executor.cc M be/src/exec/hdfs-avro-scanner-test.cc M be/src/exec/kudu-util.h M be/src/experiments/CMakeLists.txt M be/src/exprs/CMakeLists.txt M be/src/rpc/CMakeLists.txt A be/src/rpc/common.proto A be/src/rpc/rpc-mgr-test.cc A be/src/rpc/rpc-mgr.cc A be/src/rpc/rpc-mgr.h A be/src/rpc/rpc-mgr.inline.h A be/src/rpc/rpc.h A be/src/rpc/rpc_test.proto M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-util.h M be/src/runtime/CMakeLists.txt M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/bufferpool/CMakeLists.txt M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/scheduling/CMakeLists.txt M be/src/service/CMakeLists.txt M be/src/service/query-exec-state.cc M be/src/statestore/CMakeLists.txt M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h A be/src/statestore/statestore-test.cc M be/src/statestore/statestore.cc M be/src/statestore/statestore.h A be/src/statestore/statestore.proto M be/src/statestore/statestored-main.cc M be/src/testutil/CMakeLists.txt M be/src/transport/CMakeLists.txt M be/src/udf/CMakeLists.txt M be/src/udf_samples/CMakeLists.txt M be/src/util/CMakeLists.txt M be/src/util/collection-metrics.h M common/thrift/CMakeLists.txt M common/thrift/StatestoreService.thrift M tests/statestore/test_statestore.py 47 files changed, 1,771 insertions(+), 749 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/5720/3 -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-5748 Record query handle for invalid handle
Henry Robinson has posted comments on this change. Change subject: IMPALA-5748 Record query handle for invalid handle .. Patch Set 2: (2 comments) Do you think the perf difference is worth considering replacing Substitute()? I don't think that Substitute() appears (or should) on perf critical paths, and I gotta say I'm not in love with karma's syntax. http://gerrit.cloudera.org:8080/#/c/5748/2//COMMIT_MSG Commit Message: PS2, Line 7: 5748 I think this is the wrong jira (plus, nitpicking, add a : after the number for consistency). http://gerrit.cloudera.org:8080/#/c/5748/2/be/src/service/impala-beeswax-server.cc File be/src/service/impala-beeswax-server.cc: Line 38: #include We really try not to expand our boost footprint if at all possible. -- To view, visit http://gerrit.cloudera.org:8080/5748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc113b3673e1b90f81e80e841740b8006bfd31ba Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Zach Amsden Gerrit-HasComments: Yes
[Impala-ASF-CR] Prepare for official 2.8 release by icnrementing version number
Henry Robinson has posted comments on this change. Change subject: Prepare for official 2.8 release by icnrementing version number .. Patch Set 1: (1 comment) one typo. http://gerrit.cloudera.org:8080/#/c/5763/1//COMMIT_MSG Commit Message: PS1, Line 7: icnrementing incrementing -- To view, visit http://gerrit.cloudera.org:8080/5763 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I2dd16c98c32f8bece3a886c6d5e97c86720806e4 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4926: Upgrade LZ4 to 1.7.5
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6030 Change subject: IMPALA-4926: Upgrade LZ4 to 1.7.5 .. IMPALA-4926: Upgrade LZ4 to 1.7.5 LZ4 has deprecated the method names LZ4_compress and LZ4_uncompress, so rename those to their new versions LZ4_compress_default() and LZ4_decompress_fast() respectively. Otherwise no changes are required for compatibility with LZ4 1.7.5. Change-Id: I10e4561d0e940fa15ca8248c8277acfc6dff3da3 --- M be/src/util/compress.cc M be/src/util/decompress.cc M bin/impala-config.sh 3 files changed, 6 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/6030/1 -- To view, visit http://gerrit.cloudera.org:8080/6030 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I10e4561d0e940fa15ca8248c8277acfc6dff3da3 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4904,IMPALA-4914: whitelist workload "targeted-stress" for exhaustive tests
Henry Robinson has posted comments on this change. Change subject: IMPALA-4904,IMPALA-4914: whitelist workload "targeted-stress" for exhaustive tests .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/6002/3//COMMIT_MSG Commit Message: Line 7: IMPALA-4904,IMPALA-4914: whitelist workload "targeted-stress" for exhaustive tests > I agree this loses some information, but I think it's the right tradeoff: Although I'm not keen on the strict 72-char limit (we have many other commit messages that go over this - what's the reason for enforcing it?) - note you can save some chars by compressing the list of fixed JIRAs: IMPALA-{4904, 4914}: etc. -- To view, visit http://gerrit.cloudera.org:8080/6002 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie6bd5bbd380e636d680368e908519b042d79dfec Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael BrownGerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Jim Apple Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4934: Disable Kudu OpenSSL initialization
Henry Robinson has posted comments on this change. Change subject: IMPALA-4934: Disable Kudu OpenSSL initialization .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/6056/3/be/src/rpc/authentication.cc File be/src/rpc/authentication.cc: PS3, Line 663: impala remove? http://gerrit.cloudera.org:8080/#/c/6056/3/bin/impala-config.sh File bin/impala-config.sh: PS3, Line 75: IMPALA_TOOLCHAIN_BUILD_ID Don't you need to change this as well? -- To view, visit http://gerrit.cloudera.org:8080/6056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f13f3af512c6d771979638da593685524c73086 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4934: Disable Kudu OpenSSL initialization
Henry Robinson has posted comments on this change. Change subject: IMPALA-4934: Disable Kudu OpenSSL initialization .. Patch Set 4: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/6056/3/bin/impala-config.sh File bin/impala-config.sh: PS3, Line 75: IMPALA_TOOLCHAIN_BUILD_ID > I actually don't need to because I built this Kudu with a new jenkins job ( Sounds ok. If you're revving nightly, it might be better just to have a 'latest' symlink (just like a SNAPSHOT version) than to make a commit every night, but whatever is easiest. -- To view, visit http://gerrit.cloudera.org:8080/6056 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I3f13f3af512c6d771979638da593685524c73086 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Matthew JacobsGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI
Henry Robinson has posted comments on this change. Change subject: IMPALA-4885: Expose Jvm thread info in web UI .. Patch Set 1: (5 comments) this is pretty cool. Got a couple of suggestions about organisation and presentation, let me know what you think. http://gerrit.cloudera.org:8080/#/c/6013/1/be/src/util/thread.cc File be/src/util/thread.cc: PS1, Line 209: RegisterUrlCallback Rather than adding another top-level link, consider making two tabs on the threadz.tmpl page - maybe thread overview and JVM threads for now. See query_detail_tabs.tmpl for how to add tabs to a page in bootstrap. PS1, Line 234: bool track_jvm_threads, how about just making this a member of ThreadMgr? Actually - I don't think ThreadMgr needs to know about JVM threads at all, right? Why not just have something like: void ThreadOverviewUrlCallback(...) { thread_mgr->GetThreadOverview(document); GetJvmThreadOverview(document); } and then you avoid having to pass track_jvm_threads to ThreadMgr at all, which is good because it doesn't really track or manage those threads. Line 260: << status.GetDetail(); return, then remove else { } PS1, Line 262: jvmThreadsVal C++ naming styles http://gerrit.cloudera.org:8080/#/c/6013/1/www/jvm-threadz.tmpl File www/jvm-threadz.tmpl: PS1, Line 35: Is n Native -- To view, visit http://gerrit.cloudera.org:8080/6013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] Three misc webpage changes
Henry Robinson has posted comments on this change. Change subject: Three misc webpage changes .. Patch Set 1: Anyone want to take a quick look at this? -- To view, visit http://gerrit.cloudera.org:8080/6045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica0578dabb7e27e6fd45ee4f31a1418ac3adc891 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI
Henry Robinson has posted comments on this change. Change subject: IMPALA-4885: Expose Jvm thread info in web UI .. Patch Set 2: (11 comments) Just minor things. http://gerrit.cloudera.org:8080/#/c/6013/2/be/src/util/thread.cc File be/src/util/thread.cc: PS2, Line 75: void JvmThreadsUrlCallback(const Webserver::ArgumentMap& args, Document* doc); : : void ThreadOverviewUrlCallback(bool track_jvm_threads, const Webserver::ArgumentMap& args, : Document* document); : : void RegisterUrlCallbacks(bool track_jvm_threads, Webserver* webserver); : : void GetJvmThreadOverview(Document* document); put all these in an anonymous namespace to avoid polluting the global symbol table. PS2, Line 216: NULL nullptr, here and everywhere PS2, Line 379: DCHECK(document != NULL); why would document be null? PS2, Line 384: GetJvmThreadOverview move this into ThreadOverviewUrlCallback PS2, Line 392: LOG(WARNING) << "Couldn't retrieve information about JVM threads: " : << status.GetDetail(); But - put this in the "error" member of document, and it should show on the webpage automatically. All the templates can print an error message if one exists. PS2, Line 393: << status.GetDetail(); formatting http://gerrit.cloudera.org:8080/#/c/6013/2/be/src/util/thread.h File be/src/util/thread.h: PS2, Line 194: track_jvm_threads include_jvm_threads ? http://gerrit.cloudera.org:8080/#/c/6013/2/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: Line 758: // Information about JVM threads comment when this might not be set? http://gerrit.cloudera.org:8080/#/c/6013/2/fe/src/main/java/org/apache/impala/common/JniUtil.java File fe/src/main/java/org/apache/impala/common/JniUtil.java: PS2, Line 206: threadCount just write response.setTotal_thread_count(threadBean.getThreadCount()) ? http://gerrit.cloudera.org:8080/#/c/6013/2/www/threadz.tmpl File www/threadz.tmpl: PS2, Line 37: # nit: is there only one jvm-threads entry? If so, this should be the ? operator (rather than the for-each operator). http://gerrit.cloudera.org:8080/#/c/6013/2/www/threadz_tabs.tmpl File www/threadz_tabs.tmpl: Line 27: nit: trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/6013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has posted comments on this change. Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Patch Set 1: Although this patch looks big, the files are moved + clang-formatted. -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6149 Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler For the last five years we've had the Scheduler superclass with exactly one implementation, SimpleScheduler. This patch collapses that hierarchy into just one concrete implementation called Scheduler. Also fixed up some includes in (the new) scheduler.h based on the include-what-you-use tool. Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e --- M be/src/runtime/exec-env.cc M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h R be/src/scheduling/scheduler-test-util.cc R be/src/scheduling/scheduler-test-util.h R be/src/scheduling/scheduler-test.cc R be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h D be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/util/MembershipSnapshot.java 13 files changed, 630 insertions(+), 698 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/6149/1 -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] Three misc webpage changes
Henry Robinson has posted comments on this change. Change subject: Three misc webpage changes .. Patch Set 1: Thanks for the reviews! > seems fine but I don't really know jquery If it makes it easier, this is basically C code - see e.g. https://github.com/apache/incubator-impala/blob/master/www/sessions.tmpl for another example of a sortable table already in master. > Is there a plan to migrate other HTML tables to DataTables? (E.g., Queries, > Query Locations, ...) No particular plans - it's so easy to do I think anyone can do it when they see a need. -- To view, visit http://gerrit.cloudera.org:8080/6045 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ica0578dabb7e27e6fd45ee4f31a1418ac3adc891 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 12: > I'm not sure the extra complexity of the UI is worth it. Without the UI, we'd have to remember how to construct the URLs and what the arguments are. The only complexity the UI adds is log_level.tmpl which handles presentation of the returned data. Everything else would stay roughly the same if we had a command-line-only interface. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4905: Reduce coordinator lock contention in RPC handler
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/5971 Change subject: IMPALA-4905: Reduce coordinator lock contention in RPC handler .. IMPALA-4905: Reduce coordinator lock contention in RPC handler Fragment instances call ReportExecStatus every few seconds, so there are often very many of those RPCs in flight. The handler in Coordinator::UpdateFragmentExecStatus() would try and take the shared coordinator lock twice, once to merge in any insert file move operations to be performed, and once to print all the other instance state's status. There was also a bug where the insert-merge branch would always be taken (and so would the lock) even if the query was not an INSERT statement. This patch refactors UpdateFragmentExecStatus() so that it never takes the coordinator lock. Instead, the insert updates are saved in the fragment instance state to be merged during finalization, and the logging is now done by the main coordinator thread which is signalled by the RPC handler. Change-Id: Id7599780785c4e9306711f535bf4726a247873e2 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc 3 files changed, 77 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/5971/1 -- To view, visit http://gerrit.cloudera.org:8080/5971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Id7599780785c4e9306711f535bf4726a247873e2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4905: Reduce coordinator lock contention in RPC handler
Henry Robinson has uploaded a new patch set (#2). Change subject: IMPALA-4905: Reduce coordinator lock contention in RPC handler .. IMPALA-4905: Reduce coordinator lock contention in RPC handler Fragment instances call ReportExecStatus every few seconds, so there are often very many of those RPCs in flight. The handler in Coordinator::UpdateFragmentExecStatus() would try and take the shared coordinator lock twice, once to merge in any insert file move operations to be performed, and once to print all the other instance state's status. There was also a bug where the insert-merge branch would always be taken (and so would the lock) even if the query was not an INSERT statement. This patch refactors UpdateFragmentExecStatus() so that it never takes the coordinator lock. Instead, the insert updates are saved in the fragment instance state to be merged during finalization. The logging is now done without taking the coordinator lock unnecessarily, and also avoid taking the fragment instance state locks since done() can be read safely without synchronization on x86. Change-Id: Id7599780785c4e9306711f535bf4726a247873e2 --- M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/fragment-instance-state.cc 3 files changed, 61 insertions(+), 44 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/5971/2 -- To view, visit http://gerrit.cloudera.org:8080/5971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id7599780785c4e9306711f535bf4726a247873e2 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4905: Reduce coordinator lock contention in RPC handler
Henry Robinson has posted comments on this change. Change subject: IMPALA-4905: Reduce coordinator lock contention in RPC handler .. Patch Set 1: (1 comment) Changed how the logging works - it didn't quite work in the previous patch because WaitForAllInstance() isn't called until late on in query execution. http://gerrit.cloudera.org:8080/#/c/5971/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS1, Line 829: for (const InstanceState* state: fragment_instance_states_) { : files_to_move.insert( : state->insert_status().files_to_move.begin(), : state->insert_status().files_to_move.end()); : : for (const PartitionStatusMap::value_type& partition: : state->insert_status().per_partition_status) { : TInsertPartitionStatus* status = &(per_partition_status_[partition.first]); : status->__set_num_modified_rows( : status->num_modified_rows + partition.second.num_modified_rows); : status->__set_kudu_latest_observed_ts(std::max( : partition.second.kudu_latest_observed_ts, status->kudu_latest_observed_ts)); : status->__set_id(partition.second.id); : status->__set_partition_base_dir(partition.second.partition_base_dir); : : if (partition.second.__isset.stats) { : if (!status->__isset.stats) status->__set_stats(TInsertStats()); : DataSink::MergeDmlStats(partition.second.stats, >stats); : } : } : } > I expect that large INSERT queries will take a while longer to run because I don't think it's going to be excessive, even though this is an N^2 loop (the metadata update times usually dominate). Worth checking to see if it makes a difference though. -- To view, visit http://gerrit.cloudera.org:8080/5971 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id7599780785c4e9306711f535bf4726a247873e2 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4885: Expose Jvm thread info in web UI
Henry Robinson has posted comments on this change. Change subject: IMPALA-4885: Expose Jvm thread info in web UI .. Patch Set 1: This is super useful. Could you post the screenshot somewhere that's publicly accessible? -- To view, visit http://gerrit.cloudera.org:8080/6013 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Id497043ab33dcf107a562f0b1ccd5c46095d397f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dimitris TsirogiannisGerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-4670 / IMPALA-4672: Add RpcMgr and port Statestore services to KRPC .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/5720/2/tests/statestore/test_statestore.py File tests/statestore/test_statestore.py: Line 18: # TODO(KRPC): Re-enable. > Or theoretically (meaning I haven't tried this) you could add a pytest skip I expect this file to get removed completely as part of this commit. -- To view, visit http://gerrit.cloudera.org:8080/5720 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8dbf12b9ecd71d26d239d31c19b487175194c766 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: David Knupp Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] Add the query handle to error messages for Invalid Query Handle for beeswax interface.
Henry Robinson has posted comments on this change. Change subject: Add the query handle to error messages for Invalid Query Handle for beeswax interface. .. Patch Set 1: (1 comment) Any chance you want to do the same thing for HS2? See impala-hs2-server.cc. http://gerrit.cloudera.org:8080/#/c/5748/1/be/src/service/impala-beeswax-server.cc File be/src/service/impala-beeswax-server.cc: PS1, Line 193: "Invalid query handle: " + handle.id try and avoid operator+ for strings - sometimes it doesn't do what you might expect. We prefer Substitute("Invalid query handle: $0", handle.id) -- To view, visit http://gerrit.cloudera.org:8080/5748 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibc113b3673e1b90f81e80e841740b8006bfd31ba Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach AmsdenGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5792/13/be/src/util/logging-support.cc File be/src/util/logging-support.cc: Line 40 > Yes, Interestingly that has always been the behavior (even without this pat Right, but the idea is that if we want to turn off trace debugging, for example, we can use the usual controls to do so (by setting GLOG_v < 3). What happens with trace log messages now? Are they always printed? -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5792/13/be/src/util/logging-support.cc File be/src/util/logging-support.cc: Line 40 > My understanding is that it should be controlled by log4j on the JVM side r This mechanism exists so that we can configure Impala's logging in one place, and have the behaviour be consistent across FE and BE. I don't think we should introduce the burden of having them configured separately - is that what this change now implies? Is trace logging off by default? We should expect that 95% of deployments won't use this feature for fine-grained control, so the out-of-the-box behaviour has to be good enough for them. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 13: (1 comment) http://gerrit.cloudera.org:8080/#/c/5792/13/be/src/util/logging-support.cc File be/src/util/logging-support.cc: Line 40 > These come in the way of dynamic log4j dynamic log level changes because th Does that mean that severities that get mapped to VLOG will always get printed? i.e. if we have LOG.debug("something") in the frontend, that will get printed at INFO level (per line 55)? -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 14: Code-Review+2 Discussed offline. I believe the out-of-the-box behaviour hasn't changed. -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[native-toolchain-CR] Bump breakpad upstream version
Henry Robinson has posted comments on this change. Change subject: Bump breakpad upstream version .. Patch Set 2: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/6213/2//COMMIT_MSG Commit Message: PS2, Line 7: upstream remove, not sure where 'upstream' is in this context. http://gerrit.cloudera.org:8080/#/c/6213/2/source/breakpad/make-breakpad-src-archive.sh File source/breakpad/make-breakpad-src-archive.sh: Line 21: # releases. Mention where the original is? -- To view, visit http://gerrit.cloudera.org:8080/6213 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8859ced5048a9f3098c9ec3f24e7331700a1 Gerrit-PatchSet: 2 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Lars VolkerGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5005: Don't allow server to send SASL COMPLETE msg out of order
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6190 Change subject: IMPALA-5005: Don't allow server to send SASL COMPLETE msg out of order .. IMPALA-5005: Don't allow server to send SASL COMPLETE msg out of order Change-Id: I0c0d931d5d6ef3cbf50e2c36619cab2e0c72b629 --- M be/src/transport/TSaslTransport.cpp 1 file changed, 7 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/6190/1 -- To view, visit http://gerrit.cloudera.org:8080/6190 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I0c0d931d5d6ef3cbf50e2c36619cab2e0c72b629 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4856: Port ImpalaInternalService to KRPC
Henry Robinson has posted comments on this change. Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/5888/1/be/src/service/impala-internal-service.cc File be/src/service/impala-internal-service.cc: Line 66: > sync callback ? AddData func may block some time that will cause many work You are correct, but those changes are to come in a subsequent patch which changes the protocol a bit. -- To view, visit http://gerrit.cloudera.org:8080/5888 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Anonymous Coward #168 Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-4983: Compile LZ4 in release mode
Henry Robinson has posted comments on this change. Change subject: IMPALA-4983: Compile LZ4 in release mode .. Patch Set 1: Code-Review+2 Verified+1 Carry +2 from Toolchain project. Verified by complete build. -- To view, visit http://gerrit.cloudera.org:8080/6164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18 Gerrit-PatchSet: 1 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has posted comments on this change. Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Patch Set 2: (1 comment) I've reverted the renames, temporarily, for easy review. I haven't reverted the CMakeFiles or includes so this patch won't compile. http://gerrit.cloudera.org:8080/#/c/6149/1/be/src/scheduling/scheduler.h File be/src/scheduling/scheduler.h: Line 18 > revert. Done -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[native-toolchain-CR] IMPALA-4983: Compile LZ4 in release mode
Henry Robinson has submitted this change and it was merged. Change subject: IMPALA-4983: Compile LZ4 in release mode .. IMPALA-4983: Compile LZ4 in release mode LZ4 was not compiled (apparently ever?) with optimization enabled. In recent versions this lead to a regression in compression time that was noticeable vs previous LZ4 versions. With optimization enabled, the regression appears to vanish. Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18 --- M source/lz4/build.sh 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Henry Robinson: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/6164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18 Gerrit-PatchSet: 1 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has uploaded a new patch set (#2). Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler For the last five years we've had the Scheduler superclass with exactly one implementation, SimpleScheduler. This patch collapses that hierarchy into just one concrete implementation called Scheduler. Also fixed up some includes in (the new) scheduler.h based on the include-what-you-use tool. Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e --- M be/src/runtime/exec-env.cc M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h D be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler-test-util.cc M be/src/scheduling/simple-scheduler-test-util.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/util/MembershipSnapshot.java 13 files changed, 238 insertions(+), 301 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/6149/2 -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] IMPALA-4983: Set toolchain version to include LZ4 build flags
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6165 Change subject: IMPALA-4983: Set toolchain version to include LZ4 build flags .. IMPALA-4983: Set toolchain version to include LZ4 build flags This bumps the toolchain version to include a change that compiles LZ4 with -O3. Due to a limitation in the way the toolchain version dependencies, existing downloads of LZ4 1.7.5 will not be updated. Therefore to ensure you have the most recent binaries, do the following: rm -rf ${IMPALA_TOOLCHAIN}/lz4-1.7.5/ cd ${IMPALA_HOME} && . bin/impala-config.sh bin/bootstrap_toolchain.py Change-Id: Ie9244188d44523227bab487a58f23954c4727c71 --- M bin/impala-config.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/6165/1 -- To view, visit http://gerrit.cloudera.org:8080/6165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ie9244188d44523227bab487a58f23954c4727c71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[native-toolchain-CR] IMPALA-4983: Compile LZ4 in release mode
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6164 Change subject: IMPALA-4983: Compile LZ4 in release mode .. IMPALA-4983: Compile LZ4 in release mode LZ4 was not compiled (apparently ever?) with optimization enabled. In recent versions this lead to a regression in compression time that was noticeable vs previous LZ4 versions. With optimization enabled, the regression appears to vanish. Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18 --- M source/lz4/build.sh 1 file changed, 1 insertion(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/64/6164/1 -- To view, visit http://gerrit.cloudera.org:8080/6164 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I8bd113822dfc4df2d76c4393c4b3b3550066dd18 Gerrit-PatchSet: 1 Gerrit-Project: native-toolchain Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-5001: Redownload dependencies if toolchain ID changes
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/6166 Change subject: IMPALA-5001: Redownload dependencies if toolchain ID changes .. IMPALA-5001: Redownload dependencies if toolchain ID changes To allow for the situation where a dependency's compile flags change (but otherwise nothing else does, including the version), this patch adds the toolchain build ID to the version information checked by bootstrap_toolchain.py when it decides whether to download a new dependency version. Also add a small amount of logging to make it clearer what the script is doing. Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 --- M bin/bootstrap_toolchain.py 1 file changed, 26 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/6166/1 -- To view, visit http://gerrit.cloudera.org:8080/6166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has posted comments on this change. Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/6149/2/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 315: NULL; > could you please restore the old formatting here? I find the autoformatting Done Line 391: TUniqueId instance_id = fragment_params->is_coord_fragment ? > same here Done -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Hello Marcel Kornacker, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6149 to look at the new patch set (#3). Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler For the last five years we've had the Scheduler superclass with exactly one implementation, SimpleScheduler. This patch collapses that hierarchy into just one concrete implementation called Scheduler. Also fixed up some includes in (the new) scheduler.h based on the include-what-you-use tool. Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e --- M be/src/runtime/exec-env.cc M be/src/scheduling/CMakeLists.txt M be/src/scheduling/admission-controller.cc M be/src/scheduling/query-schedule.h R be/src/scheduling/scheduler-test-util.cc R be/src/scheduling/scheduler-test-util.h R be/src/scheduling/scheduler-test.cc R be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h D be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/util/MembershipSnapshot.java 13 files changed, 633 insertions(+), 695 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/6149/3 -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker
[Impala-ASF-CR] Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler
Henry Robinson has posted comments on this change. Change subject: Remove Scheduler abstract interface, rename SimpleScheduler -> Scheduler .. Patch Set 3: Code-Review+2 Rebase, carry +2. -- To view, visit http://gerrit.cloudera.org:8080/6149 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I70e0b002fa56d3ba1c3b34f03ae05f4042ac309e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5001: Redownload dependencies if toolchain ID changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-5001: Redownload dependencies if toolchain ID changes .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/6166/1/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: Line 116: print("No update necessary\n") > add a space between the * and this 'No...' You mean a carriage return? That happens automatically. -- To view, visit http://gerrit.cloudera.org:8080/6166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4983: Set toolchain version to include LZ4 build flags
Henry Robinson has posted comments on this change. Change subject: IMPALA-4983: Set toolchain version to include LZ4 build flags .. Patch Set 1: > I assume this has passed tests with the new lz4 bits? Basic tests pass. The full suite will be run by GVO as normal. -- To view, visit http://gerrit.cloudera.org:8080/6165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9244188d44523227bab487a58f23954c4727c71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4983: Set toolchain version to include LZ4 build flags
Henry Robinson has posted comments on this change. Change subject: IMPALA-4983: Set toolchain version to include LZ4 build flags .. Patch Set 1: Confirmed that the perf regression is fixed. With IMPALA-5001 the steps aren't necessary (just a reminder to bootstrap the toolchain). -- To view, visit http://gerrit.cloudera.org:8080/6165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie9244188d44523227bab487a58f23954c4727c71 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4822: Implement dynamic log level changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-4822: Implement dynamic log level changes .. Patch Set 13: Code-Review+2 (4 comments) Looks pretty good to me. http://gerrit.cloudera.org:8080/#/c/5792/13/be/src/util/logging-support.cc File be/src/util/logging-support.cc: Line 40 What's the effect of removing these? Where is the VLOG masking done now? http://gerrit.cloudera.org:8080/#/c/5792/13/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: PS13, Line 127: set_glog_url =\ : (self.SET_GLOG_LOGLEVEL_URL + "?glog=foo") nit: one line, no parentheses? http://gerrit.cloudera.org:8080/#/c/5792/13/www/log_level.tmpl File www/log_level.tmpl: PS13, Line 21: error It might make more sense to print the form if there was an error, so that the mistake can be rectified. PS13, Line 25: width: 600px; Why set the width? Does the page look bad if you let it figure out the width from the parent div? -- To view, visit http://gerrit.cloudera.org:8080/5792 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I588418e9bcb0b66d33138baf96207a5a35bfbd63 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5001: Redownload dependencies if toolchain ID changes
Henry Robinson has posted comments on this change. Change subject: IMPALA-5001: Redownload dependencies if toolchain ID changes .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/6166/1/bin/bootstrap_toolchain.py File bin/bootstrap_toolchain.py: Line 116: print("No update necessary\n") > ah right, thrown off by the explicit \n added. I thought this was intended Done Line 124: > nit: extra newline Done -- To view, visit http://gerrit.cloudera.org:8080/6166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5001: Redownload dependencies if toolchain ID changes
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/6166 to look at the new patch set (#2). Change subject: IMPALA-5001: Redownload dependencies if toolchain ID changes .. IMPALA-5001: Redownload dependencies if toolchain ID changes To allow for the situation where a dependency's compile flags change (but otherwise nothing else does, including the version), this patch adds the toolchain build ID to the version information checked by bootstrap_toolchain.py when it decides whether to download a new dependency version. Also add a small amount of logging to make it clearer what the script is doing. Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 --- M bin/bootstrap_toolchain.py 1 file changed, 25 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/6166/2 -- To view, visit http://gerrit.cloudera.org:8080/6166 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I77fcfbeb4594b14da2048a9c31458f634600c1e6 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Henry Robinson has posted comments on this change. Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4428/1/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: Line 319: capacity_ = tuple_ptrs_size_ / (num_tuples_per_row_ * sizeof(Tuple*)); > We already know the initial capacity using this calculation - we should com What about just setting capacity_ = initial_capacity_ here? -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Henry Robinson has uploaded a new patch set (#2). Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called If MarkCapacity() is called on a row batch, it is difficult to call AcquireState() on that batch because tuple_ptrs_size_ is not accessible to initialise the destination batch - this is usually calculated from capacity(), but that value is wrong for these purposes after MarkCapacity(). Add RowBatch::InitialCapacity() to return the initial capacity value of the batch. Add row-batch-test to add initial coverage of AcquireState() API. Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 4 files changed, 77 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4428/2 -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Henry Robinson has posted comments on this change. Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/4428/2/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: PS2, Line 282: MarkCapacity > MarkAtCapacity I think Done -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4428 to look at the new patch set (#4). Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called If MarkCapacity() is called on a row batch, it is difficult to call AcquireState() on that batch because tuple_ptrs_size_ is not accessible to initialise the destination batch - this is usually calculated from capacity(), but that value is wrong for these purposes after MarkCapacity(). Add RowBatch::InitialCapacity() to return the initial capacity value of the batch. Add row-batch-test to add initial coverage of AcquireState() API. Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 4 files changed, 77 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4428/4 -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4187: Switch RPC latency metrics to histograms
Hello Juan Yu, Matthew Jacobs, Dan Hecht, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4516 to look at the new patch set (#4). Change subject: IMPALA-4187: Switch RPC latency metrics to histograms .. IMPALA-4187: Switch RPC latency metrics to histograms It's usually better to measure latency distributions with histograms, not avg / min / max. This change switches out the metrics that measure the latency of RPC processing time to histograms. Add HistogramMetric::Reset() to allow histogram to remove all entries. Added spinlock around histogram access. On a 8-core i7 @ 3.4GHz, the following throughputs were observed: 1 thread -> 25M updates/sec 4 threads -> 7M updates/sec 16 threads -> 5M updates/sec Each histogram takes ~108KB of storage for its buckets. This can be reduced by reducing the maximum value, currently 60 minutes. The new metrics have the following text representation: Count: 148, 25th %-ile: 0, 50th %-ile: 0, 75th %-ile: 0, 90th %-ile: 0, 95th %-ile: 0, 99.9th %-ile: 1ms Which changes from the old: count: 345, last: 6ms, min: 0, max: 12ms, mean: 1ms, stddev: 1ms Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34 --- M be/src/rpc/rpc-trace.cc M be/src/rpc/rpc-trace.h M be/src/util/histogram-metric.h M common/thrift/metrics.json 4 files changed, 64 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/4516/4 -- To view, visit http://gerrit.cloudera.org:8080/4516 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba6d4270dd5676eeeff35ad8d9dc5dcddd95e34 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-3823: Add timer to measure Parquet footer reads
Henry Robinson has posted comments on this change. Change subject: IMPALA-3823: Add timer to measure Parquet footer reads .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/4371/6/be/src/util/runtime-profile-counters.h File be/src/util/runtime-profile-counters.h: Line 228: int32_t total_num_values() const { return total_num_values_; } > Since it's only a read, I figured even if the CPU reorders the read instruc Without reader synchronization you can probably have the following sequence: Previously, counter.min = counter.max = 0 Writer: counter.min = V counter.max = V Reader: mn = counter->min_value() // == V mx = counter->max_value() // == 0 DCHECK_LE(mn, mx) << "Fails" So you would need to synchronize with the writer to ensure that an update appears atomic. TSO helps you avoid certain kinds of ordering bugs - but remember that others are trying to port this to PPC. You have, in RuntimeProfile::SummaryStatsCounter::UpdateCounter(): ++total_num_values_; sum_ += new_value; If you did an unsynchronized read of sum and total_num_values on a non-TSO architecture, you could easily get total_num_values == 0 but sum > 0. -- To view, visit http://gerrit.cloudera.org:8080/4371 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icf87bad90037dd0cea63b10c537382ec0f980cbf Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has posted comments on this change. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. Patch Set 3: (8 comments) Some comments before I head off to Strata. I would run clang-format over the new thrift files. That way they'll be more readable to Impala developers, while keeping the structure and idioms of the thrift code. http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/TAcceptQueueServer.cpp File be/src/rpc/TAcceptQueueServer.cpp: PS3, Line 204: 1 this is an important enough parameter that I'd make a constant for it, and put your comments about only using thread on that. We don't want someone changing the number of threads without understanding the thread-safety implications. PS3, Line 216: acceptPool.Offer returned false. If this returns false it's because the queue is shut down. Better to say that clearly in the error message so users have an idea what this might mean. Also you can say that it's unexpected - we never shut down our servers so the queue should never be shut down. http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/TAcceptQueueServer.h File be/src/rpc/TAcceptQueueServer.h: Line 46: * connections will time out while in the OS accept queue. add a reference to IMPALA-4135 here. PS3, Line 105: std::mutex big_lock_; Is this used anywhere? http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/thrift-server-test.cc File be/src/rpc/thrift-server-test.cc: Line 174: ThreadPool pool("group", "test", 256, 1, [](int tid, const int64_t& item) { Add a comment about what you're testing and the associated JIRA. PS3, Line 178: ASSERT_OK(status); Does this work in a threaded context - does the test fail if the thread hits an ASSERT? PS3, Line 180: for (int i = 0; i < 1024 * 16; ++i) pool.Offer(i); I'm a bit concerned about failing with ulimit errors, particularly because both client and server should be in the same process. Have you ever seen this repro with fewer concurrent connections - say 8K? PS3, Line 173: TEST(ConcurrencyTest, ConnectTimeout) { : ThreadPool pool("group", "test", 256, 1, [](int tid, const int64_t& item) { : using Client = ThriftClient; : Client* client = new Client("127.0.0.1", 22000, "", NULL, false); : Status status = client->Open(); : ASSERT_OK(status); : }); : for (int i = 0; i < 1024 * 16; ++i) pool.Offer(i); : pool.DrainAndShutdown(); : } This relies on a running impala internal service - you probably had an Impala process running when you ran this? That won't be true in our builds. Use the same server code that all the other tests use to start a server in this process. -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has posted comments on this change. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. Patch Set 4: (7 comments) Looking pretty good. http://gerrit.cloudera.org:8080/#/c/4519/4/be/src/rpc/TAcceptQueueServer.cpp File be/src/rpc/TAcceptQueueServer.cpp: PS4, Line 136: doAccept doAccept() isn't a great name (I know it was from my patch!) because this isn't actually doing the work of accept, but instead establishing a connection. Perhaps call this SetupConnection() ? You can use Impala's convention for method capitalisation; again makes it easier to see what was added. PS4, Line 204: acceptPool connection_handler_pool, or something similar. PS4, Line 205: ACCEPT_THREAD_POOL_SIZE This can be defined in this method as a constexpr. http://gerrit.cloudera.org:8080/#/c/4519/4/be/src/rpc/TAcceptQueueServer.h File be/src/rpc/TAcceptQueueServer.h: PS4, Line 96: thread strange line breaks in this comment. PS4, Line 97: static const int ACCEPT_THREAD_POOL_SIZE = 1; move into .cc file http://gerrit.cloudera.org:8080/#/c/4519/4/be/src/rpc/thrift-server-test.cc File be/src/rpc/thrift-server-test.cc: PS4, Line 173: ConnectTimeout Call this ManyConcurrentConnection or something. Line 175: // waiting to be accepted.(IMPALA-4135) Mention that this does not always fail. How long does it take to run? -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove spurious Boost warnings on compilation errors
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4564 Change subject: Remove spurious Boost warnings on compilation errors .. Remove spurious Boost warnings on compilation errors Compilation errors can spuriously print warnings from Boost where the filesystem module is used, like: ‘boost::system::posix_category’ defined but not used Defining BOOST_SYSTEM_NO_DEPRECATED removes those warnings, which arise from Boost maintaining deprecated names for error codes that have moved namespaces during the shift to C++11 (see http://www.boost.org/doc/libs/1_61_0/boost/system/error_code.hpp and http://www.boost.org/doc/libs/1_61_0/libs/system/doc/reference.html). We're not using the old names, so it's ok to remove them. Change-Id: Ib84d8a9958469fb22b0af4907958917a65e8290f --- M be/CMakeLists.txt 1 file changed, 1 insertion(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/4564/1 -- To view, visit http://gerrit.cloudera.org:8080/4564 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Ib84d8a9958469fb22b0af4907958917a65e8290f Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has posted comments on this change. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. Patch Set 6: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/4519/6//COMMIT_MSG Commit Message: PS6, Line 25: awhile a while -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4246: SleepForMs() utility function has undefined behavior for > 1s
Henry Robinson has posted comments on this change. Change subject: IMPALA-4246: SleepForMs() utility function has undefined behavior for > 1s .. Patch Set 1: Code-Review+2 (2 comments) http://gerrit.cloudera.org:8080/#/c/4622/1/be/src/util/time.cc File be/src/util/time.cc: PS1, Line 26: std:: remove std:: PS1, Line 26: std:: remove std:: -- To view, visit http://gerrit.cloudera.org:8080/4622 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I06c55b1be287b264e7601c9c89788ae5929571cf Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has uploaded a new patch set (#8). Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. IMPALA-4135: Thrift threaded server times-out connections during high load During times of high load, Thrift's TThreadedServer can't keep up with the rate of new socket connections, causing some to time out. This patch creates TAcceptQueueServer, which is a modified version of TThreadedServer that calls accept() and then hands the returned TTransport off to a thread pool to handle setting up the connection. This ensures that accept() is called as quickly as possible, preventing connections from timing out while waiting. It also adds a metric, connection-setup-queue-size, to monitor the number of accepted connections waiting to be processed. A flag, --accepted_cnxn_queue_depth, controls the size of the accepted connection buffer. Testing: - New test added to thrift-server-test. (Disabled by default, due to high ulimit -n requirement) - Locally with the repro shown in IMPALA-4135. - On the 16-node with a real repro query. - Ran the stress test for a while. Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d --- M be/src/common/global-flags.cc M be/src/rpc/CMakeLists.txt A be/src/rpc/TAcceptQueueServer.cpp A be/src/rpc/TAcceptQueueServer.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M common/thrift/metrics.json 8 files changed, 545 insertions(+), 14 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/19/4519/8 -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has posted comments on this change. Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. Patch Set 9: After some discussion, moved the result materialization into Send(). The logic comes out a bit cleaner, because there is no reason to worry about the lifetime of the row batch - when Send() returns, the row batch is finished with. -- To view, visit http://gerrit.cloudera.org:8080/4402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has posted comments on this change. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. Patch Set 8: Code-Review+2 Rebase (plus add a flag and disable the test), carry +2. -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-MarshallGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3853: squeasel is MIT (and dual copyright) not Apache
Henry Robinson has posted comments on this change. Change subject: IMPALA-3853: squeasel is MIT (and dual copyright) not Apache .. Patch Set 1: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4646 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9711ad60dbe00c3b8b1ce7b9ccc3ca1dd637b88c Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Jim AppleGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has uploaded a new patch set (#8). Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. IMPALA-2905: Handle coordinator fragment lifecycle like all others The plan-root fragment instance that runs on the coordinator should be handled like all others: started via RPC and run asynchronously. Without this, the fragment requires special-case code throughout the coordinator, and does not show up in system metrics etc. This patch adds a new sink type, PlanRootSink, to the root fragment instance so that the coordinator can pull row batches that are pushed by the root instance. The coordinator signals completion to the fragment instance via closing the consumer side of the sink, whereupon the instance is free to complete. Since the root instance now runs asynchronously wrt to the coordinator, we add several coordination methods to allow the coordinator to wait for a point in the instance's execution to be hit - e.g. to wait until the instance has been opened. Done in this patch: * Add PlanRootSink * Add coordination to PFE to allow coordinator to observe lifecycle * Make FragmentMgr a singleton * Removed dead code from Coordinator::Wait() and elsewhere. * Moved result output exprs out of QES and into PlanRootSink. * Remove special-case limit-based teardown of coordinator fragment, and supporting functions in PlanFragmentExecutor. * Simplified lifecycle of PlanFragmentExecutor by separating Open() into OpenPlan() and Exec(), the latter of which drives the sink by reading rows from the plan tree. * Add child profile to PlanFragmentExecutor to measure time spent in each lifecycle phase. * Removed dependency between InitExecProfiles() and starting root fragment. Not yet done: * Fix planner tests to reflect new sink added at root. Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc A be/src/exec/plan-root-sink.cc A be/src/exec/plan-root-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler.cc M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h M be/src/service/fragment-mgr.cc M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M be/src/testutil/in-process-servers.cc M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java A fe/src/main/java/org/apache/impala/planner/PlanRootSink.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M tests/hs2/test_json_endpoints.py 30 files changed, 840 insertions(+), 709 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/4402/8 -- To view, visit http://gerrit.cloudera.org:8080/4402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Henry Robinson has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 18: Code-Review+1 (1 comment) Changes since PS16 look good to me. http://gerrit.cloudera.org:8080/#/c/4054/18/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 242: /// fragment. All elements are non-nullptr. Owned by obj_pool(). Say when this is initialized. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has posted comments on this change. Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. Patch Set 7: (28 comments) This patch passes EE tests. I haven't completely addressed the issue of exec profiles in the coordinator - I intend to rebase on the MT coordinator patch which reworks that path. The major change is that the new sink hands off a single batch from the sender to the consumer at a time. Queuing and resource transfer are orthogonal concerns; the main benefit of this patch is the simplification of QES, coordinator and PFE paths. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/exec/push-pull-sink.cc File be/src/exec/push-pull-sink.cc: Line 31 > i think if you have the sink build a queue of QueryResultSet* (and change Q To make progress, I changed this so that the QES passes in a QueryResultSet via the coordinator, which is filled by GetNext(). We can look into constructing the QRS in this sink, and the memory and perf effects of queuing, in a separate patch. Line 54 > Will this thread be unblocked when the query is cancelled? I imagine a scen Implementation changed, not applicable. Line 58 > The row batches still sitting in the queue are Reset() implicitly when this Implementation changed, not applicable. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/exec/push-pull-sink.h File be/src/exec/push-pull-sink.h: Line 29 > QueueSink then? i couldn't quite figure out what PushPullSink was. Renamed to PlanRootSink. Line 30 > what's the size of the queue? Done Line 66 > why virtual? Done Line 77 > re should be movable: i don't particularly like this comment, since it seem Moot for now, since the queue is removed. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 356: num_remaining_fragment_instances_(0), > Set in Exec() Done Line 471: > Can we get rid of rm_reserved_limit_ in MemTracker? Done Line 484: if (has_coordinator_fragment) { > spelling Done Line 486: root_fragment_instance_ = > leave todo to get that out of the qs. this is expensive to create. Fixed. Line 490: executor_->WaitForPrepare(); > const TPlanNode& ? Done Line 491: > my_row_desc -> root_row_desc Done Line 510: int num_hosts, int start_fragment_instance_idx) { > that's actually not true for the mt case. As discussed, we should make it true :) Line 1471: } > probably best to fix initialization as part of this change (it should prece Will defer to after rebasing on top of your changes which improve the profile creation path. Line 1472: > Is this always set, even in weird error scenarios? Same - going to defer until rebase on MT coord patch. Line 1637: } > that shouldn't be true anymore Will defer. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 266: /// them to the client in GetNext(). > why do you need this and not just the queue sink? For row_desc() and runtime_state(). http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 339: return Status::OK(); > Could another thread wait for opened_promise_ indefinitely if this returns Yes, fixed by moving all the error-prone logic into OpenInternal() Line 359: while (report_thread_active_) { > it looks like we should break up Open() and move the row-producing logic in Good idea. Added Exec() which produces rows. Only the FES calls this API, which now runs: if (Prepare().ok()) { Open(); Exec(); } Close(); http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: Line 49: /// PlanFragmentExecutor handles all aspects of the execution of a single plan fragment, > extend class description with new semantics. What are the new semantics? The GetNext() path is removed, but there's nothing new in its place as far as mechanisms in PFE go. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/service/fragment-mgr.cc File be/src/service/fragment-mgr.cc: Line 60 > don't all fragments now have a sink? Yes, but I think this is the wrong place to check it. The PFE already checks in Prepare(), so I think this check was mostly to be sure that we handed tried to remotely execute the coordinator fragment. Line 46: // Preparing and opening the fragment creates a thread and consumes a non-trivial > why does this need to happen up here? Reverted. http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/service/impala-internal-service.h File be/src/service/impala-internal-service.h: Line 36: : impala_server_(impala_server) { > why even have that param if it's a singleton Done http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/util/blocking-queue.h File be/src/util/blocking-queue.h: Line 35: > why is this
[Impala-ASF-CR] IMPALA-3983/IMPALA-3974: Delete function jar resources after load
Henry Robinson has posted comments on this change. Change subject: IMPALA-3983/IMPALA-3974: Delete function jar resources after load .. Patch Set 2: (1 comment) Any easy way to test this? Can you add some UDFs and then check the local directory for any leftover jar files? http://gerrit.cloudera.org:8080/#/c/4617/2/be/src/catalog/catalog.cc File be/src/catalog/catalog.cc: PS2, Line 48: String long line -- To view, visit http://gerrit.cloudera.org:8080/4617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I5f9dedb5b342415380c83e61a72eb497371a8199 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Bharath VissapragadaGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3342: Adding new timer to accurately measure the TotalCpuTime
Henry Robinson has posted comments on this change. Change subject: IMPALA-3342: Adding new timer to accurately measure the TotalCpuTime .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/4633/1//COMMIT_MSG Commit Message: PS1, Line 10: correctly incorrectly http://gerrit.cloudera.org:8080/#/c/4633/1/be/src/util/stopwatch.h File be/src/util/stopwatch.h: Line 28: #include space after include - please run git-clang-format over your patches before submitting. PS1, Line 190: CpuTimeStopWatch Do you think we actually need a stopwatch here? What about a method that just returns the ns computed from getrusage()? Then the PFE can just get the time at the start of the computation, and then compute the delta when the fragment is complete. PS1, Line 190: class class needs a comment PS1, Line 225: end Why does this need to be a member? -- To view, visit http://gerrit.cloudera.org:8080/4633 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ifa88aa6f3371fa42d11ecc122f43c7d83623c300 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadkeGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Yonghyun Hwang Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4135: Thrift threaded server times-out connections during high load
Henry Robinson has submitted this change and it was merged. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load .. IMPALA-4135: Thrift threaded server times-out connections during high load During times of high load, Thrift's TThreadedServer can't keep up with the rate of new socket connections, causing some to time out. This patch creates TAcceptQueueServer, which is a modified version of TThreadedServer that calls accept() and then hands the returned TTransport off to a thread pool to handle setting up the connection. This ensures that accept() is called as quickly as possible, preventing connections from timing out while waiting. It also adds a metric, connection-setup-queue-size, to monitor the number of accepted connections waiting to be processed. A flag, --accepted_cnxn_queue_depth, controls the size of the accepted connection buffer. Testing: - New test added to thrift-server-test. (Disabled by default, due to high ulimit -n requirement) - Locally with the repro shown in IMPALA-4135. - On the 16-node with a real repro query. - Ran the stress test for a while. Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Reviewed-on: http://gerrit.cloudera.org:8080/4519 Tested-by: Internal Jenkins Reviewed-by: Henry Robinson--- M be/src/common/global-flags.cc M be/src/rpc/CMakeLists.txt A be/src/rpc/TAcceptQueueServer.cpp A be/src/rpc/TAcceptQueueServer.h M be/src/rpc/thrift-server-test.cc M be/src/rpc/thrift-server.cc M be/src/rpc/thrift-server.h M common/thrift/metrics.json 8 files changed, 545 insertions(+), 14 deletions(-) Approvals: Henry Robinson: Looks good to me, approved Internal Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has uploaded a new patch set (#9). Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. IMPALA-2905: Handle coordinator fragment lifecycle like all others The plan-root fragment instance that runs on the coordinator should be handled like all others: started via RPC and run asynchronously. Without this, the fragment requires special-case code throughout the coordinator, and does not show up in system metrics etc. This patch adds a new sink type, PlanRootSink, to the root fragment instance so that the coordinator can pull row batches that are pushed by the root instance. The coordinator signals completion to the fragment instance via closing the consumer side of the sink, whereupon the instance is free to complete. Since the root instance now runs asynchronously wrt to the coordinator, we add several coordination methods to allow the coordinator to wait for a point in the instance's execution to be hit - e.g. to wait until the instance has been opened. Done in this patch: * Add PlanRootSink * Add coordination to PFE to allow coordinator to observe lifecycle * Make FragmentMgr a singleton * Removed dead code from Coordinator::Wait() and elsewhere. * Moved result output exprs out of QES and into PlanRootSink. * Remove special-case limit-based teardown of coordinator fragment, and supporting functions in PlanFragmentExecutor. * Simplified lifecycle of PlanFragmentExecutor by separating Open() into OpenPlan() and Exec(), the latter of which drives the sink by reading rows from the plan tree. * Add child profile to PlanFragmentExecutor to measure time spent in each lifecycle phase. * Removed dependency between InitExecProfiles() and starting root fragment. Not yet done: * Fix planner tests to reflect new sink added at root. Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df --- M be/src/exec/CMakeLists.txt M be/src/exec/data-sink.cc A be/src/exec/plan-root-sink.cc A be/src/exec/plan-root-sink.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler.cc M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h M be/src/service/fragment-mgr.cc M be/src/service/impala-beeswax-server.cc M be/src/service/impala-hs2-server.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h M be/src/testutil/in-process-servers.cc M common/thrift/DataSinks.thrift M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java A fe/src/main/java/org/apache/impala/planner/PlanRootSink.java M fe/src/main/java/org/apache/impala/planner/Planner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M tests/hs2/test_json_endpoints.py 30 files changed, 824 insertions(+), 709 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/4402/9 -- To view, visit http://gerrit.cloudera.org:8080/4402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has posted comments on this change. Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. Patch Set 10: EE tests pass with this new sink implementation as well. -- To view, visit http://gerrit.cloudera.org:8080/4402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2905: Handle coordinator fragment lifecycle like all others
Henry Robinson has posted comments on this change. Change subject: IMPALA-2905: Handle coordinator fragment lifecycle like all others .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/4402/6/be/src/exec/push-pull-sink.cc File be/src/exec/push-pull-sink.cc: Line 31: constexpr int32_t QUEUE_DEPTH = 16; > Thinking about this a little more, I think the accumulation of batches with Thanks for pointing that out! Is the 'memory-transfer' contract documented anywhere? It's pretty hard to know at what points we are allowed to do what with RowBatches. -- To view, visit http://gerrit.cloudera.org:8080/4402 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibb0064ec2f085fa3a5598ea80894fb489a01e4df Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4428 Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called If MarkCapacity() is called on a row batch, it is difficult to call AcquireState() on that batch because tuple_ptrs_size_ is not accessible to initialise the destination batch - this is usually calculated from capacity(), but that value is wrong for these purposes after MarkCapacity(). Add RowBatch::initial_capacity() to track the initial capacity value of the batch. Add row-batch-test to add initial coverage of AcquireState() API. Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 4 files changed, 78 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4428/1 -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called
Henry Robinson has posted comments on this change. Change subject: IMPALA-4138: Fix AcquireState() for batches that have MarkCapacity() called .. Patch Set 1: (1 comment) Agree that this maybe isn't a fix so much as an interface improvement, but I think it helps RowBatch be self-contained if I can AcquireState() from a batch using only that batch's interface. http://gerrit.cloudera.org:8080/#/c/4428/1/be/src/runtime/row-batch-test.cc File be/src/runtime/row-batch-test.cc: Line 60: ASSERT_DEBUG_DEATH(bad_dest.AcquireState(), ""); Will change to IMPALA_ASSERT_DEBUG_DEATH when related patch lands. -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] Remove Llama support.
Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/4445 Change subject: Remove Llama support. .. Remove Llama support. Alas, poor Llama! I knew him, Impala: a system of infinite jest, of most excellent fancy: we hath borne him on our back a thousand times; and now, how abhorred in my imagination it is! Done: * Removed QueryResourceMgr, ResourceBroker, CGroupsMgr * Removed untested 'offline' mode and NM failure detection from ImpalaServer * Removed all Llama-related Thrift files * Removed RM-related arguments to MemTracker constructors * Deprecated all RM-related flags, printing a warning if enable_rm is set * Removed expansion logic from MemTracker * Removed VCore logic from QuerySchedule * Removed all reservation-related logic from Scheduler * Removed RM metric descriptions * Various misc. small class changes Not done: * Remove RM flags (--enable_rm etc.) * Remove RM query options * Changes to RequestPoolService. Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef --- M be/CMakeLists.txt M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/bufferpool/reservation-tracker-test.cc M be/src/exec/blocking-join-node.cc M be/src/exec/data-sink.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/kudu-scan-node-test.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink-test.cc M be/src/exprs/expr-test.cc D be/src/resourcebroker/CMakeLists.txt D be/src/resourcebroker/resource-broker.cc D be/src/resourcebroker/resource-broker.h M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/mem-pool-test.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/CMakeLists.txt D be/src/scheduling/query-resource-mgr.cc D be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/request-pool-service.cc M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/service/query-exec-state.cc M be/src/util/CMakeLists.txt D be/src/util/cgroups-mgr.cc D be/src/util/cgroups-mgr.h M be/src/util/debug-util.h D be/src/util/llama-util.cc D be/src/util/llama-util.h M be/src/util/thread-pool.h M be/src/util/thread.cc M be/src/util/thread.h M be/src/util/uid-util.h M bin/bootstrap_toolchain.py M bin/generate_minidump_collection_testdata.py M bin/start-impala-cluster.py M common/thrift/CMakeLists.txt M common/thrift/ImpalaInternalService.thrift D common/thrift/Llama.thrift D common/thrift/ResourceBrokerService.thrift M common/thrift/metrics.json M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/hadoop/conf/yarn-site.xml.tmpl D testdata/cluster/node_templates/cdh5/etc/init.d/llama-application D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-log4j.properties.tmpl D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-site.xml.tmpl M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 73 files changed, 117 insertions(+), 4,296 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4445/1 -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson
[Impala-ASF-CR] Remove Llama support.
Henry Robinson has posted comments on this change. Change subject: Remove Llama support. .. Patch Set 1: Passed a core test run. -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4014: HEADERS ONLY: Introduce query-wide execution state. .. Patch Set 1: (17 comments) http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: PS1, Line 32: setting : /// that flag to 0 disables periodic reporting altogether. Possibly a good time to retire that functionality which I don't think is ever used. PS1, Line 43: TODO: : /// - move tear-down logic out of d'tor and into TearDown() function Why can't we do that in this patch (given it's header only...)? PS1, Line 56: QueryState* query_state() { return query_state_; } When would we want to get a mutable query_state() from its FIS? That circumstance would suggest the caller violated the top-down pattern for getting at the FIS in the first place. PS1, Line 69: /// Set the execution thread, taking ownership of the object. : void set_exec_thread(Thread* exec_thread) { exec_thread_.reset(exec_thread); } I do find this API weird, always have. Why doesn't this object start its own thread? PS1, Line 116: /// Returns true if this query has a limit and it has been reached. : bool ReachedLimit(); FYI, this and GetNext() are likely to be removed with my patch for IMPALA-2905 (subject to review). Might be worth anticipating that here. http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-exec-mgr.h File be/src/runtime/query-exec-mgr.h: PS1, Line 24: an entry point for : /// execution-related rpcs (ExecPlanFragment/CancelPlanFragment) I don't think this is relevant - we shouldn't comment on what might call this class. PS1, Line 33: Also creates a QueryState for this query, if none exists, and sets its refcount : /// to 1. If a QueryState already exists for this query, increments the refcount. Not clear from this who owns that reference, and who is therefore responsible for releasing it. PS1, Line 43: /// Cancels a particular fragment instance. : Status CancelFInstance(const TUniqueId& finstance_id); Why is this here, and not accessible through GetQueryState(query_id)->Cancel(...) ? PS1, Line 55: PublishFilter Again, don't see that this needs to be in this class. http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-state.h File be/src/runtime/query-state.h: PS1, Line 31: threads too specific: clients of this class must obtain a reference. PS1, Line 41: class Guard { comment on usage pattern, which is going to be common. Particularly that callers must check if query_state() is not null, and if it is not null then it will live as long as the guard. FWIW, I prefer ScopedRef as a name. The 'guard' in lock_guard<> means - I think - that the object guards a critical section by preventing concurrent access. PS1, Line 44: (ExecEnv::GetQueryExecMgr() shouldn't there be corresponding changes to ExecEnv then? Or are you leaving those out to make this self-contained? Either way - prefer to do ExecEnv::GetInstance()->query_exec_mgr(). Under what circumstances would this ever be null? PS1, Line 49: DCHECK(ExecEnv::GetQueryExecMgr() != nullptr); I don't think we need this, but presumably it's redundant here if the constructor checked it exists. PS1, Line 60: query nit picking here, but it might be helpful to distinguish the query lifetime from the lifetime of the set of fragment instances; since the former can be significantly longer than the latter. Not sure what a good alternative is without introducing a new phrase for a group of fragment instances. PS1, Line 65: Delete all query state in this class or accessible through this class. Would just say it releases all resources owned by this class ("accessible through this class" sounds like it could reach out and delete other structures' data). PS1, Line 69: /// Registers a new FInstanceState. Either the comment or the method name should change. I would change the comment - the fact that an FInstanceState is registered seems like an implementation detail. PS1, Line 73: Status Under what conditions would this return an error? -- To view, visit http://gerrit.cloudera.org:8080/4418 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that change capacity
Hello Tim Armstrong, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4428 to look at the new patch set (#5). Change subject: IMPALA-4138: Fix AcquireState() for batches that change capacity .. IMPALA-4138: Fix AcquireState() for batches that change capacity If MarkAtCapacity() is called on a row batch, it is difficult to call AcquireState() on that batch because tuple_ptrs_size_ is not accessible to initialise the destination batch - this is usually calculated from capacity(), but that value is wrong for these purposes after MarkAtCapacity(). Add RowBatch::InitialCapacity() to return the initial capacity value of the batch. Add row-batch-test to add initial coverage of AcquireState() API. Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 --- M be/src/runtime/CMakeLists.txt A be/src/runtime/row-batch-test.cc M be/src/runtime/row-batch.cc M be/src/runtime/row-batch.h 4 files changed, 78 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/4428/5 -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4138: Fix AcquireState() for batches that change capacity
Henry Robinson has posted comments on this change. Change subject: IMPALA-4138: Fix AcquireState() for batches that change capacity .. Patch Set 4: (2 comments) Rebased to include IMPALA_ASSERT_DEBUG_DEATH macro from trunk. http://gerrit.cloudera.org:8080/#/c/4428/4//COMMIT_MSG Commit Message: PS4, Line 11: this is usually calculated from : capacity() > are there places that incorrectly use capacity() that should be changed to I have one in my upcoming patch for IMPALA-2905. Independently, this seems like a shortcoming in the API - there's no way to do AcquireState() after capacity changes without sharing knowledge of the initial capacity through a separate channel. Otherwise there are a couple of AcquireState() calls, but they use row batches that have been initialized with RuntimeState::batch_size() rows. http://gerrit.cloudera.org:8080/#/c/4428/4/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: PS4, Line 360: capacity_ > not due to your change, but this comment should be fixed. Done -- To view, visit http://gerrit.cloudera.org:8080/4428 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I6ceca53c406b05cd04b7d95a4f9f2eec7bc127f5 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4610: Remove Llama support.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4610: Remove Llama support. .. Patch Set 1: (4 comments) Matt's going to weigh in on what parts of the memory estimation path we can remove, depending on what admission control depends on. http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: Line 86: int16_t GetPerHostVCores() const; > what about this? This is an unused method declaration. As discussed, we still use VCores to generate estimates in the plan / profile, and since those are user-facing we should phase them out on a different schedule. Left a TODO. http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/scheduling/request-pool-service.cc File be/src/scheduling/request-pool-service.cc: Line 47: DEFINE_string(llama_site_path, "", "Path to the Llama configuration file " > good to know. please leave todo with explanatory comment. Done http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: Line 292 > what's going on here? The 'offline' mode was used only when the local NodeManager went offline. Since that can't happen, this is an unused and untested code path. We'll likely want to redo this anyhow when we implement decommissioning. http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/service/impala-server.h File be/src/service/impala-server.h: Line 243 > unclear how this is related to llama See other comment wrt when this was used. -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4610: Remove Llama support.
Henry Robinson has uploaded a new patch set (#2). Change subject: IMPALA-4610: Remove Llama support. .. IMPALA-4610: Remove Llama support. Alas, poor Llama! I knew him, Impala: a system of infinite jest, of most excellent fancy: we hath borne him on our back a thousand times; and now, how abhorred in my imagination it is! Done: * Removed QueryResourceMgr, ResourceBroker, CGroupsMgr * Removed untested 'offline' mode and NM failure detection from ImpalaServer * Removed all Llama-related Thrift files * Removed RM-related arguments to MemTracker constructors * Deprecated all RM-related flags, printing a warning if enable_rm is set * Removed expansion logic from MemTracker * Removed VCore logic from QuerySchedule * Removed all reservation-related logic from Scheduler * Removed RM metric descriptions * Various misc. small class changes Not done: * Remove RM flags (--enable_rm etc.) * Remove RM query options * Changes to RequestPoolService. * Remove estimates of VCores / memory from plan Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef --- M be/CMakeLists.txt M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/bufferpool/reservation-tracker-test.cc M be/src/exec/blocking-join-node.cc M be/src/exec/data-sink.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/kudu-scan-node-test.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink-test.cc M be/src/exprs/expr-test.cc D be/src/resourcebroker/CMakeLists.txt D be/src/resourcebroker/resource-broker.cc D be/src/resourcebroker/resource-broker.h M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/mem-pool-test.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/CMakeLists.txt D be/src/scheduling/query-resource-mgr.cc D be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/request-pool-service.cc M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/service/query-exec-state.cc M be/src/util/CMakeLists.txt D be/src/util/cgroups-mgr.cc D be/src/util/cgroups-mgr.h M be/src/util/debug-util.h D be/src/util/llama-util.cc D be/src/util/llama-util.h M be/src/util/thread-pool.h M be/src/util/thread.cc M be/src/util/thread.h M be/src/util/uid-util.h M bin/bootstrap_toolchain.py M bin/generate_minidump_collection_testdata.py M bin/start-impala-cluster.py M common/thrift/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift D common/thrift/Llama.thrift D common/thrift/ResourceBrokerService.thrift M common/thrift/metrics.json M fe/src/main/java/com/cloudera/impala/planner/Planner.java M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/hadoop/conf/yarn-site.xml.tmpl D testdata/cluster/node_templates/cdh5/etc/init.d/llama-application D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-log4j.properties.tmpl D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-site.xml.tmpl M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 75 files changed, 124 insertions(+), 4,301 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4445/2 -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4160: Remove Llama support.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4160: Remove Llama support. .. Patch Set 1: (13 comments) http://gerrit.cloudera.org:8080/#/c/4445/1//COMMIT_MSG Commit Message: Line 7: Remove Llama support. > You beat me to it in patch set #2 Done PS1, Line 31: * Remove RM flags (--enable_rm etc.) : * Remove RM query options : * Changes to RequestPoolService. > reference https://issues.cloudera.org/browse/IMPALA-4159 Done http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: PS1, Line 69: RuntimeProfile* profile, int64_t byte_limit, : const std::string& label, MemTracker* parent) > 1 line? I will clang-format this patch separately. PS1, Line 87: MemTracker::MemTracker(UIntGauge* consumption_metric, : int64_t byte_limit, const string& label) > 1 line? See above. http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: PS1, Line 174: // TODO: This may not be right if more than one tracker can actually change its : // RM reservation limit. > remove Done PS1, Line 179: // TODO: this doesn't roll it back completely since the max values for : // the updated trackers aren't decremented. The max values are only used : // for error reporting so this is probably okay. Rolling those back is : // pretty hard; we'd need something like 2PC. > I think this was talking about the RM case so this can be removed. Done PS1, Line 190: new limit=" << limit << " prev=" << limit; > new/prev isn't interesting anymore, this is enough: Done http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: PS1, Line 112: per_host_mem = query_options_.rm_initial_mem; > I'd be fine with taking out this case and removing the query option- I don' I don't think we can or should remove the query option now, as that would break some clients I think. I'll leave a TODO to remove this logic if admission control depends on this. I could see someone using this option to force admission control behavior by capping the mem estimate. PS1, Line 120: FLAGS_rm_default_memory > same as above. Done PS1, Line 125: // TODO: Get this limit from Llama (Yarn sets it). > remove Done http://gerrit.cloudera.org:8080/#/c/4445/2/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: PS2, Line 88: int64_t GetClusterMemoryEstimate() const; > I think this can be removed Done http://gerrit.cloudera.org:8080/#/c/4445/2/fe/src/main/java/com/cloudera/impala/planner/Planner.java File fe/src/main/java/com/cloudera/impala/planner/Planner.java: PS2, Line 295: > We can remove the VCores stuff now, though some users may be relying on the Decided against this - users might rely on this for capacity estimations etc. MT might completely change the logic, in which case we can remove then. http://gerrit.cloudera.org:8080/#/c/4445/2/testdata/cluster/node_templates/cdh5/etc/hadoop/conf/yarn-site.xml.tmpl File testdata/cluster/node_templates/cdh5/etc/hadoop/conf/yarn-site.xml.tmpl: PS2, Line 35: : yarn.scheduler.fair.continuous-scheduling-enabled : true : : : : yarn.scheduler.fair.assignmultiple : true : > I think a lot of this is stuff that was needed for Llama. Can you leave a T Done. I think we are very close to being able to remove this file. -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4160: Remove Llama support.
Hello Marcel Kornacker, Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4445 to look at the new patch set (#5). Change subject: IMPALA-4160: Remove Llama support. .. IMPALA-4160: Remove Llama support. Alas, poor Llama! I knew him, Impala: a system of infinite jest, of most excellent fancy: we hath borne him on our back a thousand times; and now, how abhorred in my imagination it is! Done: * Removed QueryResourceMgr, ResourceBroker, CGroupsMgr * Removed untested 'offline' mode and NM failure detection from ImpalaServer * Removed all Llama-related Thrift files * Removed RM-related arguments to MemTracker constructors * Deprecated all RM-related flags, printing a warning if enable_rm is set * Removed expansion logic from MemTracker * Removed VCore logic from QuerySchedule * Removed all reservation-related logic from Scheduler * Removed RM metric descriptions * Various misc. small class changes Not done: * Remove RM flags (--enable_rm etc.) * Remove RM query options * Changes to RequestPoolService (see IMPALA-4159) * Remove estimates of VCores / memory from plan Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef --- M be/CMakeLists.txt M be/generated-sources/gen-cpp/CMakeLists.txt M be/src/bufferpool/reservation-tracker-test.cc M be/src/exec/blocking-join-node.cc M be/src/exec/data-sink.cc M be/src/exec/exec-node.cc M be/src/exec/hash-join-node.cc M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node.cc M be/src/exec/kudu-scan-node-test.cc M be/src/exec/kudu-scan-node.cc M be/src/exec/kudu-table-sink-test.cc M be/src/exprs/expr-test.cc D be/src/resourcebroker/CMakeLists.txt D be/src/resourcebroker/resource-broker.cc D be/src/resourcebroker/resource-broker.h M be/src/runtime/buffered-block-mgr-test.cc M be/src/runtime/buffered-block-mgr.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-recvr.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/disk-io-mgr-test.cc M be/src/runtime/disk-io-mgr.cc M be/src/runtime/exec-env.cc M be/src/runtime/exec-env.h M be/src/runtime/mem-pool-test.cc M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/runtime-filter-bank.cc M be/src/runtime/runtime-state.cc M be/src/runtime/runtime-state.h M be/src/runtime/test-env.cc M be/src/scheduling/CMakeLists.txt D be/src/scheduling/query-resource-mgr.cc D be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/request-pool-service.cc M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler-test.cc M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/impalad-main.cc M be/src/service/query-exec-state.cc M be/src/util/CMakeLists.txt D be/src/util/cgroups-mgr.cc D be/src/util/cgroups-mgr.h M be/src/util/debug-util.h D be/src/util/llama-util.cc D be/src/util/llama-util.h M be/src/util/thread-pool.h M be/src/util/thread.cc M be/src/util/thread.h M be/src/util/uid-util.h M bin/bootstrap_toolchain.py M bin/create-test-configuration.sh M bin/generate_minidump_collection_testdata.py M bin/start-impala-cluster.py M common/thrift/CMakeLists.txt M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift D common/thrift/Llama.thrift D common/thrift/ResourceBrokerService.thrift M common/thrift/metrics.json M fe/src/main/java/com/cloudera/impala/planner/Planner.java M testdata/cluster/admin M testdata/cluster/node_templates/cdh5/etc/hadoop/conf/yarn-site.xml.tmpl D testdata/cluster/node_templates/cdh5/etc/init.d/llama-application D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-log4j.properties.tmpl D testdata/cluster/node_templates/cdh5/etc/llama/conf/llama-site.xml.tmpl M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.tmpl 76 files changed, 181 insertions(+), 4,392 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/45/4445/5 -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky
[Impala-ASF-CR] IMPALA-4160: Remove Llama support.
Henry Robinson has posted comments on this change. Change subject: IMPALA-4160: Remove Llama support. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/4445/1/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: PS1, Line 112: bool ignored; > can we print "deprecated" somewhere? There's not an obvious place to do that; the only warning mechanism I can see is the RuntimeState's log which we don't have access to here. Printing a warning here to the process log wouldn't really help (users don't look at logs, operations staff do). Returning a bad status from parsing the query options would abort everything. Perhaps the best thing to do is to somehow hide the query options from clients (so they can still be set, but not shown). http://gerrit.cloudera.org:8080/#/c/4445/4/common/thrift/Frontend.thrift File common/thrift/Frontend.thrift: Line 390: // TODO: Remove this and associated code in Planner. > also add a todo in Planner.java, it's too easy to overlook here Done -- To view, visit http://gerrit.cloudera.org:8080/4445 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Icfb14209e31f6608bb7b8a33789e00411a6447ef Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Taras Bobrovytsky Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4011: Remove / reword messages when statestore messages are late
Henry Robinson has posted comments on this change. Change subject: IMPALA-4011: Remove / reword messages when statestore messages are late .. Patch Set 2: Code-Review+2 I'm going to submit this as is, then someone can take on warning when updates are too late over a fixed period. -- To view, visit http://gerrit.cloudera.org:8080/4500 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I09c7fa4a94065965e5cb83a3b183b2175f8b45fc Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry RobinsonGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: No