[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 21: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 21 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 21: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 21 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Henry Robinson has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 18: Code-Review+1 (1 comment) Changes since PS16 look good to me. http://gerrit.cloudera.org:8080/#/c/4054/18/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 242: /// fragment. All elements are non-nullptr. Owned by obj_pool(). Say when this is initialized. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/4054/17/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS17, Line 483: // initialize progress updater : const string& str = Substitute("Query $0", PrintId(query_id_)); : progress_.Init(str, schedule_.num_scan_ranges()); > why was this wrong before? If there are no remote fragments I'd imagine Get the problem is if there are remote fragments, but starting the coordinator instance fails in prepare. i added a comment to the .h file which says that no elements of fragment_instance_states_ will ever be nullptr. if you resize here you end up with that scenario (exposed through a test). -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 18: (2 comments) http://gerrit.cloudera.org:8080/#/c/4054/17/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS17, Line 483: // initialize progress updater : const string& str = Substitute("Query $0", PrintId(query_id_)); : progress_.Init(str, schedule_.num_scan_ranges()); why was this wrong before? If there are no remote fragments I'd imagine GetTotalFInstances would return 1, and in PrepareCoordFragment() we should be setting the state at idx 0. http://gerrit.cloudera.org:8080/#/c/4054/18/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS18, Line 643: fragment_instance_states_.push_back(coord_state); DCHECK that this has size 1 after this -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 18: the latest patch contained another bug fix. this now passes the tests. please take another look. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 18 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Hello Matthew Jacobs, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4054 to look at the new patch set (#17). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will disappear in a follow-on patch and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/container-util.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 30 files changed, 1,547 insertions(+), 556 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/17 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 17 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Hello Matthew Jacobs, Internal Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4054 to look at the new patch set (#15). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will disappear in a follow-on patch and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc A be/src/scheduling/query-resource-mgr.cc A be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/container-util.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 32 files changed, 2,038 insertions(+), 553 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/15 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 15 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Henry Robinson has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 14: (15 comments) I think it would be good to have Alex review Frontend.java again - he said it looked good overall in PS3 so hopefully there won't be many comments. http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-resource-mgr.cc File be/src/scheduling/query-resource-mgr.cc: Line 1: // Licensed to the Apache Software Foundation (ASF) under one Remove this file. http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-resource-mgr.h File be/src/scheduling/query-resource-mgr.h: Line 1: // Licensed to the Apache Software Foundation (ASF) under one Did your rebase go wrong? This file should be removed. http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: PS12, Line 68: if (plan_node_to_fragment_idx_.size() < node.node_id + 1) { : plan_node_to_fragment_idx_.resize(node.node_id + 1); : plan_node_to_plan_node_idx_.resize(node.node_id + 1); : } I think you can find the max node ID in InitMtFragmentExecParams if you also loop over the plan nodes there. That seems better than all this conditional resizing, which is confusing to read. PS12, Line 105: std remove std:: http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: PS12, Line 184: Returns next instance id. Instance ids are consecutive numbers generated from : /// the query id. shouldn't that be the opposite? Start at 0 if there's a coordinator fragment. Might be better reworded as "The coordinator fragment instance, if there is one, has ID 0, all other fragment instances start from 1." PS12, Line 278: what's 'static info'? http://gerrit.cloudera.org:8080/#/c/4054/14/be/src/service/fragment-mgr.cc File be/src/service/fragment-mgr.cc: PS14, Line 23: #include not needed (see below), remove PS14, Line 42: << hex << exec_params.fragment_instance_ctx.fragment_instance_id << dec use PrintId(). http://gerrit.cloudera.org:8080/#/c/4054/14/be/src/service/impala-server.cc File be/src/service/impala-server.cc: Line 848: random_generator uuid_generator; remove this line PS14, Line 1080: << hex << params.fragment_instance_id Use PrintId() here instead of hex / dec. http://gerrit.cloudera.org:8080/#/c/4054/14/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: Line 443: Status status = exec_env_->scheduler()->Schedule(schedule_.get()); Leave a TODO to simplify this as discussed. http://gerrit.cloudera.org:8080/#/c/4054/14/be/src/service/query-options.cc File be/src/service/query-options.cc: Line 398: case TImpalaQueryOptions::MT_DOP: { When the patch that added MT_NUM_CORES was committed, I asked to consider a mechanism for hiding query options. DId that come to anything? It's unfortunate that we're going to commit a query option that (AFAICT) will break clients if it's turned on. http://gerrit.cloudera.org:8080/#/c/4054/14/be/src/util/container-util.h File be/src/util/container-util.h: PS14, Line 79: V* FindOrInsert(std::unordered_map* m, const K& key, const V& default_val) { : typename std::unordered_map ::iterator it = m->find(key); : if (it == m->end()) { : it = m->insert(std::make_pair(key, default_val)).first; : } : return >second; : } : : template : V* FindOrInsert(boost::unordered_map * m, const K& key, const V& default_val) { : typename boost::unordered_map ::iterator it = m->find(key); : if (it == m->end()) { : it = m->insert(std::make_pair(key, default_val)).first; : } : return >second; : } Just add the map type as a template parameter and remove the other versions of this. http://gerrit.cloudera.org:8080/#/c/4054/14/common/thrift/ImpalaInternalService.thrift File common/thrift/ImpalaInternalService.thrift: PS14, Line 185: backend so this is independent of the number of fragment instances for that query on that backend? PS14, Line 333: nd the id of the first fragment instance (the coordinator instance) : // are identical. isn't this true now only if there is a coordinator fragment? Reword to make that clearer - looks like first fragment instance is always 0 regardless of whether there is coord fragment. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 14
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Internal Jenkins has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 14: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/233/ -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4054 to look at the new patch set (#14). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will disappear in a follow-on patch and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc A be/src/scheduling/query-resource-mgr.cc A be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/container-util.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 32 files changed, 2,034 insertions(+), 553 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/14 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 14 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 13: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 13 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 10: (4 comments) http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS10, Line 2083: // TODO: use FragmentInstanceState::error_log_ instead > can you mention that? It's hard to keep track of all the TODOs, it'll make Done http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: PS10, Line 90: shared across fragment instances > This owns the vector of fragment instance exec params, so I think it's conf it is effectively shared between FInstanceExecParams (through field fragment_exec_params). it's a circular structure. http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: Line 192: const TPlanFragment& fragment = GetContainingFragment(id); > TODO-MT: remove, only needed on the ST path Done http://gerrit.cloudera.org:8080/#/c/4054/10/common/thrift/Planner.thrift File common/thrift/Planner.thrift: PS10, Line 35: // TODO: should this be called idx, to distinguish more clearly from a : // globally unique id? > let's think about this some more tomorrow. Done -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Dan Hecht has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/4054/12/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: PS12, Line 207: request_.query_ctx.request.query_options.mt_dop > 0 is this condition always equal to the one at line 196? -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has uploaded a new patch set (#12). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will disappear in a follow-on patch and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc A be/src/scheduling/query-resource-mgr.cc A be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/container-util.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 32 files changed, 2,035 insertions(+), 559 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/12 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 11: (26 comments) regarding testing: all tests passed http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS10, Line 294: boost > Is there any reason for the boost sets? a complete overhaul of coordinator.cc is out of scope. i have tried to switch to std:: and away from boost:: elsewhere, and it often requires very sweeping renaming, which i don't want to attempt here PS10, Line 750: std::cref(insta > why doesn't this params need a cref, but the one in StartRemoteFragments() it probably called the default copy c'tor. fixed. PS10, Line 1971: TODO-MT: remove > please mention this only happens for ST because in the MT case it is set up Done Line 1975: fragment_profiles_[instance_state.fragment_id()].num_instances); > in the else case you can DCHECK that exec_summary.exec_stats is the right s Done PS10, Line 2083: rorLogMap merged; > After the coord is treated as a remote FragmentInstanceState? yes PS10, Line 2253: : namespace { : : // Make a PublishFilter rpc to 'impalad' for given fragment_instan > nit wrap to 90chars Done http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS10, Line 259: /// 1. lock_ : /// 2. FragmentInstanceState::lock_ : boost::mutex > this seems to contradict the wording in FragmentInstanceState::lock_ Done http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/runtime/runtime-filter-bank.cc File be/src/runtime/runtime-filter-bank.cc: PS10, Line 84: QUERY > will this (and below) be noisy? this happens once for each filter. vlog_file seems too high a log level. http://gerrit.cloudera.org:8080/#/c/4054/10/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: PS10, Line 90: shared across fragment instances > for all fragment instances not sure what you mean. Line 106: std::vector instance_exec_params; > remove extra line Done Line 153: > when coordinator is handled as a remote fragment? Done Line 157: /// (in effect the number of remote instances) > when coordinator is handled as a remote fragment? Done PS10, Line 143: /// The following 4 functions need to be replaced once we stop special-casing : /// the coordinator instance in the coordinator. : /// The replacement is a single function int GetNumFInstances() (which includes : /// the coordinator instance). : : /// TODO-MT: remove; this is actually only the number of remote instances : /// (from the coordinator's perspective) : void set_num_fragment_instances(int64_t num_fragment_instances) { : num_fragment_instances_ = num_fragment_instances; : } : : /// Returns the number of fragment instances registered with this schedule. : /// MT: total number of fragment instances : /// ST: value set with set_num_fragment_instances(); excludes coord instance : /// (in effect the number of remote instances) : /// TODO-MT: get rid of special-casing of coordinator instance and always return the : /// total : int GetNumFragmentInstances() const; : : /// Returns the total number of fra > This is all still pretty confusing. Done PS10, Line 172: const TPlanFragment* GetCoordFragment() const; : : /// Return all fragments belonging to exec request in 'fragments'. : void GetTPlanFragments(std::vector* fragments) const; : > this is very confusing. I think you can get rid of the first one if you cha you can't use GetContainingFragment() there, because it references mt_fragment_exec_params_, but GetFragmentId() works. PS10, Line 187: : const TPlanFragment& GetContainingFragment(PlanNodeId node_id) const { : return mt_fragment_exec_params_[GetFragmentId(node_id)].fragment; > I think you can remove this- there's only 1 usage I can see in SimpleSchedu Done PS10, Line 210: : const FInstanceExecParams& GetCoordInstanceExecParams() const { : const TPlanFragment& coord_fragment = request_.mt_plan_exec_info[0].fragments[0]; : DCHECK_EQ(coord_fragment.partition.type, TPartitionType::UNPARTITIONED); : > I only see this used by the fn below, and I only see the fn below used once Done PS10, Line 244: : /// Maps from plan node id to its index in plan.nodes.
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 10: (29 comments) http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS3, Line 713: > ? Done PS3, Line 1654: (!rpc_status.ok()) { > Not sure what that means - you've added code to explicitly skip the coordin the coordinator previously didn't have an associated FragmentInstanceState. i added that, and am now skipping it in this path, since it's not a remote fragment. http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: PS9, Line 203: const > I think what the original comment was trying to get at was that the returne Done PS9, Line 227: int per_fragment_instance_idx_; : > not used Done PS9, Line 485: > inconsistent use of int/int32_t Done PS9, Line 486: > nit: space Done PS9, Line 488: progress_.Init(str, schedule_.num_scan_ranges()); > move this above near other usage of num_remote_instances Done PS9, Line 494: : bool has_coordinator_fragment = schedule_.GetCoordFragment() != NULL; : if (has_coordinator_fragment) { > I don't understand why this comment belongs here. Am I missing something or it's a comment on the lock acquisition. added a blank line. PS9, Line 542: ken on the coordin > Seems weird that we wouldn't have to call a function called 'FinishQuerySta renamed PS9, Line 637: } : executor_.reset(new PlanFragmentExecutor( : exec_env_, PlanFragmentExecutor::ReportStatusCallback())); > Why leave this as a comment? I'd imagine it will be called for the coord fr added todo, but i don't want to remove it, just to make sure we don't forget. PS9, Line 645: s before optimizing the LLVM module. The other expr > this and other cases where we index into fragment_instance_states_ would be Done Line 680: fragment_idx < request.fragments.size(); ++fragment_idx) { > comment where they are created. Done PS9, Line 711: > why cref(params)? Won't the reference (not the value) be copied by bind(), without the cref it's trying to copy the parameters and then complains about a missing copy c'tor (i'm glad there wasn't one). PS9, Line 712: > several of these can be omitted if schedule_ is a member variable. Again, d Done PS9, Line 735: for (const MtFragmentExecParams& fragment_params: s > why is this a comment? Done PS9, Line 746: gOptions* instance > same thing about using state_idx_ Done PS9, Line 1453: u > please make sure to run git-clang-format over this patch when you commit it Done PS9, Line 1555: SetExecPlanFragmentParams(exec_pa > It'd be nice to avoid passing these through the fn. I don't see any reason these are shared among all instances to which they apply, so i'd rather not create copies everywhere. they're also only used during instance startup, so i don't see why passing it as a parameter is inappropriate. PS9, Line 1564: rpc_params.fragment_instance_ctx.per_node_scan_ranges); : VLOG_FILE << "making rpc: ExecPlanFragment" > same thing about the state_idx_ parameter Done PS9, Line 1934: _profiles_[fragment_id]; > was removing the bounds checks intentional? Here and below. Done http://gerrit.cloudera.org:8080/#/c/4054/6/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: Line 101: class Coordinator { > nit: extra line Done http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS9, Line 547: e. : void MtStartRemoteFInstances(); > Looks to me like it sets up the state for all non-coordinator fragment inst Done http://gerrit.cloudera.org:8080/#/c/4054/9/be/src/scheduling/query-resource-mgr.cc File be/src/scheduling/query-resource-mgr.cc: PS9, Line 53: void ResourceResolver::GetResourceHostport(const TNetworkAddress& src, : TNetworkAddress* dest) const { : if (ExecEnv::GetInstance()->is_pseudo_distributed_llama()) { : auto entry = impalad_to_dn_.find(src); : if (entry != impalad_to_dn_.end()) { : *dest = entry->second; : } else { : *dest = TNetworkAddress(); : } : } else { : dest->hostname = src.hostname; : dest->port = 0; : } > I take it you'll rebase on the llama removal patch? i was going to. http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: Line 184: const TPlanFragment& GetContainingFragment(PlanNodeId node_id) const { > What's 'it' - the QuerySchedule? It doesn't get created in the scheduler, b i'm not sure i find that particularly
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has uploaded a new patch set (#10). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will eventually disappear (once multi-threaded execution is the default) and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc A be/src/scheduling/query-resource-mgr.cc A be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/container-util.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 32 files changed, 2,026 insertions(+), 540 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/10 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has uploaded a new patch set (#9). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will eventually disappear (once multi-threaded execution is the default) and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc M be/src/scheduling/query-resource-mgr.cc M be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 30 files changed, 1,504 insertions(+), 547 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/9 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has uploaded a new patch set (#8). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will eventually disappear (once multi-threaded execution is the default) and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/src/common/global-types.h M be/src/exec/exec-node.cc M be/src/exec/union-node.cc M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/runtime/runtime-filter-bank.cc M be/src/scheduling/query-resource-mgr.cc M be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 30 files changed, 1,505 insertions(+), 548 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/8 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Mostafa Mokhtar has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: Line 377: bool is_mt_exec = request.query_ctx.request.query_options.mt_dop != 1; Possibly make is_mt_exec a member of Coordinator as query_options.mt_dop is checked 5 times. http://gerrit.cloudera.org:8080/#/c/4054/3/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS3, Line 467: Mt Is the expectation to remove the "Mt" prefix once all operators support multi threading? http://gerrit.cloudera.org:8080/#/c/4054/6/fe/src/main/java/com/cloudera/impala/planner/Planner.java File fe/src/main/java/com/cloudera/impala/planner/Planner.java: Line 218: "statistics. Drop and re-compute statistics to resolve this problem.\n" + I believe this is printing too much information by default. -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Mostafa Mokhtar Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. Patch Set 4: (27 comments) http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: PS4, Line 218: const TNetworkAddress local_addr_; > Can you use ExecEnv::backend_address()? Or find some way to avoid the dupli Done PS4, Line 297: Wait > nit: Wait() Done PS4, Line 388: fragment_instance_state_idxs > Is this referencing the "fragment id (TPlanFragment.id)"? If so, please say it's not, it's an index into fragment_instance_states_. i thought that would be clear from the name. added comment. http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/query-schedule.cc File be/src/scheduling/query-schedule.cc: PS4, Line 121: std::sort(fragments.begin(), fragments.end(), : [](const TPlanFragment* a, const TPlanFragment* b) { return a->id < b->id; }); > shouldn't these come in an expected order? this adds an extra level of robustness, otherwise i have to make this ordering guarantee part of the interface. i don't expect this to have any performance impact. PS4, Line 346: NULL > nullptr Done http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: Line 97: std::mapper_exch_num_senders; > any benefit from using map over unordered_map here? it makes it easier to assign to thrift structs, and this isn't performance-critical. PS4, Line 173: NULL > nullptr Done Line 174: const TPlanFragment* GetCoordFragment() const; > nit: trailing whitespace Done Line 213: MtFragmentExecParams* GetFragmentExecParams(FragmentId id) { > If you need non-const access, isn't it more idiomatic to return a non-const we typically return pointers for mutable structures (and const refs for immutable structures). PS4, Line 219: NULL > nullptr Done Line 307: > comment, e.g. what state in MtFragmentExecParams gets initialized here vs. Done http://gerrit.cloudera.org:8080/#/c/4054/4/be/src/scheduling/simple-scheduler.cc File be/src/scheduling/simple-scheduler.cc: PS4, Line 384: auto > auto only for iterators this is a map iterator PS4, Line 414: depth-first traversal > isn't this an implementation detail of the other MtComputeFragmentExecParam Done PS4, Line 425: f > can you change f indexes here to match the code? it's explained as if it st moved comment PS4, Line 433: const TDataStreamSink& sink = src_fragment.output_sink.stream_sink; : DCHECK( : sink.output_partition.type == TPartitionType::UNPARTITIONED : || sink.output_partition.type == TPartitionType::HASH_PARTITIONED : || sink.output_partition.type == TPartitionType::RANDOM); > move this closer to where it's used on l453 Done PS4, Line 470: MtComputeFragmentExecParams > Can we be sure that recursion depth will not be a problem here? yes Line 524: // try to load-balance scan ranges by assigning just beyond the > nit: wrap at 90 characters Done PS4, Line 541: while (params_idx < params_list.size() : && total_assigned_bytes < threshold_total_bytes) { > It's not obvious to me why this will always assign all params. If the last i'm not sure what to add. for the last fragment, threshold_total_bytes == total_size, so this won't get exceeded. if total_assigned_bytes == total_size, by definition there can't be any params left. let me know how i can clarify. Line 695: for (int j = 0; j < params.hosts.size(); ++j, ++num_fragment_instances) { > you could call reserve() before this line i'll leave the old code as-is. PS4, Line 933: if (!FLAGS_disable_admission_control) { : RETURN_IF_ERROR(admission_controller_->AdmitQuery(schedule)); : } : if (!FLAGS_enable_rm) return Status::OK(); : : string user = GetEffectiveUser(schedule->request().query_ctx.session); : if (user.empty()) user = "default"; : schedule->PrepareReservationRequest(resolved_pool, user); : const TResourceBrokerReservationRequest& reservation_request = : schedule->reservation_request(); : if (!reservation_request.resources.empty()) { : Status status = resource_broker_->Reserve( : reservation_request, schedule->reservation()); : if (!status.ok()) { : // Warn about missing table and/or column stats if necessary. : const TQueryCtx& query_ctx = schedule->request().query_ctx; : if(!query_ctx.__isset.parent_query_id && :
[Impala-ASF-CR] IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend
Marcel Kornacker has uploaded a new patch set (#5). Change subject: IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend .. IMPALA-3902: Scheduler improvements for running multiple fragment instances on a single backend This is an extension of the scheduler and coordinator for multi-threaded execution. It mainly removes the assumption of having one instance per fragment per host. The approach taken here is to create parallel data structures and control flow functions, where necessary, and otherwise to leave the existing single-instance logic in place. The parallel structures' and functions' names are prefixed with "Mt" to facilitate the enventual clean-up. Not much of an attempt was made to factor out common functionality between the Mt- and the single-threaded version, because the single-threaded version will eventually disappear (once multi-threaded execution is the default) and refactoring the existing code to fit into two parallel functions from which it's being called might end up obscuring the code more than helping it. Also, this code is relatively stable and having two parallel paths won't cause much extra work (in terms of having to apply the same changes/fixes twice) in the medium term. Changes to data structures: - QuerySchedule: per-instance and per-fragment structs with complete execution parameters (instead of partially relying on TQueryExecRequest); the per-instance execution parameter struct is a child of the per-fragment parameter struct - explicit fragment id, with range 0..#fragments-1 (instead of relying on an index into an array in TQueryExecRequest) Excluded: - runtime filter handling - anything related to RM Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 --- M be/CMakeLists.txt M be/src/common/global-types.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/scheduling/query-resource-mgr.cc M be/src/scheduling/query-resource-mgr.h M be/src/scheduling/query-schedule.cc M be/src/scheduling/query-schedule.h M be/src/scheduling/simple-scheduler.cc M be/src/scheduling/simple-scheduler.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-mgr.cc M be/src/service/impala-server.cc M be/src/service/query-exec-state.cc M be/src/service/query-options.cc M be/src/service/query-options.h M be/src/util/CMakeLists.txt M be/src/util/uid-util-test.cc M be/src/util/uid-util.h M common/thrift/ExecStats.thrift M common/thrift/Frontend.thrift M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M common/thrift/Planner.thrift M common/thrift/Types.thrift M fe/src/main/java/com/cloudera/impala/common/TreeNode.java M fe/src/main/java/com/cloudera/impala/planner/Planner.java M fe/src/main/java/com/cloudera/impala/service/Frontend.java M fe/src/test/java/com/cloudera/impala/planner/PlannerTestBase.java 29 files changed, 1,490 insertions(+), 546 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/54/4054/5 -- To view, visit http://gerrit.cloudera.org:8080/4054 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I240445199e22f009f4e72fdb8754eb8d77e3d680 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Marcel KornackerGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs