Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15961 )

Change subject: IMPALA-9692 (part 3): Model QuerySchedule as a protobuf
......................................................................


Patch Set 5:

(28 comments)

http://gerrit.cloudera.org:8080/#/c/15961/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/15961/4//COMMIT_MSG@31
PS4, Line 31: In order to keep payloads reasonable for the evenaul RPC 
interface,
            :   AdmissionController::ReleaseQuery() and ReleaseQueryBacken
> the reason we do this is because ReleaseQuery() and ReleaseQueryBackend() w
Right.


http://gerrit.cloudera.org:8080/#/c/15961/4//COMMIT_MSG@43
PS4, Line 43: Para
> nit: typo?
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/coordinator-backend-state.cc@a95
PS4, Line 95:
            :
> intentionally removed?
Yeah, we no longer include the host with the FInstanceExecParams that we send 
back (because we don't want a duplicate copy of the hostname per finstance), so 
there's nothing to DCHECK for here anymore.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.h
File be/src/runtime/exec-params.h:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.h@51
PS4, Line 51: Return a reference
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.h@73
PS4, Line 73:   std::unordered_map<int32_t, const TPlanFragment&> fragments_;
> Nit: In the coordinator, we treat this like a vector, relying on the fragme
I guess the reason I did it this way is that the fragments don't come to us in 
sorted order by plan id, so we would have to insert them into the vector and 
then sort it, which is O(n*log(n)) in the number of fragments whereas just 
inserting into the hashmap is O(n)

I don't have a strong preference, though. There are of course arguments for and 
against both approaches.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.cc
File be/src/runtime/exec-params.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.cc@32
PS4, Line 32: 'exec_reques
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.cc@32
PS4, Line 32: reference
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/runtime/exec-params.cc@48
PS4, Line 48: 1);
> is it worth adding some DCHECKs to make sure both of these vectors have at
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/admission-controller.h@341
PS4, Line 341: std::unique_ptr<QuerySchedulePB>* schedule_result
> might be nice to put this in a struct called AdmissionResponse? even if it
Definitely agree that the eventual AdmitQuery RPC will take an AdmitQueryReq 
and return an AdmitQueryResp.

I was going to leave defining those for the eventual patch where I define the 
whole AdmissionControlService RPC interface since a few other things will 
change that will impact it (eg. SubmitForAdmission() is blocking but the 
eventual AdmitQuery RPC will be non-blocking, so the possible 'admit_outcome' 
values are different) that I haven't fully fleshed out yet.

I can define a AdmitQueryResp that just contains a QuerySchedulePB for now or 
something like that if you want.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/admission-controller.cc@899
PS4, Line 899: move(
> nit: we omit the std namespace in .cc files, right?
I fixed the instances in this file.

Fwiw, there are some instances of "std::unordered_map" in scheduler.cc that are 
necessary to disambiguate with boost::unordered_map, which is used in 
scheduler.h. Ideally, we would remove all the uses of boost::unordered_map from 
Impala, but I figured this patch was big enough as it is so I left them for now.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/admission-controller.cc@1607
PS4, Line 1607: GetMemToAdmit(*schedule, entry.second);
              :   }
              : }
> Nit: You could use GetMemToAdmit() here
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/cluster-membership-mgr.cc@344
PS4, Line 344: const UniqueIdPB& backend_id
> why was this changed to use a backend_id instead of a be_desc? just curious
Previously, Coordinator::BackendState had access to its BackendDescriptorPB via 
its BackendExecParams. That's no longer the case since we don't include the 
BackendDescriptorPB with the BackendExecParamsPB (since the coordinator already 
has the BackendDescriptorPBs for all of the backends from the statestore and it 
would be a bunch of unnecessary data getting transferred to send them for each 
backend in each QuerySchedulePB).

Instead, we just send the BackendIds for each backend, and then anytime the 
coordinator needs the BackendDescriptorPBs, as it does here, it can look them 
up in the current membership.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/cluster-membership-mgr.cc@352
PS4, Line 352: INFO
> should this ever happen? is there a reason it can't be DFATAL?
Yeah, I think this could happen if:

- a query gets scheduled on a backend just before the statestore decides that 
its no longer responding and removes it from the membership
- the coordinator then tries to start up execution just after receiving the 
updated membership but before CancelFromThreadPool gets around to cancelling 
the query
- Exec() to the backend fails so we try to blacklist it, but its no longer in 
the cluster membership

Probably pretty unlikely, but possible.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@56
PS4, Line 56:  protobuf struct c
> do you mean backend_exec_params?
No, the point I'm making is that this is a pointer to something that's a child 
of QuerySchedule::query_schedule_pb_. I reworded the comment to hopefully make 
it clearer.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@58
PS4, Line 58: er
> 'pb' is a bit of an add name for this, why not just 'backend_exec_params_pb
That's fair. Of course I was trying to avoid getting too verbose/repetitive - 
it would be unfortunate to have a bunch of instances of 
'backend_exec_params.backend_exec_params_pb->..' all over Scheduler.

Combined with your suggestion to rename QuerySchedule/QuerySchedulePB, here's 
what I'm thinking:

'QuerySchedule' is arguably made up of two parts: the actual schedule 
(QuerySchedulePB) and some intermediate state about the query for the 
convenience of Scheduler/AdmissionController, so maybe the best thing would be 
to rename 'QuerySchedule' to something like 'ScheduleState'?

Then, maybe I rename the rest of the classes here to match, i.e. 
BackendScheduleState/FragmentScheduleState/etc, so they also don't clash with 
their corresponding BackendExecParamsPB/etc.

Then, I can just name this 'exec_params', and using it will look like 
'backend_schedule_state.exec_params->...'

I'm gonna save actually doing this until we have agreement (or maybe even 
submit it in a followup to keep from making this patch even messier).


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@82
PS4, Line 82: Scheduler::Comput
> Nit: BackendExecParamsPB
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@108
PS4, Line 108: iven backend will
> swapped to 'BackendExecParamsPB' right?
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@114
PS4, Line 114: _'
> same comment as above, name could be revised.
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@124
PS4, Line 124: l
> so we shouldn't use this class (or any of the related structs) outside the
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@130
PS4, Line 130: FInstanceExec
> it would be nice if this class or QuerySchedulePB was renamed. QuerySchedul
Addressed in another comment.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@193
PS4, Line 193:   const TPlanFragment& GetContainingFragment(PlanNodeId node_id) 
const {
> why is non-const access to per_backend_exec_params_ and fragment_exec_param
We used to use a pattern where Scheduler creates a local PerBackendExecParams, 
fills it in, does a couple more loops over it to set some additional 
parameters, and then gives it to QuerySchedule via 
set_per_backend_exec_params(), after which its no longer modified.

With this change, we have QuerySchedule manage creating the individual 
BackendExecParams in GetOrCreateBackendExecParams() so that it can create the 
BackendExecParamsPBs, and so when we do the additional loops over it we need to 
be able to access it in a non-const way.

Of course, now that this class is only used by Scheduler/AdmissionController, 
it matters a little less cause we're sort of trusting them to mutate the 
schedule correctly (i.e. we already expose non-const access to the 
FragmentExecParams below)


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@285
PS4, Line 285:
> mention that ownership is transferred to the coordinator after submission h
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.h@320
PS4, Line 320: 
             :   /// Used to generate consecutive fragment instance ids.
> Nit: Same thought I mentioned on ExecParams, the fragment indexes are expec
Addressed on your other comment.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.cc@86
PS4, Line 86: for (int i = 0; i < fragments_.size(); ++i)
> can u still use the 'for ([var] : fragments)' pattern?
No, we rely on 'fragment_exec_params_' being in order of fragment idx, but 
since fragments_ is an unordered_map the order of iteration over it isn't 
guaranteed

(that is, unless I go with Joe's suggestion to make 'fragments_' a vector, see 
other comments for details)


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.cc@153
PS4, Line 153:   for (const auto& entry : per_backend_exec_params_) {
> is this just general cleanup? I guess per_backend_exec_params_ is smaller t
Validate() is called after the FInstanceExecParamsPB have been swapped to the 
BackendExecParamsPB, so we have to iterate over per_backend_exec_params_ to 
have access to them.

Added some comments about this.


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/query-schedule.cc@201
PS4, Line 201: GetOrCreateBackendEx
> seems more like 'GetOrCreateBackendExecParams' or 'LoadBackendExecParams' m
Done


http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/15961/4/be/src/scheduling/scheduler.cc@300
PS4, Line 300:         *scan_ranges.mutable_scan_ranges() = 
{entry.second.begin(), entry.second.end()};
             :       }
> why do u have to do this lookup? how was it working before?
The structure has changed, because thrift allows maps with value types that are 
lists but protobuf doesn't (see eg. the comment on ScanRangesPB definition)

So previously 'first_entry->second' and 'instance_params.per_node_scan_ranges' 
where both of the form 'map<int, list<ScanRangeParamsPB>>' and we could just 
copy one straight into the other but now we're copying into a protobuf struct 
of the form 'map<int, ScanRangesPB>' where ScanRangesPB just contains a 
'list<ScanRangeParamsPB>', so we have to iterate over and copy each element 
individually to do the transform.


http://gerrit.cloudera.org:8080/#/c/15961/4/common/protobuf/admission_control_service.proto
File common/protobuf/admission_control_service.proto:

http://gerrit.cloudera.org:8080/#/c/15961/4/common/protobuf/admission_control_service.proto@39
PS4, Line 39: n i
> typo?
No, its saying that this is a fragment-level value not a query-level value. 
I'll reword this to hopefully make it clearer.



--
To view, visit http://gerrit.cloudera.org:8080/15961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1db64e72f84604b1d8ac24e0bdd4ad6bedd6bcd9
Gerrit-Change-Number: 15961
Gerrit-PatchSet: 5
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Comment-Date: Fri, 19 Jun 2020 00:50:36 +0000
Gerrit-HasComments: Yes

Reply via email to