Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13550 )

Change subject: WIP IMPALA-8484: Run queries on disjoint executor groups
......................................................................


Patch Set 7:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.h@347
PS3, Line 347:   boost::mutex admission_ctrl_lock_;
> For the  #running queries metrics, one thing to keep in mind is that the ne
As we discussed in person, I removed group stats altogether and added the 
number of admitted queries to the /backends debug page.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.h@578
PS3, Line 578:
> This data structure some some significance for the policy, since the iterat
Changed it to an ordered map and added a comment at the typedef.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@470
PS3, Line 470:        << " needs=" << PrintBytes(p
> pool_max_mem is the max mem across the cluster, so we should use pool_stats
Done


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@530
PS3, Line 530: xRequestsForPool(pool
> pool_cfg.max_requests would also scale with the cluster, so if we decide to
I switched this to a slot based model that allows pool_cfg.max_requests 
concurrent queries per executor.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@672
PS3, Line 672:         schedule.per_backend_exec_params().size(), 
cluster_thread_reservation,
             :         query_opts.thread_reservation_aggregate_limit);
             :     return true;
             :   }
> I think we should be logging all reasons, otherwise we wont know whats hold
I change the relevant code to log every reason (the profiles still only contain 
the last one). Let me know if you'd like us to find a way to include all 
non-admissions in the profile, too.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@689
PS3, Line 689: int64_t cluster_mem_to_admit = 
schedule.GetClusterMemoryToAdmit();
> this should be outside this scope, that is, before the lock. otherwise it w
Done


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@692
PS3, Line 692:           Substitute(REASON_REQ_OVER_POOL_MEM, 
PrintBytes(cluster_mem_to_admit),
> If we iterate over the groups, will we tend to admit to the first group? So
Good point, made it an ordered map and added a comment to the header.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@712
PS3, Line 712: ics_.pool_max_queued->SetValue(pool_cfg.max_
> I would also like to avoid adding a flag. Agree there's a good chance we ca
I split up the checks in per-cluster and per-group rejection tests, but since 
the cluster size can change between admission attempts, it still seems 
beneficial to run both checks for every admission attempt. I added code to only 
run the tests when the cluster membership changes but we don't have 
versioning/notification for changes to the pool configs so that approach might 
not work (see new comment in DequeueLoop()).


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@999
PS3, Line 999:     DCHECK_GE(remote_pool_stats.num_admitted_running, 0);
> Nit: I think this method should have a different name as the Scheduler stil
Done


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@1009
PS3, Line 1009:
> nit: maybe add a dcheck to make sure this in-consistency ( < ) never happen
Done


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/admission-controller.cc@1177
PS3, Line 1177: _GE(stats->agg_num_queued(), stats->local_stats().num_queued);
              :
> this is fine, its what our current behavior is. If we let queries after thi
Yeah, this TODO carried over from old code. Should we just remove it for now 
and file a jira instead?


http://gerrit.cloudera.org:8080/#/c/13550/6/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13550/6/be/src/scheduling/cluster-membership-mgr.h@98
PS6, Line 98:     /// The version of this Snapshot. It is incremented every 
time the cluster membership
This probably needs at least some smoke testing


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/scheduling/cluster-membership-mgr.h@97
PS3, Line 97:
> Nit: add a description
Done


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/service/impala-server.cc@234
PS3, Line 234: DEFINE_int32(status_report_cancellation_padding, 20, "(Advanced) 
The coordinator will "
> s/comma/commas/
Currently nothing, I added a DCHECK that makes sure that only a single group is 
allowed and updated the help string. A few weeks ago we talked about eventually 
supporting a hierarchy of executor groups, which we would model by executors 
being in multiple groups. The flag already uses the plural form so that we 
don't have to change the flag name in the future. Please let me know if you'd 
like us to go with the singular form for now or if we should chat about it more.


http://gerrit.cloudera.org:8080/#/c/13550/3/be/src/service/impala-server.cc@1053
PS3, Line 1053:     // result_metadata are atomic.
              :     lock_guard<mutex> l(*(*request_state)->lock());
              :
              :     // register exec state as early as possible so that queries 
that
              :     // take a long time to plan show up, and to handle incoming 
status
              :     // reports before execution starts.
              :     RETURN_IF_ERROR(RegisterQuery(session_state, 
*request_state));
              :     *registered_request_state = true;
              :
              : #ifndef NDEBUG
              :     /
> we still need to keep track of this in order to cancel running queries, may
Done, moved it to the ClientRequestState.



--
To view, visit http://gerrit.cloudera.org:8080/13550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b
Gerrit-Change-Number: 13550
Gerrit-PatchSet: 7
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 27 Jun 2019 21:25:14 +0000
Gerrit-HasComments: Yes

Reply via email to