Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16215 )

Change subject: IMPALA-5746: Cancel all queries scheduled by failed coordinators
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16215/3/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/16215/3/be/src/runtime/exec-env.cc@554
PS3, Line 554:           
server->CancelQueriesOnFailedBackends(current_backend_set);
> I think this might be unnecessary on executor-only nodes?
Yeah for this and your other comment below, probably the cleanest thing is two 
just have two distinct callbacks, one that is registered if is_executor and 
calls CancelQueriesForFailedCoordinators, and one for is_coordinator that call 
CancelQueriesOnFailedBackends.

If both is_executor and is_coordinator is true you'll just register both 
callbacks, but ClusterMembershipMgr can handle that.


http://gerrit.cloudera.org:8080/#/c/16215/3/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16215/3/be/src/runtime/query-exec-mgr.cc@222
PS3, Line 222:   // TODO: create cancellation task queue and working thread to 
run cancellation tasks
             :   // on a separate thread. If the queue is full, ignore the 
cancellations since we'll
             :   // be able to process them on the next heartbeat instead.
             :
             :   for (auto& qs : to_cancel) {
             :     VLOG(1) << "CancelQueriesForFailedCoordinators(): cancel 
query " << qs->query_id();
             :     qs->Cancel();
             :     qs->is_coord_active_.Store(false);
             :     ReleaseQueryState(qs);
             :   }
> if you move this into impala-server.cc you can use the ImpalaServer::Cancel
I had suggested not to put this in ImpalaServer because its primarily the 
client interface (eg. beeswax and hs2) and is therefore really a coordinator 
component.

It's true that executor-only impalads will still have an ImpalaServer instance, 
but we might someday want to change that as we clean up the separation between 
executors and coordinators. On the other hand, ImpalaServer is probably so 
intertwined with everything else, and the benefit would be small enough, that 
we'll never actually do that.

It also doesn't seem like that much code duplication to put it here.

I don't really feel strongly about it, though.


http://gerrit.cloudera.org:8080/#/c/16215/3/tests/custom_cluster/test_restart_services.py
File tests/custom_cluster/test_restart_services.py:

http://gerrit.cloudera.org:8080/#/c/16215/3/tests/custom_cluster/test_restart_services.py@241
PS3, Line 241:   def test_kill_coordinator(self):
This is an okay place for this test (obviously we have a lot of different test 
files and the distinction between them isn't always clear), but maybe instead 
put it test_process_failures.py instead?



--
To view, visit http://gerrit.cloudera.org:8080/16215
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I918fcc27649d5d2bbe8b6ef47fbd9810ae5f57bd
Gerrit-Change-Number: 16215
Gerrit-PatchSet: 3
Gerrit-Owner: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 21 Jul 2020 19:25:00 +0000
Gerrit-HasComments: Yes

Reply via email to