Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10744 )

Change subject: IMPALA-1760: Implement shutdown command
......................................................................


Patch Set 11:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/10744/11//COMMIT_MSG
Commit Message:

PS11:
RE: "I have a question that should we call CheckNotShuttingDown() in 
ImpalaInternalService::ExecQueryFInstances? If somehow the quiesce period is 
too short that coordinators still schedule fragment instances to the shutting 
down node, the queries can be failed fast."

I kind of like the suggestion but I'd be concerned about adding another code 
path to test - for the moment it seems simpler to just have one code path for 
failing queries that run past the quiesce period.


http://gerrit.cloudera.org:8080/#/c/10744/11//COMMIT_MSG@26
PS11, Line 26: e.g. statestore down
> Could you add tests for this? Could you explain more about how the shutdown
I added a basic test that kills the statestore. This property is pretty trivial 
to verify by looking at the code - the shutdown code path doesn't communicate 
with the statestore or refer to the cluster membership or anything like that.


http://gerrit.cloudera.org:8080/#/c/10744/11//COMMIT_MSG@29
PS11, Line 29: * If shutting down, a banner is shown on the root debug page.
> does this get exposed programatically somehow as well? I would think that c
You can get it in JSON form from the root debug page, i.e. 
http://host:port/?json=true, although that might not be the "right" interface. 
I thought about adding more functions to query status, etc but decided not to 
do that in this patch because it felt a little speculative and the patch is 
already large.


http://gerrit.cloudera.org:8080/#/c/10744/11//COMMIT_MSG@32
PS11, Line 32: 1. (if a coordinator) clients are prevented from submitting
             :   queries to this coordinator via some out-of-band mechanism,
             :   e.g. load balancer
> should shutting-down coordinators reject new queries or new sessions after
The current patch starts rejecting new queries and sessions immediately once a 
coordinator is shut down. Clarified here.


http://gerrit.cloudera.org:8080/#/c/10744/11/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/10744/11/be/src/service/client-request-state.cc@628
PS11, Line 628:   for (int i = 0; i < 3; ++i) {
> What about sleep several seconds before the next retry like this?
Seems like a good idea to consider but I don't want to make the change in this 
patch (we want to keep this consistent with the backend exec RPC). I filed a 
JIRA to track this IMPALA-7283

To that end though, I removed the code duplication with the other place that 
does retry to make it easier to make such changes in the future.


http://gerrit.cloudera.org:8080/#/c/10744/11/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/10744/11/be/src/util/default-path-handlers.cc@228
PS11, Line 228: bool is_quiescing = impala_server->IsShuttingDown();
> I think for now we should just stick to is_shutting_down since this stateme
I think that the Impala daemon can be quiesceing even after the period has 
elapsed - it's only successfully quiesced once nothing is running on it. So 
that really means that the quiesce period is the minimum quiesce period. 
Updated some comments accordingly.


http://gerrit.cloudera.org:8080/#/c/10744/11/common/thrift/StatestoreService.thrift
File common/thrift/StatestoreService.thrift:

http://gerrit.cloudera.org:8080/#/c/10744/11/common/thrift/StatestoreService.thrift@79
PS11, Line 79: it
> nit: typo
Done


http://gerrit.cloudera.org:8080/#/c/10744/11/fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
File fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java:

http://gerrit.cloudera.org:8080/#/c/10744/11/fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java@95
PS11, Line 95:    * Supports optionally specifying the backend and the 
deadline: either shutdown(),
             :    * shutdown('host:port'), shutdown(deadline), 
shutdown('host:port', deadline).
> Cool! It'd be better to mention these in the commit message:
Updated the commit message.


http://gerrit.cloudera.org:8080/#/c/10744/11/tests/custom_cluster/test_restart_services.py
File tests/custom_cluster/test_restart_services.py:

http://gerrit.cloudera.org:8080/#/c/10744/11/tests/custom_cluster/test_restart_services.py@94
PS11, Line 94: r'quiesce period left: ([0-9ms]*), deadline left: ([0-9ms]*), ' +
             :       r'fragment instances: ([0-9]*), queries registered: 
([0-9]*)'
> I'd be great if these can be shown in the web page dynamically.
This appears in a banner on the front page of the web UI - is that what you 
were asking for?


http://gerrit.cloudera.org:8080/#/c/10744/11/tests/custom_cluster/test_restart_services.py@212
PS11, Line 212:     # Test that we can reduce the deadline after setting it to 
a high value.
> maybe test that we cannot increase the deadline too
Done



--
To view, visit http://gerrit.cloudera.org:8080/10744
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4d5606ccfec84db4482c1e7f0f198103aad141a0
Gerrit-Change-Number: 10744
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Fredy Wijaya <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Thu, 12 Jul 2018 00:20:29 +0000
Gerrit-HasComments: Yes

Reply via email to