Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14104 )

Change subject: IMPALA-8803: Coordinator should release admitted memory 
per-backend
......................................................................


Patch Set 2:

(10 comments)

I actually had another round of comments, nothing too major.

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state-test.cc
File be/src/runtime/coordinator-backend-state-test.cc:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state-test.cc@40
PS1, Line 40: TUniqueId());
nit: should be a ** according to style guide


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state-test.cc@52
PS1, Line 52:   void MakeBackendStates(int num_states, QuerySchedule* 
query_schedule,
nit: ++i according to google style guide


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state-test.cc
File be/src/runtime/coordinator-backend-state-test.cc:

http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state-test.cc@113
PS2, Line 113:   backend_resource_state->Close(&unreleased_backend_states);
Calling Close() multiple times feels a bit weird to me since it doesn't really 
mirror the real usage pattern of BackendResourceStat. I guess maybe if we 
rename the method it will be a bit more obvious.

Maybe exposing backend_resource_states_ directly, or having a test-only method 
to get the count of backends in different stats would end up making the intent 
of the test clearer.


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.h@472
PS2, Line 472:   /// Return a vector of UNRELEASED or PENDING BackendStates.
Can you document that it's idempotent, and also if the contract is that the 
caller is expected to release resources for the returned backends.


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.h@477
PS2, Line 477:   static const int64_t RELEASE_BACKEND_STATES_DELAY_MS = 1000;
I think we should consider making this and BATCHED_RELEASE_DECAY_FACTOR hidden 
flags, just as a safety valve in case the heuristic turns out to not work well 
in production.

Hopefully this is the good kind of safety valve where it's easy to add and we 
never actually need to use it in practice because we picked good defaults.


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.h@512
PS2, Line 512: int
maybe int64_t? I don't think this is likely to overflow but would be nice to 
have more of a buffer


http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14104/1/be/src/runtime/coordinator-backend-state.h@477
PS1, Line 477: c con
> nit: how about call this CloseAndGetUnreleasedBackends() then return the un
+1 to returning the vector instead of requiring the caller pass it in, I think 
it simplifies the calling code.


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.cc@807
PS2, Line 807:   for (auto backend_state : backend_states_) {
I think we can merge this into the constructor, since there's nothing that can 
fail and Init() is called right after the constructor. Usually the Init() 
pattern is used when we need to return a Status.


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/runtime/coordinator-backend-state.cc@824
PS2, Line 824:     // If the coordinator backend has not been released, but all 
other have, then the only
What would happen if we enabled this when query result spooling was disabled 
also?

If we can release resources early in that case too it may be beneficial. I 
think for a lot of queries there's enough buffering in the coordinator fragment 
to allow other fragments to be released (e.g. if it's a non-grouping 
aggregation, or if there are a small number of rows returned).


http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/14104/2/be/src/scheduling/admission-controller.cc@992
PS2, Line 992:       num_released_backends_[schedule.query_id()] -= 
host_addrs.size();
nit: We could use the released_backends iterator instead of looking it up again 
in the map.



--
To view, visit http://gerrit.cloudera.org:8080/14104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I88bb11e0ede7574568020e0277dd8ac8d2586dc9
Gerrit-Change-Number: 14104
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 22 Aug 2019 19:52:38 +0000
Gerrit-HasComments: Yes

Reply via email to