Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13207 )

Change subject: IMPALA-8460: Simplify cluster membership management
......................................................................


Patch Set 8:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@67
PS8, Line 67:   typedef std::shared_ptr<const TBackendDescriptor> 
BackendDescriptorPtr;
I think I would prefer if there was something in the name of this and 
SnapshotPtr that indicated that it was a shared_ptr, so that it was more 
obvious in other code that it wasn't just a raw pointer. It's probably a matter 
of taste so I'll leave it up to you.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@70
PS8, Line 70:   typedef std::unordered_map<string, TBackendDescriptor> 
BackendIdMap;
nit: std::string in headers here and elsewhere. I guess a "using" is in a 
header somewhere.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@79
PS8, Line 79:   struct Snapshot {
Probably worth documenting that we expect this to be copy-constructed.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@121
PS8, Line 121:   /// Registers a callback to provide the local backend 
descriptor.
Can you document thread safety for these functions and the basics of the 
calling sequence (e.g. can be called anytime before or after Init()).

I think it would be good to document invariants in more detail and add DCHECKs 
as necessary. Otherwise it gets confusing about whether the implementation 
needs to support various cases and adds cognitive load to people working on the 
implementation.

I.e.
* Is it valid to call multiple times with different functions. Are there any 
guarantees that the old callback won't be called after this function returns?
* Is it valid to call with an empty function to deregister a callback.
* Etc.

It looks like in practice it's only ever called once with a non-NULL callback 
and I don't think we're likely to need to generalise it. So I'd suggested 
documenting that invariant and adding checks that the function is non-empty as 
appropriate 
(https://en.cppreference.com/w/cpp/utility/functional/function/operator_bool)


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.h@133
PS8, Line 133:   /// Returns a read only snapshot of the current cluster 
membership state.
Valid to call any time before Init()?


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@39
PS8, Line 39:   if (statestore_subscriber_ == nullptr) return Status::OK();
Consider DCHECK(TestInfo::is_test());


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@188
PS8, Line 188: RemoveExecutor
> Annoyingly, this call to RemoveExecutor() (and the others below) are wrong.
Yeah I like 1) since it makes the comparison more explicit. We should add a 
DCHECK too (either making it an invariant that the executor must be present 
when RemoveExecutor() is called, or returning a bool and DCHECKing here).

I suggested in a comment elsewhere that we could disable auto-generation of the 
== operator since for most of our thrift types it doesn't make sense to use.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@235
PS8, Line 235: std::
nit: unnecessary std::


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/cluster-membership-mgr.cc@313
PS8, Line 313:   for (const auto& group_it : executor_groups) {
Would this consistency check have detected the bug you found? Do we need to 
either enhance this check or the testing?


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group-test.cc
File be/src/scheduling/executor-group-test.cc:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group-test.cc@79
PS8, Line 79:
Might be good to have a randomised test for this or ClusterMembershipMgr that 
adds/removes executors randomly and checks consistency.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h
File be/src/scheduling/executor-group.h:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.h@41
PS8, Line 41: class ExecutorGroup {
Worth making explicit that this can be copy-constructed. Either by 
documentation or by declaring the default copy constructor.


http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.cc
File be/src/scheduling/executor-group.cc:

http://gerrit.cloudera.org:8080/#/c/13207/8/be/src/scheduling/executor-group.cc@63
PS8, Line 63:   if (find(be_descs.begin(), be_descs.end(), be_desc) == 
be_descs.end()) {
It would be nice if we could avoid accidentally using the equality operator for 
TBackendDescriptor. It looks like there's an options for thrift cpp to not 
generate the equality operators:

  diff --git a/common/thrift/CMakeLists.txt b/common/thrift/CMakeLists.txt
  index f297292..d7bc014 100644
  --- a/common/thrift/CMakeLists.txt
  +++ b/common/thrift/CMakeLists.txt
  @@ -58,7 +58,7 @@ function(THRIFT_GEN VAR)
       # It depends on hive_meta_store, which in turn depends on fb303.
       # The java dependency is handled by maven.
       # We need to generate C++ src file for the parent dependencies using the 
"-r" option.
  -    set(CPP_ARGS ${THRIFT_INCLUDE_DIR_OPTION} --gen cpp:moveable_types -o
  +    set(CPP_ARGS ${THRIFT_INCLUDE_DIR_OPTION} --gen   
cpp:moveable_types,no_default_operators -o
         ${BE_OUTPUT_DIR})
       IF (THRIFT_FILE STREQUAL "beeswax.thrift")
         set(CPP_ARGS -r ${CPP_ARGS})


I tried it and there's fallout around TUniqueId and TNetworkAddress - maybe we 
could disable the codegen option and define custom overloads of the required 
operators for those two types.



--
To view, visit http://gerrit.cloudera.org:8080/13207
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3cf9a8bb060d0c6e9ec8868b7b21ce01f8740a3
Gerrit-Change-Number: 13207
Gerrit-PatchSet: 8
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Thomas Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Thu, 09 May 2019 18:38:27 +0000
Gerrit-HasComments: Yes

Reply via email to