----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/70168/#review213774 -----------------------------------------------------------
Since this RR adds code which was/is removed in https://reviews.apache.org/r/70169/ I believe they should be a single refactoring patch (at least that's how I reviewed them both). Ideally we would have one patch moving code out of the SLRP, and one or more other patches to do cleanups, if only to make it easier to confirm that we do not introduce new bugs. This RR does way too many things to rule that out with much confidence. src/csi/metrics.cpp Lines 44 (patched) <https://reviews.apache.org/r/70168/#comment299798> Let's call out here that we fall through intentionally. src/csi/service_manager.cpp Lines 113 (patched) <https://reviews.apache.org/r/70168/#comment299771> Separate code blocks by empty lines from surrounding text. src/csi/service_manager.cpp Lines 123-125 (patched) <https://reviews.apache.org/r/70168/#comment299800> This docstring doesn't reflect the actual return type. I find boolean values are always tricky as they can require a lot of non-local context to interpret. How about we just use the previous return type `[(ContainerID, Option(ContainerStatus)]`? src/csi/service_manager.cpp Lines 133 (patched) <https://reviews.apache.org/r/70168/#comment299804> The _if necessary_ part isn't clear to me, and I am not sure a user of this function needs to know. Maybe just something along the lines of _waits for the endpoint socket to be ready_ and that some timeout is involved. We should also document that `endpoint` is a unix URI. ``` // Returns a `Future` ... ``` src/csi/service_manager.cpp Lines 137 (patched) <https://reviews.apache.org/r/70168/#comment299805> ``` // Returns a `Future` of the URI(?) of the ... ``` src/csi/service_manager.cpp Lines 139 (patched) <https://reviews.apache.org/r/70168/#comment299809> Do we need to mention the daemon here? Maybe just ``` If the container is not already running, this method will start a new container. ``` src/csi/service_manager.cpp Lines 155-156 (patched) <https://reviews.apache.org/r/70168/#comment299817> Since `Owned` can be shared, let's make this class explicitly non-copyable to prevent lifetime issues, e.g., ``` // Since this class contains `Owned` members which should not but can // be copied, explicitly make this class non-copyable. // // TODO(chhsia0): Remove this once MESOS-5122 is fixed. ServiceManagerProcess(const ServiceManagerProcess&) = delete; ServiceManagerProcess& operator=(const ServiceManagerProcess&) = delete; ``` src/csi/service_manager.cpp Lines 184-185 (patched) <https://reviews.apache.org/r/70168/#comment299818> In Mesos we typically expand types unless they are already mentioned somewhere in the statement or are nested types (e.g., iterators). ```` foreach (const CSIPluginContainerInfo::Service& service, services) { foreach (const CSIPluginContainerInfo& container, info.containers()) { ... } } ```` src/csi/service_manager.cpp Lines 186-191 (patched) <https://reviews.apache.org/r/70168/#comment299819> Let's either add a comment here why it is always okay to only ever store the first found service, or maybe even better separate input validation from setting up our own state. src/csi/service_manager.cpp Lines 235-236 (patched) <https://reviews.apache.org/r/70168/#comment299823> Let's keep the previous code from SLRP process `recover` and the previous `getContainers` return type. src/csi/service_manager.cpp Lines 260 (patched) <https://reviews.apache.org/r/70168/#comment299824> If we rewrap the comment we can `s/despite if/even though/`. Ideally we'd do that in a trivial cleanup patch. src/csi/service_manager.cpp Lines 264 (patched) <https://reviews.apache.org/r/70168/#comment299825> This shows how a unnamed boolean is much harder to understand than variable that carries clear semantics like the previous `isRunningContainer`. src/csi/service_manager.cpp Lines 316 (patched) <https://reviews.apache.org/r/70168/#comment299772> Let's just use the `static` function like we had previously, ``` static ContainerID getContainerId( const CSIPluginInfo& info, const CSIPluginContainerInfo& container); ``` src/csi/service_manager.cpp Lines 332-338 (patched) <https://reviews.apache.org/r/70168/#comment299773> Maybe the following instead? ```` auto it = std::find_if( info.containers().begin(), info.containers().end(), [this, containerId](const CSIPluginContainerInfo& info) { return getContainerId(info) == containerId; }); return it != info.containers().end() ? *it : Option<CSIPluginContainerInfo>::none(); ```` src/csi/service_manager.cpp Lines 354 (patched) <https://reviews.apache.org/r/70168/#comment299799> Let's not hide that we capture `this` here as such lambdas can be tricky. src/csi/service_manager.cpp Lines 375 (patched) <https://reviews.apache.org/r/70168/#comment299801> Let's add a comment here that and how we namespace our containers. src/csi/service_manager.cpp Lines 380-386 (patched) <https://reviews.apache.org/r/70168/#comment299802> Once we change the return type of this lambda we can get rid of this comment and all the coupling it needs to call out. src/csi/service_manager.cpp Lines 389 (patched) <https://reviews.apache.org/r/70168/#comment299803> As `result` has a different type than the function return value we could `move` here. src/csi/service_manager.cpp Lines 460 (patched) <https://reviews.apache.org/r/70168/#comment299806> Seems like we don't need to defer to self? src/csi/service_manager.cpp Lines 485 (patched) <https://reviews.apache.org/r/70168/#comment299807> `s/[=]/[this]/` src/csi/service_manager.cpp Lines 496-506 (patched) <https://reviews.apache.org/r/70168/#comment299808> I find the data flow in this way of chaining confusing (e.g., not immediately clear who generated the argument to this continuation). Can we first handle the `then` case and only then the `onAny`? src/csi/service_manager.cpp Lines 514-517 (patched) <https://reviews.apache.org/r/70168/#comment299810> This feels weird. Why can't we just do ``` if (endpoints.contains(containerId)) { return endpoints.at(containerId)->future(); } CHECK(!daemons.contains(containerId)); ``` src/csi/service_manager.cpp Lines 519-520 (patched) <https://reviews.apache.org/r/70168/#comment299811> How about avoiding the hard assertion here and instead returning a `Failure`? Since this was in the existing code, so maybe let's do any cleanup in a separate patch. src/csi/service_manager.cpp Lines 551 (patched) <https://reviews.apache.org/r/70168/#comment299812> We don't really use such abbreviated names, maybe `s/endpointVar/endpoint_/` or even `endpoint` if we inline the temporary? ``` Environment::Variable* endpoint = commandInfo.mutable_environment()->add_variables(); endpoint->set_name("CSI_ENDPOINT"); endpoint->set_value("unix://" + endpointPath.get()); ``` src/csi/service_manager.cpp Lines 614 (patched) <https://reviews.apache.org/r/70168/#comment299814> Should we have a metric for the start action as well? src/csi/service_manager.cpp Lines 620-621 (patched) <https://reviews.apache.org/r/70168/#comment299813> From this log line it isn't really clear that this is the teardown action for above `Connecting to endpoint ...`. Maybe instead `Disconnected from endpoint ...`? src/csi/service_manager.cpp Lines 644-645 (patched) <https://reviews.apache.org/r/70168/#comment299815> Not added here, but this assumes than an `Owned` can be shared. Let's clean that up in a follow-up. src/csi/service_manager.cpp Lines 646 (patched) <https://reviews.apache.org/r/70168/#comment299816> Looks good, not a simple code move though (also, `recover` additionally handles abandoned futures) ;) Might want to make the capture list explicit (`[this, containerId]`). - Benjamin Bannier On March 12, 2019, 7:59 p.m., Chun-Hung Hsiao wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/70168/ > ----------------------------------------------------------- > > (Updated March 12, 2019, 7:59 p.m.) > > > Review request for mesos, Benjamin Bannier, Jie Yu, and Jan Schlicht. > > > Bugs: MESOS-9632 > https://issues.apache.org/jira/browse/MESOS-9632 > > > Repository: mesos > > > Description > ------- > > The `ServiceManager` is agnostic to CSI versions, so can be used to > manage plugin containers for both CSI v0 and v1 plugins. Most of its > logic are adapted from the SLRP code. > > We also separate the CSI plugin metrics from SLRP metrics object so the > metrics can be accessed by `ServiceManager`. However, we do not make > `ServiceManager` own the CSI plugin metrics object because eventually we > would like to decouple metrics lifecycles from SLRP lifecycles. > > > Diffs > ----- > > src/CMakeLists.txt 3397c3b1d4e8a7900b2e5f870679cc7aa30b4be2 > src/Makefile.am d451d7cabe3bf5d4f5039cfac5de1b03ef891d07 > src/csi/metrics.hpp PRE-CREATION > src/csi/metrics.cpp PRE-CREATION > src/csi/service_manager.hpp PRE-CREATION > src/csi/service_manager.cpp PRE-CREATION > > > Diff: https://reviews.apache.org/r/70168/diff/6/ > > > Testing > ------- > > Testing done later in chain. > > > Thanks, > > Chun-Hung Hsiao > >
