> On March 4, 2016, 11:30 p.m., Avinash sridharan wrote: > > src/slave/containerizer/mesos/isolators/network/cni.cpp, line 134 > > <https://reviews.apache.org/r/44269/diff/4/?file=1280944#file1280944line134> > > > > I don't understand this comment. We just made sure the plugin does not > > exist? So what does the comment imply "it can > > // still be valid as long as operator puts the CNI plugin binary > > // that it uses under '--network_cni_plugins_dir'." ? > > > > I think at this point we should return an error. If can't find an > > executable for a named network, the behavior will become undefined. We > > should bail at this point. > > Qian Zhang wrote: > My point is, if we can not find a plugin for a named network during > initilization, log a warning message to let operator know this issue, and > afterward operator can put the plugin in the plugin directory without > restarting agent, then the named network can still work. > > Avinash sridharan wrote: > Lets not rely on the operator heeding WARNING messages and fixing the > problem. My concern is that this is a `FATAL` error since before the operator > can rectify the error if containers are launched the behavior becomes > undefined. > > Qian Zhang wrote: > Agree, let's return an error :-) > > Qian Zhang wrote: > After more thinking, I think in this case, it makes more sense to log a > warning message and ignore the network config file rather than bail at this > point, because there can be other valid network config files. If in the end > there is no any valid network config files, we should definitely bail. > > Avinash sridharan wrote: > I think we should not allow any errors in the configs/plugins passed by > the operator . Reason being that frameworks are going to learn about networks > out-of-band, and if there are config/plugin errors we will have to throw > errors during task launch. Why should we allow the system to proceed knowing > that this is going to lead to erroneous situation? The only way the operator > can fix this error is by restarting the slave (and fixing the config), so > might as well bail out sooner rather than later.
Can you please let me know how this can lead to erroneous situation? If a network config file is invalid for whatever reason, "network/isolator" will NOT load it and just ignore it, so how can framework launches a task to join an invalid network which is not loaded by the isolator? I do not think framework user has such knowledge, or you think framework user will know all the network config files (valid or invalid) under "--network_cni_config_dir" in some way? > On March 4, 2016, 11:30 p.m., Avinash sridharan wrote: > > src/slave/containerizer/mesos/isolators/network/cni.cpp, lines 108-115 > > <https://reviews.apache.org/r/44269/diff/4/?file=1280944#file1280944line108> > > > > isn't this the same as !nameValue.isSome() ? > > > > Also LOG(WARNING) should be LOG(ERROR) > > Qian Zhang wrote: > Please see > https://github.com/apache/mesos/blob/0.27.1/src/docker/docker.cpp#L255:L260 > for the existing way to handle the return value of `json.get().find()`. > > Avinash sridharan wrote: > Well... !isSome implies isNone or isError is true which is what you are > aiming for ? Will simplify things I guess. > > Qian Zhang wrote: > Let's follow the existing way, I think it is better to not introduce > inconsistence :-) > > Avinash sridharan wrote: > Inconsistence in terms of coding style is bad. But copying logic just for > the sake of consistency doesn't make sense. Just because a logic is > implemented a certain way doesn't mean we can't improve on it. I will leave > it up to you, but would prefer if we can shorten the code if possible. Can you please elaborate how to shorten the code? Do you mean instead of calling `isNone()` and `isError()` we can simply call `isSome()`? I do not think we can do it, because `isNone()` and `isError()` have different purposes, we need `isNone()` to handle the case that there is no "name" in the JSON, and we need `isError()` to handle the case that there is an error to find "name" in the JSON. That's why we return different error for `isNone()` and `isError()` respectively. - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/44269/#review122077 ----------------------------------------------------------- On March 7, 2016, 12:03 a.m., Qian Zhang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/44269/ > ----------------------------------------------------------- > > (Updated March 7, 2016, 12:03 a.m.) > > > Review request for mesos, Avinash sridharan, Gilbert Song, and Jie Yu. > > > Bugs: MESOS-4759 > https://issues.apache.org/jira/browse/MESOS-4759 > > > Repository: mesos > > > Description > ------- > > Added the framework of 'network/cni' isolator. > > > Diffs > ----- > > src/CMakeLists.txt 8f57a5701073bf1eaaa223383e928cf5db8f8ae4 > src/Makefile.am a41e95ddeb838fdebf4ced953c4a29181916e261 > src/slave/containerizer/mesos/isolators/network/cni.hpp PRE-CREATION > src/slave/containerizer/mesos/isolators/network/cni.cpp PRE-CREATION > > Diff: https://reviews.apache.org/r/44269/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Qian Zhang > >