> On April 23, 2018, 10:43 p.m., Greg Mann wrote: > > src/master/master.cpp > > Line 4512 (original), 4515-4527 (patched) > > <https://reviews.apache.org/r/66568/diff/5/?file=2010510#file2010510line4515> > > > > I was just wondering - perhaps this code belongs in the validation > > function for the ACCEPT call? Since this is something that we can easily > > confirm synchronously in the top-level scheduler API handler and return > > 400, rather than calling into the accept handlers. > > Zhitao Li wrote: > I'm fine in either case as long as it provides better consistency and > less code duplication (which I think is a problem). I guess the reason there > is a split is due to that AuthZ is async right now. > > I actualy wonder what happens if we move the slave connectivity checks to > `Master::accept()`: since master to agent connection through network can be > broken at any moment, I do not see checking that in `_accept()` is 100% safe > anyway, and `send()` calls to agent need to handle failures anyway. > > Alternatively, we can consider moving authorization checks as early as we > can, and condense all other validations after AuthZ returned (assuming AuthZ > do not depends on these validations). > > Zhitao Li wrote: > (replied too fast) I feel both work can reduce the code duplicate, and > makes it less ambiguous for future development. > > Chun-Hung Hsiao wrote: > IMO in general it's better to have validations before authN/authZ, > because validations is usually more light-weight than authN/authZ. Also, > validations should be synchronous, meaning that if the call is invalid, we > will get rid of it faster from the memory (rather than waiting for the > authN/authZ to finish). > > Chun-Hung Hsiao wrote: > But in this case since this is temporary, I'm fine with putting the logic > here.
Question: Are these operations usable with this restriction? It seems to me that with this restriction, the framework will need to maintain one more extra state: 1. reserve enough CPUs, mem and free disk (otherwise it is possible that the framework cannot get enough CPUs and mem after growing the volume); 2. grow the persistent volume; 3. use the grown volume. If we implement a proper validation right now, 1 and 2 can be combined, and there will be no need to rewrite the framework in the future. Can you explain again what the concern of implementing the correct validation logic again? - Chun-Hung ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66568/#review201785 ----------------------------------------------------------- On April 23, 2018, 8:07 p.m., Zhitao Li wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66568/ > ----------------------------------------------------------- > > (Updated April 23, 2018, 8:07 p.m.) > > > Review request for mesos, Chun-Hung Hsiao and Greg Mann. > > > Bugs: MESOS-4965 > https://issues.apache.org/jira/browse/MESOS-4965 > > > Repository: mesos > > > Description > ------- > > These two operations are intended to be non-speculative eventually but > are implemented as speculative right now. To avoid frameworks opt-in to > dangerous behavior, we require that accept can only contain one > `GROW_VOLUME` or `SHRINK_VOLUME` and no other operations. > > This is implemented by reuse existing code which already drops `LAUNCH` > or `LAUNCH_GROUP` with proper reason and message. > > > Diffs > ----- > > src/master/master.cpp 767ad8cfe142b47ef07172bcb2a4fb49fc3e833a > > > Diff: https://reviews.apache.org/r/66568/diff/5/ > > > Testing > ------- > > Behavior tested in https://reviews.apache.org/r/66569/. > > > Thanks, > > Zhitao Li > >
