> On March 27, 2018, 6:58 p.m., Greg Mann wrote: > > src/master/master.cpp > > Lines 11881-11887 (original), 11885-11890 (patched) > > <https://reviews.apache.org/r/66051/diff/7/?file=1988982#file1988982line11889> > > > > Currently, it looks like `Slave::usedResources` is the same as the > > "allocated resources" from that agent. > > > > When a GROW or SHRINK operation is initiated by an operator, its > > consumed resource is in a unique state in the master while that operation > > is pending: the volume is not allocated, but it is "used" in some sense. > > > > One place where we use `Slave::usedResources` is when validating > > DESTROY operations: > > https://github.com/apache/mesos/blob/21e4b45b388d0b272236b1c58313569f8a1d1fc8/src/master/master.cpp#L4841 > > > > This means if the master receives a DESTROY call for a persistent > > volume while there is a pending operator-initiated GROW_VOLUME call for > > that same volume (i.e. the operation was forwarded to the agent for > > processing but the master hasn't heard back yet), we would accept the > > DESTROY operation and forward it to the agent. Is this what we want? > > > > This brings something else to mind: would it be possible for the > > allocator to send out an offer containing the consumed resource of a > > GROW_VOLUME or SHRINK_VOLUME operation, while the operation is pending on > > the agent? This seems bad, but I believe it's possible. I think this may be > > the first time we have a set of resources which we don't want the allocator > > to offer for a period of time, but which are not currently allocated to > > some framework/role. I wonder if we need some new tools in the allocator to > > handle this? > > Chun-Hung Hsiao wrote: > Yeah this sounds bad. Would it help if we remove the consumed resources > from the allocator and put it back after receiving the terminal status update? > > Zhitao Li wrote: > I agree this is problematic. > > It seems like we need to make sure several things: > 1. `consumed` resources of a non-speculative operation can not offered to > any framework while pending; > 2. `consumed` resources should still be visible for various validation; > > Therefore, I'm thinking about introducing a new field in `Master::Slave` > struct which represents `pendingConsumedResources` (or a function in `Slave` > which can extract this info from pending operations). We can then use it to > validation of `DESTROY` call. > > Another issue I found was in `Master::apply()`, which we directly call > `allocator->updateAvailable(slave->id, {operation})`, but semantically we > should be calling `allocator->updateAvailable(slave->id, consumed, empty)` so > that allocator would not see the `converted` resources until operation > confirmations comes back.
I'm separating the actual logic change of supporting non-speculative operator API into future task and patch. - Zhitao ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/66051/#review200086 ----------------------------------------------------------- On April 9, 2018, 9:32 p.m., Zhitao Li wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/66051/ > ----------------------------------------------------------- > > (Updated April 9, 2018, 9:32 p.m.) > > > Review request for mesos, Chun-Hung Hsiao, Gaston Kleiman, and Greg Mann. > > > Bugs: MESOS-4965 > https://issues.apache.org/jira/browse/MESOS-4965 > > > Repository: mesos > > > Description > ------- > > These operator APIs is implemented as speculative for now, but > we plan to convert them to non-speculative in the future. > > > Diffs > ----- > > src/master/http.cpp 34c9023906eca94965acc994f20e888c1f47b962 > src/master/master.hpp 0d9620dd0c232dc1df83477e838eeb7313bf8828 > src/master/validation.cpp ac2e1bb8771841ec59b3bdcdeffb6c6230680d4d > > > Diff: https://reviews.apache.org/r/66051/diff/10/ > > > Testing > ------- > > > Thanks, > > Zhitao Li > >
