-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63094/#review188465
-----------------------------------------------------------




include/mesos/resource_provider/resource_provider.proto
Lines 56-66 (patched)
<https://reviews.apache.org/r/63094/#comment265477>

    I would prefer if we would not assume that this number changes for _every_ 
resource change, but just for failures.
    * * *
    
    For one we do not want users to assume that this number has certain values 
so we can recognize failures. Imagine a user applied an operation to the 
resource provider resources (`sequence_id=0`), and the agent speculated that 
the new `sequence_id` would be `1`. Now the operation fails in the RP and its 
`sequence_id` would be incremented to `1`. We cannot distinguish a case where 
the agent assumed the operation succeeded from a case where it failed based on 
the clock value alone in this case. Would we just increase this number in case 
of failures it would be clear that any operation assuming `sequence_id=0` would 
be invalid after the failure since the agent would be unlikely to assume 
failures on a "normal" execution path. 
    
    * * *
    
    Another reason would be that it would be great to be able to use similar 
notions of `sequence_id` for the agent's view of its RPs, and the master's view 
of its agents. We plan to implement speculative application of "legacy" offer 
operations like e.g., `RESERVE`, `CREATE` in the master, and incrementing this 
number even in successful cases would require too much knowledge about how the 
agent applies these operations in the master.
    
    Imagine an agent with resource provider resources `disk(*):1` and 
`disk(*):1` (each `disk` from a different resource provider). The agent will 
expose `disk(*):2` to the master. A framework now reserves, 
`disk(*):2->disk(A):2`. The agent would inform the resource providers about 
that operation. Let's say the first of these operations succeeds, but the 
second on fails (the argument works as well for the other order, or even 
concurrent application). Having the master increment this 
`resource_sequence_id` for speculated operations would require it to know that 
the single `RESERVE` operation did map onto two state changes, i.e., it should 
have incremented its agent `resource_sequence_id` by two instead of just by one.
    
    This seems not only leaky, but also inflexible since it becomes hard to 
decompose an agent's resources without affecting the master's assumptions.
    
    * * *
    
    Having `sequence_id` just increase in case of failures would eliminate a 
large class of cases where users would need to make assumption about how it 
does change. Instead we would more clearly move into a direction where it is 
only changed by pushes from lower layers (where an operation either succeeded 
or failed) to user layers above. This should also simplify how we would project 
multiple `sequence_id`s onto a single `sequence_id` (multiple resource provider 
`sequence_id`s onto agent `sequence_id`); in that case we could just take the 
maximum of all RP IDs and the agent ID (e.g., last failure of an operation on 
any RP is equivalent to last failure of any operation on the agent). Counting 
all operations makes this harder, especially when thinking of master-speculated 
operations.


- Benjamin Bannier


On Oct. 18, 2017, 1:12 a.m., Jie Yu wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63094/
> -----------------------------------------------------------
> 
> (Updated Oct. 18, 2017, 1:12 a.m.)
> 
> 
> Review request for mesos, Benjamin Bannier, Benjamin Mahler, Chun-Hung Hsiao, 
> Gaston Kleiman, Greg Mann, Joseph Wu, Jan Schlicht, and Vinod Kone.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> This sequence id is used to establish ordering between the operation
> and the resources that the operation is operating on.  Each agent (or
> external resource provider) will keep a sequence id, and increments it
> when the resources on the agent (or the external resource provider)
> has changed. The master will keep track of the last known agent (or
> external resource provider) sequence id, and attach this sequence id
> in each operation it sends out. The agent (or the external resource
> provider) should reject operations that have smaller sequence id than
> its own sequence id, because this means the operation is operating on
> resources that might have already been invalidated.
> 
> 
> Diffs
> -----
> 
>   include/mesos/resource_provider/resource_provider.proto 
> f5a9073075327019fd133bd51265f695ef464845 
>   include/mesos/v1/resource_provider/resource_provider.proto 
> e5cbede5b6e57a8641fca1ebfee5454f292cc24c 
>   src/messages/messages.proto 0a32b3457e9143a7d48670610ca3e56dd516136f 
> 
> 
> Diff: https://reviews.apache.org/r/63094/diff/1/
> 
> 
> Testing
> -------
> 
> make check
> 
> 
> Thanks,
> 
> Jie Yu
> 
>

Reply via email to