Alexander Rukletsov created MESOS-8082:
------------------------------------------

             Summary: updateAvailable races with a periodic allocation and 
leads to flaky tests.
                 Key: MESOS-8082
                 URL: https://issues.apache.org/jira/browse/MESOS-8082
             Project: Mesos
          Issue Type: Bug
          Components: test
            Reporter: Alexander Rukletsov
            Assignee: Alexander Rukletsov


When an operator requests a resource modification (reserve resources, create a 
persitent volume and so on), a corresponding endpoint handler can request 
allocator state modification twice: recover resources from rescinded offers and 
for update applied operation. These operations should happen atomically, i.e., 
no other allocator change can happen in-between. This is however not the case: 
a periodic allocation can kick in. Solutions to this race might be: moving 
offer management to the allocator, coupling operations in the allocator, 
pausing allocator.

While this race does not necessarily lead to bugs in production—as long as 
operators and tooling can handle failures and retry—, it makes some tests using 
resource modification flaky, because in tests we do not plan for failures and 
retries.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to