[ 
https://issues.apache.org/jira/browse/MESOS-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-3157:
-----------------------------------
    Description: 
Our deployment environments have a lot of churn, with many short-live 
frameworks that often revive offers. Running the allocator takes a long time 
(from seconds up to minutes).

In this situation, event-triggered allocation causes the event queue in the 
allocator process to get very long, and the allocator effectively becomes 
unresponsive (eg. a revive offers message takes too long to come to the head of 
the queue).

We have been running a patch to remove all the event-triggered allocations and 
only allocate periodically on the allocation interval. This works great and 
really improves responsiveness.

  was:
Our deployment environments have a lot of churn, with many short-live 
frameworks that often revive offers. Running the allocator takes a long time 
(from seconds up to minutes).

In this situation, event-triggered allocation causes the event queue in the 
allocator process to get very long, and the allocator effectively becomes 
unresponsive (eg. a revive offers message takes too long to come to the head of 
the queue).

We have been running a patch to remove all the event-triggered allocations and 
only allocate from the batch task {{HierarchicalAllocatorProcess::batch}}. This 
works great and really improves responsiveness.


> Only perform periodic resource allocations.
> -------------------------------------------
>
>                 Key: MESOS-3157
>                 URL: https://issues.apache.org/jira/browse/MESOS-3157
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation
>            Reporter: James Peach
>            Assignee: Jacob Janco
>
> Our deployment environments have a lot of churn, with many short-live 
> frameworks that often revive offers. Running the allocator takes a long time 
> (from seconds up to minutes).
> In this situation, event-triggered allocation causes the event queue in the 
> allocator process to get very long, and the allocator effectively becomes 
> unresponsive (eg. a revive offers message takes too long to come to the head 
> of the queue).
> We have been running a patch to remove all the event-triggered allocations 
> and only allocate periodically on the allocation interval. This works great 
> and really improves responsiveness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to