[ https://issues.apache.org/jira/browse/MESOS-6904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15817026#comment-15817026 ]
Jacob Janco commented on MESOS-6904: ------------------------------------ Reviews currently in progress: https://reviews.apache.org/r/51027/ https://reviews.apache.org/r/51028/ https://reviews.apache.org/r/52534/ WIP from [~gyliu] https://reviews.apache.org/r/51621/ > Track resource allocation candidates and batch allocation work > -------------------------------------------------------------- > > Key: MESOS-6904 > URL: https://issues.apache.org/jira/browse/MESOS-6904 > Project: Mesos > Issue Type: Bug > Components: allocation > Reporter: Jacob Janco > Assignee: Jacob Janco > Labels: allocator > > "Our deployment environments have a lot of churn, with many short-live > frameworks that often revive offers. Running the allocator takes a long time > (from seconds up to minutes). > In this situation, event-triggered allocation causes the event queue in the > allocator process to get very long, and the allocator effectively becomes > unresponsive (eg. a revive offers message takes too long to come to the head > of the queue)." - MESOS-3157 > To remedy the above scenario, it is proposed to track allocation candidates > and only dispatch allocation work if there is no pending allocation in the > allocator queue. When an enqueued allocation is processed, the tracked set of > candidates is cleared. > Current behavior will trigger allocation work on cluster events (e.g. > `addSlave()`, `addFramework()`, etc) as well as during the periodic batched > allocation running at a defined time interval. > This ticket tracks the new direction the work has taken since discussion in > MESOS-3157 where a previous solution by [~jamespeach] introduced batched > allocation only (which we currently run) as well as an approach to reduce > redundancy of work in the queue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)