[
https://issues.apache.org/jira/browse/MESOS-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14736795#comment-14736795
]
Bartek Plotka commented on MESOS-2930:
--------------------------------------
1. Do we need that? What is the worst case scenario? With proper QoS
Controller, the new BE tasks using out-dated revocable offers will be just
killed the same moment they started. And we have to use the proper QoS
Controller anyway to have Resource Estimator aware of over-allocation state of
slave.
2. Some idea to mitigate above issue:
* What about rescinding/removing out-dated oversubscription offers? As far as i
know, rescinding is only happening when slave is deactivated or disconnected,
and allocator hasn't any handler for that. Only master has and we could try to
trigger it in Master code. However, it would be really nice, because we would
be able to out-date that offer. Basically, the offer with revocable resources
can be treated as _best-effort_ offer as well, right? (: Is there any other
solution?
* Second issue is with returning negative resources:
** We could create new "SignedResources" class and use that in
ResourceEstimator and whole flow up-to Master. We could create some converters
to normal Resources as well. (I've created something like that some time ago,
while writing my own allocation module). I'm not sure if currently there is any
other use case for such special Resources class, so...
** We could just add flag to Estimator API (optionall) making the slack
resources negative virtually.
Or maybe it's already designed? (:
> Allow the Resource Estimator to express over-allocation of revocable
> resources.
> -------------------------------------------------------------------------------
>
> Key: MESOS-2930
> URL: https://issues.apache.org/jira/browse/MESOS-2930
> Project: Mesos
> Issue Type: Improvement
> Components: slave
> Reporter: Benjamin Mahler
> Assignee: Klaus Ma
>
> Currently the resource estimator returns the amount of oversubscription
> resources that are available, since resources cannot be negative, this allows
> the resource estimator to express the following:
> (1) Return empty resources: We are fully allocated for oversubscription
> resources.
> (2) Return non-empty resources: We are under-allocated for oversubscription
> resources. In other words, some are available.
> However, there is an additional situation that we cannot express:
> (3) Analogous to returning non-empty "negative" resources: We are
> over-allocated for oversubscription resources. Do not re-offer any of the
> over-allocated oversubscription resources that are recovered.
> Without (3), the slave can only shrink the total pool of oversubscription
> resources by returning (1) as resources are recovered, until the pool is
> shrunk to the desired size. However, this approach is only best-effort, it's
> possible for a framework to launch more tasks in the window of time (15
> seconds by default) that the slave polls the estimator.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)