Hi Qian,
Yes; frameworks will have to:
1) Register with the revocable resources framework capability set; the
important bit here, is that frameworks running on oversubscribed resources
will have to cope with frequent preemptions of their tasks. We wanted to
gain experience with this (experimental) feature and therefore let
frameworks opt out of running on oversubscribed resources.
2) The offer with revocable resources will look different and most
frameworks will actually have to rework their offer accept logic a bit; The
offer will include both regular (non-revocable) and revocable resources.
For example: cpus: 0.5; cpus{REV}: 1.89; mem: 512.
Here, schedulers would usually only look at the first cpu resource and
decline the offer.
See https://github.com/apache/mesos/blob/master/docs/oversubscription.md
for more information and feel free to reach out if you need help/assistance
running with oversubscription.
Cheers,
Niklas
On 15 August 2015 at 08:26, Qian AZ Zhang <[email protected]> wrote:
>
>
> Hi,
>
> When I try Mesos oversubscription feature, I found the revocable resources
> returned by resource estimator are actually separately kept track by
> allocator from regular resources, e.g., I started my slave with this
> command:
> ./bin/mesos-slave.sh --master=192.168.122.171:5050
> --resource_estimator="org_apache_mesos_FixedResourceEstimator"
> --modules=/home/stack/mesos/build/slave_modules
>
> The content of /home/stack/mesos/build/slave_modules is:
> {
> "libraries": {
> "file":
> "/home/stack/mesos/build/src/.libs/libfixed_resource_estimator.so",
> "modules": {
> "name": "org_apache_mesos_FixedResourceEstimator",
> "parameters": {
> "key": "resources",
> "value": "cpus:2;mem:500"
> }
> }
> }
> }
>
> Then I see the following message in master's output:
> I0815 23:16:34.065404 23218 hierarchical.hpp:600] Slave
> 20150815-231543-2876942528-5050-23204-S0 (mesos) updated with
> oversubscribed resources cpus(*){REV}:2; mem(*){REV}:500 (total: cpus(*):4;
> mem(*):2929; disk(*):36813; ports(*):[31000-32000]; cpus(*){REV}:2; mem
> (*){REV}:500, allocated: )
>
> So as you can see, the slave's total resources are: cpus(*):4; mem(*):2929;
> disk(*):36813; ports(*):[31000-32000]; cpus(*){REV}:2; mem(*){REV}:500, the
> revocable resources (cpus(*){REV}:2; mem(*){REV}:500) are kept separately
> from the regular resources (cpus(*):4; mem(*):2929; disk(*):36813; ports
> (*):[31000-32000];) which are auto detected when slave started up. That
> means frameworks can use more than the auto detected resources which I
> think should be slave's total resources. This seems a bit strange to me, I
> think allocator needs to mark part of the auto detected resources as
> revocable based on what resource estimator returns.
>
>
> Regards,
> Qian Zhang