On 16 August 2015 at 02:56, Qian AZ Zhang <[email protected]> wrote:
> Thanks Niklas and Christos.
>
> To Niklas, we'd like to try oversubscription feature with our framework.
> However, I do not quite understand your example below:
> >> For example: cpus: 0.5; cpus{REV}: 1.89; mem: 512.
> >> Here, schedulers would usually only look at the first cpu resource
> and decline the offer.
> Can you please let me know why framework scheduler would usually only look
> at the first cpu resource and decline the offer? I understand when
> launching a task, for one type of resource (e.g., cpus), the task can only
> use either revocable or non-revocable, not both. So for your example, I
> think scheduler can either pick up "cpus: 0.5; mem: 512" or "cpus{REV}:
> 1.89; mem: 512" to launch its task, right?
>
Correct; the offer will contain both and the scheduler need to make a
decision whether to use regular or revocable resources.
>
>
> To Christos, that's exactly what our framework wants: "allow frameworks
> to slack resources (allocated but not used) for revocable best effort
> work.". Our use case is, framework1 is allocated with 500MB memory for
> launching its tasks, but actually after its tasks are launched, they only
> use 300MB, i.e., there are 200MB memory allocated to framework1 but not
> used by its tasks. We'd like those 200MB memory reported by resource
> estimator as revocable resources, and let framework2 which has the
> revocable resources capability set use them. And once framework1 launches
> more tasks (i.e., need more memory), the tasks of framework2 can be killed
> by QoS controller so that framework1 can take those 200MB memory back to
> launch more tasks.
>
We are only oversubscribing compressible resources (cpu shares, I/O and
networking bandwidth) for now (or at least encouraging users to); you need
to be careful with oversubscribing disk and memory, but would love to help
if you want to try it out.
>
>
> However, after reading Mesos's code, I found the revocable resources
> reported by resource estimator are actually separately kept track by
> allocator from regular resources:
> HierarchicalAllocatorProcess<RoleSorter, FrameworkSorter>::updateSlave(
> const SlaveID& slaveId,
> const Resources& oversubscribed)
> {
> ...
> * slaves[slaveId].total += oversubscribed;*
> ...
> }
> As you see in the above code, oversubscribed resources will be added on
> top of slave's total resources, that means, slave's original total
> resources is enlarged with these *extra *resources. So when framework2
> launches its tasks, what its tasks use is actually these extra resources
> rather that framework1's unused resources. This is not what we expect, we'd
> like framework2 to use framework1's unused resources. That's why I said in
> my first mail that allocator needs to mark part of the total resources as
> revocable based on what resource estimator returns rather than add revocable
> resources on top of total resources.
>
>
> Regards,
> Qian Zhang
>
> [image: Inactive hide details for Christos Kozyrakis ---08/16/2015
> 02:26:14---> On Aug 15, 2015, at 8:26 AM, Qian AZ Zhang <zhangqxa@cn]Christos
> Kozyrakis ---08/16/2015 02:26:14---> On Aug 15, 2015, at 8:26 AM, Qian AZ
> Zhang <[email protected]> wrote: >
>
> From: Christos Kozyrakis <[email protected]>
> To: [email protected]
> Date: 08/16/2015 02:26
> Subject: Re: Why does allocator keep track of the revocable resources
> separately from regular resources
> ------------------------------
>
>
>
>
> > On Aug 15, 2015, at 8:26 AM, Qian AZ Zhang <[email protected]> wrote:
> >
> > That
> > means frameworks can use more than the auto detected resources which I
> > think should be slave's total resources. This seems a bit strange to me,
> I
> > think allocator needs to mark part of the auto detected resources as
> > revocable based on what resource estimator returns.
>
> That’s the whole idea of oversubscription Qian, to carefully understand
> the difference between allocated and actually used and allow frameworks to
> slack resources (allocated but not used) for revocable best effort work.
>
> Revocable resources are clearly marked in the offer. It is up to your
> framework to use them or ignore. You can also opt out as Niklas mentioned.
> Note that if a task uses some regular resources and some revocable
> resources at the same time, it is essentially a best effort task. So
> proceed carefully with your scheduler.
>
>
>