Re: Updated agent resources with every offer.

Arkal Arjun Rao Tue, 16 Feb 2016 19:14:36 -0800

So I've thought about this carefully and I've thought of a workaround that
doesn't work yet, but maybe will if people can chime in.


I have a demon thread that starts when the executor registers with the
framework. The demon thread probes the free disk every n seconds and sends
an update with a task ID (that is guaranteed to be unique) to the framework
if something has changed. The framework parses statuses received with the
unique task id and then uses dynamic reservation to achieve the result i
want.

executor.py contains this
```
def demonThread():
    < probe disk state >
    if remainingMesosUsableDisk == previousDiskUpdate:
        pass
    else:
        log.debug("Disk usage changed.  Sending disk update.")
        previousDiskUpdate = remainingMesosUsableDisk
        # Send the status of the disk usage via a status update
        status = mesos_pb2.TaskStatus()
        status.task_id.value = '-1'
        status.message = str(remainingMesosUsableDisk)
        status.state = mesos_pb2.TASK_RUNNING
        driver.sendStatusUpdate(status)
```

And the framework.py contains this
```
def statusUpdate(self, driver, update):
    taskID = int(update.task_id.value)
    if taskID == -1:
        nodeFreeDisk = int(update.message)
        self.offerUpdateReqd[update.slave_id.value] = nodeFreeDisk
        if not self.implicitAcknowledgements:
            driver.acknowledgeStatusUpdate(update)
        return None

def resourceOffers(self, driver, offers):
    for offer in offers:
        if self.offerUpdateReqd[offer.slave_id.value] is not None:
            operation = mesos_pb2.Offer.Operation()
            operation.type = mesos_pb2.Offer.Operation.RESERVE
            disk = operation.reserve.resources.add()
            disk.name = "disk"
            disk.type = mesos_pb2.Value.SCALAR
            disk.scalar.value =
self.offerUpdateReqd[offer.slave_id.value]/1024/1024
            disk.role = "*"
            # Accept the offer with the reservation operation and continue
            driver.acceptOffers([offer.id], [operation])
```

So for example, the agent has 5gb free when the executor registers with the
framework. The first offer says there is 5gb free.
The firs tjob writes 1Gb to the disk and the demon thread sends the
framework a message saying there is only 4gb left on the system. the next
offer that comes from the slave is accepted with a reserve operation
specifying disk.scalar.value = 4Gb. According to the readme, if the
reservation was successful, the next offer should have disk = 4Gb.

however, the next offer shows the same 5Gb as the original. This either
means that the reservation did not go through, or I'm missing something
here.

Does anyone have any thoughts about this?

Arjun

On Fri, Feb 12, 2016 at 6:02 PM, Vinod Kone <vinodk...@gmail.com> wrote:

> Say your task asks for 1cpu and  disk. After task terminates, mesos
> immediately offers back 1cpu and 1gb disk. It makes sense for cpu but not
> so much for disk.
>
> Mesos slave overcommits the disk in that sense. Mainly to allow task
> owners access to sandbox data after task termination. The asynchronous gc
> thread garbage collects the sandbox if there is disk space pressure on the
> host.
>
>
> @vinodkone
>
> On Feb 12, 2016, at 5:26 PM, Arkal Arjun Rao <aa...@ucsc.edu> wrote:
>
> That can be modified with the right values for gc_delay.
>
> I'm running a very basic test test where I accept a request, write a files
> to the sandbox, sleep for 100s, then exit. After exit, I probe the next
> offer.
>
> Having not specified any value for disk_watch_interval and assuming it is
> the default 60s, the new offer should have disk = (Original value - size of
> file i wrote to sandbox), right? Am i missing something here?
>
> Arjun
>
> On Fri, Feb 12, 2016 at 5:05 PM, Chong Chen <chong.ch...@huawei.com>
> wrote:
>
>> Hi,
>>
>> I think the garbage collector of Mesos agent will remove the directory of
>> the finished task.
>>
>> Thanks!
>>
>>
>>
>> *From:* Arkal Arjun Rao [mailto:aa...@ucsc.edu]
>> *Sent:* Friday, February 12, 2016 4:22 PM
>> *To:* user@mesos.apache.org
>> *Subject:* Re: Updated agent resources with every offer.
>>
>>
>>
>> Hi Vinod,
>>
>>
>>
>> Thanks for the reply. I think I understand what you mean. Could you
>> clarify these follow-up questions?
>>
>>
>>
>> 1. So if I did write to the sandbox, mesos would know and send the
>> correct offer?
>>
>> 2. And if so, and this might be hacky, if i bind mounted my docker folder
>> (where all cached images are stored) into a sandbox directory, do you think
>> Mesos will register the correct state of the disk in the offer? (Suppose I
>> were to spawn a possibly persistent job that requests 0 cores, 0 memory and
>> 0gb and use it's sandbox)
>>
>>
>>
>> Thanks again,
>>
>> Arjun
>>
>>
>>
>> On Fri, Feb 12, 2016 at 4:08 PM, Vinod Kone <vinodk...@apache.org> wrote:
>>
>> If your job is writing stuff outside the sandbox it is up to your
>> framework to do that resource accounting. It is really tricky for Mesos to
>> do that. For example, the second job might be launched even before the
>> first one finishes.
>>
>>
>>
>> On Fri, Feb 12, 2016 at 3:46 PM, Arkal Arjun Rao <aa...@ucsc.edu> wrote:
>>
>> Hi All,
>>
>>
>>
>> I'm new to Mesos and I'm working on a  framework that strongly considers
>> the disk value in an offer before making a decision. My jobs don't run in
>> the agent's sandbox and may use docker to pull images from my dockerhub and
>> run containers on input data downloaded from S3.
>>
>>
>>
>> My jobs clean up after themselves but do not delete the cached docker
>> images after they complete so a later job can use them directly without the
>> delay of downloading the image again. I cannot predict how much a job will
>> leave behind.
>>
>>
>>
>> Leaving behind files after the job means that the disk space available
>> for the next job is less than the disk value the current job had when it
>> started. However the offer made to the master does not appear to update the
>> disk parameter before making the new offer. Is there any way to get the
>> executor driver to update the value passed in the disk field of resource
>> offers?
>>
>>
>>
>> Here's a Stack overflow with more details
>> http://stackoverflow.com/questions/35354841/setup-mesos-to-provide-up-to-date-disk-in-offers
>>
>>
>>
>> Thanks in advance,
>>
>> Arjun Arkal Rao
>>
>>
>>
>> PhD Candidate,
>>
>> Haussler Lab,
>>
>> UC Santa Cruz,
>>
>> USA
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Arjun Arkal Rao
>>
>>
>>
>> PhD Student,
>>
>> Haussler Lab,
>>
>> UC Santa Cruz,
>>
>> USA
>>
>>
>>
>> aa...@ucsc.edu
>>
>>
>>
>
>
>
> --
> Arjun Arkal Rao
>
> PhD Student,
> Haussler Lab,
> UC Santa Cruz,
> USA
>
> aa...@ucsc.edu
>
>


-- 
Arjun Arkal Rao

PhD Student,
Haussler Lab,
UC Santa Cruz,
USA

aa...@ucsc.edu

Re: Updated agent resources with every offer.

Reply via email to