Re: Setting Rate of Resource Offers

Christopher Ketchum Fri, 19 Jun 2015 14:04:55 -0700

Hi,

I had tried adjusting the allocation_interval flag earlier, but I guess I was 
misusing it - it does exactly what I was looking for.


Thanks for the help!
Christopher
> On Jun 18, 2015, at 7:57 AM, Alex Rukletsov <[email protected]> wrote:
> 
> Christopher,
> 
> have you tried to adjust the master allocation_interval flag?
> 
> On Thu, Jun 18, 2015 at 12:20 AM, Christopher Ketchum <[email protected]> 
> wrote:
> Hi,
> 
> I think those logs were misleading, sorry. I am running the tests in Pycharm, 
> which aggregates all the logs onto one console so I selected only the mesos 
> messages that explicitly said they were from master. Here are those logs 
> without my editing. Again, the last two messages are almost a second apart. 
> The resources are recovered very quickly, but not offered up for another 
> second. Is this delay to try to increase the offer size? Is that delay 
> adjustable?
> 
> I0617 11:34:08.581996 184418304 master.cpp:4623] Updating the latest state of 
> task 1 of framework 20150617-113405-16777343-5050-6614-0000 to TASK_FINISHED
> 
> I0617 11:34:08.582051 188174336 hierarchical.hpp:648] Recovered cpus(*):3.9 
> (total allocatable: mem(*):7136; disk(*):109424; ports(*):[31000-32000]; 
> cpus(*):3.9) on slave 20150617-113405-16777343-5050-6614-S0 from framework 
> 20150617-113405-16777343-5050-6614-0000
> 
> I0617 11:34:08.582778 185491456 master.cpp:4690] Removing task 1 with 
> resources cpus(*):3.9 of framework 20150617-113405-16777343-5050-6614-0000 on 
> slave 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 
> (localhost)
> 
> I0617 11:34:08.582839 185491456 master.cpp:2787] Forwarding status update 
> acknowledgement 3fec98f2-9f50-4968-9708-c7663f36b62d for task 1 of framework 
> 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) at 
> [email protected]:54818 to slave 
> 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 (localhost)
> 
> I0617 11:34:08.583075 256425984 status_update_manager.cpp:389] Received 
> status update acknowledgement (UUID: 3fec98f2-9f50-4968-9708-c7663f36b62d) 
> for task 1 of framework 20150617-113405-16777343-5050-6614-0000
> 
> I0617 11:34:09.446701 184954880 master.cpp:3760] Sending 1 offers to 
> framework 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) 
> at [email protected]:54818
> 
> Thanks,
> Christopher
>> On Jun 17, 2015, at 1:54 PM, Vinod Kone <[email protected]> wrote:
>> 
>> Looks like the hierarchical allocator doesn't trigger an allocation when 
>> resources are recovered from a finished task (likely a bug. can you file a 
>> ticket?). Instead it depends on the periodic allocation interval (default 
>> 1s, configurable via flags.allocation_interval) for the next allocation. In 
>> the meanwhile, you can reduce the default allocation interval via the flag 
>> to speed it up.
>> 
>> On Wed, Jun 17, 2015 at 11:59 AM, Christopher Ketchum <[email protected]> 
>> wrote:
>> You can see there is about a second delay between the last two messages. Its 
>> not a huge amount of time but it is noticeable, especially when testing with 
>> many short tasks. 
>> 
>> I0617 11:34:08.582778 185491456 master.cpp:4690] Removing task 1 with 
>> resources cpus(*):3.9 of framework 20150617-113405-16777343-5050-6614-0000 
>> on slave 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 
>> (localhost)
>> 
>> I0617 11:34:08.582839 185491456 master.cpp:2787] Forwarding status update 
>> acknowledgement 3fec98f2-9f50-4968-9708-c7663f36b62d for task 1 of framework 
>> 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) at 
>> [email protected]:54818 to slave 
>> 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 (localhost)
>> 
>> I0617 11:34:09.446701 184954880 master.cpp:3760] Sending 1 offers to 
>> framework 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) 
>> at [email protected]:54818
>> 
>>> On Jun 17, 2015, at 10:18 AM, Vinod Kone <[email protected]> wrote:
>>> 
>>> Can you paste the master logs for when the task is finished and the next 
>>> offer is sent?
>>> 
>>> On Wed, Jun 17, 2015 at 9:11 AM, Christopher Ketchum <[email protected]> 
>>> wrote:
>>> Hi everyone,
>>> 
>>> Thanks for the responses. To clarify, I’m only running one framework with a 
>>> single slave for testing purposes, and it is the re-offers that I am trying 
>>> to adjust. When I watch the program run I see tasks updating to 
>>> TASK_FINISHED, but there is a noticeable delay where my framework has the 
>>> next task queued but the master has not yet reoffered those resources, so 
>>> the program pauses until it gets the next offer. 
>>> 
>>> I am mainly concerned that I haven’t configured something properly, and 
>>> when I scale up the delays will compound. Of course, it is also possible 
>>> that with multiple slaves able to offer resources these delays will 
>>> disappear.
>>> 
>>> Thanks again,
>>> Christopher
>>> 
>>>> On Jun 14, 2015, at 8:11 AM, Alex Gaudio <[email protected]> wrote:
>>>> 
>>>> Hi Christopher,
>>>> 
>>>> To let a particular mesos framework receive more offers than other 
>>>> frameworks, we assign our frameworks weights.  The higher the weight, the 
>>>> more frequently the framework will receive an offer.  See the '--weights' 
>>>> and '--roles' options in the config: 
>>>> http://mesos.apache.org/documentation/latest/configuration/.  Basically, a 
>>>> higher weight > 1 means more offers get sent to your framework.  The mesos 
>>>> source code for how weighting works is shown here:   
>>>> https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L306
>>>>  and 
>>>> https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L41.
>>>> 
>>>> What you may want to do is create a "role" called "development_mode" and 
>>>> then assign the role a high weight (like 40).  You would then assign your 
>>>> framework to the "development_mode" role.  What we've actually done is 
>>>> created roles named the numbers 1,2,3,4,5,10,20,30,40, where each role 
>>>> maps to a weight of that number ... and we then we allow frameworks to 
>>>> choose which role they start up as.  At Mesoscon, I will be speaking about 
>>>> why we do this and how we are solving some general issues with the DRF 
>>>> algorithm, if you're interested!
>>>> 
>>>> Alex 
>>>> 
>>>> 
>>>> 
>>>> On Sun, Jun 14, 2015 at 5:58 AM Alex Rukletsov <[email protected]> wrote:
>>>> Christopher,
>>>> 
>>>> try adjusting master allocation_interval flag. It specifies often the 
>>>> allocator performs batch allocations to frameworks. As Ondrej pointed out, 
>>>> if you framework explicitly declines offers, it won't be re-offered the 
>>>> same resources for some period of time.
>>>> 
>>>> On Sat, Jun 13, 2015 at 8:30 PM, Ondrej Smola <[email protected]> 
>>>> wrote:
>>>> Hi Christopher,
>>>> 
>>>> i dont know about any way way how to speed up first resource offer -
>>>> in my experience new offers arrive almost immediately after framework
>>>> registration. It depends on the infrastructure you are testing your
>>>> framework on - are there any
>>>> other frameworks running? As is discussed in an another thread offers
>>>> should be send to multiple frameworks at once. There may be small
>>>> delay based on initial registration and network delay. If you speak
>>>> about "reoffers" - reoffering
>>>> decline offers - there should param to set interval for reoffer. For
>>>> example in Go you can decline offer this way (it is also important to
>>>> decline every non used offer):
>>>> 
>>>> driver.DeclineOffer(offer.Id, &mesos.Filters{RefuseSeconds: 
>>>> proto.Float64(5)})
>>>> 
>>>> Look to mesos UI - it shoud give you information abou what offers are
>>>> offered to which frameworks, mesos master logs also give you this
>>>> information.
>>>> 
>>>> 
>>>> 2015-06-13 18:23 GMT+02:00 Christopher Ketchum <[email protected]>:
>>>> > Hi,
>>>> >
>>>> > I was wondering if there was any way to adjust the rate of resource 
>>>> > offers to the framework. I am writing a mesos framework, and when I am 
>>>> > testing it I am noticing a slight pause were the framework seems to be 
>>>> > waiting for another resource offer. I would like to know if there is any 
>>>> > way to speed these offers up, just to make testing a little faster.
>>>> >
>>>> > Thanks,
>>>> > Chris
>>>> 
>>> 
>>> 
>> 
>> 
> 
>

Re: Setting Rate of Resource Offers

Reply via email to