Re: Setting Rate of Resource Offers

Christopher Ketchum Wed, 17 Jun 2015 15:21:18 -0700

Hi,

I think those logs were misleading, sorry. I am running the tests in Pycharm, 
which aggregates all the logs onto one console so I selected only the mesos 
messages that explicitly said they were from master. Here are those logs 
without my editing. Again, the last two messages are almost a second apart. The 
resources are recovered very quickly, but not offered up for another second. Is 
this delay to try to increase the offer size? Is that delay adjustable?


I0617 11:34:08.581996 184418304 master.cpp:4623] Updating the latest state of 
task 1 of framework 20150617-113405-16777343-5050-6614-0000 to TASK_FINISHED

I0617 11:34:08.582051 188174336 hierarchical.hpp:648] Recovered cpus(*):3.9 
(total allocatable: mem(*):7136; disk(*):109424; ports(*):[31000-32000]; 
cpus(*):3.9) on slave 20150617-113405-16777343-5050-6614-S0 from framework 
20150617-113405-16777343-5050-6614-0000

I0617 11:34:08.582778 185491456 master.cpp:4690] Removing task 1 with resources 
cpus(*):3.9 of framework 20150617-113405-16777343-5050-6614-0000 on slave 
20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 (localhost)

I0617 11:34:08.582839 185491456 master.cpp:2787] Forwarding status update 
acknowledgement 3fec98f2-9f50-4968-9708-c7663f36b62d for task 1 of framework 
20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) at 
[email protected]:54818 to slave 
20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 (localhost)

I0617 11:34:08.583075 256425984 status_update_manager.cpp:389] Received status 
update acknowledgement (UUID: 3fec98f2-9f50-4968-9708-c7663f36b62d) for task 1 
of framework 20150617-113405-16777343-5050-6614-0000

I0617 11:34:09.446701 184954880 master.cpp:3760] Sending 1 offers to framework 
20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) at 
[email protected]:54818

Thanks,
Christopher
> On Jun 17, 2015, at 1:54 PM, Vinod Kone <[email protected]> wrote:
> 
> Looks like the hierarchical allocator doesn't trigger an allocation when 
> resources are recovered from a finished task (likely a bug. can you file a 
> ticket?). Instead it depends on the periodic allocation interval (default 1s, 
> configurable via flags.allocation_interval) for the next allocation. In the 
> meanwhile, you can reduce the default allocation interval via the flag to 
> speed it up.
> 
> On Wed, Jun 17, 2015 at 11:59 AM, Christopher Ketchum <[email protected] 
> <mailto:[email protected]>> wrote:
> You can see there is about a second delay between the last two messages. Its 
> not a huge amount of time but it is noticeable, especially when testing with 
> many short tasks. 
> 
> I0617 11:34:08.582778 185491456 master.cpp:4690] Removing task 1 with 
> resources cpus(*):3.9 of framework 20150617-113405-16777343-5050-6614-0000 on 
> slave 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 
> <http://127.0.0.1:5051/> (localhost)
> 
> I0617 11:34:08.582839 185491456 master.cpp:2787] Forwarding status update 
> acknowledgement 3fec98f2-9f50-4968-9708-c7663f36b62d for task 1 of framework 
> 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) at 
> [email protected] 
> <mailto:[email protected]>:54818 to 
> slave 20150617-113405-16777343-5050-6614-S0 at slave(1)@127.0.0.1:5051 
> <http://127.0.0.1:5051/> (localhost)
> 
> I0617 11:34:09.446701 184954880 master.cpp:3760] Sending 1 offers to 
> framework 20150617-113405-16777343-5050-6614-0000 (Test Framework (Python)) 
> at [email protected] 
> <mailto:[email protected]>:54818
> 
>> On Jun 17, 2015, at 10:18 AM, Vinod Kone <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Can you paste the master logs for when the task is finished and the next 
>> offer is sent?
>> 
>> On Wed, Jun 17, 2015 at 9:11 AM, Christopher Ketchum <[email protected] 
>> <mailto:[email protected]>> wrote:
>> Hi everyone,
>> 
>> Thanks for the responses. To clarify, I’m only running one framework with a 
>> single slave for testing purposes, and it is the re-offers that I am trying 
>> to adjust. When I watch the program run I see tasks updating to 
>> TASK_FINISHED, but there is a noticeable delay where my framework has the 
>> next task queued but the master has not yet reoffered those resources, so 
>> the program pauses until it gets the next offer. 
>> 
>> I am mainly concerned that I haven’t configured something properly, and when 
>> I scale up the delays will compound. Of course, it is also possible that 
>> with multiple slaves able to offer resources these delays will disappear.
>> 
>> Thanks again,
>> Christopher
>> 
>>> On Jun 14, 2015, at 8:11 AM, Alex Gaudio <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Hi Christopher,
>>> 
>>> To let a particular mesos framework receive more offers than other 
>>> frameworks, we assign our frameworks weights.  The higher the weight, the 
>>> more frequently the framework will receive an offer.  See the '--weights' 
>>> and '--roles' options in the config: 
>>> http://mesos.apache.org/documentation/latest/configuration/ 
>>> <http://mesos.apache.org/documentation/latest/configuration/>.  Basically, 
>>> a higher weight > 1 means more offers get sent to your framework.  The 
>>> mesos source code for how weighting works is shown here:   
>>> https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L306
>>>  
>>> <https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L306>
>>>  and 
>>> https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L41
>>>  
>>> <https://github.com/apache/mesos/blob/9e7b890a917fcf0ac4cd1738f060ba97af847b65/src/master/allocator/sorter/drf/sorter.cpp#L41>.
>>> 
>>> What you may want to do is create a "role" called "development_mode" and 
>>> then assign the role a high weight (like 40).  You would then assign your 
>>> framework to the "development_mode" role.  What we've actually done is 
>>> created roles named the numbers 1,2,3,4,5,10,20,30,40, where each role maps 
>>> to a weight of that number ... and we then we allow frameworks to choose 
>>> which role they start up as.  At Mesoscon, I will be speaking about why we 
>>> do this and how we are solving some general issues with the DRF algorithm, 
>>> if you're interested!
>>> 
>>> Alex 
>>> 
>>> 
>>> 
>>> On Sun, Jun 14, 2015 at 5:58 AM Alex Rukletsov <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Christopher,
>>> 
>>> try adjusting master allocation_interval flag. It specifies often the 
>>> allocator performs batch allocations to frameworks. As Ondrej pointed out, 
>>> if you framework explicitly declines offers, it won't be re-offered the 
>>> same resources for some period of time.
>>> 
>>> On Sat, Jun 13, 2015 at 8:30 PM, Ondrej Smola <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> Hi Christopher,
>>> 
>>> i dont know about any way way how to speed up first resource offer -
>>> in my experience new offers arrive almost immediately after framework
>>> registration. It depends on the infrastructure you are testing your
>>> framework on - are there any
>>> other frameworks running? As is discussed in an another thread offers
>>> should be send to multiple frameworks at once. There may be small
>>> delay based on initial registration and network delay. If you speak
>>> about "reoffers" - reoffering
>>> decline offers - there should param to set interval for reoffer. For
>>> example in Go you can decline offer this way (it is also important to
>>> decline every non used offer):
>>> 
>>> driver.DeclineOffer(offer.Id, &mesos.Filters{RefuseSeconds: 
>>> proto.Float64(5)})
>>> 
>>> Look to mesos UI - it shoud give you information abou what offers are
>>> offered to which frameworks, mesos master logs also give you this
>>> information.
>>> 
>>> 
>>> 2015-06-13 18:23 GMT+02:00 Christopher Ketchum <[email protected] 
>>> <mailto:[email protected]>>:
>>> > Hi,
>>> >
>>> > I was wondering if there was any way to adjust the rate of resource 
>>> > offers to the framework. I am writing a mesos framework, and when I am 
>>> > testing it I am noticing a slight pause were the framework seems to be 
>>> > waiting for another resource offer. I would like to know if there is any 
>>> > way to speed these offers up, just to make testing a little faster.
>>> >
>>> > Thanks,
>>> > Chris
>>> 
>> 
>> 
> 
>

Re: Setting Rate of Resource Offers

Reply via email to