Thanks. In testing my new setup using roles, I’m having a problem with Marathon 
not being offered any resources. I’ve posted a question on the Marathon forums:

https://groups.google.com/forum/?hl=en#!topic/marathon-framework/bQ2pO5Dk2MA

but am not getting any replies so was wondering if there is guidance for 
understanding from a Mesos perspective why a framework (Marathon here) is not 
getting resource offers. (Btw, my own custom framework does get offers – just 
not Marathon.) Is there a set of command line options that will reveal the 
master’s resource offer process to the point that the problem will be revealed? 
Or is there a trouble shooting guide that provides understanding around not 
getting resource offers?

Sorry to be so non-specific – I’m a few days into this and starting to grasp.

Thanks,
Mike

From: Vinod Kone [mailto:[email protected]]
Sent: Friday, August 14, 2015 11:37 AM
To: [email protected]
Subject: Re: resources not offered to framework

If the currently running tasks do not have checkpointing turned on, they cannot 
reconnect to a restarted slave no matter what.

And yes currently you can't change the Slave resources roles without wiping 
metadata.

@vinodkone

On Aug 14, 2015, at 6:14 AM, Mike Barborak 
<[email protected]<mailto:[email protected]>> wrote:
I’ve made the changes to my frameworks and Marathon to use roles. My question 
is, is there a way to change a slave’s role without restarting it? I ask 
because the slave I want to reconfigure is running frameworks that scheduled 
tasks that take a very long time to complete their work. These frameworks do 
not have checkpointing turned on. (I’ve changed the code so that they will in 
the future.) My understanding and experience tell me that to change the slave’s 
configuration, I have to restart the slave and that when I do that I will get a 
log message saying I have to rm –f /tmp/mesos/meta/slaves/latest. After I do 
that and restart, I believe the running frameworks will not reconnect with the 
slave (does that sound right?) and will timeout and shut down along with the 
tasks they scheduled and that is what I’m trying to avoid.

Thanks,
Mike

From: Mike B
Sent: Tuesday, July 14, 2015 5:33 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: resources not offered to framework

I didn’t understood the difference between roles and attributes. That sounds 
like what I am looking for. Thanks for your help.

-Mike

From: Vinod Kone [mailto:[email protected]]
Sent: Tuesday, July 14, 2015 4:37 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: resources not offered to framework


On Tue, Jul 14, 2015 at 4:36 AM, Mike B 
<[email protected]<mailto:[email protected]>> wrote:
I could see the master processing ACCEPT calls for offers and I could see the 
resources associated with the new slave being recovered because none of the 
frameworks they were offered to wanted them. What I never saw was these new  
resources being offered to the framework that could have used them. Ideally, I 
would have liked these new resources to have been offered to that framework. 
(One note, another instance of the same framework was launched after seeing 
this problem and it was offered these new resources.)


I imagine this could be possible with the built-in allocator if the framework 
(say F) that needed the "worker" resources had a high DRF share and other 
frameworks had a low DRF share. If the frameworks that do not need "worker" 
resources do not filter them for long enough (refuse_seconds is small) time, 
they might repeatedly become candidates for allocation starving out F.

Couple of options here.
--> You can have frameworks that are not interested in "worker" resources 
decline offers (with "worker" resources) with a very long interval (say 1 year).

--> Instead of attributes, use roles (role: worker, role: workstation etc) and 
have framework F register with role "worker".

Reply via email to