I’ve made the changes to my frameworks and Marathon to use roles. My question 
is, is there a way to change a slave’s role without restarting it? I ask 
because the slave I want to reconfigure is running frameworks that scheduled 
tasks that take a very long time to complete their work. These frameworks do 
not have checkpointing turned on. (I’ve changed the code so that they will in 
the future.) My understanding and experience tell me that to change the slave’s 
configuration, I have to restart the slave and that when I do that I will get a 
log message saying I have to rm –f /tmp/mesos/meta/slaves/latest. After I do 
that and restart, I believe the running frameworks will not reconnect with the 
slave (does that sound right?) and will timeout and shut down along with the 
tasks they scheduled and that is what I’m trying to avoid.

Thanks,
Mike

From: Mike B
Sent: Tuesday, July 14, 2015 5:33 PM
To: [email protected]
Subject: RE: resources not offered to framework

I didn’t understood the difference between roles and attributes. That sounds 
like what I am looking for. Thanks for your help.

-Mike

From: Vinod Kone [mailto:[email protected]]
Sent: Tuesday, July 14, 2015 4:37 PM
To: [email protected]<mailto:[email protected]>
Subject: Re: resources not offered to framework


On Tue, Jul 14, 2015 at 4:36 AM, Mike B 
<[email protected]<mailto:[email protected]>> wrote:
I could see the master processing ACCEPT calls for offers and I could see the 
resources associated with the new slave being recovered because none of the 
frameworks they were offered to wanted them. What I never saw was these new  
resources being offered to the framework that could have used them. Ideally, I 
would have liked these new resources to have been offered to that framework. 
(One note, another instance of the same framework was launched after seeing 
this problem and it was offered these new resources.)


I imagine this could be possible with the built-in allocator if the framework 
(say F) that needed the "worker" resources had a high DRF share and other 
frameworks had a low DRF share. If the frameworks that do not need "worker" 
resources do not filter them for long enough (refuse_seconds is small) time, 
they might repeatedly become candidates for allocation starving out F.

Couple of options here.
--> You can have frameworks that are not interested in "worker" resources 
decline offers (with "worker" resources) with a very long interval (say 1 year).

--> Instead of attributes, use roles (role: worker, role: workstation etc) and 
have framework F register with role "worker".

Reply via email to