I’ve made the changes to my frameworks and Marathon to use roles. My question is, is there a way to change a slave’s role without restarting it? I ask because the slave I want to reconfigure is running frameworks that scheduled tasks that take a very long time to complete their work. These frameworks do not have checkpointing turned on. (I’ve changed the code so that they will in the future.) My understanding and experience tell me that to change the slave’s configuration, I have to restart the slave and that when I do that I will get a log message saying I have to rm –f /tmp/mesos/meta/slaves/latest. After I do that and restart, I believe the running frameworks will not reconnect with the slave (does that sound right?) and will timeout and shut down along with the tasks they scheduled and that is what I’m trying to avoid.
Thanks, Mike From: Mike B Sent: Tuesday, July 14, 2015 5:33 PM To: [email protected] Subject: RE: resources not offered to framework I didn’t understood the difference between roles and attributes. That sounds like what I am looking for. Thanks for your help. -Mike From: Vinod Kone [mailto:[email protected]] Sent: Tuesday, July 14, 2015 4:37 PM To: [email protected]<mailto:[email protected]> Subject: Re: resources not offered to framework On Tue, Jul 14, 2015 at 4:36 AM, Mike B <[email protected]<mailto:[email protected]>> wrote: I could see the master processing ACCEPT calls for offers and I could see the resources associated with the new slave being recovered because none of the frameworks they were offered to wanted them. What I never saw was these new resources being offered to the framework that could have used them. Ideally, I would have liked these new resources to have been offered to that framework. (One note, another instance of the same framework was launched after seeing this problem and it was offered these new resources.) I imagine this could be possible with the built-in allocator if the framework (say F) that needed the "worker" resources had a high DRF share and other frameworks had a low DRF share. If the frameworks that do not need "worker" resources do not filter them for long enough (refuse_seconds is small) time, they might repeatedly become candidates for allocation starving out F. Couple of options here. --> You can have frameworks that are not interested in "worker" resources decline offers (with "worker" resources) with a very long interval (say 1 year). --> Instead of attributes, use roles (role: worker, role: workstation etc) and have framework F register with role "worker".

