If the currently running tasks do not have checkpointing turned on, they cannot 
reconnect to a restarted slave no matter what. 

And yes currently you can't change the Slave resources roles without wiping 
metadata. 

@vinodkone

> On Aug 14, 2015, at 6:14 AM, Mike Barborak <[email protected]> wrote:
> 
> I’ve made the changes to my frameworks and Marathon to use roles. My question 
> is, is there a way to change a slave’s role without restarting it? I ask 
> because the slave I want to reconfigure is running frameworks that scheduled 
> tasks that take a very long time to complete their work. These frameworks do 
> not have checkpointing turned on. (I’ve changed the code so that they will in 
> the future.) My understanding and experience tell me that to change the 
> slave’s configuration, I have to restart the slave and that when I do that I 
> will get a log message saying I have to rm –f /tmp/mesos/meta/slaves/latest. 
> After I do that and restart, I believe the running frameworks will not 
> reconnect with the slave (does that sound right?) and will timeout and shut 
> down along with the tasks they scheduled and that is what I’m trying to avoid.
>  
> Thanks,
> Mike
>  
> From: Mike B 
> Sent: Tuesday, July 14, 2015 5:33 PM
> To: [email protected]
> Subject: RE: resources not offered to framework
>  
> I didn’t understood the difference between roles and attributes. That sounds 
> like what I am looking for. Thanks for your help.
>  
> -Mike
>  
> From: Vinod Kone [mailto:[email protected]] 
> Sent: Tuesday, July 14, 2015 4:37 PM
> To: [email protected]
> Subject: Re: resources not offered to framework
>  
>  
> On Tue, Jul 14, 2015 at 4:36 AM, Mike B <[email protected]> wrote:
> I could see the master processing ACCEPT calls for offers and I could see the 
> resources associated with the new slave being recovered because none of the 
> frameworks they were offered to wanted them. What I never saw was these new  
> resources being offered to the framework that could have used them. Ideally, 
> I would have liked these new resources to have been offered to that 
> framework. (One note, another instance of the same framework was launched 
> after seeing this problem and it was offered these new resources.)
>  
> 
> I imagine this could be possible with the built-in allocator if the framework 
> (say F) that needed the "worker" resources had a high DRF share and other 
> frameworks had a low DRF share. If the frameworks that do not need "worker" 
> resources do not filter them for long enough (refuse_seconds is small) time, 
> they might repeatedly become candidates for allocation starving out F.
>  
> Couple of options here.
> --> You can have frameworks that are not interested in "worker" resources 
> decline offers (with "worker" resources) with a very long interval (say 1 
> year).
>  
> --> Instead of attributes, use roles (role: worker, role: workstation etc) 
> and have framework F register with role "worker".
>  

Reply via email to