> On Sept. 11, 2018, 10:35 a.m., Benno Evers wrote:
> > src/slave/slave.cpp
> > Lines 1607 (patched)
> > <https://reviews.apache.org/r/67022/diff/1/?file=2018278#file2018278line1607>
> >
> >     If I understand the slave code correctly, I think this would checkpoint 
> > the new state every time we're losing connection to a master (e.g. when the 
> > leader changes).
> >     
> >     I'd suggest moving it into `Slave::recover()`, i.e. the place where we 
> > set `requiredMasterCapabilities.agentUpdate` to true in the first place.

The condition to set `requiredMasterCapabilities.agentUpdate` is:

```
     // Check for SlaveInfo compatibility.                                      
                                                                                
                      
     Try<Nothing> _compatible =                                                 
                                                                                
                      
       compatible(slaveState->info.get(), info);                                
                                                                                
                      
                                                                                
                                                                                
                      
     if (_compatible.isSome()) {                                                
                                                                                
                      
       // Permitted change, so we reuse the recovered agent id and reconnect    
                                                                                
                      
       // to running executors.                                                 
                                                                                
                      
                                                                                
                                                                                
                      
       // Prior to Mesos 1.5, the master expected that an agent would never     
                                                                                
                      
       // change its `SlaveInfo` and keep the same slave id, and therefore 
would                                                                           
                           
       // not update it's internal data structures on agent re-registration.    
                                                                                
                      
       if (!(slaveState->info.get() == info)) {                                 
                                                                                
                      
         requiredMasterCapabilities.agentUpdate = true;                         
                                                                                
                      
       }       
```

So if slave info did not change (i.e, a simple leader change), 
`requiredMasterCapabilities.agentUpdate` should be `false` and we would not 
update the checkpointed `SlaveInfo`.

I confirmed that in our agent log, but we can also create a new unit test to 
cover that case. wdyt?


- Zhitao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67022/#review208520
-----------------------------------------------------------


On May 8, 2018, 5:40 p.m., Zhitao Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67022/
> -----------------------------------------------------------
> 
> (Updated May 8, 2018, 5:40 p.m.)
> 
> 
> Review request for mesos, Eric Chung and Jason Lai.
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Refreshed checkpointed SlaveInfo if `--reconfiguration_policy==any`.
> 
> 
> Diffs
> -----
> 
>   src/slave/flags.cpp a319b5ea633c41fd8a252c5e1617ac52d1480ba5 
>   src/slave/slave.cpp c6d9152d9de4184f9107bb8242b41d468d76e018 
> 
> 
> Diff: https://reviews.apache.org/r/67022/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Zhitao Li
> 
>

Reply via email to