----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/57836/#review169821 -----------------------------------------------------------
Thanks Ilya! Michael and I took a look. The fix looks good, just a suggestion below on how to make the check a bit clearer. Also I was confused a bit by the description. How about the following? ``` The commit 4912f341dce12f10b43fe82db20086032391e95a introduced a new `framework_info` field in the `UpdateFrameworkMessage` sent from the master to the agent. Unfortunately, the agent code unconditionally uses the new field, which results in a default-intialized `FrameworkInfo` when a new agent is receiving the message from an old master. This leads to an agent abort when it subsequently tries to launch a new task for the updated framework. This fixes the issue by ignoring the field if unset. ``` src/slave/slave.cpp Lines 2912-2913 (original), 2912-2919 (patched) <https://reviews.apache.org/r/57836/#comment242469> To make this clearer, we can modify updateFramework to take the message as its argument: ``` void Slave::updateFramework( const UpdateFrameworkMessage& message) { ... // The framework info was added in 1.3, so it will // not be set if from a master older than 1.3. if (message.has_framework_info()) { ... } ... } ``` ``` install<UpdateFrameworkMessage>( &Slave::updateFramework); ``` - Benjamin Mahler On March 22, 2017, 12:38 p.m., Ilya Pronin wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/57836/ > ----------------------------------------------------------- > > (Updated March 22, 2017, 12:38 p.m.) > > > Review request for mesos, Benjamin Mahler and Michael Park. > > > Bugs: MESOS-7281 > https://issues.apache.org/jira/browse/MESOS-7281 > > > Repository: mesos > > > Description > ------- > > Patch in r/57108 introduced framework info updates. When a patched agent > is used with not patched master it will accept a default-intialized > framework info with an empty framework ID. This will cause an agent > abort when it tries to launch a new task for the updated framework. This > change fixes it. > > > Diffs > ----- > > src/slave/slave.cpp a4f4a9ca80b726de8e07571fd6d93120947c278b > > > Diff: https://reviews.apache.org/r/57836/diff/1/ > > > Testing > ------- > > Ran `make check`. Verified manually by launching a task, forcing a framework > to failover to make the master send an `UpdateFrameworkMessage` and launching > more tasks. > > > Thanks, > > Ilya Pronin > >
