----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/59214/#review174850 -----------------------------------------------------------
Fix it, then Ship it! src/master/http.cpp Lines 1454-1456 (original), 1454-1456 (patched) <https://reviews.apache.org/r/59214/#comment248064> Could we do the following? ``` writer->field( "unregistered_frameworks", [](JSON::ArrayWriter*) {}); ``` The use of vector<int> tripped me up here since it led me to think we were exposing integers before. src/master/http.cpp Lines 2874-2880 (original), 2825-2831 (patched) <https://reviews.apache.org/r/59214/#comment248065> Ditto for these w.r.t. to vector<int> being misleading to the reader. src/master/master.cpp Lines 6083-6084 (original), 6060-6061 (patched) <https://reviews.apache.org/r/59214/#comment248073> What does this null pointer case correspond to? Is it a framework that will be "recovered" later in the re-registration? A comment might help clarify this for the reader. src/master/master.cpp Lines 6180-6229 (original), 6157-6197 (patched) <https://reviews.apache.org/r/59214/#comment248077> Hm.. do we still need `ids`? It seems like we could simplify this for the reader a bit by removing the `ids` and by consolidating the two cases into a single loop: ``` foreach (const FrameworkInfo& frameworkInfo, frameworks) { Framework* framework = getFramework(frameworkId); if (isCompletedFramework(frameworkId)) { // do nothing, __reregister already sent the shutdown message } else if (framework != nullptr)) { // send update framework message } else { // recover the framework } } ``` With the framework cases together I found it easier to read. Looking at this code, I found the breakdown of responsibility a little confusing, in that it wasn't clear to me why `__reregisterSlave` sends the shutdown but `___reregisterSlave` deals with the update and recovery of the framework (I assume it's due to the call-site requirements). Something to think about separately from this change. src/master/master.cpp Lines 6779-6780 (original), 6735-6736 (patched) <https://reviews.apache.org/r/59214/#comment248079> If you want to send a separate cleanup patch, looks like this case and other PARTITION_AWARE cases were missed in the sweep to use `framework->capabilities`: ``` if (!framework->capabilities.partitionAware) { newTaskState = TASK_LOST; } ``` - Benjamin Mahler On May 11, 2017, 11:31 p.m., Neil Conway wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/59214/ > ----------------------------------------------------------- > > (Updated May 11, 2017, 11:31 p.m.) > > > Review request for mesos and Benjamin Mahler. > > > Bugs: MESOS-6977 > https://issues.apache.org/jira/browse/MESOS-6977 > > > Repository: mesos > > > Description > ------- > > In previous versions of Mesos, the master might know about a task > running on an agent, but might not have a `FrameworkInfo` for that > task's framework. This might occur if the master fails over, the agent > re-registers, but the framework has not (yet) re-registered. > > In Mesos 1.0, the agent will now supply the FrameworkInfo for all tasks > it is running when it re-registers with the master. Since we no longer > support 0.x agents, we can remove the old backward compatibility code > for handling a running task with unknown FrameworkInfo. > > Note that this means that "orphan tasks", "orphan executors", and > "unregistered frameworks" are no longer possible. > > > Diffs > ----- > > src/master/http.cpp e2590a17044ac019b24a24629428d4ec8adc0c31 > src/master/master.cpp 4e7a161f431624bd78d3e9032eb8587687149cad > > > Diff: https://reviews.apache.org/r/59214/diff/1/ > > > Testing > ------- > > `make check` > > > Thanks, > > Neil Conway > >
