> On Nov. 22, 2013, 8:04 p.m., Niklas Nielsen wrote: > > Did we get to a conclusion regarding case 1)? and could we write a test > > which exercises the new scenarios?
If I get some time, I'll write a test. I've been testing it in production for a few days though. Not sure about consensus. Would like to hear from the others. - Brenden ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/15745/#review29305 ----------------------------------------------------------- On Nov. 22, 2013, 12:30 a.m., Brenden Matthews wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/15745/ > ----------------------------------------------------------- > > (Updated Nov. 22, 2013, 12:30 a.m.) > > > Review request for mesos and Niklas Nielsen. > > > Repository: mesos-git > > > Description > ------- > > Fixed some task reconciliation cases. > > Case 1: > > If a slave is known but the task cannot be found, we should assume that > the task has been lost. It's possible that the following events > occurred: > > 1) Framework disconnected from master > 2) Master terminated framework's tasks > 3) Framework reconnects to master, and (incorrectly) assumes tasks are > still running > > Case 2: > > If a framework loses track of running tasks, the master should inform > the framework of which tasks it knows to be running, in addition to any > which have had a state change. > > Review: https://reviews.apache.org/r/15745 > > > Diffs > ----- > > src/master/master.cpp a08d01208ff7bbb878b2d50d8406efee4de86171 > > Diff: https://reviews.apache.org/r/15745/diff/ > > > Testing > ------- > > `make check` & tested in staging cluster. > > > Thanks, > > Brenden Matthews > >