> On Jan. 5, 2016, 1:45 a.m., Guangya Liu wrote: > > docs/high-availability-framework-guide.md, lines 117-118 > > <https://reviews.apache.org/r/41896/diff/1/?file=1181077#file1181077line117> > > > > What will be the final state of this task? Does the framework need to > > re-launch this task again even though this task might already been finished > > in agent. Can you please add some best practise for this case? > > Neil Conway wrote: > I'm not sure what else we can say here: the best practice we're > recommending is to avoid this situation entirely by ensuring that when a new > framework leader is elected, it knows about (a superset of) all the tasks the > previous leader might have launched. > > Guangya Liu wrote: > Can we clarify that the framework may need to re-launch task for such > case?
Well, if the framework instance doesn't know about the task (because it hasn't persisted state correctly before failing over), it isn't clear what they should do in general -- maybe kill the unknown task, maybe let it run and page an admin. This is why we don't recommend that this situation be allowed in the first place :) - Neil ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/41896/#review112704 ----------------------------------------------------------- On Jan. 5, 2016, 3:14 a.m., Neil Conway wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/41896/ > ----------------------------------------------------------- > > (Updated Jan. 5, 2016, 3:14 a.m.) > > > Review request for mesos, Benjamin Hindman, Ben Mahler, and Joris Van > Remoortere. > > > Bugs: MESOS-3936 > https://issues.apache.org/jira/browse/MESOS-3936 > > > Repository: mesos > > > Description > ------- > > Added guide to writing highly available Mesos frameworks. > > > Diffs > ----- > > docs/app-framework-development-guide.md > 4a43a93d080bdac37b8aee91748fea7552a1cc67 > docs/high-availability-framework-guide.md PRE-CREATION > docs/high-availability.md 31aa66220617a3f8606b185ef247c11f00735227 > docs/home.md 6f0f4b9cb9d0da1f9960ebe7f36ce186c1317535 > > Diff: https://reviews.apache.org/r/41896/diff/ > > > Testing > ------- > > Previewed via site-docker. > > Note that there's a lot more that could be said here; also, at some point we > should probably unify the "reconciliation" page with this page, and perhaps > move some of the content in the "high-availability" page here (leaving the > "high-availability" page for the operator-centric parts of configuring Mesos > to run in HA mode). > > > Thanks, > > Neil Conway > >