----- Original Message ----- > From: "Yan Gao" <y...@suse.com> > To: pacemaker@oss.clusterlabs.org > Sent: Thursday, December 6, 2012 12:28:06 PM > Subject: Re: [Pacemaker] Enable remote monitoring > > Hi, > > On 12/06/12 19:42, Lars Marowsky-Bree wrote: > > On 2012-12-06T22:25:40, Andrew Beekhof <and...@beekhof.net> wrote: > > > >> But any failures of the nagios agents would count against the VM's > >> migration-threshold. > >> So if moving were the right thing to do, it would have done it > >> already. > > > > OK. I think this was due to me still being stuck on the workings of > > an > > order constraint, but of course if the failures are instead > > attributed > > to the container, this would happen automatically already. True. > > > > (Incidentally, I like "attribute", "ascribe" better than "delegate" > > because to me, they better fit what's going on, if we sticked with > > "delegate-failures". Just saying. ;-) > > > >>> We already have on-fail settings. How would these play together? > >> Good question. My initial thought was that it would be up to > >> on-fail > >> settings in the VM. > > > > I'd prefer to keep that separate (as proposed below). Because if an > > action of the *VM* really fails, I may want an admin to look into > > it > > (why could the bloody hypervisor not start/stop it?), which is > > different > > from restarting the VM if one of the resources within it needs > > that. > > > >>> Would it even make sense to have on-fail="restart-container"? (Or > >>> a > >>> nicer wording.) > >>> > >>> Hmmm. That might work. We allow a "container" to be specified as > >>> a meta > >>> attribute. > >>> > >>> If set, on-fail would default to restart container for most > >>> actions. But > >>> admins could actually modify it - say, they might want to set > >>> monitor on-fail="ignore" to just get notified. And when we move > >>> forward > >>> to whiteboxes, we could have start/monitor/promote/demote > >>> on-fail="restart" (like now) and stop > >>> on-fail="restart-container". > >>> > >>> That appears reasonably neat? > >> It does actually. > >> I wasn't originally thinking it was necessary but it makes sense > >> now > >> that you point it out. > > > > Yes, I think I like this too now. > I like it too. Here comes the drafted code: > https://github.com/gao-yan/pacemaker/commit/4f7b80baa42f3801c1fb8186aef076877f34dfea > > It works in my simple test. Although failures of resources hasn't > counted against container's migration-threshold yet, it shows you the > basic idea. I'd appreciate if you can take a look first. It's very > likely I'm really on the right track this time. ;-)
+1, I like where this is going :) -- Vossel > > > > Uhm. Would "container" imply ordering + colocation, or would we > > still > > need them grouped (resource_set'ed, whatever)? > > > > My, design is hard. ;-) > :-) > > > Regards, > Gao,Yan > -- > Gao,Yan <y...@suse.com> > Software Engineer > China Server Team, SUSE. > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org