> On 5 Dec 2015, at 3:32 AM, Jan Pokorný <[email protected]> wrote: > > On 04/12/15 12:33 +1100, Andrew Beekhof wrote: >>> On 4 Dec 2015, at 2:45 AM, Jan Pokorný <[email protected]> wrote: >>> On 02/12/15 17:23 -0600, Ken Gaillot wrote: >>>> This will be of interest to cluster front-end developers and anyone who >>>> needs event notifications ... >>>> >>>> One of the new features in Pacemaker 1.1.14 will be built-in >>>> notifications of cluster events, as described by Andrew Beekhof on That >>>> Cluster Guy blog: >>>> http://blog.clusterlabs.org/blog/2015/reliable-notifications/ >>>> >>>> For a future version, we're considering extending that to allow multiple >>>> notification scripts, each with multiple recipients. This would require >>>> a significant change in the CIB. Instead of a simple cluster property, >>>> our current idea is a new configuration section in the CIB, probably >>>> along these lines: >>>> >>>> <configuration> >>>> <!-- usual crm_config etc. here --> >>>> >>>> <!-- this is the new section --> >>>> <notifications> >>>> >>>> <!-- each script would be in a notify element --> >>>> <notify id="notify-1" path="/my/script.sh" timeout="30s"> >>>> >>>> <recipient id="recipient-1" value="[email protected]" /> >>>> <!-- etc. for multiple recipients --> >>>> >>>> </notify> >>>> >>>> <!-- etc. for multiple scripts --> >>>> >>>> </notifications> >>>> </configuration> >>>> >>>> >>>> The recipient values would be passed to the script as command-line >>>> arguments (ex. "/my/script.sh [email protected]"). >>> >>> Just thinking out loud, Pacemaker is well adapted to cope with >>> asymmetric/heterogenous nodes (incl. user-assisted optimizations >>> like with non-default "resource-discovery" property of a location >>> contraint, for instance). >>> >>> Setting notifications universally for all nodes may be desired >>> in some scenarios, but may not be optimal if nodes may diverge, >> >> Correct always wins over optimal. >> >> I’d not be optimising around scripts that only apply to specific >> resources that also don’t run everywhere - at most you waste a few >> cycles. If that ever becomes a real issue we can add a filter to >> the notify block. >> >> Far worse is if a service can run somewhere new and you forgot to >> copy the script across… The knowledge doesn’t exist to report that >> as a problem. >> >> The common scenario will be feeding fencing events into things like >> galera or nova and sending via different transports, like SNMP, SMS, >> email. Particularly sending SNMP alerts into a fully fledged >> monitoring and alerts system that finds duplicates and does advanced >> filtering. We do not and should not be trying to reimplement that. >> >>> or will for sure: >>> >>> (1) the script may not be distributed across all the nodes >> >> Thats a bug, not a feature. > > see bellow > >>> - or (1b) it is located at the shared storage that will become >>> available later during cluster life cycle because it is >>> a subject of cluster service management as well >> >> How will that script send a notification that the shared storage is >> no longer available? > > This was mostly based on (made up, yes) assumption that notification > script is only checked once for the existence. On the other hand, > if not, periodic recheck won't be drastically different in complexity > from period dir rescan (and optimizations on some systems do exist).
A rescan isn’t going to help you send the “i’ve just stopped the shared storage” notification. There is nothing there to rescan. > >>> (2) one intentionally wants to run the notification mechanism >>> on a subset of nodes >> >> Can you explain to me when that would be a good idea? > > I have no idea about nifty details about how it all should work, but > it may be desired to, e.g., decide if the notification agent should > run also in pacemaker_remote case or not. They don’t. The alerts come from the node its connected to. > Or you want to run backup > SMS notifications only at the nodes with GSM module installed. Apart from sounding like a lot of work to avoid if [ ! -e /bin/sometool ]; exit 0; fi It doesn’t make sense that receiving alerts would be so crucial that they’d configure redundant paths, but only do so on a subset of the nodes. That would be like only configuring fencing for some of the nodes. > >> Particularly when those nodes are the only remaining survivors >> (which you can’t know isn’t the case). >> If we don’t care about the services on those nodes, why did we make >> them HA? > > You can achieve good enough HA notification mechanism by using more > non-HA notification methods, just as you do with fencing topologies, > or just as HA cluster uses more nodes that are not HA by themselves. > >>> Note also that once you have the responsibility to distribute the >>> script on your own, you can use the same distribution mechanism to >>> share your configuration for this script, as an alternative to using >>> "value" attribute in the above proposal >> >> So instead of using a standard pool of agents and pcs to set a >> value, I get to maintain two sets of files on every node in the >> cluster? >> And this is supposed to be a feature? > > Just wanted to remind that CIB solves just a subset of orchestration > problems. The CIB is a dumb data store, why are we talking about orchestration? All we’re trying to do is get information out of pacemaker and into people’s alerting frameworks. We’re not trying to re-invent those frameworks. > Tools like pcs adds only a tiny fraction to this subset. > > Standard pool of agents + (mostly) single value customization via > central place (CIB) sounds good, not discounting this at all. > >>> (and again, this way, you >>> are free to have an asymmetric configuration). There are tons >>> of cases like that and one has to deal with that already (some RAs, >>> file with secret for Corosync, ...). >>> >>> What I am up to is a proposal of an alternative/parallel mechanism >>> that better fits the asymmetric (and asynchronous from cluster life >>> cycle POV) use cases: old good drop-in files. There would simply >>> be a dedicated directory (say /usr/share/pacemaker/notify.d) where >>> the software interested in notifications would craft it's own >>> listener script (or a symlink thereof), script is then discovered >>> by Pacemaker upon subsequent dir rescan or inotify event, done. >>> >>> --> no configuration needed (or is external to the CIB, or is >>> interspersed in a non-invasive way there), install and go >>> >>> --> it has local-only effect, equally as is local the installation >>> of the respective software utilizing notifications >>> (and as is local handling of the notifications!) >> >> Still not a feature. > > I am soliciting the feedback to learn more about the usefulness > if you define feature := something useful. Yes, I’m saying its neither. > > -- > Jan (Poki) > _______________________________________________ > Developers mailing list > [email protected] > http://clusterlabs.org/mailman/listinfo/developers _______________________________________________ Developers mailing list [email protected] http://clusterlabs.org/mailman/listinfo/developers
