They all look sane to me. Please proceed with a pull request :-) We should probably start thinking about .13 (or .14 for the superstitious), there have been quite a few important patches arrive since .12 was released.
> On 10 Dec 2014, at 1:33 am, Lars Ellenberg <lars.ellenb...@linbit.com> wrote: > > > Andrew, > All, > > Please have a look at the patches I queued up here: > https://github.com/lge/pacemaker/commits/for-beekhof > > Most (not all) are specific for the heartbeat cluster stack. > > Thanks, > Lars > > A few comments here: > > ----- > > This effectively changes crm_mon output, > but also changes logging where this method is invoked: > > Low: native_print: report target-role as well > > This is for the "Why does my resource not start?" guys who > forgot to remove the limiting target-role setting. > > Report target role (unless "Started", which is the default anyways), > if it limits our abilities (Slave, Stopped), > or if it differs from the current status. > > ----- > > Heartbeat specific: > > Low: allow heartbeat to spawn the pengine itself, and tell crmd about it > > Heartbeat 3.0.6 now may spawn the pengine directly, and will announce > this in the environment -- I introduced the setting "crmd_spawns_pengine". > > This improves shutdown behavior. Otherwise I regularly find an orphaned > pengine process after pacemaker shutdown. > > ----- > > Heartbeat specific, as consequence of the fix blow: > > Low: add debugging aid to help spot missing set_msg_callback()s on > heartbeat > > In ha_msg_dispatch(), change from rcvmsg() to readmsg(). > rcvmsg() is internally simply a wrapper around readmsg(), > which silently deletes messages without matching callback. > > Use readmsg() directly here. It will only return unprocessed (by > callbacks) messages, so log a warning, notice or debug message > depending on message header information, and ha_msg_del() it ourselves. > > ----- > > Heartbeat specific bug fix: > > High: fix stonith ignoring its own messages on heartbeat > > Since the introduction of the additional F_TYPE messages > T_STONITH_NOTIFY and T_STONITH_TIMEOUT_VALUE, and their use as message > types in global heartbeat cluster messages, stonith-ng was broken on the > heartbeat cluster stack. > > When delegation was made the default, and the result could only be > reaped by listening for the T_STONITH_NOTIFY message, no-one (but > stonithd itself) would ever notice successful completion, > and stonith would be re-issued forever. > > Registering callbacks for these F_TYPE fixes these hung stonith and > stonith_admin operations on the heartbeat cluster stack. > > ----- > > Heartbeat specific: > > Medium: fix tracking of peer client process status on heartbeat > > Don't optimistically assume that peer client processes are alive, > or that a node that can talk to us is in fact member of the same > ccm partition. > > Whenever ccm tells us about a new membership, *ask* for peer client > process status. > > ----- > > This oneliner may well be relevant for corosync CPG as well, > possibly one of the reasons the pcmk_cpg_membership() has this funny > "appears to be online even though we think it is dead" block? > > fix crm_update_peer_proc to NOT ignore flags if partially set > > The "set_bit()" function used here actually deals with masks, not bit > numbers. > The "flag" argument should in fact be plural: flags. > > These proc flag bits are not always set one at a time, > but for example as "crm_proc_crmd | crm_proc_cpg", > and not necessarily cleared with the same combination. > > Ignoring to-be-set flags just because *some* of the flag bits are > already set is clearly a bug, and may be the reason for stale process > cache information. > > ----- > > Heartbeat specific: > > Medium: map heartbeat JOIN/LEAVE status to ONLINE/OFFLINE > > The rest of the code deals in "online" and "offline", > not "join" and "leave". Need to map these states, > or the rest of the code won't work properly. > > ----- > > Generic, if shutdown is requested before stonith connection was ever > established > (due to other problems), inisting to re-try the stonith connection confused > the shutdown. > > Medium: don't trigger a stonith_reconnect if no longer required > > Get rid of some spurious error messages, and speed up shutdown, > even if the connection to the stonith daemon failed. > > ----- > > Non-functional change, just for readability: > > Low: use CRM_NODE_MEMBER, not CRM_NODE_ACTIVE > > ACTIVE is defined to be MEMBER anyways: > include/crm/cluster.h:#define CRM_NODE_ACTIVE CRM_NODE_MEMBER > > Don't confuse the reader of the code > by implying it was something different. > > ----- > > Heartbeat specific, packaging only: > > Low: heartbeat 3.0.6 knows to finds the daemons; drop compat symlinks > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org