On Wed, May 26, 2010 at 6:05 PM, Sam Reidland <[email protected]> wrote: > I have been working on a simple 2 node 2 resource cluster using > Pacemaker 1.0.7 and heartbeat 3.0.2. The two resources are IPaddr and > our application. When our application was started, the box would reboot > (actually a clean restart). After a lot of searching I found that if I > didn't initialize net-SNMP, everything started perfectly. The build of > net-SNMP we use spits 2 or 3 lines to stderr when it starts and I > noticed that the reboot occurred after the first line to stderr was > printed and no other output was seen after that. My OCF script started > our app with the following command '/BACKHAUL/bhApplication >/dev/null > &'. I changed the command to '/BACKHAUL/bhApplication &>/dev/null &' and > everything works as it should. So the question is, why does the HA > software cause the box to reboot when something is sent to stderr? I'm > not even sure what part caused the box to reboot.
Looks like its causing the lrmd to crash for some reason. Jan 1 00:15:45 bh130 daemon.crit crmd: [1148]: CRIT: lrm_connection_destroy: LRM Connection failed Jan 1 00:15:45 bh130 daemon.info crmd: [1148]: info: lrm_connection_destroy: LRM Connection disconnected Dejan? _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
