On Wed, May 26, 2010 at 6:05 PM, Sam Reidland <[email protected]> wrote:
> I have been working on a simple 2 node 2 resource cluster using
> Pacemaker 1.0.7 and heartbeat 3.0.2. The two resources are IPaddr and
> our application. When our application was started, the box would reboot
> (actually a clean restart). After a lot of searching I found that if I
> didn't initialize net-SNMP, everything started perfectly. The build of
> net-SNMP we use spits 2 or 3 lines to stderr when it starts and I
> noticed that the reboot occurred after the first line to stderr was
> printed and no other output was seen after that. My OCF script started
> our app with the following command '/BACKHAUL/bhApplication >/dev/null
> &'. I changed the command to '/BACKHAUL/bhApplication &>/dev/null &' and
> everything works as it should. So the question is, why does the HA
> software cause the box to reboot when something is sent to stderr? I'm
> not even sure what part caused the box to reboot.

Looks like its causing the lrmd to crash for some reason.

Jan  1 00:15:45 bh130 daemon.crit crmd: [1148]: CRIT:
lrm_connection_destroy: LRM Connection failed
Jan  1 00:15:45 bh130 daemon.info crmd: [1148]: info:
lrm_connection_destroy: LRM Connection disconnected

Dejan?
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to