RE: [Linux-HA] Heartbeat Reboot - Why?

Mike Sweetser - Adhost Wed, 31 Dec 2008 11:03:47 -0800

> The reason for reboot was that the crmd encountered an
> unrecoverable condition and exited. It is not clear what happened
> to crmd. It could be a communication problem, though there's
> nothing in the logs from the lower layer (heartbeat). BTW, you
> can prevent reboots by replacing "crm yes" with "crm respawn" in
> ha.cf, though that probably won't help.


Can I do this without stopping/starting Heartbeat?

> A reboot shouldn't leave your system in an unstable state. What
> do you mean by "unstable"? Why there was no failover? 

That is an excellent question, and one that I'd really like to have the
answer to.  It was almost as if the secondary node did not detect the
outage.

> Can you
> please produce a hb_report report. That would include the
> configuration and logs from both nodes and all other relevant
> information.

I've attached the hb_report - let me know if there's another way I
should do this.

By the way, hb_report would not properly detect the log until I
commented out some of the detecti
on code in /usr/share/heartbeat/utillib.sh:

123,125c123,125
<       #if [ "$HA_SYSLOGMSGFMT" -o "$HA_LOGFACILITY" ]; then
<       #       awk '{print $1,$2,$3}'
<       #else
---
>       if [ "$HA_SYSLOGMSGFMT" -o "$HA_LOGFACILITY" ]; then
>               awk '{print $1,$2,$3}'
>       else
127c127
<       #fi
---
>       fi

Thank You,
Mike Sweetser

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

RE: [Linux-HA] Heartbeat Reboot - Why?

Reply via email to