Hi, On Mon, Jan 14, 2008 at 11:58:17AM -0500, Gary Schlachter wrote: > Dejan, > > I started there. However, the problem I had was that I could not > install 2.1.3 on Fedora Core 1 since it needed later versions of other > RPMs. I can make 2.1.3 on FC1 but when I try to package heartbeat, I get > missing libnet-devel, openhpi-devel, gnutls-devel, OpenIPMI-devel. Is > there a way around this?
You'd have to find those packages for your distribution. BTW, isn't Fedora Core 1 a bit old too. Thanks, Dejan > > Gary > > Dejan Muhamedagic wrote: >> Hi, >> >> On Fri, Jan 11, 2008 at 10:22:48AM -0500, Gary Schlachter wrote: >> >>> I have a problem with heartbeat dying. I have a 3 node cluster >>> running HA 2.0.8 on Fedora Core 1. They are providing a single IP >>> address resource. They are using eth0 as the heartbeat mechanism. If I >>> disconnect the eth0 cable from the node which is providing the IP >>> address, one of the other nodes correctly begins providing it. However, >>> shortly after disconnecting the eth0 cable, the heartbeat process (and >>> others) die. The >> >> This has been fixed a few months ago. The fix is in the 2.1.3 >> release. Could you please use the new release. >> >> Thanks, >> >> Dejan >> >> >>> key area in the ha-debug log looks like the following: >>> >>> pengine[4293]: 2008/01/11_09:50:22 info: determine_online_status: Node >>> loneranger.us.big.net is online >>> pengine[4293]: 2008/01/11_09:50:22 info: native_print: SharedIP >>> (heartbeat::ocf:IPaddr): Started loneranger.us.big.net >>> pengine[4293]: 2008/01/11_09:50:22 notice: StopRsc: >>> loneranger.us.big.net Stop SharedIP >>> crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition: >>> loneranger.us.big.net: State transition S_POLICY_ENGINE >>> ->S_TRANSITION_ENGINE [input=I_PE_SUCCESS cause=C_IPC_MESSAGE >>> origin=route_message ] >>> pengine[4293]: 2008/01/11_09:50:22 info: process_pe_message: Transition >>> 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-137.bz2 >>> tengine[4292]: 2008/01/11_09:50:22 info: unpack_graph: Unpacked >>> transition 0: 1 actions in 1 synapses >>> tengine[4292]: 2008/01/11_09:50:22 info: send_rsc_command: Initiating >>> action 3: SharedIP_stop_0 on loneranger.us.big.net >>> crmd[9543]: 2008/01/11_09:50:22 info: do_lrm_rsc_op: Performing >>> op=SharedIP_stop_0 key=3:0:994066a9-4cae-49a4-abad-37f3e0b84b3e) >>> IPaddr[4300]: 2008/01/11_09:50:22 INFO: /sbin/ifconfig eth0:0 >>> 10.1.2.50 down >>> lrmd[9540]: 2008/01/11_09:50:22 info: RA output: (SharedIP:stop:stderr) >>> SIOCDELRT: No such process >>> >>> crmd[9543]: 2008/01/11_09:50:22 info: process_lrm_event: LRM operation >>> SharedIP_stop_0 (call=4, rc=0) complete >>> cib[9539]: 2008/01/11_09:50:22 info: cib_diff_notify: Update (client: >>> 9543, call:32): 0.30.317 -> 0.30.318 (ok) >>> cib[4315]: 2008/01/11_09:50:22 info: write_cib_contents: Wrote version >>> 0.30.318 of the CIB to disk (digest: ad7329b3cddc6a9bbd96deb332a3d08f) >>> tengine[4292]: 2008/01/11_09:50:22 info: te_update_diff: Processing diff >>> (cib_update): 0.30.317 -> 0.30.318 >>> tengine[4292]: 2008/01/11_09:50:22 info: match_graph_event: Action >>> SharedIP_stop_0 (3) confirmed on c8608d41-66b2-4115-9043-4a8423b0d562 >>> tengine[4292]: 2008/01/11_09:50:22 info: run_graph: Transition 0: >>> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0) >>> tengine[4292]: 2008/01/11_09:50:22 info: notify_crmd: Transition 0 >>> status: te_complete - <null> >>> crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition: >>> loneranger.us.big.net: State transition S_TRANSITION_ENGINE -> S_IDLE [ >>> input=I_TE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ] >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: >>> Resource temporarily unavailable >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down. >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: >>> Resource temporarily unavailable >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down. >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: >>> Resource temporarily unavailable >>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down. >>> >>> The last messages repeat for a very long time then most daemons >>> eventually stop. >>> >>> >>> _______________________________________________ >>> Linux-HA mailing list >>> [email protected] >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >> _______________________________________________ >> Linux-HA mailing list >> [email protected] >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems >> > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
