Hi,

On Mon, Jan 14, 2008 at 11:58:17AM -0500, Gary Schlachter wrote:
> Dejan,
>
>       I started there. However, the problem I had was that I could not 
> install 2.1.3 on Fedora Core 1 since it needed later versions of other 
> RPMs.  I can make 2.1.3 on FC1 but when I try to package heartbeat, I get 
> missing libnet-devel, openhpi-devel, gnutls-devel, OpenIPMI-devel.  Is 
> there a way around this?

You'd have to find those packages for your distribution. BTW,
isn't Fedora Core 1 a bit old too.

Thanks,

Dejan

>
> Gary
>
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Fri, Jan 11, 2008 at 10:22:48AM -0500, Gary Schlachter wrote:
>>   
>>>    I have a problem with heartbeat dying.  I have a 3 node cluster 
>>> running HA 2.0.8 on Fedora Core 1.  They are providing a single IP 
>>> address resource.  They are using eth0 as the heartbeat mechanism.  If I 
>>> disconnect the eth0 cable from the node which is providing the IP 
>>> address, one of the other nodes correctly begins providing it.  However, 
>>> shortly after disconnecting the eth0 cable, the heartbeat process (and 
>>> others) die.  The     
>>
>> This has been fixed a few months ago. The fix is in the 2.1.3
>> release. Could you please use the new release.
>>
>> Thanks,
>>
>> Dejan
>>
>>   
>>> key area in the ha-debug log looks like the following:
>>>
>>> pengine[4293]: 2008/01/11_09:50:22 info: determine_online_status: Node 
>>> loneranger.us.big.net is online
>>> pengine[4293]: 2008/01/11_09:50:22 info: native_print: SharedIP    
>>> (heartbeat::ocf:IPaddr):    Started loneranger.us.big.net
>>> pengine[4293]: 2008/01/11_09:50:22 notice: StopRsc:   
>>> loneranger.us.big.net    Stop SharedIP
>>> crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition: 
>>> loneranger.us.big.net: State transition S_POLICY_ENGINE 
>>> ->S_TRANSITION_ENGINE [input=I_PE_SUCCESS cause=C_IPC_MESSAGE 
>>> origin=route_message ]
>>> pengine[4293]: 2008/01/11_09:50:22 info: process_pe_message: Transition 
>>> 0: PEngine Input stored in: /var/lib/heartbeat/pengine/pe-input-137.bz2
>>> tengine[4292]: 2008/01/11_09:50:22 info: unpack_graph: Unpacked 
>>> transition 0: 1 actions in 1 synapses
>>> tengine[4292]: 2008/01/11_09:50:22 info: send_rsc_command: Initiating 
>>> action 3: SharedIP_stop_0 on loneranger.us.big.net
>>> crmd[9543]: 2008/01/11_09:50:22 info: do_lrm_rsc_op: Performing 
>>> op=SharedIP_stop_0 key=3:0:994066a9-4cae-49a4-abad-37f3e0b84b3e)
>>> IPaddr[4300]:    2008/01/11_09:50:22 INFO: /sbin/ifconfig eth0:0 
>>> 10.1.2.50 down
>>> lrmd[9540]: 2008/01/11_09:50:22 info: RA output: (SharedIP:stop:stderr) 
>>> SIOCDELRT: No such process
>>>
>>> crmd[9543]: 2008/01/11_09:50:22 info: process_lrm_event: LRM operation 
>>> SharedIP_stop_0 (call=4, rc=0) complete
>>> cib[9539]: 2008/01/11_09:50:22 info: cib_diff_notify: Update (client: 
>>> 9543, call:32): 0.30.317 -> 0.30.318 (ok)
>>> cib[4315]: 2008/01/11_09:50:22 info: write_cib_contents: Wrote version 
>>> 0.30.318 of the CIB to disk (digest: ad7329b3cddc6a9bbd96deb332a3d08f)
>>> tengine[4292]: 2008/01/11_09:50:22 info: te_update_diff: Processing diff 
>>> (cib_update): 0.30.317 -> 0.30.318
>>> tengine[4292]: 2008/01/11_09:50:22 info: match_graph_event: Action 
>>> SharedIP_stop_0 (3) confirmed on c8608d41-66b2-4115-9043-4a8423b0d562
>>> tengine[4292]: 2008/01/11_09:50:22 info: run_graph: Transition 0: 
>>> (Complete=1, Pending=0, Fired=0, Skipped=0, Incomplete=0)
>>> tengine[4292]: 2008/01/11_09:50:22 info: notify_crmd: Transition 0 
>>> status: te_complete - <null>
>>> crmd[9543]: 2008/01/11_09:50:22 info: do_state_transition: 
>>> loneranger.us.big.net: State transition S_TRANSITION_ENGINE -> S_IDLE [ 
>>> input=I_TE_SUCCESS cause=C_IPC_MESSAGE origin=route_message ]
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: 
>>> Resource temporarily unavailable
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: 
>>> Resource temporarily unavailable
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Cannot write to media pipe 0: 
>>> Resource temporarily unavailable
>>> heartbeat[9527]: 2008/01/11_09:54:27 ERROR: Shutting down.
>>>
>>> The last messages repeat for a very long time then most daemons 
>>> eventually stop.
>>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> [email protected]
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>     
>> _______________________________________________
>> Linux-HA mailing list
>> [email protected]
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>   
>
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to