On Feb 5, 2008, at 12:30 AM, Amos Shapira wrote:

On Feb 4, 2008 7:32 PM, Andrew Beekhof <[EMAIL PROTECTED]> wrote:

Crashing?
What was the subject?  I don't recall this.


I couldn't make CentOS 5's 2.1.3 talk to another node when configuring with
the version 2 style CRM, at some stage I learned that not all programs
manage to start and stay up. Later also found (I think) that "stonith -h" or
something like this always bombs on some interrupt.

I don't remember all the details but the thread where I asked about this is
archived in
http://lists.community.tummy.com/pipermail/linux-ha/2007-November/029068.html


Sorry, I must have missed this thread.

I eventually switched to using the old-style haresources config file and
things seem to work OK with that.

24 heartbeat[17482]: 2007/11/29_07:12:41 info: Status update for node drbd01.test.spammatters.local: status up 25 heartbeat[17482]: 2007/11/29_07:13:45 info: all clients are now paused

line 25 is sure to be part of the problem, but also I don't see any evidence that heartbeat even tried to start the crm processes.

this is also interesting...
13 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0 14 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound send socket to device: eth0 15 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound receive socket to device: eth0 16 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: started on port 695 interface eth0 to 192.168.0.248 17 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0 18 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound send socket to device: eth0 19 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: bound receive socket to device: eth0 20 heartbeat[17482]: 2007/11/29_07:12:40 info: glib: ucast: started on port 695 interface eth0 to 192.168.0.249

I wonder if the fact that there are two IPs on eth0 could have been causing problems.

Oh, and the reason crm_mon was taking so long is related to your choice of deadtime which was quite high.



Thanks,

--Amos
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to