Hi.
Please see
http://www.gossamer-threads.com/lists/linuxha/users/68406?do=post_view_threaded#68406

2011/9/22 Claus Wimmer <[email protected]>

> Hello,
>
> I have tried to build up a four nodes cluster with heartbeat and pacemaker.
> Everything is alright as long as the cluster consists of 2 nodes. With 3 or
> 4 nodes suddenly error messages come up during a configuration change:
>
> Sep 06 08:16:56 secomat4 heartbeat: [15956]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15954]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15952]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15952]: ERROR: write_child: write
> failure on ucast hb1.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15954]: ERROR: write_child: write
> failure on ucast hb1.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15956]: ERROR: write_child: write
> failure on ucast hb1.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15958]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15958]: ERROR: write_child: write
> failure on ucast hb1.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15960]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15960]: ERROR: write_child: write
> failure on ucast hb2.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15966]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15962]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15964]: ERROR: glib: Unable to send
> [-1] ucast packet: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15962]: ERROR: write_child: write
> failure on ucast hb2.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15966]: ERROR: write_child: write
> failure on ucast hb2.: Message too long
> Sep 06 08:16:56 secomat4 heartbeat: [15964]: ERROR: write_child: write
> failure on ucast hb2.: Message too long
> ...
>
> The result is a loss of nodes' intra cluster connection. This seems to be
> independent of cluster communication protocol. The example error mesages
> show up with ucast (currently I use bcast again, see ha.cf). I have seen
> the same error messages with bcast (bcast with compression didn't work,
> too). With mcast it was slightly different: I couldn't get a single node up
> and running, so i had to switch back to bcast.
>
>
> Some Information about the setup:
>
> ha.cf:
> use_logd on
> udpport 694
> keepalive 2
> warntime 10
> deadtime 15
> initdead 90
> bcast hb1 hb2
> autojoin none
> node secomat1 secomat2 secomat3 secomat4
> # debug 1
> crm yes
> apiauth stonith-ng      uid=root
>
>
> crm:
> pacemaker
>
>
> RPMs:
> secomat4:~ # rpm -qi heartbeat
> Name        : heartbeat                    Relocations: (not relocatable)
> Version     : 3.0.3                             Vendor: (none)
> Release     : 2.18                          Build Date: Wed Sep 29 18:06:13
> 2010
> Install Date: Wed Apr 13 15:25:12 2011         Build Host: f13.beekhof.net
> Group       : Productivity/Clustering/HA    Source RPM:
> heartbeat-3.0.3-2.18.src.rpm
> Size        : 10221126                         License: GPL v2 only; LGPL
> v2.1 or later
> Signature   : (none)
> URL         : http://linux-ha.org/
> Summary     : Messaging and membership subsystem for High-Availability
> Linux
> Description : ...
>
>
> secomat4:~ # rpm -qi pacemaker
> Name        : pacemaker                    Relocations: (not relocatable)
> Version     : 1.1.5                             Vendor: (none)
> Release     : 1.1                           Build Date: Mon Feb 14 17:34:14
> 2011
> Install Date: Wed Apr 13 15:25:15 2011         Build Host: f13.beekhof.net
> Group       : Productivity/Clustering/HA    Source RPM:
> pacemaker-1.1.5-1.1.src.rpm
> Size        : 13626153                         License: GPLv2+ and LGPLv2+
> Signature   : (none)
> URL         : http://www.clusterlabs.org
> Summary     : Scalable High-Availability cluster resource manager
> Description : ...
>
>
> Hardware:
> 2  Intel(R) Xeon(R) CPU           X5650  @ 2.67GHz, 2660 MHz
> 6  cores
> 24 processors
>
>
> OS:
> secomat4:~ # uname
> -a
>
> Linux secomat4 2.6.34.10-0.2-default #1 SMP 2011-07-20 18:48:56 +0200
> x86_64 x86_64 x86_64 GNU/Linux
>
>
> Best regards
> Claus
> ___________________________________________________________
> Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die
> Toolbar eingebaut! http://produkte.web.de/go/toolbar
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to