Hi. Please see http://www.gossamer-threads.com/lists/linuxha/users/68406?do=post_view_threaded#68406
2011/9/22 Claus Wimmer <[email protected]> > Hello, > > I have tried to build up a four nodes cluster with heartbeat and pacemaker. > Everything is alright as long as the cluster consists of 2 nodes. With 3 or > 4 nodes suddenly error messages come up during a configuration change: > > Sep 06 08:16:56 secomat4 heartbeat: [15956]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15954]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15952]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15952]: ERROR: write_child: write > failure on ucast hb1.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15954]: ERROR: write_child: write > failure on ucast hb1.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15956]: ERROR: write_child: write > failure on ucast hb1.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15958]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15958]: ERROR: write_child: write > failure on ucast hb1.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15960]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15960]: ERROR: write_child: write > failure on ucast hb2.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15966]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15962]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15964]: ERROR: glib: Unable to send > [-1] ucast packet: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15962]: ERROR: write_child: write > failure on ucast hb2.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15966]: ERROR: write_child: write > failure on ucast hb2.: Message too long > Sep 06 08:16:56 secomat4 heartbeat: [15964]: ERROR: write_child: write > failure on ucast hb2.: Message too long > ... > > The result is a loss of nodes' intra cluster connection. This seems to be > independent of cluster communication protocol. The example error mesages > show up with ucast (currently I use bcast again, see ha.cf). I have seen > the same error messages with bcast (bcast with compression didn't work, > too). With mcast it was slightly different: I couldn't get a single node up > and running, so i had to switch back to bcast. > > > Some Information about the setup: > > ha.cf: > use_logd on > udpport 694 > keepalive 2 > warntime 10 > deadtime 15 > initdead 90 > bcast hb1 hb2 > autojoin none > node secomat1 secomat2 secomat3 secomat4 > # debug 1 > crm yes > apiauth stonith-ng uid=root > > > crm: > pacemaker > > > RPMs: > secomat4:~ # rpm -qi heartbeat > Name : heartbeat Relocations: (not relocatable) > Version : 3.0.3 Vendor: (none) > Release : 2.18 Build Date: Wed Sep 29 18:06:13 > 2010 > Install Date: Wed Apr 13 15:25:12 2011 Build Host: f13.beekhof.net > Group : Productivity/Clustering/HA Source RPM: > heartbeat-3.0.3-2.18.src.rpm > Size : 10221126 License: GPL v2 only; LGPL > v2.1 or later > Signature : (none) > URL : http://linux-ha.org/ > Summary : Messaging and membership subsystem for High-Availability > Linux > Description : ... > > > secomat4:~ # rpm -qi pacemaker > Name : pacemaker Relocations: (not relocatable) > Version : 1.1.5 Vendor: (none) > Release : 1.1 Build Date: Mon Feb 14 17:34:14 > 2011 > Install Date: Wed Apr 13 15:25:15 2011 Build Host: f13.beekhof.net > Group : Productivity/Clustering/HA Source RPM: > pacemaker-1.1.5-1.1.src.rpm > Size : 13626153 License: GPLv2+ and LGPLv2+ > Signature : (none) > URL : http://www.clusterlabs.org > Summary : Scalable High-Availability cluster resource manager > Description : ... > > > Hardware: > 2 Intel(R) Xeon(R) CPU X5650 @ 2.67GHz, 2660 MHz > 6 cores > 24 processors > > > OS: > secomat4:~ # uname > -a > > Linux secomat4 2.6.34.10-0.2-default #1 SMP 2011-07-20 18:48:56 +0200 > x86_64 x86_64 x86_64 GNU/Linux > > > Best regards > Claus > ___________________________________________________________ > Schon gehört? WEB.DE hat einen genialen Phishing-Filter in die > Toolbar eingebaut! http://produkte.web.de/go/toolbar > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
