On 29/05/2013, at 2:37 AM, Greg Woods <[email protected]> wrote:

> I have two clusters that are both running CentOS 5.6 and
> heartbeat-3.0.3-2.3.el5 (from the clusterlabs repo). THey are running
> slightly different pacemaker versions (pacemaker-1.0.9.1-1.15.el5 on the
> first one and pacemaker-1.0.12-1.el5 on the other) They both have
> identical ha.cf files except that the bcast device names are different
> (and they are correct for each case, I checked), like this:
> 
> udpport 694
> bcast eth2
> bcast eth1
> use_logd off
> logfile /var/log/halog
> debugfile /var/log/hadebug
> debug 1
> keepalive 2
> deadtime 15
> initdead 60
> node vmd1.ucar.edu
> node vmd2.ucar.edu
> auto_failback off
> respawn hacluster /usr/lib64/heartbeat/ipfail
> crm respawn

I don't know about the rest, but definitely do not use both ipfail and crm.
Pick one :)

> 
> On one of them (which maybe or maybe not coincidentally is having some
> problems), I get these messages logged about every 2 seconds
> in /var/log/halog, on the other I don't see them:
> 
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG: Dumping
> message with 10 fields
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[0] :
> [t=NS_ackmsg]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[1] :
> [dest=vmx2.ucar.edu]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[2] :
> [ackseq=3a0]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[3] :
> [(1)destuuid=0x5ceb280(37 28)]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[4] :
> [src=vmx1.ucar.edu]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[5] :
> [(1)srcuuid=0x5ceb390(36 27)]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[6] :
> [hg=4c97c17a]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[7] :
> [ts=51a13435]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[8] : [ttl=3]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[9] : [auth=1
> 23b556bcb61a08abecf87cb6411c62e62cf99f0d]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG: Dumping
> message with 12 fields
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[0] :
> [t=status]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[1] :
> [st=active]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[2] :
> [dt=3a98]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[3] :
> [protocol=1]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[4] :
> [src=vmx1.ucar.edu]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[5] :
> [(1)srcuuid=0x5ceb390(36 27)]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[6] :
> [seq=17b]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[7] :
> [hg=4c97c17a]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[8] :
> [ts=51a13435]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[9] :
> [ld=0.27 0.41 0.26 1/315 19183]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[10] :
> [ttl=3]
> May 25 15:59:17 vmx1.ucar.edu heartbeat: [5689]: ERROR: MSG[11] :
> [auth=1 3d3da4df831636f7c274395041ffb49bbf215170]
> 
> The questions are what do these messages actually mean, why is one
> cluster logging them and not the other, and is this something I should
> be worried about?
> 
> Thanks for any info,
> --Greg
> 
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to