[Linux-HA] ERROR: Message hist queue is filling up

Scott Mann Sun, 25 Nov 2007 11:50:22 -0800

Hi,

I started getting this message on 1 system in a 2 node hb cluster AFTER 
installing 2.1.2 via the fc8 rpms (yum install heartbeat*, so both heartbeat 
and heartbeat-devel). I actually installed the rpms on two freshly installed 
FC8 systems. Also installed: libnet and glib-devel. I basically did the same 
thing a few weeks ago when these systems were FC7 (but got hb 2.0.8 via the 
rpms).


I found an earlier email from Alan R regarding this and 2.0.5, but could find 
no resolution. I'm certainly a newbie with this product and it may be something 
I'm doing. I've written an app to the API that seems to be working on 2.0.8. It 
uses "azClient" as its "signon" name. The problem didn't appear on wiley-coyote 
until after I'd started the app (although, it could be that I simply did not 
see the messages until after the app started). The problem DID NOT and still 
does not appear on the other node, beauregard. I ran the app on it also, and it 
signed on properly, etc.

Having said all that, when starting heartbeat, here are the messages in the log 
file:

Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: Version 2 support: no
Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: WARN: Logging daemon is 
disabled --enabling logging daemon is recommended
Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: 
**************************
Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: Configuration validated. 
Starting heartbeat 2.1.2
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: heartbeat: version 2.1.2
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Heartbeat generation: 
1196015782
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: 
G_main_add_TriggerHandler: Added signal manual handler
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: 
G_main_add_TriggerHandler: Added signal manual handler
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Removing 
/var/run/heartbeat/rsctmp failed, recreating.
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: write 
socket priority set to IPTOS_LOWDELAY on eth0
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: bound send 
socket to device: eth0
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: bound 
receive socket to device: eth0
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: started on 
port 694 interface eth0 to 192.168.0.11
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: 
G_main_add_SignalHandler: Added signal handler for signal 17
Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Local status now set to: 
'up'
Nov 25 12:32:00 wiley-coyote heartbeat: [26166]: info: Link beauregard:eth0 up.
Nov 25 12:32:00 wiley-coyote heartbeat: [26166]: info: Status update for node 
beauregard: status active
Nov 25 12:32:00 wiley-coyote harc[26173]: info: Running /etc/ha.d/rc.d/status 
status
Nov 25 12:33:04 wiley-coyote heartbeat: [26166]: info: all clients are now 
paused
Nov 25 12:33:37 wiley-coyote heartbeat: [26166]: ERROR: Message hist queue is 
filling up (151 messages in queue)
<above ERROR message continues to repeat>

It is also worth noting that when I execute "cl_status nodestatus wiley-coyote" 
on wiley-coyote I get:

cl_status[26192]: 2007/11/25_12:33:22 ERROR: Cannot signon with heartbeat
cl_status[26192]: 2007/11/25_12:33:22 ERROR: REASON: hb_api_signon: Can't 
initiate connection  to heartbeat

which seems to indicate a problem with the socket? Or pipe? BTW, this command 
works correctly on beauregard, returning "alive" for beauregard and "dead" for 
wiley-coyote.

Anyway, please point me to whatever you think appropriate for me to look at 
(especially source as I'd like to learn more). My config file is simple and is 
below (comments mostly removed). Also, the only resource I'm managing is an IP 
address. I'm not using CRM, so I've got an haresources file which contains 
exactly:

wiley-coyote    192.168.0.98/24/eth0


Any help would be greatly appreciated!
TIA

Scott Mann
Sr Software Engineer
Aztek Networks

ha.cf (identical on both systems except for the change in ucast)
----------------------------------------------------------------
#       Facility to use for syslog()/logger 
#
logfacility     local0
#
#
keepalive 2
#
#
deadtime 30
#
#
warntime 10
#
#
initdead 120
#
#
udpport 694
#
# beauregard
ucast eth0 192.168.0.11
# wiley-coyote
#ucast eth0 192.168.0.31
#
#
#auto_failback on

auto_failback off

#

node wiley-coyote
node beauregard
#
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=haclient uid=hacluster
apiauth azClient uid=root,smann

#
#compression_threshold 2
crm no


<end>

_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] ERROR: Message hist queue is filling up

Reply via email to