Hi, I started getting this message on 1 system in a 2 node hb cluster AFTER installing 2.1.2 via the fc8 rpms (yum install heartbeat*, so both heartbeat and heartbeat-devel). I actually installed the rpms on two freshly installed FC8 systems. Also installed: libnet and glib-devel. I basically did the same thing a few weeks ago when these systems were FC7 (but got hb 2.0.8 via the rpms).
I found an earlier email from Alan R regarding this and 2.0.5, but could find no resolution. I'm certainly a newbie with this product and it may be something I'm doing. I've written an app to the API that seems to be working on 2.0.8. It uses "azClient" as its "signon" name. The problem didn't appear on wiley-coyote until after I'd started the app (although, it could be that I simply did not see the messages until after the app started). The problem DID NOT and still does not appear on the other node, beauregard. I ran the app on it also, and it signed on properly, etc. Having said all that, when starting heartbeat, here are the messages in the log file: Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: Version 2 support: no Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: WARN: Logging daemon is disabled --enabling logging daemon is recommended Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: ************************** Nov 25 12:31:59 wiley-coyote heartbeat: [26165]: info: Configuration validated. Starting heartbeat 2.1.2 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: heartbeat: version 2.1.2 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Heartbeat generation: 1196015782 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: G_main_add_TriggerHandler: Added signal manual handler Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: G_main_add_TriggerHandler: Added signal manual handler Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Removing /var/run/heartbeat/rsctmp failed, recreating. Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: bound send socket to device: eth0 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: bound receive socket to device: eth0 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: glib: ucast: started on port 694 interface eth0 to 192.168.0.11 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Nov 25 12:31:59 wiley-coyote heartbeat: [26166]: info: Local status now set to: 'up' Nov 25 12:32:00 wiley-coyote heartbeat: [26166]: info: Link beauregard:eth0 up. Nov 25 12:32:00 wiley-coyote heartbeat: [26166]: info: Status update for node beauregard: status active Nov 25 12:32:00 wiley-coyote harc[26173]: info: Running /etc/ha.d/rc.d/status status Nov 25 12:33:04 wiley-coyote heartbeat: [26166]: info: all clients are now paused Nov 25 12:33:37 wiley-coyote heartbeat: [26166]: ERROR: Message hist queue is filling up (151 messages in queue) <above ERROR message continues to repeat> It is also worth noting that when I execute "cl_status nodestatus wiley-coyote" on wiley-coyote I get: cl_status[26192]: 2007/11/25_12:33:22 ERROR: Cannot signon with heartbeat cl_status[26192]: 2007/11/25_12:33:22 ERROR: REASON: hb_api_signon: Can't initiate connection to heartbeat which seems to indicate a problem with the socket? Or pipe? BTW, this command works correctly on beauregard, returning "alive" for beauregard and "dead" for wiley-coyote. Anyway, please point me to whatever you think appropriate for me to look at (especially source as I'd like to learn more). My config file is simple and is below (comments mostly removed). Also, the only resource I'm managing is an IP address. I'm not using CRM, so I've got an haresources file which contains exactly: wiley-coyote 192.168.0.98/24/eth0 Any help would be greatly appreciated! TIA Scott Mann Sr Software Engineer Aztek Networks ha.cf (identical on both systems except for the change in ucast) ---------------------------------------------------------------- # Facility to use for syslog()/logger # logfacility local0 # # keepalive 2 # # deadtime 30 # # warntime 10 # # initdead 120 # # udpport 694 # # beauregard ucast eth0 192.168.0.11 # wiley-coyote #ucast eth0 192.168.0.31 # # #auto_failback on auto_failback off # node wiley-coyote node beauregard # #apiauth client-name gid=gidlist uid=uidlist #apiauth ipfail gid=haclient uid=hacluster apiauth azClient uid=root,smann # #compression_threshold 2 crm no <end> _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems