Hi all, I'm installing a new cluster based on RHEL5. After compiling the source-RPMs and installing heartbeat setup was quite easy and I soon had a cluster of 3 nodes up and running.
However, when restarting a server heartbeat did not come up properly. It looks like heartbeat starting up, xen reconfiguring the network-devices (the xen-init-scripts run after heartbeat) and hearbeat being lost. Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send [-1] ucast packet: No such device Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast eth0.: No such device I'm not sure if this is actually a real heartbeat-problem. Other daemons like sshd are started before hearbeat, so they should (in theory) suffer the same problems, but seem to be completely uneffected. Even if this is not considered to be a heartbeat-problem, I thought I should mention it here because I expect others to hit the same issue. I could have tried to shuffle the init-scripts around to have heartbeat running with xen properly, but have chosen to run my servers with non-xen-kernels instead. Haven't seen any problems in non-xen-mode yet. Current kernel is 2.6.18-8.1.1.el5, was 2.6.18-8.el5xen before. Apr 16 17:16:09 mailin1 logd: [2427]: info: logd started with default configuration. Apr 16 17:16:09 mailin1 logd: [2433]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Apr 16 17:16:09 mailin1 logd: [2427]: info: G_main_add_SignalHandler: Added signal handler for signal 15 Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: No log entry found in ha.cf -- use logd Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: Enabling logging daemon Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: logfile and debug file are those specified in logd config file (default /etc/logd.cf) Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: ************************** Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: Configuration validated. Starting heartbeat 2.0.8 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: heartbeat: version 2.0.8 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: Heartbeat generation: 10 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: G_main_add_TriggerHandler: Added signal manual handler Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: G_main_add_TriggerHandler: Added signal manual handler Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: Removing /var/run/heartbeat/rsctmp failed, recreating. Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound send socket to device: eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound receive socket to device: eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: started on port 694 interface eth0 to 129.13.185.82 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound send socket to device: eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound receive socket to device: eth0 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: started on port 694 interface eth0 to 129.13.185.83 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: G_main_add_SignalHandler: Added signal handler for signal 17 Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: Local status now set to: 'up' Apr 16 17:16:09 mailin1 gpm[2472]: *** info [startup.c(95)]: Apr 16 17:16:09 mailin1 gpm[2472]: Started gpm successfully. Entered daemon mode. Apr 16 17:16:09 mailin1 rhnsd[2574]: Red Hat Network Services Daemon starting up. Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Link mailin2:eth0 up. Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Status update for node mailin2: status active Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Link mailin3:eth0 up. Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Status update for node mailin3: status active Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Comm_now_up(): updating status to active Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Local status now set to: 'active' [...] Apr 16 17:16:13 mailin1 xenstored: Checking store ... Apr 16 17:16:13 mailin1 xenstored: Checking store complete. Apr 16 17:16:13 mailin1 xenstored: Checking store ... Apr 16 17:16:13 mailin1 xenstored: Checking store complete. Apr 16 17:16:14 mailin1 kernel: Bridge firewalling registered Apr 16 17:16:14 mailin1 cib: [2592]: WARN: init_start: CCM Activation failed Apr 16 17:16:14 mailin1 cib: [2592]: WARN: init_start: CCM Connection failed 4 times (30 max) Apr 16 17:16:14 mailin1 kernel: device vif0.0 entered promiscuous mode Apr 16 17:16:14 mailin1 kernel: audit(1176736574.278:3): dev=vif0.0 prom=256 old_prom=0 auid=4294967295 Apr 16 17:16:14 mailin1 kernel: xenbr0: port 1(vif0.0) entering learning state Apr 16 17:16:14 mailin1 kernel: xenbr0: topology change detected, propagating Apr 16 17:16:14 mailin1 kernel: xenbr0: port 1(vif0.0) entering forwarding state Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send [-1] ucast packet: No such device Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast eth0.: No such device Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send [-1] ucast packet: No such device Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast eth0.: No such device -- CU, Patrick.
smime.p7s
Description: S/MIME cryptographic signature
_______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
