Hi all,

I'm installing a new cluster based on RHEL5. After compiling the
source-RPMs and installing heartbeat setup was quite easy and I soon had
a cluster of 3 nodes up and running.

However, when restarting a server heartbeat did not come up properly.

It looks like heartbeat starting up, xen reconfiguring the
network-devices (the xen-init-scripts run after heartbeat) and hearbeat
being lost.
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send
[-1] ucast packet: No such device
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast
eth0.: No such device

I'm not sure if this is actually a real heartbeat-problem. Other daemons
like sshd are started before hearbeat, so they should (in theory) suffer
the same problems, but seem to be completely uneffected. Even if this is
not considered to be a heartbeat-problem, I thought I should mention it
here because I expect others to hit the same issue.

I could have tried to shuffle the init-scripts around to have heartbeat
running with xen properly, but have chosen to run my servers with
non-xen-kernels instead.

Haven't seen any problems in non-xen-mode yet.
Current kernel is 2.6.18-8.1.1.el5, was 2.6.18-8.el5xen before.






Apr 16 17:16:09 mailin1 logd: [2427]: info: logd started with default
configuration.
Apr 16 17:16:09 mailin1 logd: [2433]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Apr 16 17:16:09 mailin1 logd: [2427]: info: G_main_add_SignalHandler:
Added signal handler for signal 15
Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: No log entry found in
ha.cf -- use logd
Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: Enabling logging
daemon 
Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: logfile and debug file
are those specified in logd config file (default /etc/logd.cf)
Apr 16 17:16:09 mailin1 heartbeat: [2448]: info:
**************************
Apr 16 17:16:09 mailin1 heartbeat: [2448]: info: Configuration
validated. Starting heartbeat 2.0.8
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: heartbeat: version
2.0.8
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: Heartbeat generation:
10
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info:
G_main_add_TriggerHandler: Added signal manual handler
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info:
G_main_add_TriggerHandler: Added signal manual handler
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info:
Removing /var/run/heartbeat/rsctmp failed, recreating.
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: write
socket priority set to IPTOS_LOWDELAY on eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound send
socket to device: eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound
receive socket to device: eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: started on
port 694 interface eth0 to 129.13.185.82
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: write
socket priority set to IPTOS_LOWDELAY on eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound send
socket to device: eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: bound
receive socket to device: eth0
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: glib: ucast: started on
port 694 interface eth0 to 129.13.185.83
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info:
G_main_add_SignalHandler: Added signal handler for signal 17
Apr 16 17:16:09 mailin1 heartbeat: [2449]: info: Local status now set
to: 'up'
Apr 16 17:16:09 mailin1 gpm[2472]: *** info [startup.c(95)]: 
Apr 16 17:16:09 mailin1 gpm[2472]: Started gpm successfully. Entered
daemon mode.
Apr 16 17:16:09 mailin1 rhnsd[2574]: Red Hat Network Services Daemon
starting up.
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Link mailin2:eth0 up.
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Status update for node
mailin2: status active
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Link mailin3:eth0 up.
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Status update for node
mailin3: status active
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Comm_now_up(): updating
status to active
Apr 16 17:16:10 mailin1 heartbeat: [2449]: info: Local status now set
to: 'active'
[...]
Apr 16 17:16:13 mailin1 xenstored: Checking store ...
Apr 16 17:16:13 mailin1 xenstored: Checking store complete.
Apr 16 17:16:13 mailin1 xenstored: Checking store ...
Apr 16 17:16:13 mailin1 xenstored: Checking store complete.
Apr 16 17:16:14 mailin1 kernel: Bridge firewalling registered
Apr 16 17:16:14 mailin1 cib: [2592]: WARN: init_start: CCM Activation
failed
Apr 16 17:16:14 mailin1 cib: [2592]: WARN: init_start: CCM Connection
failed 4 times (30 max)
Apr 16 17:16:14 mailin1 kernel: device vif0.0 entered promiscuous mode
Apr 16 17:16:14 mailin1 kernel: audit(1176736574.278:3): dev=vif0.0
prom=256 old_prom=0 auid=4294967295
Apr 16 17:16:14 mailin1 kernel: xenbr0: port 1(vif0.0) entering learning
state
Apr 16 17:16:14 mailin1 kernel: xenbr0: topology change detected,
propagating
Apr 16 17:16:14 mailin1 kernel: xenbr0: port 1(vif0.0) entering
forwarding state
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send
[-1] ucast packet: No such device
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast
eth0.: No such device
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: glib: Unable to send
[-1] ucast packet: No such device
Apr 16 17:16:14 mailin1 heartbeat: [2460]: ERROR: write failure on ucast
eth0.: No such device






-- 
CU,
   Patrick.

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to