Hello list, I've got a problem with a 2-node active/passive setup, running with heartbeat 2.0.7 on Novell SuSE 10.1. The cluster is running in R1-style configuration, with a drbd-disk, IPaddress, IPsrcaddr, nmb, smb, nfs and one proprietary application as highly available services. As heartbeat-media the nodes use both eth0 (via switch) and eth1 (direct-link) network-interfaces (unicast), and a direct serial connection. The netspeed is 1Gbit/s full duplex for both interfaces, the serial line works at 57600 baud. Furthermore, I've declared three IP-Adresses as ping nodes.
Starting the cluster and running the resources is all fine, manual switching between nodes with hb_takeover and hb_standby works as expected, with all resources being started respectively stopped as they should. But the cluster shows some rather weird runtime behaviour. On a regular basis, both nodes report their eth0-network interfaces being down, therefore reporting their ping-group as dead. As this doesn't happen exactly synchronized, it sometimes provokes a resource failover, depending on which node declared itself dead in the first place. This state lasts for about three seconds, after that time both nodes fire their eth0-interfaces back up and resume working as if nothing happened. The network-interfaces use hardware from Marvell and are driven by the sk98lin-driver. I hope anyone has got an idea about this. Regards, Ronald Mit freundlichen Grüßen BBR Verkehrstechnik GmbH i.A. Ronald Unterreiner - Projektabwicklung Informationssysteme - -- BBR - Baudis Bergmann Rösch - Verkehrstechnik GmbH Pillaustraße 1e D - 38126 Braunschweig T: +49.531.27300.675 F: +49.531.27300.940 M: +49.170.5777527 @: [EMAIL PROTECTED] W: www.bbr-vt.de Registergericht: AG Braunschweig, HRB 3037 Geschäftsführer: Dipl.-Ing. Arne Baudis Dipl.-Ing. Thomas Bergmann Dipl.-Ing. Frank-Michael Rösch USt.-ID-Nr.: DE 114 877 881 _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
