----- On Aug 13, 2019, at 3:14 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote:
> You said you booted the hosts sequentially. From the logs they were starting > in > parallel. > No. last says: ha-idg-1: reboot system boot 4.12.14-95.29-de Fri Aug 9 17:42 - 15:56 (3+22:14) ha-idg-2: reboot system boot 4.12.14-95.29-de Fri Aug 9 18:08 - 15:58 (3+21:49) root pts/0 10.35.34.70 Fri Aug 9 17:24 - crash (00:44) (unknown :0 :0 Fri Aug 9 17:24 - crash (00:44) reboot system boot 4.12.14-95.29-de Fri Aug 9 17:23 - 15:58 (3+22:34) >> This is the initialization of the bond1 on ha‑idg‑1 during boot. >> 3 seconds later bond1 is fine: >> >> 2019‑08‑09T17:42:19.299886+02:00 ha‑idg‑2 kernel: [ 1232.117470] tg3 >> 0000:03:04.0 eth2: Link is up at 1000 Mbps, full duplex >> 2019‑08‑09T17:42:19.299908+02:00 ha‑idg‑2 kernel: [ 1232.117482] tg3 >> 0000:03:04.0 eth2: Flow control is on for TX and on for RX >> 2019‑08‑09T17:42:19.315756+02:00 ha‑idg‑2 kernel: [ 1232.131565] tg3 >> 0000:03:04.1 eth3: Link is up at 1000 Mbps, full duplex >> 2019‑08‑09T17:42:19.315767+02:00 ha‑idg‑2 kernel: [ 1232.131568] tg3 >> 0000:03:04.1 eth3: Flow control is on for TX and on for RX >> 2019‑08‑09T17:42:19.351781+02:00 ha‑idg‑2 kernel: [ 1232.169386] bond1: link > >> status definitely up for interface eth2, 1000 Mbps full duplex >> 2019‑08‑09T17:42:19.351792+02:00 ha‑idg‑2 kernel: [ 1232.169390] bond1: > making >> interface eth2 the new active one >> 2019‑08‑09T17:42:19.352521+02:00 ha‑idg‑2 kernel: [ 1232.169473] bond1: > first >> active interface up! >> 2019‑08‑09T17:42:19.352532+02:00 ha‑idg‑2 kernel: [ 1232.169480] bond1: link > >> status definitely up for interface eth3, 1000 Mbps full duplex >> >> also on ha‑idg‑1: >> >> 2019‑08‑09T17:42:19.168035+02:00 ha‑idg‑1 kernel: [ 110.164250] tg3 >> 0000:02:00.3 eth3: Link is up at 1000 Mbps, full duplex >> 2019‑08‑09T17:42:19.168050+02:00 ha‑idg‑1 kernel: [ 110.164252] tg3 >> 0000:02:00.3 eth3: Flow control is on for TX and on for RX >> 2019‑08‑09T17:42:19.168052+02:00 ha‑idg‑1 kernel: [ 110.164254] tg3 >> 0000:02:00.3 eth3: EEE is disabled >> 2019‑08‑09T17:42:19.172020+02:00 ha‑idg‑1 kernel: [ 110.171378] tg3 >> 0000:02:00.2 eth2: Link is up at 1000 Mbps, full duplex >> 2019‑08‑09T17:42:19.172028+02:00 ha‑idg‑1 kernel: [ 110.171380] tg3 >> 0000:02:00.2 eth2: Flow control is on for TX and on for RX >> 2019‑08‑09T17:42:19.172029+02:00 ha‑idg‑1 kernel: [ 110.171382] tg3 >> 0000:02:00.2 eth2: EEE is disabled >> ... >> 2019‑08‑09T17:42:19.244066+02:00 ha‑idg‑1 kernel: [ 110.240310] bond1: link > >> status definitely up for interface eth2, 1000 Mbps full duplex >> 2019‑08‑09T17:42:19.244083+02:00 ha‑idg‑1 kernel: [ 110.240311] bond1: > making >> interface eth2 the new active one >> 2019‑08‑09T17:42:19.244085+02:00 ha‑idg‑1 kernel: [ 110.240353] bond1: > first >> active interface up! >> 2019‑08‑09T17:42:19.244087+02:00 ha‑idg‑1 kernel: [ 110.240356] bond1: link > >> status definitely up for interface eth3, 1000 Mbps full duplex >> >> And the cluster is started afterwards on ha‑idg‑1 at 17:43:04. I don't find > >> further entries for problems with bond1. So i think it's not related. >> Time is synchronized by ntp. The two bonding devices (bond1) are connected directly (point-to-point). So if eth2 or eth3, the ones for the bonding, go online on one host the other host sees it directly. Bernd Helmholtz Zentrum Muenchen Deutsches Forschungszentrum fuer Gesundheit und Umwelt (GmbH) Ingolstaedter Landstr. 1 85764 Neuherberg www.helmholtz-muenchen.de Aufsichtsratsvorsitzende: MinDir'in Prof. Dr. Veronika von Messling Geschaeftsfuehrung: Prof. Dr. med. Dr. h.c. Matthias Tschoep, Heinrich Bassler, Kerstin Guenther Registergericht: Amtsgericht Muenchen HRB 6466 USt-IdNr: DE 129521671 _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/