** Description changed:
[impact]
- ip addresses managed by keepalived are lost across networkd restarts
+ - ALL related HA software has a small problem if interfaces are being
+ managed by systemd-networkd: nic restarts/reconfigs are always going to
+ wipe all interfaces aliases when HA software is not expecting it to (no
+ coordination between them.
+
+ - keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is
+ smarter in this case because it has a service monitor that will restart
+ the virtual IP resource, in affected node & nic, before considering a
+ real failure, but other HA service might consider a real failure when it
+ is not.
[test case]
- see original description below
+ - comment #14 is a full test case: to have 3 node pacemaker, in that
+ example, and cause a networkd service restart: it will trigger a failure
+ for the virtual IP resource monitor.
+
+ - other example is given in the original description for keepalived.
+ both suffer from the same issue (and other HA softwares as well).
[regression potential]
- this backports KeepConfiguration parameter, which adds some significant
- complexity to networkd's configuration and behavior, which could lead to
- regressions in correctly configuring the network at networkd start, or
- incorrectly maintaining configuration at networkd restart, or losing
- network state at networkd stop. Any regressions are most likely to
- occur during networkd start, restart, or stop, and most likely to
- involve missing or incorrect ip address(es).
+ - this backports KeepConfiguration parameter, which adds some
+ significant complexity to networkd's configuration and behavior, which
+ could lead to regressions in correctly configuring the network at
+ networkd start, or incorrectly maintaining configuration at networkd
+ restart, or losing network state at networkd stop.
+
+ - Any regressions are most likely to occur during networkd start,
+ restart, or stop, and most likely to involve missing or incorrect ip
+ address(es).
+
+ - the change is based in upstream patches adding the exact feature we
+ needed to fix this issue & it will be integrated with a netplan change
+ to add the needed stanza to systemd nic configuration file
+ (KeepConfiguration=)
[other info]
original description:
---
Configure netplan for interfaces, for example (a working config with IP
addresses obfuscated)
network:
ethernets:
eth0:
addresses: [192.168.0.5/24]
dhcp4: false
nameservers:
search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com,
phone.blah.com]
addresses: [10.22.11.1]
eth2:
addresses:
- 12.13.14.18/29
- 12.13.14.19/29
gateway4: 12.13.14.17
dhcp4: false
nameservers:
search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com,
phone.blah.com]
addresses: [10.22.11.1]
eth3:
addresses: [10.22.11.6/24]
dhcp4: false
nameservers:
search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com,
phone.blah.com]
addresses: [10.22.11.1]
eth4:
addresses: [10.22.14.6/24]
dhcp4: false
nameservers:
search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com,
phone.blah.com]
addresses: [10.22.11.1]
eth7:
addresses: [9.5.17.34/29]
dhcp4: false
optional: true
nameservers:
search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com,
phone.blah.com]
addresses: [10.22.11.1]
version: 2
Configure keepalived (again, a working config with IP addresses
obfuscated)
global_defs # Block id
{
notification_email {
[email protected]
}
notification_email_from [email protected]
smtp_server 10.22.11.7 # IP
smtp_connect_timeout 30 # integer, seconds
router_id system3 # string identifying the machine,
# (doesn't have to be hostname).
vrrp_mcast_group4 224.0.0.18 # optional, default 224.0.0.18
vrrp_mcast_group6 ff02::12 # optional, default ff02::12
enable_traps # enable SNMP traps
}
vrrp_sync_group collection {
group {
wan
lan
phone
}
vrrp_instance wan {
state MASTER
interface eth2
virtual_router_id 77
priority 150
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass BlahBlah
}
virtual_ipaddress {
12.13.14.20
}
}
vrrp_instance lan {
state MASTER
interface eth3
virtual_router_id 78
priority 150
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass MoreBlah
}
virtual_ipaddress {
10.22.11.13/24
}
}
vrrp_instance phone {
state MASTER
interface eth4
virtual_router_id 79
priority 150
advert_int 1
smtp_alert
authentication {
auth_type PASS
auth_pass MostBlah
}
virtual_ipaddress {
10.22.14.3/24
}
}
At boot the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
valid_lft forever preferred_lft forever
inet 10.22.14.3/24 scope global secondary eth4
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
valid_lft forever preferred_lft forever
inet 10.22.11.13/24 scope global secondary eth3
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
valid_lft forever preferred_lft forever
inet 12.13.14.20/32 scope global eth2
valid_lft forever preferred_lft forever
inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
valid_lft forever preferred_lft forever
Run 'netplan try' (didn't even make any changes to the configuration) and the
keepalived addresses disappear never to return, the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group
default qlen 1000
link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
valid_lft forever preferred_lft forever
inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
valid_lft forever preferred_lft forever
inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
valid_lft forever preferred_lft forever
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1815101
Title:
[master] Restarting systemd-networkd breaks keepalived, heartbeat,
corosync, pacemaker (interface aliases are restarted)
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-keepalived/+bug/1815101/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs