date:20200213

[Ubuntu-ha] [Bug 1863174] Re: Keepalived Loses VIP on DHCP Renewal

2020-02-13 Thread Matthew Roark

Note: this is not a duplicate of
https://bugs.launchpad.net/netplan/+bug/1815101 considering that's only
relevant for Ubuntu 18.04 (Bionic) and later, and this report is
regarding Ubuntu 16.04.

The same change as outlined in
https://bugs.launchpad.net/netplan/+bug/1815101/comments/4 _may_ still
be relevant to also address this manifestation of the issue, though.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1863174

Title:
  Keepalived Loses VIP on DHCP Renewal

Status in keepalived package in Ubuntu:
  New

Bug description:
  Ubuntu Release: 16.04.6 LTS
  Keepalived Package Version: 1.2.24-1ubuntu0.16.04.2

  -- /etc/keepalived/keepalived.conf --
  vrrp_script chk_apiserver {
  script "curl https://127.0.0.1:443/healthz --cacert ca.crt --key 
request.key --cert request.crt --fail > /dev/null 2>&1"
  interval 10
  fall 6
  rise 2
  }

  vrrp_instance K8S_APISERVER {
  interface ens3
  state BACKUP
  virtual_router_id 118
  nopreempt
  dont_track_primary

  authentication {
  auth_type AH
  auth_pass **REDACTED**
  }

  virtual_ipaddress {
  10.128.233.23
  }
  track_script {
  chk_apiserver
  }

  }

  Expected Behavior: Upon DHCP renewal, Keepalived would maintain its
  VIP on the designated interface. In the case that the VIP is lost
  (outside of its control), it should failover to another VRRP instance.

  Actual Behavior: VIP disappeared from the designated interface, and
  did not failover to any other VRRP instance until Keepalived was
  restarted.

  2: ens3:  mtu 1450 qdisc pfifo_fast state UP 
group default qlen 1000
  link/ether fa:16:3e:8c:d2:e1 brd ff:ff:ff:ff:ff:ff
  inet 172.20.34.50/27 brd 172.20.34.63 scope global ens3
 valid_lft forever preferred_lft forever
  inet6 fe80::f816:3eff:fe8c:d2e1/64 scope link 
 valid_lft forever preferred_lft forever

  -- /var/log/syslog --
  Feb  5 21:13:17 dhclient[839]: DHCPDISCOVER on ens3 to 255.255.255.255 port 
67 interval 3 (xid=0x4cfc595e)
  Feb  5 21:13:17 dhclient[839]: DHCPREQUEST of 172.20.34.50 on ens3 to 
255.255.255.255 port 67 (xid=0x5e59fc4c)
  Feb  5 21:13:17 dhclient[839]: DHCPOFFER of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17 dhclient[839]: DHCPACK of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17 dhclient[839]: bound to 172.20.34.50 -- renewal in 40846 
seconds.
  Feb  5 21:13:19 ntpd[19295]: Deleting interface #34 ens3, 172.20.34.40#123, 
interface stats: received=0, sent=0, dropped=0, active_time=150821 secs

  
  -- /tmp/keepalived.stats --
  VRRP Instance: K8S_APISERVER
Advertisements:
  Received: 10722
  Sent: 153463
Became master: 1
Released master: 0
Packet Errors:
  Length: 0
  TTL: 0
  Invalid Type: 0
  Advertisement Interval: 0
  Address List: 0
Authentication Errors:
  Invalid Type: 0
  Type Mismatch: 0
  Failure: 15
Priority Zero:
  Received: 0
  Sent: 0

  Note: the networking.service was *not* restarted during this
  timeframe; however, I have been able to reproduce the issue in that
  manner.

  Additionally, I've not been able to reproduce this issue by hand, i.e.
  'dhclient -v -r ens3'. It seemingly only occurs when the lease has
  expired; it is the only time in which we can observe ntpd detecting
  that the interface has disappeared (thus the 'Deleting interface'
  message) at least.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1863174/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

[Ubuntu-ha] [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

2020-02-13 Thread Rafael David Tinoco

** Also affects: heartbeat (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: keepalived (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Also affects: systemd (Ubuntu Xenial)
   Importance: Undecided
   Status: New

** Changed in: keepalived (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: keepalived (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: keepalived (Ubuntu Xenial)
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

** No longer affects: heartbeat (Ubuntu Xenial)

** Changed in: systemd (Ubuntu Xenial)
   Importance: Undecided => Medium

** Changed in: systemd (Ubuntu Xenial)
   Status: New => Confirmed

** Changed in: systemd (Ubuntu Xenial)
 Assignee: (unassigned) => Rafael David Tinoco (rafaeldtinoco)

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1815101

Title:
  [master] Restarting systemd-networkd breaks keepalived, heartbeat,
  corosync, pacemaker (interface aliases are restarted)

Status in Keepalived Charm:
  New
Status in netplan:
  Confirmed
Status in heartbeat package in Ubuntu:
  Won't Fix
Status in keepalived package in Ubuntu:
  In Progress
Status in systemd package in Ubuntu:
  In Progress
Status in keepalived source package in Xenial:
  Confirmed
Status in systemd source package in Xenial:
  Confirmed
Status in keepalived source package in Bionic:
  Confirmed
Status in systemd source package in Bionic:
  Confirmed
Status in keepalived source package in Disco:
  Won't Fix
Status in systemd source package in Disco:
  Won't Fix
Status in keepalived source package in Eoan:
  In Progress
Status in systemd source package in Eoan:
  Fix Released

Bug description:
  [impact]

  - ALL related HA software has a small problem if interfaces are being
  managed by systemd-networkd: nic restarts/reconfigs are always going
  to wipe all interfaces aliases when HA software is not expecting it to
  (no coordination between them.

  - keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is
  smarter in this case because it has a service monitor that will
  restart the virtual IP resource, in affected node & nic, before
  considering a real failure, but other HA service might consider a real
  failure when it is not.

  [test case]

  - comment #14 is a full test case: to have 3 node pacemaker, in that
  example, and cause a networkd service restart: it will trigger a
  failure for the virtual IP resource monitor.

  - other example is given in the original description for keepalived.
  both suffer from the same issue (and other HA softwares as well).

  [regression potential]

  - this backports KeepConfiguration parameter, which adds some
  significant complexity to networkd's configuration and behavior, which
  could lead to regressions in correctly configuring the network at
  networkd start, or incorrectly maintaining configuration at networkd
  restart, or losing network state at networkd stop.

  - Any regressions are most likely to occur during networkd start,
  restart, or stop, and most likely to involve missing or incorrect ip
  address(es).

  - the change is based in upstream patches adding the exact feature we
  needed to fix this issue & it will be integrated with a netplan change
  to add the needed stanza to systemd nic configuration file
  (KeepConfiguration=)

  [other info]

  original description:
  ---

  Configure netplan for interfaces, for example (a working config with
  IP addresses obfuscated)

  network:
  ethernets:
  eth0:
  addresses: [192.168.0.5/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth2:
  addresses:
    - 12.13.14.18/29
    - 12.13.14.19/29
  gateway4: 12.13.14.17
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth3:
  addresses: [10.22.11.6/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth4:
  addresses: [10.22.14.6/24]
  dhcp4: false
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]
    addresses: [10.22.11.1]
  eth7:
  addresses: [9.5.17.34/29]
  dhcp4: false
  optional: true
  nameservers:
    search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, 
phone.blah.com]

[Ubuntu-ha] [Bug 1863174] Re: Keepalived Loses VIP on DHCP Renewal

2020-02-13 Thread Rafael David Tinoco

*** This bug is a duplicate of bug 1815101 ***
https://bugs.launchpad.net/bugs/1815101

Hello Matthew, actually this might be a duplicate of BUG 1815101 because
that fix was not backported to Xenial. I can include Xenial in that list
and study efforts to backport the systemd-networkd to Xenial (checking
if its a go or no-go).

I'm marking it as a duplicate AND adding Xenial to the list of systemd-
networkd affected versions. Let me know if you would like to add
anything. I'll provide comments (about Xenial backport there).

** This bug has been marked a duplicate of bug 1815101
   [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, 
pacemaker (interface aliases are restarted)

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1863174

Title:
  Keepalived Loses VIP on DHCP Renewal

Status in keepalived package in Ubuntu:
  New

Bug description:
  Ubuntu Release: 16.04.6 LTS
  Keepalived Package Version: 1.2.24-1ubuntu0.16.04.2

  -- /etc/keepalived/keepalived.conf --
  vrrp_script chk_apiserver {
  script "curl https://127.0.0.1:443/healthz --cacert ca.crt --key 
request.key --cert request.crt --fail > /dev/null 2>&1"
  interval 10
  fall 6
  rise 2
  }

  vrrp_instance K8S_APISERVER {
  interface ens3
  state BACKUP
  virtual_router_id 118
  nopreempt
  dont_track_primary

  authentication {
  auth_type AH
  auth_pass **REDACTED**
  }

  virtual_ipaddress {
  10.128.233.23
  }
  track_script {
  chk_apiserver
  }

  }

  Expected Behavior: Upon DHCP renewal, Keepalived would maintain its
  VIP on the designated interface. In the case that the VIP is lost
  (outside of its control), it should failover to another VRRP instance.

  Actual Behavior: VIP disappeared from the designated interface, and
  did not failover to any other VRRP instance until Keepalived was
  restarted.

  2: ens3:  mtu 1450 qdisc pfifo_fast state UP 
group default qlen 1000
  link/ether fa:16:3e:8c:d2:e1 brd ff:ff:ff:ff:ff:ff
  inet 172.20.34.50/27 brd 172.20.34.63 scope global ens3
 valid_lft forever preferred_lft forever
  inet6 fe80::f816:3eff:fe8c:d2e1/64 scope link 
 valid_lft forever preferred_lft forever

  -- /var/log/syslog --
  Feb  5 21:13:17 dhclient[839]: DHCPDISCOVER on ens3 to 255.255.255.255 port 
67 interval 3 (xid=0x4cfc595e)
  Feb  5 21:13:17 dhclient[839]: DHCPREQUEST of 172.20.34.50 on ens3 to 
255.255.255.255 port 67 (xid=0x5e59fc4c)
  Feb  5 21:13:17 dhclient[839]: DHCPOFFER of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17 dhclient[839]: DHCPACK of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17 dhclient[839]: bound to 172.20.34.50 -- renewal in 40846 
seconds.
  Feb  5 21:13:19 ntpd[19295]: Deleting interface #34 ens3, 172.20.34.40#123, 
interface stats: received=0, sent=0, dropped=0, active_time=150821 secs

  
  -- /tmp/keepalived.stats --
  VRRP Instance: K8S_APISERVER
Advertisements:
  Received: 10722
  Sent: 153463
Became master: 1
Released master: 0
Packet Errors:
  Length: 0
  TTL: 0
  Invalid Type: 0
  Advertisement Interval: 0
  Address List: 0
Authentication Errors:
  Invalid Type: 0
  Type Mismatch: 0
  Failure: 15
Priority Zero:
  Received: 0
  Sent: 0

  Note: the networking.service was *not* restarted during this
  timeframe; however, I have been able to reproduce the issue in that
  manner.

  Additionally, I've not been able to reproduce this issue by hand, i.e.
  'dhclient -v -r ens3'. It seemingly only occurs when the lease has
  expired; it is the only time in which we can observe ntpd detecting
  that the interface has disappeared (thus the 'Deleting interface'
  message) at least.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1863174/+subscriptions

___
Mailing list: https://launchpad.net/~ubuntu-ha
Post to : ubuntu-ha@lists.launchpad.net
Unsubscribe : https://launchpad.net/~ubuntu-ha
More help   : https://help.launchpad.net/ListHelp

Re: [Ubuntu-ha] [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

2020-02-13 Thread Rafael David Tinoco

Balint, based on your input...

> thanks for the fixes in Eoan. Unfortunately we have a product based on
> disco and cannot move forward at this time. Being a networking shop,
> this issue has a serious effect on us and we would like to avoid moving
> to something like ifupdown2 within our stable branch.

So, Disco is EOL as it is not a LTS version, that is why it did not
get a fix (as the fix is very close to the one done in Eoan). Since
its unsupported by the community, it's up to you backport the Eoan
fixes to Disco if you'd like... you can even create a PPA for your
product and distribute along.

> For our users the real impact of the bug is not that that the interface
> that we are currently reconfiguring is suffering a downtime, but the
> fact that _all_ interfaces have their aliases removed if networkd is
> restarted. The proposed KeepConfiguration solution kind of beats the
> purpose of reconfiguring the interfaces, as old addresses are kept and
> need to be handled manually. Also it interferes with how DHCP works. I
> believe this might be an issue for others as well.

We are following systemd-networkd upstream decisions here. The option
"dhcp" only exists for CERTAIN scenarios (when root disk depends on
that connection, for iSCSI and/or NFS/ROOT for example). It is
explicitly said in the documentation:

"""
Takes a boolean or one of "static", "dhcp-on-stop", "dhcp". When
"static", systemd-networkd will not drop static addresses and routes
on starting up process. When set to "dhcp-on-stop", systemd-networkd
will not drop addresses and routes on stopping the daemon. When
"dhcp", the addresses and routes provided by a DHCP server will never
be dropped even if the DHCP lease expires. This is contrary to the
DHCP specification, but may be the best choice if, e.g., the root
filesystem relies on this connection. The setting "dhcp" implies
"dhcp-on-stop", and "yes" implies "dhcp" and "static". Defaults to
"no".
"""

and it is a question of choice: to have a window of opportunity for
duplicate IPs - in cases where there is no dynamic IP mapping to that
mac address - but possibly maintain the connection instead of causing
uninterruptable I/Os trying to shutdown a machine, for example. I
particularly don't like this option but it is not the default one and
was meant for a specific purpose.

>
> >From our point of view the ideal solution would be a combination of the
> keepalived patch that detects VIP removal and systemd version 244 that
> already supports "networkctl reconfigure" and "networkctl reload".

networkctl reconfigure/reload is a new functionality and won't be
added to previous already released versions as this is against SRU
guidelines. Systemd 244.2-1ubuntu1 is being included in 20.04, our
NEXT LTS version.

Like said before, you can try backporting systemd 244 to disco, or
bionic, if you are willing to support it on your own as it was already
EOL for community support. You should follow:
https://packaging.ubuntu.com/html/backports.html if you would like to
do that.

For the keepalived patches, they could be backported to Eoan, maybe
Bionic and Xenial depending on the amount of work. But then I would
need a practical example of why the systemd-networkd fix is no good in
most used scenarios.

> Is there any chance that v244 is backported to bionic? It is already
> included in focal and debian stable backports, but unfortunately I am
> not familiar enough with systemd development to tell what the impact of
> this would be.

Problem with backports is that they are unsupported even on supported
releases. I wouldn't be able to guarantee functionalities or fix it in
a constant basis. You can do it on your own and have it in a PPA of
your product, for example.

As systemd nowadays include networkd, udev management, sysV runtime
generators, tmpfiles creation, sockets creation, cgroups integration
for the process slices, etc etc... it is very very risky to backport
systemd to have "just" those 2 functionalities.

>
> As for keepalived, in bug #1819074 there was an ongoing investigation on
> the patch, that implements the keepalived transition on removing the
> VIP. We have traced back this functionality to this patch:
>
> https://github.com/acassen/keepalived/commit/0b1528c76d3fe8d1c5765841df86c59570a036da
>
> It was born before v1.3.6 was released, so we hope that it is self-
> contained enough for a backport if v2.0 of keepalived is not included in
> bionic-backports.

Let me check keepalived fix more closely and see what can be done for
the previous releases. As we are close to freeze date for our next LTS
release, it is unlikely that I do it before 2 weeks from now (as our
focus is in the development version entirely and I still need to fix
netplan to support the networkd KeepConfiguration functionality).

Lets keep talking.. I'll first patch netplan and go back with other
releases to check what can be done for them.

For now I would *strongly* recommend that in previous releases,
whoever wants to

[Ubuntu-ha] [Bug 1855140] Re: How to handle tmpfiles.d in non-systemd environments

2020-02-13 Thread Claudio Kuenzler

Hi Christian

According to https://www.debian.org/vote/2019/vote_002#outcome the
decision was to use "Systemd but we support exploring alternatives".
>From the proposal:

"Packages should include service units or init scripts to start daemons
and services. Packages may use any systemd facility at the package
maintainer's discretion, provided that this is consistent with other
Policy requirements and the normal expectation that packages shouldn't
depend on experimental or unsupported (in Debian) features of other
packages. Packages may include support for alternate init systems
besides systemd and may include alternatives for any systemd-specific
interfaces they use. Maintainers use their normal procedures for
deciding which patches to include."

>From my understanding this means that supporting an alternative init
system is optional ("Packages may include support for alternate init
systems besides systemd"). So basically this is up to the package
maintainer whether or not the package should support an alternative init
system.

Given the (still rising) popularity of Docker/Kubernetes containers, it
would indeed by very nice that the affected packages (such as haproxy)
do adjust for a non-systemd environment.

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to haproxy in Ubuntu.
https://bugs.launchpad.net/bugs/1855140

Title:
  How to handle tmpfiles.d in non-systemd environments

Status in haproxy package in Ubuntu:
  Opinion
Status in systemd package in Ubuntu:
  New

Bug description:
  This is a general issue about systemd features like tmpfiles.d which
  won't run in some environments like docker containers.

  Packages more and more rely on that with haproxy being the example
  that opened the bug, but clearly not the only one.

  I wanted to add tasks for all affected, but a qucik check showed that
  there are almost to many.

  $ apt-file search tmpfiles.d | cut -d':' -f 1 | sort | uniq
  129 at the moment and probably increasing.

  List of affected as of Dec 2019:
  acmetool anytun apt-cacher-ng bacula-common bind9 binkd bley bzflag-server 
ceph-common certmonger cockpit-ws colord connman courier-authdaemon 
courier-imap courier-ldap courier-mlm courier-mta courier-pop cryptsetup-bin 
cyrus-common dbus dhcpcanon diaspora-common dnssec-trigger ejabberd fail2ban 
firebird3.0-server freeipa-client freeipa-server glusterfs-server gvfs-common 
haproxy hddemux heartbeat htcondor i2pd inn inspircd iodine knot knot-resolver 
krb5-otp laptop-mode-tools lemonldap-ng-fastcgi-server libreswan lighttpd lirc 
lvm2 mailman mailman3 mailman3-web man-db mandos memcached mon mpd munge 
munin-common myproxy-server nagios-nrpe-server ngircd nrpe-ng nscd nsd 
nullmailer nut-client nut-server opencryptoki opendkim opendmarc 
opendnssec-enforcer opendnssec-signer opennebula opennebula-sunstone opensips 
open-vm-tools-desktop openvpn passwd pesign php7.2-fpm pidentd ploop 
postgresql-common prads prelude-correlator prelude-lml prelude-manager puppet 
pushpin resource-agents rkt rpcbind rsyslog samba-common-bin screen 
shairport-sync shibboleth-sp2-utils slurmctld slurmd slurmdbd sogo 
spice-vdagent sqwebmail sslh sudo sudo-ldap systemd systemd-container tcpcryptd 
tinyproxy tuned ulogd2 uptimed vrfydmn vsftpd w1retap-doc wdm 
wesnoth-1.12-server x2goserver-common xpra yadifa zabbix-agent 
zabbix-java-gateway zabbix-proxy-mysql zabbix-proxy-pgsql zabbix-proxy-sqlite3 
zabbix-server-mysql zabbix-server-pgsql

  Handling of these heavily Depends on the recent Debian GR [1].

  I'd suggest we wait how that turns out and then need to consider how
  (if?) to handle it in a central place, probably systemd or a
  derivative tool as started to be discussed in [2]

  If possible I'd avoid fixes in individual packages as it encourages
  growth of various workarounds for a problem that needs a general
  solution.

  [1]: https://www.debian.org/vote/2019/vote_002
  [2]: https://lists.debian.org/debian-devel/2019/12/msg00060.html

  --- Original report below ---

  When installing the haproxy package from the current Ubuntu 18.04
  Bionic repos, the package does not install the directory /run/haproxy.
  This directory is mentioned in the default config file
  /etc/haproxy/haproxy.cfg:

   stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd
  listeners

  Starting HAProxy manually will show the following error:

  # /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg
  [ALERT] 337/154339 (24) : Starting frontend GLOBAL: cannot bind UNIX socket 
[/run/haproxy/admin.sock]

  After manual creation of the directory, the start works:

  # mkdir /run/haproxy

  # /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg

  # ps auxf
  USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
  root10  0.1  0.0  18616  3416 pts/0Ss   15:42   0:00 /bin/bash
  root32  0.0  0.0  34400  2900 pts/0R+   15:45   0:00  \_ ps auxf
  root 1

[Ubuntu-ha] [Bug 1863174] [NEW] Keepalived Loses VIP on DHCP Renewal

2020-02-13 Thread Matthew Roark

Public bug reported:

Ubuntu Release: 16.04.6 LTS
Keepalived Package Version: 1.2.24-1ubuntu0.16.04.2

-- /etc/keepalived/keepalived.conf --
vrrp_script chk_apiserver {
script "curl https://127.0.0.1:443/healthz --cacert ca.crt --key 
request.key --cert request.crt --fail > /dev/null 2>&1"
interval 10
fall 6
rise 2
}

vrrp_instance K8S_APISERVER {
interface ens3
state BACKUP
virtual_router_id 118
nopreempt
dont_track_primary

authentication {
auth_type AH
auth_pass **REDACTED**
}

virtual_ipaddress {
10.128.233.23
}
track_script {
chk_apiserver
}

}

Expected Behavior: Upon DHCP renewal, Keepalived would maintain its VIP
on the designated interface. In the case that the VIP is lost (outside
of its control), it should failover to another VRRP instance.

Actual Behavior: VIP disappeared from the designated interface, and did
not failover to any other VRRP instance until Keepalived was restarted.

2: ens3:  mtu 1450 qdisc pfifo_fast state UP 
group default qlen 1000
link/ether fa:16:3e:8c:d2:e1 brd ff:ff:ff:ff:ff:ff
inet 172.20.34.50/27 brd 172.20.34.63 scope global ens3
   valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe8c:d2e1/64 scope link 
   valid_lft forever preferred_lft forever

-- /var/log/syslog --
Feb  5 21:13:17 dhclient[839]: DHCPDISCOVER on ens3 to 255.255.255.255 port 67 
interval 3 (xid=0x4cfc595e)
Feb  5 21:13:17 dhclient[839]: DHCPREQUEST of 172.20.34.50 on ens3 to 
255.255.255.255 port 67 (xid=0x5e59fc4c)
Feb  5 21:13:17 dhclient[839]: DHCPOFFER of 172.20.34.50 from 172.20.34.36
Feb  5 21:13:17 dhclient[839]: DHCPACK of 172.20.34.50 from 172.20.34.36
Feb  5 21:13:17 dhclient[839]: bound to 172.20.34.50 -- renewal in 40846 
seconds.
Feb  5 21:13:19 ntpd[19295]: Deleting interface #34 ens3, 172.20.34.40#123, 
interface stats: received=0, sent=0, dropped=0, active_time=150821 secs


-- /tmp/keepalived.stats --
VRRP Instance: K8S_APISERVER
  Advertisements:
Received: 10722
Sent: 153463
  Became master: 1
  Released master: 0
  Packet Errors:
Length: 0
TTL: 0
Invalid Type: 0
Advertisement Interval: 0
Address List: 0
  Authentication Errors:
Invalid Type: 0
Type Mismatch: 0
Failure: 15
  Priority Zero:
Received: 0
Sent: 0

Note: the networking.service was *not* restarted during this timeframe;
however, I have been able to reproduce the issue in that manner.

Additionally, I've not been able to reproduce this issue by hand, i.e.
'dhclient -v -r ens3'. It seemingly only occurs when the lease has
expired; it is the only time in which we can observe ntpd detecting that
the interface has disappeared (thus the 'Deleting interface' message) at
least.

** Affects: keepalived (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
High Availability Team, which is subscribed to keepalived in Ubuntu.
https://bugs.launchpad.net/bugs/1863174

Title:
  Keepalived Loses VIP on DHCP Renewal

Status in keepalived package in Ubuntu:
  New

Bug description:
  Ubuntu Release: 16.04.6 LTS
  Keepalived Package Version: 1.2.24-1ubuntu0.16.04.2

  -- /etc/keepalived/keepalived.conf --
  vrrp_script chk_apiserver {
  script "curl https://127.0.0.1:443/healthz --cacert ca.crt --key 
request.key --cert request.crt --fail > /dev/null 2>&1"
  interval 10
  fall 6
  rise 2
  }

  vrrp_instance K8S_APISERVER {
  interface ens3
  state BACKUP
  virtual_router_id 118
  nopreempt
  dont_track_primary

  authentication {
  auth_type AH
  auth_pass **REDACTED**
  }

  virtual_ipaddress {
  10.128.233.23
  }
  track_script {
  chk_apiserver
  }

  }

  Expected Behavior: Upon DHCP renewal, Keepalived would maintain its
  VIP on the designated interface. In the case that the VIP is lost
  (outside of its control), it should failover to another VRRP instance.

  Actual Behavior: VIP disappeared from the designated interface, and
  did not failover to any other VRRP instance until Keepalived was
  restarted.

  2: ens3:  mtu 1450 qdisc pfifo_fast state UP 
group default qlen 1000
  link/ether fa:16:3e:8c:d2:e1 brd ff:ff:ff:ff:ff:ff
  inet 172.20.34.50/27 brd 172.20.34.63 scope global ens3
 valid_lft forever preferred_lft forever
  inet6 fe80::f816:3eff:fe8c:d2e1/64 scope link 
 valid_lft forever preferred_lft forever

  -- /var/log/syslog --
  Feb  5 21:13:17 dhclient[839]: DHCPDISCOVER on ens3 to 255.255.255.255 port 
67 interval 3 (xid=0x4cfc595e)
  Feb  5 21:13:17 dhclient[839]: DHCPREQUEST of 172.20.34.50 on ens3 to 
255.255.255.255 port 67 (xid=0x5e59fc4c)
  Feb  5 21:13:17 dhclient[839]: DHCPOFFER of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17 dhclient[839]: DHCPACK of 172.20.34.50 from 172.20.34.36
  Feb  5 21:13:17

[Ubuntu-ha] [Bug 1863174] Re: Keepalived Loses VIP on DHCP Renewal

[Ubuntu-ha] [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

[Ubuntu-ha] [Bug 1863174] Re: Keepalived Loses VIP on DHCP Renewal

Re: [Ubuntu-ha] [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

[Ubuntu-ha] [Bug 1855140] Re: How to handle tmpfiles.d in non-systemd environments

[Ubuntu-ha] [Bug 1863174] [NEW] Keepalived Loses VIP on DHCP Renewal

6 matches

Site Navigation

Mail list logo

Footer information