We've been investigating a similar issue in Ubuntu 20.04 (and now 22.04)
on Azure where Running PPS re-use fails to perform DHCP for 5 minutes
when dhclient is invoked by cloud-init.  dhclient is run by cloud-init,
but sees no DHCPOFFER.  It varies due to unknown reasons but it has
affected a ~0.3-2% of deployments in this scenario over time.

We instrumented our images to capture network traffic and see what is
happening and sure enough DHCP offers are coming through to the guest by
dhclient doesn't see them.  We instrumented dhclient and the "got_one()"
callback is never invoked in these failures.

18.04 does not have this issue.

This behavior can be reproduced multiple ways:
- Reproduce similar test environment to above scenario using cloud-init (switch 
hyperv nic to a different vnet while waiting the link status to reset, then 
perform dhcp).  This test case will reproduce in ~1,500 runs, though it varies 
and requires more complex setup.
- Repeatedly run dhclient in a loop until it fails (see test-sequential.sh).  
It may take a while, but even this simple test will reproduce this behavior in 
~50k runs for me in an LXD VM.
- Simply launch instances of dhclient in parallel (see test-parallel.sh). There 
is an excellent chance at least one of those dhclients will fail this way.

I noticed the uprev of bind9 libs in focal:
focal (net): 1:9.11.16+dfsg-3~build1
focal-updates (net): 1:9.11.16+dfsg-3~ubuntu1
impish (net): 1:9.11.19+dfsg-2.1ubuntu1
jammy (net): 1:9.11.19+dfsg-2.1ubuntu3
kinetic (net): 1:9.11.19+dfsg-2.1ubuntu3

I couldn't find any related issue on the isc-dhcp tracker, etc.  I did
build dhclient from the Debian master branch
(https://salsa.debian.org/debian/isc-dhcp/-/commits/master/debian) which
uses the in-tree bind libs and that seems to have addressed the issue
for all scenarios.  Not that it helps much to bisect this just yet.

** Attachment added: "parallel test"
   
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1926139/+attachment/5593045/+files/test-parallel.sh

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to isc-dhcp in Ubuntu.
https://bugs.launchpad.net/bugs/1926139

Title:
  dhclient doesn't receive dhcp offer from kernel

Status in isc-dhcp package in Ubuntu:
  New

Bug description:
  Platform: Qemu/libvirt on AMD64
  Ubuntu version: 20.04
  isc-dhcp-client version: 4.4.1-2.1ubuntu5
  Problem: When dhclient is used during boot every few reboots the DHCP OFFER 
packets aren't pushed from the kernel to dhclient. The DISCOVER packets can be 
seen in strace and tcpdump. The OFFER packets can be seen in tcpdump, but no 
read event is triggered.
  Ubuntu 18.04 doesn't have the problem, neither does Debian 10. Building these 
dhclient versions on Ubuntu 20.04 alleviates the problem a little, but it still 
occurs. So this issue might also be kernel related.

  Attached diff shows a strace of all threads and a pcap showing the
  tcpdump output.

  Edit:
  - Sometimes the dhclient command does receive the OFFER packet and connection 
is restored.
  - In my testing running dhclient manually from the terminal when the OFFERs 
aren't received will result in a new dhclient session which does receive the 
OFFER packet and connection is restored.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/isc-dhcp/+bug/1926139/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to