Hi fellas,

Apologies for the brevity in the initial bug report.  I was using the
reportbug tool directly from the console of the VM I was working on, small
resolution.  Allow me to elaborate...

We initially discovered this bug testing our storage product, we had a
Debian 10 VM running in a typical ESXi 6.7 environment with iSCSI backed
storage.  The VM ran in a VMDK file on a VMFS datastore volume.  While the
VM was running in memory, we removed the storage initiators from ESXi
purposefully to test something unrelated, to simulate a storage outage.
After a couple of minutes the OS will go into R/O mode without its disk,
and at that time dhclient will rapidly request IP's from our ISC DHCP
server.  dhclient will take the IP, consume it from the DHCP pool and then
request another.  After some period of time this depletes the DHCP pool,
several hours to days depending on the scopes size.  This could also be
replicated by deleting the hard disk from a running VM in a virtual
environment.

When I look at systemctl for the dhclient service, I can see that there's
an error, "can't create /var/lib/dhcp/dhclient.intname.leases Read Only
file system", and then the DHCPREQUEST > DHCPACK > DHCPDECLINE sequence
starts every few seconds, and occasionally the service will show "RTNETLINK
answers: File Exists."

I'm guessing from the error that dhclient has a problem with not being able
to read / write to the client leases file, declines the IP and requests
another, but secretly holds on to the IP.

The DHCP server logs will show a final DHCPDECLINE after the ACK, and mark
the address as abandoned.  The VM will still have the address leased
however.  After a period of time VMware's guest tools will show all the
consumed IP's belonging to that MAC address and virtual interface.  Network
gear ARP shows the IP's belonging to the same MAC as well.

We've consistently reproduced this bug in our lab, and performed the test
simultaneously with a Debian 9, Centos and Ubuntu 16 instance to make sure
it wasn't some kind of NetworkManager thing, or a broader Linux issue.

I see that someone reported this similar bug back in 2018 as well, I think
they may be the same thing.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=888209

Thanks, just let me know if you have any questions.

Reply via email to