[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I have instrumented ipconfig, and determined that the ultimate source of the problem is that, for the case of multiple interfaces, ipconfig has a dependency on the kernel's probe order of the network interfaces. For whatever reason, the -31 kernel probes the network devices in one order (e.g., ens3 then ens4), and the -57 kernel in the other order (ens4 first then ens3). The probe order of network devices (and PCI devices in general) is explicitly not defined, and so this is not a bug in the kernel itself; ipconfig is failing due to its dependency on a specific enumeration order. The issue in ipconfig is that it is using a single packet socket to attempt to multiplex packet traffic on multiple interfaces. Presuming that ens3 will answer DHCP and ens4 will not, for the case that works, the order ends up being something like: send DHCP request on ens3 send DHCP request on ens4 [ system gets DHCP response via ens3 ] try to receive DHCP reply sent by peer for ens3; this matches, and all is happy For the case that it fails, the sequence is roughly: send DHCP request on ens4 send DHCP request on ens3 [ system gets DHCP response via ens3 ] try to receive DHCP reply sent by peer for ens4; the reply is actually for ens3, so ipconfig throws it away (as the XID, et al, don't match what is expected for the ens4 DHCP request). This repeats until ipconfig gives up. As I said above, the issue is that ipconfig is trying to multiplex traffic for two interfaces on one packet socket. This is fine for sending, but for receiving on an unbound packet socket, there is no way to receive a packet sent to a specific interface. Packets are delivered to recvfrom/recvmsg in the order received. I note that ipconfig sets sll.sll_ifindex on the msghdr provided to recvfrom and recvmsg system calls; perhaps the author believed that this limits received packets to only packets received on that ifindex. This is not the case, and the sll_ifindex passed to recvfrom/recvmsg is ignored. I'm looking into whether or not there is an simple fix for this that will let ipconfig function without major rework to utilize one packet socket per interface. ** Tags removed: kernel-key ** Package changed: linux (Ubuntu) => klibc (Ubuntu) ** Changed in: klibc (Ubuntu) Status: Triaged => Confirmed ** Changed in: klibc (Ubuntu) Assignee: (unassigned) => Jay Vosburgh (jvosburgh) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in klibc package in Ubuntu: Confirmed Status in klibc source package in Xenial: Triaged Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I have reproduced the described issue locally using the instructions from comment 35; will start looking into the cause. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Triaged Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Changed in: linux (Ubuntu) Status: Incomplete => Triaged ** Also affects: linux (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: linux (Ubuntu Xenial) Importance: Undecided => High ** Changed in: linux (Ubuntu Xenial) Status: New => Triaged -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Triaged Status in linux source package in Xenial: Triaged Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Just a note that I'm setting up to try the reproduction instructions from comment #35 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Tags removed: kernel-da-key ** Tags added: kernel-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul, thank you for the recreate instructions. This will help the support team immensely. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I took a step back from doing bisecting and focussed on creating a replication scenario, which I've done successfully. ipconfig is struggling to handle things when two interfaces are present and sending out DHCP requests, even if one interface doesn't get a response. Here's what I've done: Using virt-manager I created a bridge, bridge1, with no IP range associated with it (I want dnsmasq on a host to handle IP). I created a second, bridge2, likewise with no IP range associated with it ready for later use. $$$ I created an instance, named primary, with two NICs, one doing the usual NAT stuff so it has internet access. One hooked up to bridge1. I gave it two storage devices, 1 (sda) at 15Gb in size to act as local storage, 1 (sdb) 40Gb in size to be hosted over iSCSI (in hindsight, no reason for it not to be 15Gb too). Install Ubuntu 16.04.1 LTS on the primary instance, pretty much following through with defaults, but leaving the second hard drive unused. Reboot and bring up the instance. In my case I end up with ens3 being the NATing interface, ens9 being hooked up to the bridge interface. ## sudo apt update sudo apt upgrade ## Add to /etc/network/interfaces: auto ens9 iface ens9 inet static address 192.168.0.1/24 ## Then: sudo apt install open-iscsi targetcli dnsmasq ## dnsmasq config: log-queries log-dhcp interface=ens9 dhcp-range=192.168.0.50,192.168.0.150,12h dhcp-boot=script.ipxe enable-tftp tftp-root=/tftpd tftp-no-fail ## Then run targetcli and do the following commands: backstores/iblock create uefi /dev/sdb /iscsi create iqn.2015-02.oracle.boot:uefi cd iqn.2015-02.oracle.boot:uefi/tpg1 luns/ create /backstores/block/uefi portals/ create 0.0.0.0 set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 cache_dynamic_acls=1 exit ## sudo mkdir /tftpd sudo chown dnsmasq: /tftpd ## /tftpd/script.ipxe: #!ipxe set initiator-iqn iqn.2015-02.oracle.boot:uefi sanboot iscsi:192.168.0.1iqn.2015-02.oracle.boot:uefi ## This gets the host pretty much ready to be an iscsi target for a host. The host has been patched etc, so reboot. You may want to set up ip forwarding etc on this instance. $$$ Second host: No storage. Attach Ubuntu 16.04.1 LTS iso to the instance to boot from initially. Two NICs, first attached to bridge1. Second attached to bridge2. Go through the installation procedure, logging in to the iscsi endpoint on 192.168.0.1, using the details above (no username/password necessary with this configuration) and install to the iSCSI target. At the end, detach the CD-ROM and ensure everything is set up to network boot. On start-up you should see it network boot happily, everything is awesome. Do a "sudo apt update" and "sudo apt upgrade". Then reboot. On start-up you should see the bug happening. ipconfig is sending out DHCP requests on both interfaces and failing to accept any responses it is being sent ("journalctl -xef -u dnsmasq" on primary shows it is sending them). If you remove that second NIC, you'll see that the instance is able to boot happily. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Tags added: kernel-da-key -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I'm continuing to bisect the mainline linux kernel, and also trying to see if I can create a straightforward reproducible example. First focus on bisecting was between 4.5 and 4.6, to figure out what changed to suddenly have ipconfig working. I've tracked it down to this using bisect, and validated it afterwards: commit 689de1d6ca95b3b5bd8ee446863bf81a4883ea25 Author: Linus TorvaldsDate: Mon May 2 12:46:42 2016 -0700 Minimal fix-up of bad hashing behavior of hash_64() This is a fairly minimal fixup to the horribly bad behavior of hash_64() with certain input patterns. In particular, because the multiplicative value used for the 64-bit hash was intentionally bit-sparse (so that the multiply could be done with shifts and adds on architectures without hardware multipliers), some bits did not get spread out very much. In particular, certain fairly common bit ranges in the input (roughly bits 12-20: commonly with the most information in them when you hash things like byte offsets in files or memory that have block factors that mean that the low bits are often zero) would not necessarily show up much in the result. There's a bigger patch-series brewing to fix up things more completely, but this is the fairly minimal fix for the 64-bit hashing problem. It simply picks a much better constant multiplier, spreading the bits out a lot better. NOTE! For 32-bit architectures, the bad old hash_64() remains the same for now, since 64-bit multiplies are expensive. The bigger hashing cleanup will replace the 32-bit case with something better. The new constants were picked by George Spelvin who wrote that bigger cleanup series. I just picked out the constants and part of the comment from that series. Cc: sta...@vger.kernel.org Cc: George Spelvin Cc: Thomas Gleixner Signed-off-by: Linus Torvalds Next up is tracking down what changed between 4.7 and 4.8. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I've tried every version in the v4 series, and a few in v3. None prior to (and including) v4.0.0 will boot, none output anything on the screen to give me a clue why they're not booting. So far: v4.0 = won't boot v4.1 = ipconfig bug v4.2 = ipconfig bug v4.3 = ipconfig bug v4.4 = ipconfig bug v4.5 = ipconfig bug v4.6 = Boots v4.7 = Boots v4.8 = ipconfig bug v4.9 = ipconfig bug v4.10 = ipconfig bug I'm getting seriously concerned that "working" is actually the aberration. It's working in just two out of ten releases. I do have two things I should probably bisect there: 1) what changed between 4.5 and 4.6, and 2) what changed between 4.7 and 4.8. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, thanks for the clarification. Paraphrasing Linus, "We don't break userspace!" So, kernel bits being flipped causing userspace issues would be considered, at least for now, a kernel issue. Despite this, the Ubuntu kernel commit bisect results are helpful here on Launchpad. However, in order keep this relevant to upstream, you would want to bisect the mainline kernel as if doing a brand new bisect to see what the results are there. Once the mainline kernel commit bisect is done, then someone from upstream would give their perspective on is this root caused to kernel, user space, or both. ** Tags removed: bisect-done ** Tags added: downstream-bisect-done needs-upstream-bisect ** Description changed: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. - Offending commit: + Ubuntu kernel bisect offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts - The offending commit submission: + Ubuntu kernel bisect offending commit submission: https://lkml.org/lkml/2016/10/5/308 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
The more I look at this, the more I'm convinced *most* of the real problem lies in that ipconfig tool. Yes, various kernel changes seem to make it alter between working & not working under the circumstances (which is bizarre), but unless something is specifically interfering with the inter-process communication, ipconfig appears to be ignoring valid dhcp responses, just based on whether you tell it "all" interfaces vs telling it a specific interface. A small modification could be made to the initramfs-tools to have it iterate over the interfaces in the system one-at-a-time. It would marginally slow down the boot should the relevant interface not be the first, but it would get rid of this bug entirely. Or the intird environment could be modified to use dhclient instead of ipconfig (dhclient appears to be in the initrd, and works perfectly fine when called in a generic fashion, though the other initramfs-tools scripts seem aware ipconfig didn't complete successfully which I haven't looked in to) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts The offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
My apologies for any lack of clarity. I tested against the head of ubuntu-xenial, reverting just that commit and it fixed it. I tested against the head of the mainstream kernel and it didn't (last night I tried 4.9, 4.8, 4.5, 4.4, 4.2 tags of the mainstream kernel and in every place I find the general bug in effect). I'll try some larger leaps and see if I can track it down elsewhere. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts The offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, you advised in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/comments/26 reverting the commit worked consistently, but now in https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/comments/28 you are saying the opposite. Could you please clarify? ** Changed in: linux (Ubuntu) Status: Triaged => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts The offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I tried reverting that specific commit from upstream, but that didn't resolve the issue. Time for a new round of bisecting the kernel, this time using mainline. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Triaged Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. Offending commit: # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts The offending commit submission: https://lkml.org/lkml/2016/10/5/308 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, I wouldn't get too hung up on what appears a non-related code change affecting DHCP. Honestly, this result isn't surprising given how kernel code is more inter-related than meets the eye. Despite this, the issue you are reporting is an upstream one. Could you please report this problem following the instructions verbatim at https://wiki.ubuntu.com/Bugs/Upstream/kernel to the appropriate mailing list (TO Linus Torvalds, and Eric W. Biederman CC linux-kernel)? Please provide a direct URL to your post to the mailing list when it becomes available so that it may be tracked. Thank you for your help. ** Changed in: linux (Ubuntu) Importance: Low => High ** Changed in: linux (Ubuntu) Status: Incomplete => Triaged ** Description changed: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): - addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 + addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 - gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX - dns0 : 169.254.169.254 dns1 : 0.0.0.0 - rootserver: 169.254.169.254 rootpath: - filename : /ipxe.efi + gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX + dns0 : 169.254.169.254 dns1 : 0.0.0.0 + rootserver: 169.254.169.254 rootpath: + filename : /ipxe.efi - - tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. + tcpdumps show that dhcp requests are being received from the host, and + responses sent, but not accepted by the host. When the ipconfig command + is issued manually, an identical dhcp request and response happens, only + this time it is accepted. It doesn't appear to be that the messages are + being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. - I'm going to try and track back through kernel versions to see if I can - find which version the fix happened in to maybe provide some additional - context. I'll also attach copies of the initrds, packet captures etc. + Offending commit: + # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts + + The offending commit submission: + https://lkml.org/lkml/2016/10/5/308 ** Tags added: xenial -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Triaged Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Tags removed: needs-reverse-bisect ** Tags added: reverse-bisect-done ** Tags added: cherry-pick ** Tags removed: cherry-pick kernel-fixed-upstream kernel-fixed-upstream-4.10-rc1 reverse-bisect-done ** Tags added: bisect-done kernel-bug-exists-upstream kernel-bug-exists-upstream-4.10-rc1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
This seems to make no sense to me, as a layman anyway. I checked out the 4.4.0-58.79 tag, reverted that one commit and confirmed I have a booting 4.4.0-58-generic that'll happily DHCP in the initrd environment on multiple boots. It really does seem like, somehow, that commit is the source of the problems. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I bisected again, and again it came back to that mount point change. This seems so bizarre. $ git bisect log # bad: [6d4f0a79e5a307b6fd3ee3cc5bbb2fcb701b09db] UBUNTU: Ubuntu-4.4.0-57.78 # good: [db5f146d309e70067dae57798c9ea679af835aa7] UBUNTU: Ubuntu-4.4.0-53.74 git bisect start 'Ubuntu-4.4.0-57.78' 'Ubuntu-4.4.0-53.74' # bad: [02bf412367b827aa5be05a315088ef5fdcf267ca] dmaengine: at_xdmac: fix spurious flag status for mem2mem transfers git bisect bad 02bf412367b827aa5be05a315088ef5fdcf267ca # bad: [1e089050b800ba7d6ba1bf5814827e6cca301ad5] smc91x: avoid self-comparison warning git bisect bad 1e089050b800ba7d6ba1bf5814827e6cca301ad5 # bad: [d7632bdaba3dd143eac3c80bb7e2b0f62259583d] xhci: use default USB_RESUME_TIMEOUT when resuming ports. git bisect bad d7632bdaba3dd143eac3c80bb7e2b0f62259583d # bad: [7942010de9a2fe39e72b84e628867f4ff29a70f2] libxfs: clean up _calc_dquots_per_chunk git bisect bad 7942010de9a2fe39e72b84e628867f4ff29a70f2 # good: [9d2524b0bdeb57f80d0279f6695a833606ad0597] UBUNTU: SAUCE: Bluetooth: decrease refcount after use git bisect good 9d2524b0bdeb57f80d0279f6695a833606ad0597 # bad: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts git bisect bad fd4b5fa6e3487d15ede746f92601af008b2abbc0 # good: [f2109fe47ceb77647ef7d4f545efeba43d06fb64] videobuf2-v4l2: Verify planes array in buffer dequeueing git bisect good f2109fe47ceb77647ef7d4f545efeba43d06fb64 # good: [d5d9494d2092a7e571dee635ca254075912355c1] thinkpad_acpi: Add support for HKEY version 0x200 git bisect good d5d9494d2092a7e571dee635ca254075912355c1 # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I see where I messed up.. I'll try the bisect again. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Okay... I can't help but think I made a mistake somewhere in the bisecting process, but it seems to have isolated fd4b5fa6e3487d15ede746f92601af008b2abbc0 as the bad commit $ git bisect log # bad: [6d4f0a79e5a307b6fd3ee3cc5bbb2fcb701b09db] UBUNTU: Ubuntu-4.4.0-57.78 # good: [40a98f0e91bcc062babd017732cbf7cb20cf39fd] UBUNTU: Ubuntu-4.4.0-51.72 git bisect start 'Ubuntu-4.4.0-57.78' 'Ubuntu-4.4.0-51.72' # bad: [cd29d2303e86529c089b1c292480c05e7a24bd16] drm/i915: Respect alternate_ddc_pin for all DDI ports git bisect bad cd29d2303e86529c089b1c292480c05e7a24bd16 # bad: [617dec606ff9e43e64a06daef83e17da0035340a] drm/exynos: fix error handling in exynos_drm_subdrv_open git bisect bad 617dec606ff9e43e64a06daef83e17da0035340a # bad: [0dbd2050197ea4dd59f8957b72981cb7d2cfab1c] usb: gadget: function: u_ether: don't starve tx request queue git bisect bad 0dbd2050197ea4dd59f8957b72981cb7d2cfab1c # bad: [f3f9de1bd9a63b633946226ba23392ad44e2badf] i2c: core: fix NULL pointer dereference under race condition git bisect bad f3f9de1bd9a63b633946226ba23392ad44e2badf # good: [a0678a6643bf688bccce3c298a4a110af10988fc] ipv6: correctly add local routes when lo goes up git bisect good a0678a6643bf688bccce3c298a4a110af10988fc # good: [a0ae41d8ee0549161174a39d60f7316b67a87cae] Bluetooth: btusb: Add support for 0cf3:e009 git bisect good a0ae41d8ee0549161174a39d60f7316b67a87cae # good: [d5d9494d2092a7e571dee635ca254075912355c1] thinkpad_acpi: Add support for HKEY version 0x200 git bisect good d5d9494d2092a7e571dee635ca254075912355c1 # bad: [a6e674fa25854a7dafc59555d508855ea8fe3eaa] i2c: xgene: Avoid dma_buffer overrun git bisect bad a6e674fa25854a7dafc59555d508855ea8fe3eaa # bad: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts git bisect bad fd4b5fa6e3487d15ede746f92601af008b2abbc0 # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per mount namespace limit on the number of mounts >From a layman perspective, it doesn't seem like that could possibly cause the >bug. I guess one quick way forward, rather than repeat the whole bisecting process, is to completely reset the repository, bring it up to date, verify the bug still exists, and then revert this specific commit and see if the bug goes away. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I'll take a fresh look in the morning, but ran into this: make[1]: Leaving directory '/home/ubuntu/storage/ubuntu-xenial/debian/build/build-generic/zfs/module' Debug: module-check-generic install -d /home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64 find /home/ubuntu/storage/ubuntu-xenial/debian/build/build-generic/ -name \*.ko | \ sed -e 's/.*\/\([^\/]*\)\.ko/\1/' | sort > /home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64/generic.modules II: Checking modules for generic...previous or current modules file missing! /home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.76/amd64/generic.modules /home/ubuntu/storage/ubuntu-xenial/debian.master/abi/4.4.0-54.75/amd64/generic.modules debian/rules.d/4-checks.mk:12: recipe for target 'module-check-generic' failed make: *** [module-check-generic] Error 1 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I can give that a shot, following the instructions here: https://wiki.ubuntu.com/Kernel/KernelBisection#Bisecting_Ubuntu_kernel_versions -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, if the bug is reproducible at any interval, then perhaps a standard bisect between: linux-image-4.4.0-53-generic - Fine linux-image-4.4.0-57-generic - Affected would be more appropriate to understand which commit introduced the regression. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I should clarify, I know for certain that 4.4.0-51 is stable and reliable (and doesn't exhibit the bug). As part of our attempt to verify everything was correct with the installation we had a system run from Wednesday before Thanksgiving, all the way through to the following Monday, during which time it had an rc.local triggered reboot (so it had to be fully booted). -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Okay.. this is interesting. It seems like the Ubuntu dev version of 4.10 is actually intermittently failing (?!) I guess the next thing to do here is keep rebooting on this version of the kernel and see how often the bug occurs vs doesn't occur, so I can get a feel for a reasonable number of times to reboot with each test kernel once I actually start bisecting. >From the dhcp server side I can't see anything different. The requests look the same. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, to advise, I have updated the article to move the mainline-build-one section out of the way, as it has been distracting here, and for other folks. Feel free to ignore it, as it is for those who build kernels all the time (i.e. N/A here). Also, I don't maintain it, so I won't be able to advise on using it. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Rolling that command against master fails too: ubuntu@Beta:~/linux$ mainline-build-one afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc xenial *** BUILDING: commit:afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc series:xenial abinum: ... full_version<4.4.0> version<4.4.0> long abinum<040400> fatal: 'xenial' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. error: pathspec 'xenial/master' did not match any file(s) known to git. Deleted branch BUILD.040400 (was 794249c). Checking out files: 100% (33279/33279), done. Switched to a new branch 'BUILD.040400' vvv - build head commit afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc Author: Linus TorvaldsDate: Sun Jan 10 15:01:32 2016 -0800 Linux 4.4 ^^^ - build head fatal: invalid reference: xenial/master fatal: invalid reference: xenial/master-next fatal: invalid reference: xenial/master fatal: invalid reference: xenial/master-next On branch BUILD.040400 nothing to commit, working directory clean *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0001-DISABLE-comedi.patch (drivers/staging/comedi/drivers/das08_cs.c 47a4f33c4733880faa50f0e64a6e5c8f 79236ea0358db3c7a7a8a5f081c320b4) ... md5sum: drivers/staging/ti-st/st_kim.c: No such file or directory *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0002-DISABLE-ti-st.patch (drivers/staging/ti-st/st_kim.c b41944e0c30683bdedb6a66e11098892 ) ... md5sum: drivers/staging/hv/hv_mouse.c: No such file or directory *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0003-DISABLE-hyperv.patch (drivers/staging/hv/hv_mouse.c afd5524c29871a8293518f0be50a7474 ) ... *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0004-DISABLE-olpc.patch (drivers/staging/olpc_dcon/olpc_dcon_xo_1.c 13b325ae1aeee7f8602759057ed0d1f9 9d099e35d45e22f96c4d77694a5e6c58) ... *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0005-UBUNTU-olpc_dcon_xo_1-needs-delay.h.patch (drivers/staging/olpc_dcon/olpc_dcon_xo_1.c 6a0ae9f73f4878052202473bb952d6e4 9d099e35d45e22f96c4d77694a5e6c58) ... *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0006-UBUNTU-olpc_dcon_xo_1_5-needs-delay.h.patch (drivers/staging/olpc_dcon/olpc_dcon_xo_1_5.c 55c01b13d520fa0cdde88d8d3034f21c 37460a6a542aa92444e9114105621f18) ... *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0007-x86-idle-APM-requires-pm_idle-always-when-it-is-a-mo.patch (arch/x86/kernel/process.c 1ded15dd3a3cb622df182d60160ff826 73538a1ff57235e73e0342d9efa681f5) ... md5sum: debian/rules.d/2-binary-arch.mk: No such file or directory *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0008-UBUNTU-packaging-do-not-fail-secure-copy-on-older-ke.patch (debian/rules.d/2-binary-arch.mk 647c141b53e037781844f0c04234526e ) ... md5sum: arch/arm/mach-highbank/clock.c: No such file or directory *** checking /home/ubuntu/kteam-tools/mainline-build/adhoc/0009-UBUNTU-SAUCE-highbank-export-clock-functions-for-mod.patch (arch/arm/mach-highbank/clock.c 119a926bf04eae5024a3002b626ef8bc ) ... *** applying /home/ubuntu/kteam-tools/mainline-build/adhoc/any-0001-UBUNTU-SAUCE-add-vmlinux.strip-to-BOOT_TARGETS1-on-p.patch ... Applying: UBUNTU: SAUCE: add vmlinux.strip to BOOT_TARGETS1 on powerpc *** applying /home/ubuntu/kteam-tools/mainline-build/adhoc/any-0001-UBUNTU-SAUCE-tools-hv-lsvmbus-add-manual-page.patch ... Applying: UBUNTU: SAUCE: tools/hv/lsvmbus -- add manual page *** applying /home/ubuntu/kteam-tools/mainline-build/adhoc/yakkety-0001-disable-pie-when-gcc-has-it-enabled-by-default.patch ... Applying: UBUNTU: SAUCE: (no-up) disable -pie when gcc has it enabled by default fatal: Not a valid object name xenial/master-next:debian.master/changelog dpkg-parsechangelog: warning:-(l0): found end of file where expected first heading dpkg-parsechangelog: error: fatal error occurred while parsing - fatal: Not a valid object name xenial/master:debian.master/changelog dpkg-parsechangelog: warning:-(l0): found end of file where expected first heading dpkg-parsechangelog: error: fatal error occurred while parsing - /home/ubuntu/kteam-tools/mainline-build/mainline-build-one: line 291: debian/changelog.new: No such file or directory mv: cannot stat 'debian/changelog.new': No such file or directory On branch BUILD.040400 nothing to commit, working directory clean *** using configs from Ubuntu-0 () ... fatal: invalid reference: Ubuntu-0 fatal: invalid reference: xenial/ xenial-amd64: chroot not found (::,) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Gah.. okay https://wiki.ubuntu.com/KernelTeam/GitKernelBuild -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Ahh, I see where the kteam tools stuff is supposed to come from. It's not clear if I'm supposed to go down that route and use the mainline-build-one script or not when trying to build the kernel in this case. If I use the mainline-build-one tool: $ mainline-build-one afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc xenial *** BUILDING: commit:afd2ff9b7e1b367172f18ba7f693dfb62bdcb2dc series:xenial abinum: ... full_version<4.4.0> version<4.4.0> long abinum<040400> fatal: 'xenial' does not appear to be a git repository fatal: Could not read from remote repository. Please make sure you have the correct access rights and the repository exists. error: pathspec 'xenial/master' did not match any file(s) known to git. error: Cannot delete the branch 'BUILD.040400' which you are currently on. fatal: A branch named 'BUILD.040400' already exists. The only way this tool works with that syntax is to switch to the master branch, and run it from there. I'm not sure how that's supposed to work with git bisect, given bisect is setting your checked out position. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I'll get started on it. This might take a while to do. A couple of quick observations: 1) we haven't validated that mainline 4.4.0 actually works. I only know certain Ubuntu versions of the 4.4.0 kernel work. Given how much seems to be changing between Ubuntu releases of it, that seems a risky assumption to make. I'll start by proving that first. 2) On the wiki you linked to: "To do this, you can use the mainline- build-one script which can be found at ~kteam-tools/malinline-build /maineline-build-one ." A proper link would be useful. Where is ~kteam-tools? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, the next step is to fully reverse commit bisect from kernel 4.4 to 4.10-rc1 in order to identify the last bad commit, followed immediately by the first good one. Once this good commit has been identified, it may be reviewed for backporting. Could you please do this following https://wiki.ubuntu.com/Kernel/KernelBisection#How_do_I_reverse_bisect_the_upstream_kernel.3F ? Please note, finding adjacent kernel versions, or providing a commit from a kernel version bisect is not fully commit bisecting. Also, the kernel release names are irrelevant for the purposes of bisecting. It is most helpful that after the fix commit (not kernel version) has been identified, you then mark this report Status Confirmed. Thank you for your help. ** Tags added: needs-reverse-bisect ** Tags added: regression-update ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Tried and tested (the current up-to-date kernels at the time of posting): http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc1/linux- headers-4.10.0-041000rc1-generic_4.10.0-041000rc1.201612252031_amd64.deb http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.10-rc1/linux- image-4.10.0-041000rc1-generic_4.10.0-041000rc1.201612252031_amd64.deb They do not appear to suffer from the bug, dhcp was able to complete happily via the startup scripts in the initrd environment, and the host booted successfully. ** Tags added: kernel-fixed-upstream ** Tags added: kernel-fixed-upstream-4.10-rc1 ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Confirmed Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
Paul Graydon, thank you for reporting this and helping make Ubuntu better. In order to allow additional upstream developers to examine the issue, at your earliest convenience, could you please test the latest upstream kernel available from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D ? Please keep in mind the following: 1) The one to test is at the very top line at the top of the page (not the daily folder). 2) The release names are irrelevant. 3) The folder time stamps aren't indicative of when the kernel actually was released upstream. 4) Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds . If testing on your main install would be inconvenient, one may: 1) Install Ubuntu to a different partition and then test this there. 2) Backup, or clone the primary install. If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description: kernel-fixed-upstream kernel-fixed-upstream-X.Y-rcZ Where X, and Y are the first two numbers of the kernel version, and Z is the release candidate number if it exists. If the mainline kernel does not fix the issue, please add the following tags: kernel-bug-exists-upstream kernel-bug-exists-upstream-X.Y-rcZ Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream. Also, you don't need to apport-collect further unless specifically requested to do so. It is most helpful that after testing of the latest upstream kernel is complete, you mark this report Status Confirmed. Lastly, to keep this issue relevant to upstream, please continue to test the latest mainline kernel as it becomes available. Thank you for your help. ** Changed in: linux (Ubuntu) Importance: Undecided => Low ** Changed in: linux (Ubuntu) Status: Confirmed => Incomplete -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I've also confirmed the bug is present all the way back in 4.4.0-21-generic, and is present in 4.8.0-34-generic from yakkety- proposed. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Confirmed Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
I've worked my way back through the kernels. The bug, as it was (avoided by ip=dhcp in the kernel command line), was in effect in version 4.4.0-38-generic. It was fixed in 4.4.0-42-generic. This is the state of play so far with kernels I've tested: linux-image-4.4.0-38-generic - Affected linux-image-4.4.0-42-generic - Fine linux-image-4.4.0-43-generic - Fine linux-image-4.4.0-45-generic - Fine linux-image-4.4.0-47-generic - Fine linux-image-4.4.0-51-generic - Fine linux-image-4.4.0-53-generic - Fine linux-image-4.4.0-57-generic - Affected linux-image-4.4.0-58-generic - Affected (kernel in proposed) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Confirmed Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
apport-collect doesn't exist in initrd. I'm unable to supply the requested information. ** Changed in: linux (Ubuntu) Status: Incomplete => Confirmed -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Confirmed Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Attachment added: "pcap from dhcp server side of 'ipconfig -t "dhcp" -d "ens2f0" '" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795819/+files/worked.pcap -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
The checksum invalid mentioned in the pcap is interesting, but happens in both failed and successful, so I'm not sure it's relevant. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Attachment added: "pcap from dhcp server side of inird startup doing dhcp" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795820/+files/failed.pcap -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: Incomplete Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Attachment added: "Working 4.4.0-53 initrd" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795794/+files/initrd.img-4.4.0-53-generic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: New Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Attachment added: "4.4.0-57 "broken" initrd" https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+attachment/4795793/+files/initrd.img-4.4.0-57-generic -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: New Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp
[Kernel-packages] [Bug 1652348] Re: initrd dhcp fails / ignores valid response
** Package changed: linux-meta (Ubuntu) => linux (Ubuntu) -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1652348 Title: initrd dhcp fails / ignores valid response Status in linux package in Ubuntu: New Bug description: Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been (re?)introduced that is breaking dhcp booting in the initrd environment. This is stopping instances that use iscsi storage from being able to connect. Over serial console it outputs: IP-Config: no response after 2 secs - giving up IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP IP-Config: no response after 3 secs - giving up with increasing delays until it fails. At which point a simple ipconfig -t dhcp -d "ens2f0" works. The console output is slightly garbled but should give you an idea: (initramfs) ipconfig -t dhcp -[ 728.379793] ixgbe :13:00.0 ens2f0: changing MTU from 1500 to 9000 d "ens2f0" IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP IP-Config: ens2f0 guessed broadcast address 10.0.1.255 IP-Config: ens2f0 complete (dhcp from 169.254.169.254): addres[ 728.980448] ixgbe :13:00.0 ens2f0: detected SFP+: 3 s: 10.0.1.56broadcast: 10.0.1.255 netmask: 255.255.255.0 gateway: 10.0.1.1 [ 729.148410] ixgbe :13:00.0 ens2f0: NIC Link is Up 10 Gbps, Flow Control: RX/TX dns0 : 169.254.169.254 dns1 : 0.0.0.0 rootserver: 169.254.169.254 rootpath: filename : /ipxe.efi tcpdumps show that dhcp requests are being received from the host, and responses sent, but not accepted by the host. When the ipconfig command is issued manually, an identical dhcp request and response happens, only this time it is accepted. It doesn't appear to be that the messages are being sent and received incorrectly, just silently ignored by ipconfig. I was seeing this behaviour earlier this year, which I was able to fix by specifying "ip=dhcp" as a kernel parameter. About a month ago that was identified as causing us other problems (long story) and we dropped it, at which point we discovered the original bug was no longer an issue. Putting "ip=dhcp" back on with this kernel no longer fixes the problem. I've compared the two initrds and effectively the only thing that has changed between the two is the kernel components. I'm going to try and track back through kernel versions to see if I can find which version the fix happened in to maybe provide some additional context. I'll also attach copies of the initrds, packet captures etc. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp