I took a step back from doing bisecting and focussed on creating a
replication scenario, which I've done successfully.

ipconfig is struggling to handle things when two interfaces are present
and sending out DHCP requests, even if one interface doesn't get a
response.

Here's what I've done:

Using virt-manager I created a bridge, bridge1, with no IP range
associated with it (I want dnsmasq on a host to handle IP).  I created a
second, bridge2, likewise with no IP range associated with it ready for
later use.

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

I created an instance, named primary, with two NICs, one doing the usual
NAT stuff so it has internet access.  One hooked up to bridge1.  I gave
it two storage devices, 1 (sda) at 15Gb in size to act as local storage,
1 (sdb) 40Gb in size to be hosted over iSCSI (in hindsight, no reason
for it not to be 15Gb too).

Install Ubuntu 16.04.1 LTS on the primary instance, pretty much
following through with defaults, but leaving the second hard drive
unused.  Reboot and bring up the instance.  In my case I end up with
ens3 being the NATing interface, ens9 being hooked up to the bridge
interface.

##########################

sudo apt update
sudo apt upgrade

##########################

Add to /etc/network/interfaces:

auto ens9
iface ens9 inet static
  address 192.168.0.1/24

##########################

Then:

sudo apt install open-iscsi targetcli dnsmasq

##########################

dnsmasq config:

log-queries
log-dhcp
interface=ens9
dhcp-range=192.168.0.50,192.168.0.150,12h
dhcp-boot=script.ipxe
enable-tftp
tftp-root=/tftpd
tftp-no-fail

##########################

Then run targetcli and do the following commands:

backstores/iblock create uefi /dev/sdb
/iscsi create iqn.2015-02.oracle.boot:uefi
cd iqn.2015-02.oracle.boot:uefi/tpg1
luns/ create /backstores/block/uefi
portals/ create 0.0.0.0
set attribute authentication=0 demo_mode_write_protect=0 generate_node_acls=1 
cache_dynamic_acls=1
exit

##########################

sudo mkdir /tftpd
sudo chown dnsmasq: /tftpd

##########################

/tftpd/script.ipxe:

#!ipxe
set initiator-iqn iqn.2015-02.oracle.boot:uefi
sanboot iscsi:192.168.0.1::::iqn.2015-02.oracle.boot:uefi

##########################

This gets the host pretty much ready to be an iscsi target for a host.
The host has been patched etc, so reboot.

You may want to set up ip forwarding etc on this instance.


$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

Second host:

No storage.  Attach Ubuntu 16.04.1 LTS iso to the instance to boot from
initially.  Two NICs, first attached to bridge1.  Second attached to
bridge2.

Go through the installation procedure, logging in to the iscsi endpoint
on 192.168.0.1, using the details above (no username/password necessary
with this configuration) and install to the iSCSI target.  At the end,
detach the CD-ROM and ensure everything is set up to network boot.

On start-up you should see it network boot happily, everything is
awesome.  Do a "sudo apt update" and "sudo apt upgrade".  Then reboot.

On start-up you should see the bug happening.  ipconfig is sending out
DHCP requests on both interfaces and failing to accept any responses it
is being sent ("journalctl -xef -u dnsmasq" on primary shows it is
sending them).  If you remove that second NIC, you'll see that the
instance is able to boot happily.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1652348

Title:
  initrd dhcp fails / ignores valid response

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Between kernel versions 4.4.0-53 and 4.4.0-57 a bug has been
  (re?)introduced that is breaking dhcp booting in the initrd
  environment.  This is stopping instances that use iscsi storage from
  being able to connect.

  Over serial console it outputs:

  IP-Config: no response after 2 secs - giving up
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f1 hardware address 90:e2:ba:d1:36:39 mtu 1500 DHCP RARP
  IP-Config: no response after 3 secs - giving up

  with increasing delays until it fails.  At which point a simple
  ipconfig -t dhcp -d "ens2f0"  works.  The console output is slightly
  garbled but should give you an idea:

  (initramfs) ipconfig -t dhcp -[  728.379793] ixgbe 0000:13:00.0 ens2f0: 
changing MTU from 1500 to 9000
  d "ens2f0"
  IP-Config: ens2f0 hardware address 90:e2:ba:d1:36:38 mtu 1500 DHCP RARP
  IP-Config: ens2f0 guessed broadcast address 10.0.1.255
  IP-Config: ens2f0 complete (dhcp from 169.254.169.254):
   addres[  728.980448] ixgbe 0000:13:00.0 ens2f0: detected SFP+: 3
  s: 10.0.1.56        broadcast: 10.0.1.255       netmask: 255.255.255.0
   gateway: 10.0.1.1   [  729.148410] ixgbe 0000:13:00.0 ens2f0: NIC Link is Up 
10 Gbps, Flow Control: RX/TX
        dns0     : 169.254.169.254  dns1   : 0.0.0.0
   rootserver: 169.254.169.254 rootpath:
   filename  : /ipxe.efi

  tcpdumps show that dhcp requests are being received from the host, and
  responses sent, but not accepted by the host.  When the ipconfig
  command is issued manually, an identical dhcp request and response
  happens, only this time it is accepted.  It doesn't appear to be that
  the messages are being sent and received incorrectly, just silently
  ignored by ipconfig.

  I was seeing this behaviour earlier this year, which I was able to fix
  by specifying "ip=dhcp" as a kernel parameter.  About a month ago that
  was identified as causing us other problems (long story) and we
  dropped it, at which point we discovered the original bug was no
  longer an issue.

  Putting "ip=dhcp" back on with this kernel no longer fixes the
  problem.

  I've compared the two initrds and effectively the only thing that has
  changed between the two is the kernel components.

  Ubuntu kernel bisect offending commit:
  # first bad commit: [fd4b5fa6e3487d15ede746f92601af008b2abbc0] mnt: Add a per 
mount namespace limit on the number of mounts

  Ubuntu kernel bisect offending commit submission:
  https://lkml.org/lkml/2016/10/5/308

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1652348/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to