I'm glad that the ipxe in the PPA seems to make it work. I now read the discussions and all questions that came up for me while doing so were asked and clarified already in later comments.
Therefore I just reviewed the proposed change and it looks good to me (other than the version string but that was just for the PPA, so that is ok). Only one question to be sure: I was only wondering if this might trigger any issues in iscsi booting since the change in src/net/netdevice.c adds the stripping to the generic net_poll. Now the (old) commit  reads as that would be required to be set. I wonder if there would be any regression in that regard. I remember words iSCSI+Mass being used together, but I'm unsure if the stack these days still uses it. When Vern confirmed that he could deploy with the modified ipxe, did that include a iSCSI boot? If not could one of you just double-check that iSCSI boot didn't regress due to this change? : https://git.ipxe.org/ipxe.git/commit/7d64abbc5d0b5dfe4810883f372b905a359f2697 -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1805920 Title: iPXE ignores vlan 0 traffic Status in MAAS: Invalid Status in ipxe package in Ubuntu: Confirmed Status in linux package in Ubuntu: Confirmed Bug description: I have three MAAS rack/region nodes which are blades in a Cisco UCS chassis. This is an FCE deployment where MAAS has two DHCP servers, infra1 is the primary and infra3 is the secondary. The pod VMs on infra1 and infra3 PXE boot fine but the pod VMs on infra2 fail to PXE boot. If I reconfigure the subnet to provide DHCP on infra2 (either as primary or secondary) then the pod VMs on infra2 will PXE boot but the pod VMs on the demoted infra node (that no longer serves DHCP) now fail to PXE boot. While commissioning a pod VM on infra2 I captured network traffic with tcpdump on the vnet interface. Here is the dump when the PXE boot fails (no dhcp server on infra2): https://pastebin.canonical.com/p/THW2gTSv4S/ Here is the dump when PXE boot succeeds (when infra2 is serving dhcp): https://pastebin.canonical.com/p/HH3XvZtTGG/ The only difference I can see is that in the unsuccessful scenario, the reply is an 802.1q packet -- it's got a vlan tag for vlan 0. Normally vlan 0 traffic is passed as if it is not tagged and indeed, I can ping between the blades with no problem. Outgoing packets are untagged but incoming packets are tagged vlan 0 -- but the ping works. It seems vlan 0 is used as a part of 802.1p to set priority of packets. This is separate from vlan, it just happens to use that ethertype to do the priority tagging. Someone confirmed to me that, in the iPXE source, it drops all packets if they are vlan tagged. The customer is unable to figure out why the packets between blades is getting vlan tagged so we either need to figure out how to allow iPXE to accept vlan 0 or the customer will need to use different equipment for the MAAS nodes. I found a conversation on the ipxe-devel mailing list that suggested a commit was submitted and signed off but that was from 2016 so I'm not sure what became of it. Notable messages in the thread: http://lists.ipxe.org/pipermail/ipxe-devel/2016-April/004916.html http://lists.ipxe.org/pipermail/ipxe-devel/2016-July/005099.html Would it be possible to install a local patch as part of the FCE deployment? I suspect the patch(es) mentioned in the above thread would require some modification to apply properly. To manage notifications about this bug go to: https://bugs.launchpad.net/maas/+bug/1805920/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : firstname.lastname@example.org Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp