On 06/22/16 03:44, Laszlo Ersek wrote: > I wrote an iPXE patch for this, but the issue persists. So it's > something else.
I found the bug. Well, the first bug. It is so obscure that I'll have a very hard time describing it. The short version is that it is a buffer overflow in iPXE that confuses the HII machinery in edk2. Details: In the efi_snp_probe() function [src/interface/efi/efi_snp.c], a device path protocol interface is installed on the handle that otherwise also gets the SNP interface. This devpath is composed by copying the parent PCI device's devpath, and tacking on a MAC_ADDR_DEVICE_PATH node. In the latter node, the MAC address is filled in like this: > memcpy ( &macpath->MacAddress, netdev->ll_addr, > sizeof ( macpath->MacAddress ) ); A little bit higher up in the function there is a sanity check that makes sure that "netdev->ll_addr" will fit into "macpath->MacAddress": > /* Sanity check */ > if ( netdev->ll_protocol->ll_addr_len > sizeof ( EFI_MAC_ADDRESS ) ) { > DBGC ( snpdev, "SNPDEV %p cannot support link-layer address " > "length %d for %s\n", snpdev, > netdev->ll_protocol->ll_addr_len, netdev->name ); > rc = -ENOTSUP; > goto err_ll_addr_len; > } However, as you can see on the memcpy() function call quoted above, the size argument is wrong. Namely, macpath->MacAddress (i.e., MAC_ADDR_DEVICE_PATH.MacAddress) has type EFI_MAC_ADDRESS: > typedef struct { > EFI_DEVICE_PATH_PROTOCOL Header; > /// > /// The MAC address for a network interface padded with 0s. > /// > EFI_MAC_ADDRESS MacAddress; > /// > /// Network interface type(i.e. 802.3, FDDI). > /// > UINT8 IfType; > } MAC_ADDR_DEVICE_PATH; It is 32 bytes in size, and it is padded with zeros. > /// > /// 32-byte buffer containing a network Media Access Control address. > /// > typedef struct { > UINT8 Addr[32]; > } EFI_MAC_ADDRESS; > Whereas the "netdev->ll_addr" array (i.e., the "net_device.ll_addr" member) has the following type [src/include/ipxe/netdevice.h]: > uint8_t ll_addr[MAX_LL_ADDR_LEN]; Where MAX_LL_ADDR_LEN is: > /** Maximum length of a link-layer address > * > * The longest currently-supported link-layer address is for IPoIB. > */ > #define MAX_LL_ADDR_LEN 20 In other words, the memcpy() quoted at the top copies 32 bytes into a 32-byte buffer, from a 20-byte buffer. It is the *source* buffer that is overflowed. As a result, bytes 20..31 of MacAddress (inclusive) are filled with garbage. Now, *both* iPXE code *and* the edk2 code go on to create a number of child handles, with child device paths. Those device paths are (of course) derived by binarily copying the parent devpath, therefore the garbage lands in all child device paths too. The question is then, why does it cause issues? The answer boggles the mind a bit. I mentioned in the earlier emails that the new Device Manager uses a different type of IFR opcode for the sub-form selection. I said that the opcode was callback-less, but much more importantly, these opcodes (= EFI_IFR_REF4 or "goto" opcodes) store a *textual* rendering of the device path that is associated with the HII package list / form set that the individual driver installs. (For an example devpath, search my previous emails for [VlanConfigFormSet].) And, in the course of the navigation, this textually rendered devpath, from the "goto" opcode, gets translated back to binary, in the following call stack (excerpt): ProcessGotoOpCode() [MdeModulePkg/Universal/SetupBrowserDxe/Presentation.c] // // Goto another Hii Package list // ConvertTextToDevicePath() DevicePathToHiiHandle() [MdeModulePkg/Universal/SetupBrowserDxe/Presentation.c] gBS->LocateDevicePath() [MdeModulePkg/Core/Dxe/Hand/Locate.c] In order to reach the destination HII package, form set, and form, through the "goto" opcode, the LocateDevicePath() boot service has to be able to locate the devpath that has gone through the binary -> textual -> binary translation in the protocol database. And that's what fails: in the first binary -> textual devpath conversion, which is done by CreateDeviceManagerForm() [MdeModulePkg/Library/DeviceManagerUiLib/DeviceManager.c], see my first email for details, the garbage bytes, starting at offset 20, are *rendered away*. They are not captured in the textual format: in the DevPathToTextMacAddr() function [MdePkg/Library/UefiDevicePathLib/DevicePathToText.c], the (IfType == 0x01) field match from the MAC devpath node ensures that only the first six bytes are rendered to text. Then, when the ProcessGotoOpCode() function above performs the inverse translation, those bytes in the new MAC devpath node are filled with zeros (see DevPathFromTextMAC()). This means that the twice-translated MAC devpath node will never match the devpath node originally created by iPXE, nor will it match the same devpath node in all of those child device paths. Hence the LocateDevicePath() boot service will never find a *full* match, and the IsDevicePathEnd() check in DevicePathToHiiHandle() will fail. The target HII package list / form set / form etc cannot be located, and ProcessGotoOpCode() bails out with: > // > // If target Hii Handle not found, exit current formset. > // > FindParentFormSet(Selection); > return EFI_SUCCESS; In summary, the buffer overflow in iPXE *poisons* all child device paths created from the original device path too, including those created by the edk2 drivers as well. And when the *search* devpath is sanitized (via the binary-to-text-to-binary conversions that the new Device Manager and the display browser do), it never matches. It has not been a problem because these devpaths used not to be part of the HII navigation. The "goto" opcodes that expose the bug are new in the MdeModulePkg Device Manager. This is the fix for iPXE: > diff --git a/src/interface/efi/efi_snp.c b/src/interface/efi/efi_snp.c > index a6e94c00b5c4..7efb307ff8aa 100644 > --- a/src/interface/efi/efi_snp.c > +++ b/src/interface/efi/efi_snp.c > @@ -1651,7 +1651,7 @@ static int efi_snp_probe ( struct net_device *netdev ) { > macpath->Header.SubType = MSG_MAC_ADDR_DP; > macpath->Header.Length[0] = sizeof ( *macpath ); > memcpy ( &macpath->MacAddress, netdev->ll_addr, > - sizeof ( macpath->MacAddress ) ); > + sizeof ( netdev->ll_addr ) ); > macpath->IfType = ntohs ( netdev->ll_protocol->ll_proto ); > if ( ( tag = vlan_tag ( netdev ) ) ) { > vlanpath = ( ( ( void * ) macpath ) + sizeof ( *macpath ) ); (I plan to submit it later to ipxe-devel, but I'm too tired now.) If you apply this patch, you will be able to enter the sub-forms *beside* the iPXE one. The poisoning will be gone, and all sub-forms can be reached. You can also enter the iPXE sub-form too, but that one blows up with an assert: > ASSERT MdeModulePkg/Universal/SetupBrowserDxe/Presentation.c(393): > ParentStatement != ((void *) 0) But that's for another day. Laszlo _______________________________________________ edk2-devel mailing list edk2-devel@lists.01.org https://lists.01.org/mailman/listinfo/edk2-devel