On Tue, Apr 21, 2015 at 02:12:54PM +0800, Michael Chang wrote: > On Mon, Apr 20, 2015 at 02:30:00PM +0800, Michael Chang wrote: > > On Sun, Apr 19, 2015 at 11:01:11AM +0300, Andrei Borzenkov wrote: > > > EDK2 network stack is based on Managed Network Protocol which is layered > > > on top of Simple Management Protocol and does background polling. This > > > polling races with grub for received (and probably trasmitted) packets > > > which causes either serious slowdown or complete failure to load files. > > > > > > Additionaly PXE driver creates two child devices - IPv4 and IPv6 - with > > > bound SNP instance as well. This means we get three cards for every > > > physical adapter when enumerating. Not only is this confusing, this > > > may result in grub ignoring packets that come in via the "wrong" card. > > > > > > Example of device hierarchy is > > > > > > Ctrl[91] PciRoot(0x0)/Pci(0x3,0x0) > > > Ctrl[95] PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1) > > > Ctrl[B4] > > > PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv4(0.0.0.0) > > > Ctrl[BC] > > > PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400123456,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000) > > > > > > Skip PXE created virtual devices and open SNP on base device exclusively. > > > This destroys all child MNP instances and stops background polling. We > > > cannot do it for virtual devices anyway as PXE would probably attempt > > > to destroy them when stopped and current OVMF crashes when we try to do > > > it. > > > > > > Exclusive open cannot be done when enumerating cards, as it would destroy > > > PXE information we need to autoconfigure interface; and it cannot be done > > > during autoconfiguration as we need to do it for non-PXE boot as well. So > > > move SNP open to card ->open method and add matching ->close to clean up. > > > > > > References: https://savannah.gnu.org/bugs/?41731 > > > --- > > > grub-core/net/drivers/efi/efinet.c | 147 > > > ++++++++++++++++++++++++++++--------- > > > 1 file changed, 112 insertions(+), 35 deletions(-) > > > > > > diff --git a/grub-core/net/drivers/efi/efinet.c > > > b/grub-core/net/drivers/efi/efinet.c > > > index f171f20..5777016 100644 > > > --- a/grub-core/net/drivers/efi/efinet.c > > > +++ b/grub-core/net/drivers/efi/efinet.c > > > @@ -142,9 +142,72 @@ get_card_packet (struct grub_net_card *dev) > > > return nb; > > > } > > > > > > +static grub_err_t > > > +open_card (struct grub_net_card *dev) > > > +{ > > > + grub_efi_simple_network_t *net; > > > > I'm not sure about adding null check for dev->efi_net here is necessary > > or not, as we don't close it in close_card so that reopening a closed > > interface would make open_protocol fail with EFI_ACCESS_DENIED and in > > turn makes the entire open_card funtion fail with GRUB_ERR_BAD_DEVICE. > > > > > + > > > + net = grub_efi_open_protocol (dev->efi_handle, &net_io_guid, > > > + GRUB_EFI_OPEN_PROTOCOL_BY_EXCLUSIVE); > > > + if (! net) > > > + /* This should not happen... Why? */ > > > + return GRUB_ERR_BAD_DEVICE; > > > + > > > + if (net->mode->state == GRUB_EFI_NETWORK_STOPPED > > > + && efi_call_1 (net->start, net) != GRUB_EFI_SUCCESS) > > > + return GRUB_ERR_BAD_DEVICE; > > > + > > > + if (net->mode->state == GRUB_EFI_NETWORK_STOPPED) > > > + return GRUB_ERR_BAD_DEVICE; > > > + > > > + if (net->mode->state == GRUB_EFI_NETWORK_STARTED > > > + && efi_call_3 (net->initialize, net, 0, 0) != GRUB_EFI_SUCCESS) > > > + return GRUB_ERR_BAD_DEVICE; > > > + > > > + dev->mtu = net->mode->max_packet_size; > > > + > > > + dev->txbufsize = ALIGN_UP (dev->mtu, 64) + 256; > > > + dev->txbuf = grub_zalloc (dev->txbufsize); > > > + if (!dev->txbuf) > > > + { > > > + grub_print_error (); > > > + return GRUB_ERR_BAD_DEVICE; > > > + } > > > + dev->txbusy = 0; > > > + > > > + dev->rcvbufsize = ALIGN_UP (dev->mtu, 64) + 256; > > > + > > > + grub_memcpy (dev->default_address.mac, > > > + net->mode->current_address, > > > + sizeof (dev->default_address.mac)); > > > + > > > + dev->efi_net = net; > > > + return GRUB_ERR_NONE; > > > +} > > > + > > > +static void > > > +close_card (struct grub_net_card *dev) > > > +{ > > > + if (dev->txbuf) > > > + { > > > + grub_free (dev->txbuf); > > > + dev->txbuf = NULL; > > > + } > > > + > > > + if (dev->rcvbuf) > > > + { > > > + grub_free (dev->rcvbuf); > > > + dev->rcvbuf = NULL; > > > + } > > > + > > > + /* FIXME do we need to close Simple Network Protocol here? */ > > > > The question from me is why not? :) > > > > If we don't close it, the consequence might prevent other application > > (for eg, the chainloaded one from grub) from using managed network > > protocol or related one ? > > > > The rest of the patch looks good to me and a lot better than my > > workaround patch. Thanks for you work on this. > > > > I can give this patch a test to see if it fixed the slowness issue I > > have also experienced in OVMF for IPv4 networking and also together with > > my net_bootp6 patch. > > The patch works well with IPv4 as expected, I verfied it to fix the > slowness problem. > > But with IPv6, there were some problems found. > > 1. It failed with "packet too big" in grub_net_send_ip6_packet. The > reason is the card not opened at that time so that the mtu is unset > (zero). That makes the packet size testing against mtu "too big". > > The call stack looks like. > > grub_net_link_layer_resolve grub_net_icmp6_send_request > grub_net_send_ip_packet grub_net_send_ip6_packet > > Meanwhile the IPv4 call stack: > > grub_net_link_layer_resolve grub_net_arp_send_request > send_ethernet_packet > > It calls send_ethernet_packet which do not have the mtu size check and > also the open the card device if not opened. > > 2. I tried to get it going by adding card open to > grub_net_send_ip6_packet (), but experienced another problem as the link > layer resolve timeout due to the network interface's hardware mac > address unset (inf->hwaddress.mac). After some debugging I realized > that's in my net_bootp6 patch, the network interface is added during > handling of cached dhcpv6 reply packets using the hardware mac provided > by the card instance, which is again not yet opened.:( > > The reason why IPv4 works is because it doesn't use card instance's mac > address but mac_addr from the bootp's ack packet. > > It's possble to find mac_addr from the DHCPv6 Reply packets, but it's > unreliable you will always have one. I think it's related to what DUID > type is using in the CLIENT_ID option. The spec defines > > DUID-LLT 1 DUID-EN 2 DUID-LL 3 > > With DUID-LLT, DUID-LL you can read the mac_addr (and type) but with > DUID-EN you are out of luck. The final round made me surrender is that > the cached reply packet in UEFI, the DUID seems to be undefined type > "4".. > > 3. Even I can add the card open earler before hadling the dhcpv6 > packets, it will freeze at grub_efi_open_protocol if the option in use > is GRUB_EFI_OPEN_PROTOCOL_BY_EXCLUSIVE. I was actually using > GRUB_EFI_OPEN_PROTOCOL_GET_PROTOCOL in the entire test and am running > out of idea how to deal with that diversity.
Hi Andrey, I checked other grub_net_card implementation (emucard, grub_pxe_card, ofnet). The mtu and default_adree are determined during the device enumeration but not open. I think it would be better to follow them as those properties are needed quite early and that may also fix the problem #1 and #2. For #3, I'm not trying to get the log from OVMF .. Thanks, Michael _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org https://lists.gnu.org/mailman/listinfo/grub-devel