On 02/19/14 16:41, Peter Jones wrote:
> On Wed, Feb 19, 2014 at 04:10:15PM +0100, Laszlo Ersek wrote:
>> The phenomenon described in
>>
>> https://fedoraproject.org/wiki/Unified_Extensible_Firmware_Interface#You_have_failed_to_provide_all_existing_bootloader_entries_when_changing_the_boot_order
>>
>> is what I witness under OVMF too (and have been, for a long time).
>>
>> Apparently this should be a bug somewhere in
>> "IntelFrameworkModulePkg/Universal/BdsDxe", which we trigger very
>> heavily in OVMF, because OVMF always rewrites the BootOrder variable
>> and always drops at least some of the auto-generated options from it.
>> (This is done to serve the boot order specs on the qemu command
>> line.)
>>
>> I used to think this was some obscure problem related to OVMF that we
>> didn't find the time (or motivation) to track down. But Adam's
>> paragraph in the Fedora wiki article suggests that the bug is present
>> in UEFI firmwares shipping on real hardware, which to me implies the
>> bug is in IntelFrameworkModulePkg.
>
> Yeah, that sounds likely enough.  FWIW I added "--keep" to upstream
> efibootmgr last week to help people work around this problem.  Testing
> would be appreciated.

The problem we're seeing in OVMF is an OVMF bug. I'm nonetheless
following up in this thread because the analysis could be relevant to
firmware that runs on real hardware.

Actually, we analyzed this almost fully in last September:

http://thread.gmane.org/gmane.comp.bios.tianocore.devel/3688/focus=4037

when Michael reported it. It's several layers of hacks culminating as a
"boot variable leak".


(1) In an ideal world, the platform init code (PI) produces a Boot Mode
hand-off block that reflects reality. Then various "stuff" keys off the
boot mode.

The two boot modes we care about now are:
- BOOT_WITH_FULL_CONFIGURATION
- BOOT_ASSUMING_NO_CONFIGURATION_CHANGES

In an ideal world, BOOT_WITH_FULL_CONFIGURATION rightfully implies that
BDS should execute both of the following steps:
- BdsLibConnectAll() -- connect all drivers to all devices (roughly
  speaking)
- BdsLibEnumerateAllBootOption() -- produce a big bunch of default boot
  options, as new Boot#### NvVars.

Also in an ideal world, the BOOT_ASSUMING_NO_CONFIGURATION_CHANGES boot
mode implies that BDS will do none of the above two steps.

PI presumably distinguishes these two modes from each other based on
some kind of "hardware awareness", like "chassis has been opened" or
some such.


(2) The world is not ideal. On some platforms, PI is unable to
distinguish "full config" from "assume no changes". Hence it usually
reports "full config", because that's safer. (This is the case eg. in
OVMF, which only tells "full config" and "S3 resume" apart. (S3 doesn't
even reach DXE/BDS, so it's irrelevant now.))

Of course the incessant production of new boot variables is irritating,
therefore people have come up with the following "trick" -- this is from
"DuetPkg/Library/DuetBdsLib/BdsPlatform.c":

  ASSERT (BootMode == BOOT_WITH_FULL_CONFIGURATION);

  /* ... */

  //
  //BdsLibConnectAll ();
  //BdsLibEnumerateAllBootOption (BootOptionList);

  //
  // Please uncomment above ConnectAll and EnumerateAll code and remove 
following first boot
  // checking code in real production tip.
  //
  // In BOOT_WITH_FULL_CONFIGURATION boot mode, should always connect every 
device
  // and do enumerate all the default boot options. But in development system 
board, the boot mode
  // cannot be BOOT_ASSUMING_NO_CONFIGURATION_CHANGES because the machine box
  // is always open. So the following code only do the ConnectAll and 
EnumerateAll at first boot.
  //
  Status = BdsLibBuildOptionFromVar (BootOptionList, L"BootOrder");
  if (EFI_ERROR(Status)) {
    //
    // If cannot find "BootOrder" variable,  it may be first boot.
    // Try to connect all devices and enumerate all boot options here.
    //
    BdsLibConnectAll ();
    BdsLibEnumerateAllBootOption (BootOptionList);
  }

So, they were aware that their PI only ever exported a "full config"
boot mode HOB. In order to stop the continuous generation of new boot
options (which was otherwise *correct* for "full config", in the ideal
world), they
- commented out those two steps (connect all & enumerate all),
- came up with a hack based on the presence of the BootOrder variable,
- added a warning that, when (re-)entering the ideal world, the hack
  should be undone (those two steps should be uncommented, and the
  BootOrder trick removed.)


(3) This code has been copied into OVMF. But, big difference, the
*uncommenting* (ie. reverting to the ideal world) has been executed.
>From "OvmfPkg/Library/PlatformBdsLib/BdsPlatform.c":

  ASSERT (PrivateData->BootMode == BOOT_WITH_FULL_CONFIGURATION);

  /* ... */

  BdsLibConnectAll ();
  BdsLibEnumerateAllBootOption (BootOptionList);
  /* ^^^^^^^^^^^ here */

  //
  // Please uncomment above ConnectAll and EnumerateAll code and remove 
following first boot
  // checking code in real production tip.
  //
  // In BOOT_WITH_FULL_CONFIGURATION boot mode, should always connect every 
device
  // and do enumerate all the default boot options. But in development system 
board, the boot mode
  // cannot be BOOT_ASSUMING_NO_CONFIGURATION_CHANGES because the machine box
  // is always open. So the following code only do the ConnectAll and 
EnumerateAll at first boot.
  //
  Status = BdsLibBuildOptionFromVar (BootOptionList, L"BootOrder");
  if (EFI_ERROR(Status)) {
    //
    // If cannot find "BootOrder" variable,  it may be first boot.
    // Try to connect all devices and enumerate all boot options here.
    //
    BdsLibConnectAll ();
    BdsLibEnumerateAllBootOption (BootOptionList);
  }

I can find no explanation for this.
- First, because OVMF's PI indeed cannot distinguish "full config" from
  "assume no changes". So the uncommenting seems unjustified.
- Second, even if the uncommenting was done for a good reason, then the
  big warning comment, and the BootOrder based trick, should have been
  removed.
- Third, there's no history in the SCM why this was done. This code had
  come whole-sale (as opposed to incrementally) in SVN r8398, and
  there's no hint at DUET, or why the OVMF version differs in this
  aspect.
- Fourth, this unanswered question caused our discussion to peter out
  last September.


(4) Regardless of the reason, how do we fix it in OVMF?

The obvious choice would be to comment out those two steps again (like
DUET does), owning up to the fact that OVMF's PI can't tell "full
config" from "assume no changes".

However, this would break our integration with the qemu boot order
interface. Namely, now our code looks:

  ASSERT (BootMode == BOOT_WITH_FULL_CONFIGURATION);

  /* ... */

  BdsLibConnectAll ();
  BdsLibEnumerateAllBootOption (BootOptionList);

  SetBootOrderFromQemu (BootOptionList);
  /* ^^^^^^^^^^ note this */

  //
  // Please uncomment above ConnectAll and EnumerateAll code and remove 
following first boot
  // checking code in real production tip.

  /* etc etc etc */

SetBootOrderFromQemu() can only reorder and *filter out* existing boot
options. If we commented out BdsLibConnectAll() and
BdsLibEnumerateAllBootOption(), then the following would happen:
- user unticks CD/DVD in virt-manager,
- user boots the VM,
- CD/DVD option is filtered out,
- user shuts down the VM,
- user changes his/her mind and enables CD/DVD in virt-manager,
- user boots the VM,
- ... and the CD/DVD option won't reappear from thin air.

In other words, because SetBootOrderFromQemu() can only reorder, keep,
and remove, but not generate, it depends on
BdsLibEnumerateAllBootOption() to generate options for it to throw away
or let through. So we must keep those two function calls intact.

Therefore I'll have to post a patch that:
- less importantly, kills the historical warning comment, together with
  the historical BootOrder trick,
- more importantly, fixes SetBootOrderFromQemu() so that it deletes the
  Boot#### options themselves. Right now it modifies the BootOrder
  variable only, removing just *references* to unwanted Boot####
  variables. Apparently, given how we rely on
  BdsLibEnumerateAllBootOption() at each boot, this is a genuine leak.


(The relevance to real hardware is: the BDS could be correct. The
platform init code could produce a wrong boot mode HOB and mislead BDS.)

Thanks,
Laszlo

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
edk2-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to