Re: [edk2] Corrupted EFI region

2013-09-20 Thread Matt Fleming
On Wed, 18 Sep, at 01:24:14PM, jerry.hoem...@hp.com wrote:
 Matt,
 
 I conducted the following experiments on a 3.11 kernel:
 
Jerry, could you paste your memory map from the kernel log?

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-18 Thread jerry . hoemann
On Mon, Sep 16, 2013 at 11:59:20AM +0100, Matt Fleming wrote:
 On Fri, 13 Sep, at 02:38:12PM, jerry.hoem...@hp.com wrote:
  Matt,
  
  We have hit an issue on our new platform in development related to the
  call of efi_reserve_boot_services() from setup_arch().
  
  The reservation can interfere with allocation of the crash kernel.
  
 Jerry, thanks for bringing this up.
 
  In pre 3.9(?) kernels,  the crash kernel is required to be allocated from
  physically contiguous memory below 896 MB.
  
  Our new platforms are large in both the amount of memory and the amount
  of IO. This requires large crash kernels for kdump to work.  This is even
  after the work done for makedumpfile v 1.5 to allow it to work with a
  smaller foot print.
  
  
  One of the problems is that drivers will allocate memory as boot code and/or
  data in the region  896 that effectively fragments this memory.
  With the reservation, we can't reuse the memory when needed for the
  crash kernels.   If we remove the reservation and allow the kernel
  to reuse the memory,  we the reservation of the crash kernel succeeds.
  
  This is definitely a problem for distros that are pre 3.9.  Probably less
  so for top of tree, but i haven't been focused there.
  
  So we are definitely interested in finding a mechanism to not
  do this reservation on platforms that don't have the issues described
  earlier in this thread.
 
 OK, in an ideal world we'd move the crash kernel reservation after
 efi_free_boot_services(), because at that point the boot regions are
 available again. But it seems that we reserve the boot regions really
 early during startup and release them relatively late. The reason is
 that the Boot Graphics Resource Table (BGRT) data, if present, is
 located in the Boot Services Data regions but we can't extract the
 address of the region from the ACPI tables until we've setup the ACPI
 subsystem, which happens quite late.
 
 I wonder whether performing the reservation of the crash kernel memory
 first, before efi_reserve_boot_services(), would help. That way we'd
 only need to reserve remaining regions in efi_reserve_boot_services().
 This scheme would rely on nothing writing into the crash kernel area
 before we've extracted the BGRT data, however.
 
 -- 
 Matt Fleming, Intel Open Source Technology Center


Matt,

I conducted the following experiments on a 3.11 kernel:

1)  Moved the call of reserve_crashkernel to after efi_free_boot_services.
Booted with crashkernel=512M

a)  when memory below 896M was *not* fragmented by BootCode segments
reserve_crashkernel succeeded.

b)  when memory below 896M *was* fragmented by BootCode segments
reserve_crashkernel failed.

2)  Moved the call to reserve_crashkernel to before call to 
efi_reserve_boot_services.
Booted with crashkernel=512M

reserve_crashkernel succeeded irrespective of whether the memory below 896M 
was
fragmented by BootCode segments.


I haven't determined why reserve_crashkernel failed in 1b) above.

I don't see the memory reserved for the crash kernel being accessed
before call to efi_free_boot_services.

CC'ing kexec list for their input as I may have missed something.


Jerry


-- 


Jerry HoemannSoftware Engineer  Hewlett-Packard/MODL

3404 E Harmony Rd. MS 57phone:  (970) 898-1022
Ft. Collins, CO 80528   FAX:(970) 898-
email:  jerry.hoem...@hp.com


--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-16 Thread Matt Fleming
On Fri, 13 Sep, at 02:38:12PM, jerry.hoem...@hp.com wrote:
 Matt,
 
 We have hit an issue on our new platform in development related to the
 call of efi_reserve_boot_services() from setup_arch().
 
 The reservation can interfere with allocation of the crash kernel.
 
Jerry, thanks for bringing this up.

 In pre 3.9(?) kernels,  the crash kernel is required to be allocated from
 physically contiguous memory below 896 MB.
 
 Our new platforms are large in both the amount of memory and the amount
 of IO. This requires large crash kernels for kdump to work.  This is even
 after the work done for makedumpfile v 1.5 to allow it to work with a
 smaller foot print.
 
 
 One of the problems is that drivers will allocate memory as boot code and/or
 data in the region  896 that effectively fragments this memory.
 With the reservation, we can't reuse the memory when needed for the
 crash kernels.   If we remove the reservation and allow the kernel
 to reuse the memory,  we the reservation of the crash kernel succeeds.
 
 This is definitely a problem for distros that are pre 3.9.  Probably less
 so for top of tree, but i haven't been focused there.
 
 So we are definitely interested in finding a mechanism to not
 do this reservation on platforms that don't have the issues described
 earlier in this thread.

OK, in an ideal world we'd move the crash kernel reservation after
efi_free_boot_services(), because at that point the boot regions are
available again. But it seems that we reserve the boot regions really
early during startup and release them relatively late. The reason is
that the Boot Graphics Resource Table (BGRT) data, if present, is
located in the Boot Services Data regions but we can't extract the
address of the region from the ACPI tables until we've setup the ACPI
subsystem, which happens quite late.

I wonder whether performing the reservation of the crash kernel memory
first, before efi_reserve_boot_services(), would help. That way we'd
only need to reserve remaining regions in efi_reserve_boot_services().
This scheme would rely on nothing writing into the crash kernel area
before we've extracted the BGRT data, however.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-16 Thread Laszlo Ersek
On 09/16/13 12:59, Matt Fleming wrote:
 On Fri, 13 Sep, at 02:38:12PM, jerry.hoem...@hp.com wrote:
 Matt,

 We have hit an issue on our new platform in development related to the
 call of efi_reserve_boot_services() from setup_arch().

 The reservation can interfere with allocation of the crash kernel.
  
 Jerry, thanks for bringing this up.
 
 In pre 3.9(?) kernels,  the crash kernel is required to be allocated from
 physically contiguous memory below 896 MB.

 Our new platforms are large in both the amount of memory and the amount
 of IO. This requires large crash kernels for kdump to work.  This is even
 after the work done for makedumpfile v 1.5 to allow it to work with a
 smaller foot print.


 One of the problems is that drivers will allocate memory as boot code and/or
 data in the region  896 that effectively fragments this memory.
 With the reservation, we can't reuse the memory when needed for the
 crash kernels.   If we remove the reservation and allow the kernel
 to reuse the memory,  we the reservation of the crash kernel succeeds.

 This is definitely a problem for distros that are pre 3.9.  Probably less
 so for top of tree, but i haven't been focused there.

 So we are definitely interested in finding a mechanism to not
 do this reservation on platforms that don't have the issues described
 earlier in this thread.
 
 OK, in an ideal world we'd move the crash kernel reservation after
 efi_free_boot_services(), because at that point the boot regions are
 available again. But it seems that we reserve the boot regions really
 early during startup and release them relatively late. The reason is
 that the Boot Graphics Resource Table (BGRT) data, if present, is
 located in the Boot Services Data regions but we can't extract the
 address of the region from the ACPI tables until we've setup the ACPI
 subsystem, which happens quite late.

Why is BGRT allocated as Boot Services Data?

In file
MdeModulePkg/Universal/Acpi/BootGraphicsResourceTableDxe/BootGraphicsResourceTableDxe.c:

InstallBootGraphicsResourceTable()
  BgrtAllocateBsDataMemoryBelow4G()
gBS-AllocatePages(... EfiBootServicesData ...)

From Table 25. Memory Type Usage before ExitBootServices():

  EfiBootServicesData  -- The data portions of a loaded Boot Services
  Driver, and the default data allocation type
  used by a Boot Services Driver to allocate
  pool memory.

  EfiACPIReclaimMemory -- Memory that holds the ACPI tables.

From Table 26. Memory Type Usage after ExitBootServices():

  EfiBootServicesData -- Memory available for general use.

  EfiACPIReclaimMemory -- This memory is to be preserved by the loader
  and OS until ACPI is enabled. Once ACPI is
  enabled, the memory in this range is available
  for general use.

I thought that anything referenced by a pointer in any ACPI table was
EfiACPIReclaimMemory or stricter. Specifically, the RSDT or XSDT points
to BGRT, so BGRT is EfiACPIReclaimMemory.  BGRT points to the image data
(with its Image Address field), hence the image data should be
EfiACPIReclaimMemory too.

Otherwise, the pointer (BGRT.ImageAddress) can outlive the pointed-to
storage (the image data).

The image data sounds to me like textbook example for
EfiACPIReclaimMemory. This way the kernel could free Boot Services Data
early, perform the crash kernel reservation right after, and safely
access BGRT whenever the ACPI subsystem is brought up later.


The edk2 commit that flipped the memory type underneath the image data
from EfiReservedMemoryType to EfiBootServicesData is:

https://github.com/tianocore/edk2/commit/4c58575e

I think this commit is wrong. It's fine for OSPM to release the image
data at some point, but not right after ExitBootServices(), because
referencing pointers in ACPI tables survive strictly longer.

... Actually, the commit does follow the ACPI spec 5.0:

5.2.22.4 Image Address

The Image Address contains the location in memory where an
in-memory copy of the boot image can be found. The image should be
stored in EfiBootServicesData, allowing the system to reclaim
the memory when the image is no longer needed.

The ACPI spec 5.0 should recommend EfiACPIReclaimMemory here IMO. (I
take the current wording (should be stored) as a recommendation only.)

If that's in fact a recommendation (and not a hard requirement), then it
should be easy to change BgrtAllocateBsDataMemoryBelow4G() again.

Thanks,
Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-16 Thread Josh Triplett
On Mon, Sep 16, 2013 at 01:50:46PM +0200, Laszlo Ersek wrote:
 On 09/16/13 12:59, Matt Fleming wrote:
  On Fri, 13 Sep, at 02:38:12PM, jerry.hoem...@hp.com wrote:
  Matt,
 
  We have hit an issue on our new platform in development related to the
  call of efi_reserve_boot_services() from setup_arch().
 
  The reservation can interfere with allocation of the crash kernel.
   
  Jerry, thanks for bringing this up.
  
  In pre 3.9(?) kernels,  the crash kernel is required to be allocated from
  physically contiguous memory below 896 MB.
 
  Our new platforms are large in both the amount of memory and the amount
  of IO. This requires large crash kernels for kdump to work.  This is even
  after the work done for makedumpfile v 1.5 to allow it to work with a
  smaller foot print.
 
 
  One of the problems is that drivers will allocate memory as boot code 
  and/or
  data in the region  896 that effectively fragments this memory.
  With the reservation, we can't reuse the memory when needed for the
  crash kernels.   If we remove the reservation and allow the kernel
  to reuse the memory,  we the reservation of the crash kernel succeeds.
 
  This is definitely a problem for distros that are pre 3.9.  Probably less
  so for top of tree, but i haven't been focused there.
 
  So we are definitely interested in finding a mechanism to not
  do this reservation on platforms that don't have the issues described
  earlier in this thread.
  
  OK, in an ideal world we'd move the crash kernel reservation after
  efi_free_boot_services(), because at that point the boot regions are
  available again. But it seems that we reserve the boot regions really
  early during startup and release them relatively late. The reason is
  that the Boot Graphics Resource Table (BGRT) data, if present, is
  located in the Boot Services Data regions but we can't extract the
  address of the region from the ACPI tables until we've setup the ACPI
  subsystem, which happens quite late.
 
 Why is BGRT allocated as Boot Services Data?
 
 In file
 MdeModulePkg/Universal/Acpi/BootGraphicsResourceTableDxe/BootGraphicsResourceTableDxe.c:
 
 InstallBootGraphicsResourceTable()
   BgrtAllocateBsDataMemoryBelow4G()
 gBS-AllocatePages(... EfiBootServicesData ...)
 
 From Table 25. Memory Type Usage before ExitBootServices():
 
   EfiBootServicesData  -- The data portions of a loaded Boot Services
   Driver, and the default data allocation type
   used by a Boot Services Driver to allocate
   pool memory.
 
   EfiACPIReclaimMemory -- Memory that holds the ACPI tables.
 
 From Table 26. Memory Type Usage after ExitBootServices():
 
   EfiBootServicesData -- Memory available for general use.
 
   EfiACPIReclaimMemory -- This memory is to be preserved by the loader
   and OS until ACPI is enabled. Once ACPI is
   enabled, the memory in this range is available
   for general use.
 
 I thought that anything referenced by a pointer in any ACPI table was
 EfiACPIReclaimMemory or stricter. Specifically, the RSDT or XSDT points
 to BGRT, so BGRT is EfiACPIReclaimMemory.  BGRT points to the image data
 (with its Image Address field), hence the image data should be
 EfiACPIReclaimMemory too.
 
 Otherwise, the pointer (BGRT.ImageAddress) can outlive the pointed-to
 storage (the image data).
 
 The image data sounds to me like textbook example for
 EfiACPIReclaimMemory. This way the kernel could free Boot Services Data
 early, perform the crash kernel reservation right after, and safely
 access BGRT whenever the ACPI subsystem is brought up later.
 
 
 The edk2 commit that flipped the memory type underneath the image data
 from EfiReservedMemoryType to EfiBootServicesData is:
 
 https://github.com/tianocore/edk2/commit/4c58575e
 
 I think this commit is wrong. It's fine for OSPM to release the image
 data at some point, but not right after ExitBootServices(), because
 referencing pointers in ACPI tables survive strictly longer.
 
 ... Actually, the commit does follow the ACPI spec 5.0:
 
 5.2.22.4 Image Address
 
 The Image Address contains the location in memory where an
 in-memory copy of the boot image can be found. The image should be
 stored in EfiBootServicesData, allowing the system to reclaim
 the memory when the image is no longer needed.
 
 The ACPI spec 5.0 should recommend EfiACPIReclaimMemory here IMO. (I
 take the current wording (should be stored) as a recommendation only.)

I agree that UEFI *should* store the BGRT in EfiACPIReclaimMemory, but
in practice the UEFI firmware I've seen with a BGRT does follow that
recommendation and store it in EfiBootServicesData.  So, even if the
recommendation in the spec changed, the kernel would still have to
accomodate both possibilities.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to 

Re: [edk2] Corrupted EFI region

2013-09-16 Thread Laszlo Ersek
On 09/16/13 17:57, Josh Triplett wrote:

 The edk2 commit that flipped the memory type underneath the image data
 from EfiReservedMemoryType to EfiBootServicesData is:

 https://github.com/tianocore/edk2/commit/4c58575e

 I think this commit is wrong. It's fine for OSPM to release the image
 data at some point, but not right after ExitBootServices(), because
 referencing pointers in ACPI tables survive strictly longer.

 ... Actually, the commit does follow the ACPI spec 5.0:

 5.2.22.4 Image Address

 The Image Address contains the location in memory where an
 in-memory copy of the boot image can be found. The image should be
 stored in EfiBootServicesData, allowing the system to reclaim
 the memory when the image is no longer needed.

 The ACPI spec 5.0 should recommend EfiACPIReclaimMemory here IMO. (I
 take the current wording (should be stored) as a recommendation only.)
 
 I agree that UEFI *should* store the BGRT in EfiACPIReclaimMemory, but
 in practice the UEFI firmware I've seen with a BGRT does follow that
 recommendation and store it in EfiBootServicesData.  So, even if the
 recommendation in the spec changed, the kernel would still have to
 accomodate both possibilities.

Just for the theoretical debate:

The edk2 commit linked above is 5 days old. All UEFI firmware in the
wild (on released hardware) should be using EfiReservedMemoryType (the
pre-patch memory type), which is even stricter.

EfiReservedMemoryType can never be released  repurposed, so it should
make no difference for crash kernel allocation, shouldn't it?

- call efi_free_boot_services() -- doesn't touch the image data (which
  is in RAM of EfiReservedMemoryType),
- reserve crash kernel,
- access BGRT via ACPI.

BGRT had appeared in edk2 with

  https://github.com/tianocore/edk2/commit/0284e90c

and EfiReservedMemoryType used to be the allocation type until commit
4c58575e.

Or are you alluding to UEFI firmware that's not based on TianoCore?

Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-16 Thread Matthew Garrett
On Mon, Sep 16, 2013 at 06:25:22PM +0200, Laszlo Ersek wrote:

 Or are you alluding to UEFI firmware that's not based on TianoCore?

Most BGRT implementations are IBV specific rather than coming from 
Tiano. The ACPI spec says that the image should be stored in 
EfiBootServicesData, and most implementations follow that.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-16 Thread Josh Triplett
On Mon, Sep 16, 2013 at 06:25:22PM +0200, Laszlo Ersek wrote:
 On 09/16/13 17:57, Josh Triplett wrote:
 
  The edk2 commit that flipped the memory type underneath the image data
  from EfiReservedMemoryType to EfiBootServicesData is:
 
  https://github.com/tianocore/edk2/commit/4c58575e
 
  I think this commit is wrong. It's fine for OSPM to release the image
  data at some point, but not right after ExitBootServices(), because
  referencing pointers in ACPI tables survive strictly longer.
 
  ... Actually, the commit does follow the ACPI spec 5.0:
 
  5.2.22.4 Image Address
 
  The Image Address contains the location in memory where an
  in-memory copy of the boot image can be found. The image should be
  stored in EfiBootServicesData, allowing the system to reclaim
  the memory when the image is no longer needed.
 
  The ACPI spec 5.0 should recommend EfiACPIReclaimMemory here IMO. (I
  take the current wording (should be stored) as a recommendation only.)
  
  I agree that UEFI *should* store the BGRT in EfiACPIReclaimMemory, but
  in practice the UEFI firmware I've seen with a BGRT does follow that
  recommendation and store it in EfiBootServicesData.  So, even if the
  recommendation in the spec changed, the kernel would still have to
  accomodate both possibilities.
 
 Just for the theoretical debate:
 
 The edk2 commit linked above is 5 days old. All UEFI firmware in the
 wild (on released hardware) should be using EfiReservedMemoryType (the
 pre-patch memory type), which is even stricter.
 
 EfiReservedMemoryType can never be released  repurposed, so it should
 make no difference for crash kernel allocation, shouldn't it?
 
 - call efi_free_boot_services() -- doesn't touch the image data (which
   is in RAM of EfiReservedMemoryType),
 - reserve crash kernel,
 - access BGRT via ACPI.
 
 BGRT had appeared in edk2 with
 
   https://github.com/tianocore/edk2/commit/0284e90c
 
 and EfiReservedMemoryType used to be the allocation type until commit
 4c58575e.
 
 Or are you alluding to UEFI firmware that's not based on TianoCore?

I'm saying, in practice, that the systems I tested BGRT support on and
submitted patches for stored the BGRT's image in EfiBootServicesData.

- Josh Triplett
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-09-02 Thread Matt Fleming
On Thu, 08 Aug, at 06:46:02AM, Andrew Fish wrote:
 
 On Aug 8, 2013, at 3:17 AM, Matt Fleming m...@console-pimps.org wrote:
 
  On Wed, 07 Aug, at 02:10:28PM, Andrew Fish wrote:
  Well the issue I see is I don't think OS X or Windows are doing this.
  So I'm guessing there is some unique thing beings done on the Linux
  side and we don't have good tests to catch bugs in the EFI
  implementations. If the Linux loader hides the bugs and we don't hit
  them with other operating systems they are never going to get fixed.
  It would be good if we could track down some of these issues and make
  a request for some tests that can help catch these issues. The tests
  would be part of UEFI.org, but since some of us play in both worlds we
  can forward the known issues to the UEFI test work group. 
  
  I'm all for helping to develop tests that catch these kind of bugs.
  What's the next step?
  
 
 I'll bring this up with UEFI.org.
 
For those attending the UEFI plugfest in New Orleans this would be a
good topic for discussion - figuring out a collaboration process to get
new tests in place.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-18 Thread Jordan Justen
0001-OvmfPkg-allocate-the-EFI-memory-map-for-Linux-as-Loa.patch
was applied in r14555.

Thanks for the contribution.

And thanks for the bug report  testing Boris.

On Wed, Aug 7, 2013 at 10:49 AM, Laszlo Ersek ler...@redhat.com wrote:
 On 08/07/13 17:19, Borislav Petkov wrote:
 On Tue, Aug 06, 2013 at 05:31:29PM +0200, Laszlo Ersek wrote:
 Can you capture the OVMF debug output? Do you see

   ConvertPages: Incompatible memory types

 there?

 Can you set the following bits too in the debug mask?

 #define DEBUG_POOL  0x0010  // Alloc  Free's
 #define DEBUG_PAGE  0x0020  // Alloc  Free's

 Ok, I got debug output; I have to be careful now of not missing
 anything. Ok, so here we go:

 First of all, I changed debugging mask to:

   gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel|0x8010007F

 (I just set all three bits you requested).

 Using the new OVMF.id changed the addresses, of course, so we're looking
 at 0x7dc59XXX ones now.

 [0.00] memblock_reserve: [0x007dc59018-0x007dc59618] 
 efi_memblock_x86_reserve_range+0x70/0x75

 So, I've attached an archive of the debug logs. The initial observations
 I could do is that the region still gets squashed to:

 [0.014041] efi: mem11: type=4, attr=0xf, 
 range=[0x7dc59000-0x7dc59000) (0MB)

 from

 [0.00] efi: mem11: type=4, attr=0xf, 
 range=[0x7dc59000-0x7e146000) (4MB)

 And the interesting stuff in the OVMF output is right at the end:

 ConvertRange: 7DC59000-7DC5AFFF to 4
 AddRange: 7DC59000-7DC5AFFF to 4
 AllocatePoolI: Type 4, Addr 7DC59018 (len 16F0) 26,735,072
 Jumping to kernel

 We get that same output no matter if I boot it with -enable-kvm or
 not.

 If the order of the debug messages is the same as the calls actually
 happen, we AllocatePoolI to address 7DC59018 which we already have added
 as a range. But I'm not going to pretend I even know the code so I'll
 let you comment instead :).

 I think this allows us to solve the bug :)

 First, forget everything I said :) I was completely lost.

 Remember this?

 01 efi_main()
 02  exit_boot()
 03low_alloc()
 04GetMemoryMap()
 05ExitBootServices()
 06
 07 start_kernel()
 08   setup_arch()
 09efi_memblock_x86_reserve_range()
 10efi_reserve_boot_services()
 11  efi_enter_virtual_mode()
 12SetVirtualAddressMap()

 Now, lines 01 to 05 *do not happen*.

 More precisely, they don't happen in the kernel. They happen in the firmware. 
 Specifically, OvmfPkg/Library/LoadLinuxLib/Linux.c.

 You're booting the kernel from the qemu command line. The kernel you run is 
 also an [o]ld kernel[] without EFI handover protocol. So what happens is, 
 OVMF downloads the kernel image from qemu over fw_cfg, figures it's an old 
 kernel...

 PlatformBdsPolicyBehavior() [OvmfPkg/Library/PlatformBdsLib/BdsPlatform.c]
   // Process QEMU's -kernel command line option:
   TryRunningQemuKernel()[OvmfPkg/Library/PlatformBdsLib/QemuKernel.c]
 LoadLinux() [OvmfPkg/Library/LoadLinuxLib/Linux.c]
   // Old kernels without EFI handover protocol
   SetupLinuxBootParams()
 SetupLinuxMemmap()
   AllocatePool() -- !!!
   gBS-GetMemoryMap()
   gBS-ExitBootServices()
   prints Jumping to kernel
   JumpToKernel()

 Now pull up efi_memblock_x86_reserve_range(). It reserves 
 boot_params.efi_info-efi_memmap.

 I assumed this field would come from the exit_boot() kernel function. It 
 doesn't. It comes from SetupLinuxMemmap(). The former allocates the backing 
 store as EFI_LOADER_DATA. The latter, alas, marked with !!! above, as boot 
 services data. :)

 So, what you're seeing in the OVMF debug log:

 ConvertRange: 7DC59000-7DC5AFFF to 4
 AddRange: 7DC59000-7DC5AFFF to 4
 AllocatePoolI: Type 4, Addr 7DC59018 (len 16F0) 26,735,072

 This is self-consistent. It just documents that the AllocatePool() call 
 marked with !!! needs to grab two full pages first (two first lines), carve 
 them up into pool chunks, and then serve the request from them (third line).

 The address displayed here shows up in the linux dmesg later on because the 
 storage for the memory map itself is allocated, and populated, by OVMF, not 
 the EFI stub in the kernel.

 In one sentence, efi_memblock_x86_reserve_range() expects that 
 boot_params.efi_info-efi_memmap has been allocated as loader data (by 
 whomever), but SetupLinuxMemmap() violates this by allocating the storage as 
 boot services data.

 This leads to double reservation attempts between 
 efi_memblock_x86_reserve_range(), and efi_reserve_boot_services().

 The attached edk2 patch should fix it. Please confirm.

 Thanks,
 Laszlo

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-07 Thread Andrew Fish

On Aug 7, 2013, at 8:19 AM, Borislav Petkov b...@alien8.de wrote:

 On Tue, Aug 06, 2013 at 05:31:29PM +0200, Laszlo Ersek wrote:
 Can you capture the OVMF debug output? Do you see
 
  ConvertPages: Incompatible memory types
 
 there?
 
 Can you set the following bits too in the debug mask?
 
 #define DEBUG_POOL  0x0010  // Alloc  Free's
 #define DEBUG_PAGE  0x0020  // Alloc  Free's
 
 Ok, I got debug output; I have to be careful now of not missing
 anything. Ok, so here we go:
 
 First of all, I changed debugging mask to:
 
  gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel|0x8010007F
 
 (I just set all three bits you requested).
 
 Using the new OVMF.id changed the addresses, of course, so we're looking
 at 0x7dc59XXX ones now.
 
 [0.00] memblock_reserve: [0x007dc59018-0x007dc59618] 
 efi_memblock_x86_reserve_range+0x70/0x75
 
 So, I've attached an archive of the debug logs. The initial observations
 I could do is that the region still gets squashed to:
 
 [0.014041] efi: mem11: type=4, attr=0xf, 
 range=[0x7dc59000-0x7dc59000) (0MB)
 
 from
 
 [0.00] efi: mem11: type=4, attr=0xf, 
 range=[0x7dc59000-0x7e146000) (4MB)
 

OK so I think I need some Cliff Notes here to help me understand what is going 
on...

type 4 is EfiBootServicesData and attr 0x0f is cache attributes with no request 
for a runtime mapping. This is not runtime memory so to the OS loader it is 
just memory EFI has used that will get freed back to the OS after 
ExitBootServices(), along with EfiBootServicesCode, EfiLoaderCode, and 
EfiLoaderData. The EfiLoaderCode and EfiLoaderData also get freed back to the 
OS and they just exist for the convenience of the OS loader. 

So I can't figure out why this maters? Given:

typedef enum {
// Boot Services Memory
EfiLoaderCode = 1,
EfiLoaderData = 2,
EfiBootServicesCode = 3,
EfiBootServicesData = 4,
EfiConventionalMemory = 7,

// EFI Runtime Drivers
EfiRuntimeServicesCode = 5,
EfiRuntimeServicesData = 6,

// Stuff that may get mapped into Runtime
EfiReservedMemoryType = 0,
EfiACPIReclaimMemory = 9,
EfiACPIMemoryNVS = 10,
EfiMemoryMappedIO = 11,
EfiMemoryMappedIOPortSpace = 12,
EfiPalCode = 13,
   
EfiUnusableMemory = 8,
EfiMaxMemoryType = 14
} EFI_MEMORY_TYPE;

[0.005012] efi: efi_enter_virtual_mode
**[0.006004] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
*[0.007004] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)

**[0.008004] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
*[0.009004] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
**[0.010004] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
*[0.011004] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e5000) (22MB)
**[0.012004] efi: mem06: type=7, attr=0xf, 
range=[0x036e5000-0x3fffc000) (969MB)
*[0.013004] efi: mem07: type=2, attr=0xf, 
range=[0x3fffc000-0x4000) (0MB)
**[0.014004] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
*[0.015004] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
**[0.016004] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7dc59000) (28MB)
*[0.017004] efi: mem11: type=4, attr=0xf, 
range=[0x7dc59000-0x7dc59000) (0MB)
*[0.018004] efi: mem12: type=3, attr=0xf, 
range=[0x7e146000-0x7e1c2000) (0MB)
*[0.019004] efi: mem13: type=4, attr=0xf, 
range=[0x7e1c2000-0x7e1ca000) (0MB)
*[0.020004] efi: mem14: type=3, attr=0xf, 
range=[0x7e1ca000-0x7e1d4000) (0MB)
*[0.021004] efi: mem15: type=4, attr=0xf, 
range=[0x7e1d4000-0x7e1d6000) (0MB)
*[0.022004] efi: mem16: type=3, attr=0xf, 
range=[0x7e1d6000-0x7e368000) (1MB)

[0.023004] efi: mem17: type=6, attr=0x800f, 
range=[0x7e368000-0x7e37d000) (0MB)

*[0.024004] efi: mem18: type=4, attr=0xf, 
range=[0x7e37d000-0x7e8c8000) (5MB)

[0.025004] efi: mem19: type=5, attr=0x800f, 
range=[0x7e8c8000-0x7e8cf000) (0MB)

*[0.026004] efi: mem20: type=4, attr=0xf, 
range=[0x7e8cf000-0x7e923000) (0MB)

[0.028010] efi: mem21: type=6, attr=0x800f, 
range=[0x7e923000-0x7e925000) (0MB)
[0.029004] efi: mem22: type=5, attr=0x800f, 
range=[0x7e925000-0x7e934000) (0MB)

*[0.031004] efi: mem23: type=4, attr=0xf, 
range=[0x7e934000-0x7f881000) (15MB)
*[0.032004] efi: mem24: type=3, attr=0xf, 

Re: [edk2] Corrupted EFI region

2013-08-07 Thread Matt Fleming
[ Readding Matthew Garrett to the Cc list, seeing as we both got removed
  for some unknown reason ]

On Wed, 07 Aug, at 10:23:56AM, Andrew Fish wrote:

 OK so I think I need some Cliff Notes here to help me understand what
 is going on...
 
 type 4 is EfiBootServicesData and attr 0x0f is cache attributes with
 no request for a runtime mapping. This is not runtime memory so to the
 OS loader it is just memory EFI has used that will get freed back to
 the OS after ExitBootServices(), along with EfiBootServicesCode,
 EfiLoaderCode, and EfiLoaderData. The EfiLoaderCode and EfiLoaderData
 also get freed back to the OS and they just exist for the convenience
 of the OS loader. 
 
 So I can't figure out why this maters? Given:

We've seen a bunch of systems that make calls into EfiBootServicesCode
after ExitBootServices(). There were some Apple machines in that list,
though I don't have the details but Matthew should.
 
So we map these regions unconditionally and in their original state,
otherwise the firmware will generate fatal page faults when trying to
access those memory regions.

-- 
Matt Fleming, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/01/13 18:49, Borislav Petkov wrote:
 On Wed, Jul 31, 2013 at 10:55:27PM +0100, David Woodhouse wrote:
 On Wed, 2013-07-31 at 22:54 +0200, Borislav Petkov wrote:
 so I'm seeing this funny thing where an EFI region changes when we enter
 efi_enter_virtual_mode when booting with edk2 on kvm. Here's the diff:

 Perhaps the edk2-de...@lists.sourceforge.net list should be in Cc?
 
 Good idea and message repeated below.
 
 One more thing: I'm using a self-built OVMF with top commit from March:
 
 
 r14165 | sfu5 | 2013-03-06 02:42:04 +0100 (Wed, 06 Mar 2013) | 4 lines
 
 Fix a bug that IsSignatureFoundInDatabase() incorrectly computes CertCount.
 
 ---
 
 Hi guys,
 
 so I'm seeing this funny thing where an EFI region changes when we enter
 efi_enter_virtual_mode when booting with edk2 on kvm. Here's the diff:
 
 --- before  2013-07-31 22:20:52.316039492 +0200
 +++ after   2013-07-31 22:21:30.960731706 +0200
 @@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
  efi: mem08: type=7, attr=0xf, range=[0x4000-0x7c00) 
 (960MB)
  efi: mem09: type=4, attr=0xf, range=[0x7c00-0x7c02) 
 (0MB)
  efi: mem10: type=7, attr=0xf, range=[0x7c02-0x7e0ad000) 
 (32MB)
 -efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0cc000) 
 (0MB)
 +efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0ad000) 
 (0MB)

(type 4 is EfiBootServicesData)

  efi: mem12: type=7, attr=0xf, range=[0x7e0cc000-0x7e0cd000) 
 (0MB)
  efi: mem13: type=4, attr=0xf, range=[0x7e0cd000-0x7e55d000) 
 (4MB)
  efi: mem14: type=3, attr=0xf, range=[0x7e55d000-0x7e59c000) 
 (0MB)
 
 That second boundary of region mem11 suddenly changes *before* we merge
 the regions. edk2 bug?

I take it you mean this change (ie. appearance of the zero-sized range)
occurs when you enable KVM acceleration in qemu?

If so, please locate gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel
in OvmfPkg/OvmfPkgX64.dsc, and set the following bit in its value:

  # DEBUG_GCD  0x0010 Global Coherency Database changes

Then please rebuild OVMF, and capture the debug port output of qemu
(-debugcon file:debug.log -global isa-debugcon.iobase=0x402) both with
and without KVM.

DEBUG_GCD should produce messages related to CoreAllocateSpace(), and
might help us find the spot the difference is introduced.

BTW does this have anything to do with the NX bit report of yours, or
have you noticed this independently?

(I'm not subscribed to lkml so apologies if this email doesn't end up in
those archives / doesn't reach everyone.)

Thanks
Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 01:27:16PM +0200, Laszlo Ersek wrote:
  --- before  2013-07-31 22:20:52.316039492 +0200
  +++ after   2013-07-31 22:21:30.960731706 +0200
  @@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
   efi: mem08: type=7, attr=0xf, 
  range=[0x4000-0x7c00) (960MB)
   efi: mem09: type=4, attr=0xf, 
  range=[0x7c00-0x7c02) (0MB)
   efi: mem10: type=7, attr=0xf, 
  range=[0x7c02-0x7e0ad000) (32MB)
  -efi: mem11: type=4, attr=0xf, 
  range=[0x7e0ad000-0x7e0cc000) (0MB)
  +efi: mem11: type=4, attr=0xf, 
  range=[0x7e0ad000-0x7e0ad000) (0MB)
 
 (type 4 is EfiBootServicesData)

Yes.

   efi: mem12: type=7, attr=0xf, 
  range=[0x7e0cc000-0x7e0cd000) (0MB)
   efi: mem13: type=4, attr=0xf, 
  range=[0x7e0cd000-0x7e55d000) (4MB)
   efi: mem14: type=3, attr=0xf, 
  range=[0x7e55d000-0x7e59c000) (0MB)
  
  That second boundary of region mem11 suddenly changes *before* we merge
  the regions. edk2 bug?
 
 I take it you mean this change (ie. appearance of the zero-sized range)
 occurs when you enable KVM acceleration in qemu?

Right. And I'm booting with qemu -enable-kvm so KVM acceleration is
enabled?? Or do you mean something else.

 If so, please locate gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel
 in OvmfPkg/OvmfPkgX64.dsc, and set the following bit in its value:
 
   # DEBUG_GCD  0x0010 Global Coherency Database changes
 
 Then please rebuild OVMF, and capture the debug port output of qemu
 (-debugcon file:debug.log -global isa-debugcon.iobase=0x402) both with
 and without KVM.
 
 DEBUG_GCD should produce messages related to CoreAllocateSpace(), and
 might help us find the spot the difference is introduced.

Ok, I'll try to get this thing done before my vacation. If not, we'll
deal with it afterwards but I won't forget, I promise! :-)

 BTW does this have anything to do with the NX bit report of yours, or
 have you noticed this independently?

Independently, while testing my runtime services mapping patchset. I was
getting an empty region and was wondering whether to discard it from the
mapping or not and then I looked at why I get it in the first place.

Basically, I get this empty region which appears at some point. It is
there when we enter efi_enter_virtual_mode in the kernel to setup the
runtime mappings:

[0.005012] efi: efi_enter_virtual_mode: enter
[0.006004] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.007004] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.008003] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.009004] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.010004] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.011004] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.012004] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.013003] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.014004] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.015004] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.016004] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.017004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)

^^

[0.018003] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)

When we dump the EFI regions initially, it is ok.

[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)

So what basically happens is the end boundary of the region becomes the
start, practically turning it into a 0-size one.

Thanks for looking into it.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 15:02, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 01:27:16PM +0200, Laszlo Ersek wrote:
 --- before  2013-07-31 22:20:52.316039492 +0200
 +++ after   2013-07-31 22:21:30.960731706 +0200
 @@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
  efi: mem08: type=7, attr=0xf, 
 range=[0x4000-0x7c00) (960MB)
  efi: mem09: type=4, attr=0xf, 
 range=[0x7c00-0x7c02) (0MB)
  efi: mem10: type=7, attr=0xf, 
 range=[0x7c02-0x7e0ad000) (32MB)
 -efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0cc000) (0MB)
 +efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0ad000) (0MB)

 (type 4 is EfiBootServicesData)
 
 Yes.
 
  efi: mem12: type=7, attr=0xf, 
 range=[0x7e0cc000-0x7e0cd000) (0MB)
  efi: mem13: type=4, attr=0xf, 
 range=[0x7e0cd000-0x7e55d000) (4MB)
  efi: mem14: type=3, attr=0xf, 
 range=[0x7e55d000-0x7e59c000) (0MB)

 That second boundary of region mem11 suddenly changes *before* we merge
 the regions. edk2 bug?

 I take it you mean this change (ie. appearance of the zero-sized range)
 occurs when you enable KVM acceleration in qemu?
 
 Right. And I'm booting with qemu -enable-kvm so KVM acceleration is
 enabled?? Or do you mean something else.

My question was: is my understanding correct that you only see this
problem with -enable-kvm? Because,

On 08/01/13 18:49, Borislav Petkov wrote:
 so I'm seeing this funny thing where an EFI region changes when we
 enter efi_enter_virtual_mode when booting with edk2 on kvm. Here's
 the diff:

You said on kvm, and provided a diff. I think (hope) I understand the
environment you've denoted with after, but what's your before? The
absence of -enable-kvm, or something else?

 
 If so, please locate gEfiMdePkgTokenSpaceGuid.PcdDebugPrintErrorLevel
 in OvmfPkg/OvmfPkgX64.dsc, and set the following bit in its value:

   # DEBUG_GCD  0x0010 Global Coherency Database changes

 Then please rebuild OVMF, and capture the debug port output of qemu
 (-debugcon file:debug.log -global isa-debugcon.iobase=0x402) both with
 and without KVM.

 DEBUG_GCD should produce messages related to CoreAllocateSpace(), and
 might help us find the spot the difference is introduced.
 
 Ok, I'll try to get this thing done before my vacation. If not, we'll
 deal with it afterwards but I won't forget, I promise! :-)
 
 BTW does this have anything to do with the NX bit report of yours, or
 have you noticed this independently?
 
 Independently, while testing my runtime services mapping patchset.

What's the purpose of that series? Can you please provide a link (if you
posted versions of it already)?

 I was
 getting an empty region and was wondering whether to discard it from the
 mapping or not and then I looked at why I get it in the first place.
 
 Basically, I get this empty region which appears at some point. It is
 there when we enter efi_enter_virtual_mode in the kernel to setup the
 runtime mappings:
 
 [0.005012] efi: efi_enter_virtual_mode: enter
 [0.006004] efi: mem00: type=7, attr=0xf, 
 range=[0x-0x0009f000) (0MB)
 [0.007004] efi: mem01: type=2, attr=0xf, 
 range=[0x0009f000-0x000a) (0MB)
 [0.008003] efi: mem02: type=7, attr=0xf, 
 range=[0x0010-0x0080) (7MB)
 [0.009004] efi: mem03: type=4, attr=0xf, 
 range=[0x0080-0x0100) (8MB)
 [0.010004] efi: mem04: type=7, attr=0xf, 
 range=[0x0100-0x0200) (16MB)
 [0.011004] efi: mem05: type=2, attr=0xf, 
 range=[0x0200-0x036e3000) (22MB)
 [0.012004] efi: mem06: type=7, attr=0xf, 
 range=[0x036e3000-0x3fffb000) (969MB)
 [0.013003] efi: mem07: type=2, attr=0xf, 
 range=[0x3fffb000-0x4000) (0MB)
 [0.014004] efi: mem08: type=7, attr=0xf, 
 range=[0x4000-0x7c00) (960MB)
 [0.015004] efi: mem09: type=4, attr=0xf, 
 range=[0x7c00-0x7c02) (0MB)
 [0.016004] efi: mem10: type=7, attr=0xf, 
 range=[0x7c02-0x7e0ad000) (32MB)
 [0.017004] efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0ad000) (0MB)
   
 ^^
 
 [0.018003] efi: mem12: type=7, attr=0xf, 
 range=[0x7e0cc000-0x7e0cd000) (0MB)
 
 When we dump the EFI regions initially, it is ok.
 
 [0.00] efi: mem10: type=7, attr=0xf, 
 range=[0x7c02-0x7e0ad000) (32MB)
 [0.00] efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0cc000) (0MB)
 [0.00] efi: mem12: type=7, attr=0xf, 
 range=[0x7e0cc000-0x7e0cd000) (0MB)
 
 So what basically happens is the end boundary of the region becomes the
 start, practically turning it into a 0-size one.

... and you 

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 03:39:31PM +0200, Laszlo Ersek wrote:
 My question was: is my understanding correct that you only see this
 problem with -enable-kvm? Because,
 
 On 08/01/13 18:49, Borislav Petkov wrote:
  so I'm seeing this funny thing where an EFI region changes when we
  enter efi_enter_virtual_mode when booting with edk2 on kvm. Here's
  the diff:
 
 You said on kvm, and provided a diff. I think (hope) I understand the
 environment you've denoted with after, but what's your before? The
 absence of -enable-kvm, or something else?

Ah, I see.

So 'before' is the initial dump of the EFI regions, very early during
boot:

[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range=[0x7e55d000-0x7e59c000) (0MB)
[0.00] efi: mem15: type=4, attr=0xf, 
range=[0x7e59c000-0x7e5a) (0MB)
[0.00] efi: mem16: type=3, attr=0xf, 
range=[0x7e5a-0x7e668000) (0MB)
[0.00] efi: mem17: type=5, attr=0x800f, 
range=[0x7e668000-0x7e67d000) (0MB)
[0.00] efi: mem18: type=6, attr=0x800f, 
range=[0x7e67d000-0x7e692000) (0MB)
[0.00] efi: mem19: type=4, attr=0xf, 
range=[0x7e692000-0x7f992000) (19MB)
[0.00] efi: mem20: type=7, attr=0xf, 
range=[0x7f992000-0x7f994000) (0MB)
[0.00] efi: mem21: type=3, attr=0xf, 
range=[0x7f994000-0x7fb12000) (1MB)
[0.00] efi: mem22: type=5, attr=0x800f, 
range=[0x7fb12000-0x7fb42000) (0MB)
[0.00] efi: mem23: type=6, attr=0x800f, 
range=[0x7fb42000-0x7fb66000) (0MB)
[0.00] efi: mem24: type=0, attr=0xf, 
range=[0x7fb66000-0x7fb6a000) (0MB)
[0.00] efi: mem25: type=9, attr=0xf, 
range=[0x7fb6a000-0x7fb72000) (0MB)
[0.00] efi: mem26: type=10, attr=0xf, 
range=[0x7fb72000-0x7fb76000) (0MB)
[0.00] efi: mem27: type=4, attr=0xf, 
range=[0x7fb76000-0x7ffe) (4MB)
[0.00] efi: mem28: type=6, attr=0x800f, 
range=[0x7ffe-0x8000) (0MB)

and with 'after' I've denoted the dump of the EFI regions a second time,
a bit later, when we enter efi_enter_virtual_mode():

[0.005012] efi: efi_enter_virtual_mode: enter
[0.006004] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.007004] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.008003] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.009004] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.010004] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.011004] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.012004] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.013003] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.014004] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.015004] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.016004] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.017004] efi: mem11: type=4, attr=0xf, 

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 16:03, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 03:39:31PM +0200, Laszlo Ersek wrote:
 My question was: is my understanding correct that you only see this
 problem with -enable-kvm? Because,

 On 08/01/13 18:49, Borislav Petkov wrote:
 so I'm seeing this funny thing where an EFI region changes when we
 enter efi_enter_virtual_mode when booting with edk2 on kvm. Here's
 the diff:

 You said on kvm, and provided a diff. I think (hope) I understand the
 environment you've denoted with after, but what's your before? The
 absence of -enable-kvm, or something else?
 
 Ah, I see.
 
 So 'before' is the initial dump of the EFI regions, very early during
 boot:

snip

 and with 'after' I've denoted the dump of the EFI regions a second time,
 a bit later, when we enter efi_enter_virtual_mode():

snip

 
 during the *same* boot.
 
 So, it is one boot but two dumps of the EFI regions. And yes, I'm
 booting with the 'kvm' executable which has '-enable-kvm'

Okay. Thanks for clarifying it.

 
 What's the purpose of that series? Can you please provide a link (if
 you posted versions of it already)?
 
 Not yet posted but working on it.
 
 The idea is to map the runtime regions at stable addresses so that when
 we kexec a kernel, it can use runtime services too. And we have to do
 that because of the braindead design of SetVirtualAddressMap() being
 callable only once per boot.

I wouldn't call the design of SetVirtualAddressMap() braindead.

I'd rather call kexec unique and somewhat unexpected :)

 
 So what basically happens is the end boundary of the region becomes the
 start, practically turning it into a 0-size one.

 ... and you guys suspect that some firmware code is responsible, code
 that runs between the initial memory map dump, and efi_enter_virtual_mode():

 https://lkml.org/lkml/2013/7/31/550
 
 I wouldn't wonder if we f*cked it up again like the last time. I'll give
 it a long hard look.

Ah sorry, by and you guys suspect I didn't mean to imply anything
between the lines, I was simply trying to ascertain your working idea :)

Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 04:27:44PM +0200, Laszlo Ersek wrote:
 I wouldn't call the design of SetVirtualAddressMap() braindead.

Ok, I've always wondered and you could probably shed some light on the
matter: why is SetVirtualAddressMap() a call-once only? Why can't I
simply call it again and update the mappings?

 I'd rather call kexec unique and somewhat unexpected :)

In all fairness, it was there before UEFI, AFAICT.

  I wouldn't wonder if we f*cked it up again like the last time. I'll give
  it a long hard look.
 
 Ah sorry, by and you guys suspect I didn't mean to imply anything
 between the lines, I was simply trying to ascertain your working idea :)

As long as we get to the bottom of this, we're all fine. And I'd
pretty much expect everyone who is dealing with EFI to have grown a
sufficiently thick skin before starting to do so, so don't worry.

:-)

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 16:40, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 04:27:44PM +0200, Laszlo Ersek wrote:
 I wouldn't call the design of SetVirtualAddressMap() braindead.
 
 Ok, I've always wondered and you could probably shed some light on the
 matter: why is SetVirtualAddressMap() a call-once only? Why can't I
 simply call it again and update the mappings?

The current implementation (how pointers are converted) probably doesn't
accommodate a second call.

Of course you want to know why SetVirtualAddressMap() was designed like
that... I didn't participate in the design so I don't know :)

But, as I said, a kernel directly executing another kernel is an
unexpected idea. IMHO the second kernel in question doesn't fit the UEFI
phases at all. The OS booted like that (ie. the OS whose kernel is the
2nd (=kexec) kernel) never goes through SEC, PEI, DXE, BDS.

SetVirtualAddressMap() is a firmware interface, but the kexec OS
(including its private boot loader and kernel) are not loaded by firmware.

 
 I'd rather call kexec unique and somewhat unexpected :)
 
 In all fairness, it was there before UEFI, AFAICT.

That doesn't matter as long as the UEFI designers aren't aware of it :)

(Who should have made whom aware, ie. Linux people approaching UEFI
people, or UEFI people exploring Linux, is a separate topic. As always
I'm apolitical about UEFI; I'm not arguing for it or against it. My
feeble efforts for improving OVMF and interfacing code are motivated by
my employer, not my world view, but as a side-effect of working with the
code I can't help but notice some nice things in edk2 and appreciate
them :))

 I wouldn't wonder if we f*cked it up again like the last time. I'll give
 it a long hard look.

 Ah sorry, by and you guys suspect I didn't mean to imply anything
 between the lines, I was simply trying to ascertain your working idea :)
 
 As long as we get to the bottom of this, we're all fine. And I'd
 pretty much expect everyone who is dealing with EFI to have grown a
 sufficiently thick skin before starting to do so, so don't worry.
 
 :-)

This is a unique opportunity for me to point the following. (Unique
because it wasn't me bringing up the thick skin thing :)) My skin is
*very thin*. It's not even there, you could say. So, if I mess up,
please don't insult me. (As explained before, my own language above
wasn't even tongue-in-cheek.) Insult my code or my analysis pls.

BTW there's another point I'd like to ask about -- you're saying you see
the region corruption during the same boot, from the first (early)
memmap dump to the second one (when just about to enter virtual mode).
But, is this one boot the very first boot, or the kexec one?

Thanks!
Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread James Bottomley
On Mon, 2013-08-05 at 17:15 +0200, Laszlo Ersek wrote:
 On 08/05/13 16:40, Borislav Petkov wrote:
  On Mon, Aug 05, 2013 at 04:27:44PM +0200, Laszlo Ersek wrote:
  I wouldn't call the design of SetVirtualAddressMap() braindead.
  
  Ok, I've always wondered and you could probably shed some light on the
  matter: why is SetVirtualAddressMap() a call-once only? Why can't I
  simply call it again and update the mappings?
 
 The current implementation (how pointers are converted) probably doesn't
 accommodate a second call.

Having actually looked at the code (trying to find why we were getting
an unconverted pointer), I second that.  However, the ugliness of the
massive pointer chase should have been an indication that something was
not quite right architecturally (or implementation wise) with
SetVirtualAddressMap().

 Of course you want to know why SetVirtualAddressMap() was designed like
 that... I didn't participate in the design so I don't know :)
 
 But, as I said, a kernel directly executing another kernel is an
 unexpected idea. IMHO the second kernel in question doesn't fit the UEFI
 phases at all. The OS booted like that (ie. the OS whose kernel is the
 2nd (=kexec) kernel) never goes through SEC, PEI, DXE, BDS.

That thinking is a bit last century (not that I'm blaming you for it, it
seems to be ingrained in the way UEFI sometimes goes about things) ...
in the old days, DOS was bootstrapped by the 512 byte jump code in a
well known sector.  In the current century, almost every OS is
bootstrapped by a sophisticated loader, which is effectively another OS
(if you don't believe this, try looking at the grub source code one
day); it's a short step from this to one OS booting another, and that's
really what kexec is.  The utility of kexec has proven itself over the
past couple of decades or so by allowing us to dump (kexec to a dump
kernel), short circuit the boot process (simply re-kexec the kernel on
crash) and now do rebootless upgrades (checkpoint the userspace and
kexec to the new kernel).  It's not even unique to Linux: Solaris used a
hidden kexec system call to do live upgrades as well and I believe
several other UNIXs have this feature.

James


--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
 I didn't realize the timestamps survive kexec. (As far as I remember
 the kernels I played with kexec on didn't have the automatic
 timestamps yet in dmesg, but I might have messed up just as well...)

No, no, no, kexec is not involved at all.

Here's the whole dmesg up until efi_enter_virtual_map. When we have entered
efi_enter_virtual_mode, the region has changed from

[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)

to

[0.023004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)


And yes, I still need to audit whether the kernel actually does that
change. I'm still looking...


[=3h[=3h[=3h[=3h[=3h[=3h[=3hearly
 console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.10.0-rc7+ (boris@nazgul) (gcc version 4.7.3 
(Debian 4.7.3-4) ) #9 SMP PREEMPT Mon Aug 5 16:27:00 CEST 2013
[0.00] Command line: root=/dev/sda1 debug ignore_loglevel 
log_buf_len=10M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009] usable
[0.00] BIOS-e820: [mem 0x0010-0x7e667fff] usable
[0.00] BIOS-e820: [mem 0x7e668000-0x7e691fff] reserved
[0.00] BIOS-e820: [mem 0x7e692000-0x7fb11fff] usable
[0.00] BIOS-e820: [mem 0x7fb12000-0x7fb69fff] reserved
[0.00] BIOS-e820: [mem 0x7fb6a000-0x7fb71fff] ACPI data
[0.00] BIOS-e820: [mem 0x7fb72000-0x7fb75fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7fb76000-0x7ffd] usable
[0.00] BIOS-e820: [mem 0x7ffe-0x7fff] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] bootconsole [earlyser0] enabled
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range=[0x7e55d000-0x7e59c000) (0MB)
[0.00] efi: mem15: type=4, attr=0xf, 
range=[0x7e59c000-0x7e5a) (0MB)
[0.00] efi: mem16: type=3, attr=0xf, 
range=[0x7e5a-0x7e668000) (0MB)
[0.00] efi: mem17: type=5, attr=0x800f, 
range=[0x7e668000-0x7e67d000) (0MB)
[0.00] efi: mem18: type=6, attr=0x800f, 
range=[0x7e67d000-0x7e692000) (0MB)
[0.00] efi: mem19: type=4, attr=0xf, 
range=[0x7e692000-0x7f992000) (19MB)
[0.00] efi: mem20: type=7, attr=0xf, 
range=[0x7f992000-0x7f994000) (0MB)
[0.00] efi: mem21: type=3, attr=0xf, 
range=[0x7f994000-0x7fb12000) (1MB)
[0.00] efi: mem22: type=5, attr=0x800f, 
range=[0x7fb12000-0x7fb42000) (0MB)
[0.00] efi: mem23: type=6, attr=0x800f, 
range=[0x7fb42000-0x7fb66000) (0MB)
[0.00] efi: mem24: type=0, attr=0xf, 
range=[0x7fb66000-0x7fb6a000) (0MB)
[0.00] efi: mem25: type=9, attr=0xf, 

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Andrew Fish

On Aug 5, 2013, at 7:40 AM, Borislav Petkov b...@alien8.de wrote:

 On Mon, Aug 05, 2013 at 04:27:44PM +0200, Laszlo Ersek wrote:
 I wouldn't call the design of SetVirtualAddressMap() braindead.
 
 Ok, I've always wondered and you could probably shed some light on the
 matter: why is SetVirtualAddressMap() a call-once only? Why can't I
 simply call it again and update the mappings?
 
 I'd rather call kexec unique and somewhat unexpected :)
 
 In all fairness, it was there before UEFI, AFAICT.
 

AFAICT EFI pre-dates kexec merge into mainline by a number of years as 
SetVirtualaddressMap() was part of EFI 1.0 (previous millennium) 

The EFI to UEFI conversion was placing EFI 1.10  into an industry standard, 
UEFI 2.0. UEFI is an industry standard so some one just needs to make a 
proposal to update the spec. The edk2 open source project is not part of the 
standards body so complaining on this mailing list is not going to get anything 
changed. 

The conversion of C code to run from address A to address B is a non trivial 
operation, and a single conversion is bad enough. The infrastructure code 
required to do the conversion from physical to virtual addressing currently 
only runs from physical mode, so a call to change virtual address mappings from 
virtual mode is more complex than the current scheme. 
In general you don't want complexity in the locked NOR FLASH of the platform 
that can only be updated by the platform vendor. Even if the platform firmware 
is easy to update you want to have complexity in the OS as it is easier to 
change and easier to get right.

Thanks,

Andrew Fish 

 I wouldn't wonder if we f*cked it up again like the last time. I'll give
 it a long hard look.
 
 Ah sorry, by and you guys suspect I didn't mean to imply anything
 between the lines, I was simply trying to ascertain your working idea :)
 
 As long as we get to the bottom of this, we're all fine. And I'd
 pretty much expect everyone who is dealing with EFI to have grown a
 sufficiently thick skin before starting to do so, so don't worry.
 
 :-)
 
 -- 
 Regards/Gruss,
Boris.
 
 Sent from a fat crate under my desk. Formatting is fine.
 --
 
 --
 Get your SQL database under version control now!
 Version control is standard for application code, but databases havent 
 caught up. So what steps can you take to put your SQL databases under 
 version control? Why should you start doing it? Read more to find out.
 http://pubads.g.doubleclick.net/gampad/clk?id=49501711iu=/4140/ostg.clktrk
 ___
 edk2-devel mailing list
 edk2-de...@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/edk2-devel

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [edk2] Corrupted EFI region

2013-08-05 Thread Kinney, Michael D
Boris,

A memory map entry with zero size does not look right to me.

The memory map passed into SetVirtualAddressMap() must contain the exact same 
set of memory map entries that existed when ExitBootServices() was called with 
a return result of EFI_SUCCESS.

When you are showing comparisons of memory maps, are you showing the 
ExitBootServices() one and the SeVirtualAddressMap() one?  If the memory maps 
are not identical, then somehow the memory map is being modified, and we need 
to figure that out.

If the ExitBootServices() memory map has the zero sized entry, then we need to 
see how GetMemoryMap() is returning a zero sized entry.  It is not clear that a 
zero sized entry would actually break anything, but it is a good idea to root 
cause that issue and make sure those types of memory map entries are not pass 
from the FW to the OS.

Thanks,

Mike


-Original Message-
From: Borislav Petkov [mailto:b...@alien8.de] 
Sent: Monday, August 05, 2013 9:48 AM
To: Laszlo Ersek
Cc: linux-efi@vger.kernel.org; Gleb Natapov; edk2-de...@lists.sourceforge.net; 
lkml; David Woodhouse
Subject: Re: [edk2] Corrupted EFI region

On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
 I didn't realize the timestamps survive kexec. (As far as I remember
 the kernels I played with kexec on didn't have the automatic
 timestamps yet in dmesg, but I might have messed up just as well...)

No, no, no, kexec is not involved at all.

Here's the whole dmesg up until efi_enter_virtual_map. When we have entered
efi_enter_virtual_mode, the region has changed from

[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)

to

[0.023004] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0ad000) (0MB)


And yes, I still need to audit whether the kernel actually does that
change. I'm still looking...


[=3h[=3h[=3h[=3h[=3h[=3h[=3hearly
 console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.10.0-rc7+ (boris@nazgul) (gcc version 4.7.3 
(Debian 4.7.3-4) ) #9 SMP PREEMPT Mon Aug 5 16:27:00 CEST 2013
[0.00] Command line: root=/dev/sda1 debug ignore_loglevel 
log_buf_len=10M earlyprintk=ttyS0,115200 console=ttyS0,115200 console=tty0
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009] usable
[0.00] BIOS-e820: [mem 0x0010-0x7e667fff] usable
[0.00] BIOS-e820: [mem 0x7e668000-0x7e691fff] reserved
[0.00] BIOS-e820: [mem 0x7e692000-0x7fb11fff] usable
[0.00] BIOS-e820: [mem 0x7fb12000-0x7fb69fff] reserved
[0.00] BIOS-e820: [mem 0x7fb6a000-0x7fb71fff] ACPI data
[0.00] BIOS-e820: [mem 0x7fb72000-0x7fb75fff] ACPI NVS
[0.00] BIOS-e820: [mem 0x7fb76000-0x7ffd] usable
[0.00] BIOS-e820: [mem 0x7ffe-0x7fff] reserved
[0.00] debug: ignoring loglevel setting.
[0.00] bootconsole [earlyser0] enabled
[0.00] NX (Execute Disable) protection: active
[0.00] efi: EFI v2.31 by EDK II
[0.00] efi:  ACPI=0x7fb71000  ACPI 2.0=0x7fb71014 
[0.00] efi: mem00: type=7, attr=0xf, 
range=[0x-0x0009f000) (0MB)
[0.00] efi: mem01: type=2, attr=0xf, 
range=[0x0009f000-0x000a) (0MB)
[0.00] efi: mem02: type=7, attr=0xf, 
range=[0x0010-0x0080) (7MB)
[0.00] efi: mem03: type=4, attr=0xf, 
range=[0x0080-0x0100) (8MB)
[0.00] efi: mem04: type=7, attr=0xf, 
range=[0x0100-0x0200) (16MB)
[0.00] efi: mem05: type=2, attr=0xf, 
range=[0x0200-0x036e3000) (22MB)
[0.00] efi: mem06: type=7, attr=0xf, 
range=[0x036e3000-0x3fffb000) (969MB)
[0.00] efi: mem07: type=2, attr=0xf, 
range=[0x3fffb000-0x4000) (0MB)
[0.00] efi: mem08: type=7, attr=0xf, 
range=[0x4000-0x7c00) (960MB)
[0.00] efi: mem09: type=4, attr=0xf, 
range=[0x7c00-0x7c02) (0MB)
[0.00] efi: mem10: type=7, attr=0xf, 
range=[0x7c02-0x7e0ad000) (32MB)
[0.00] efi: mem11: type=4, attr=0xf, 
range=[0x7e0ad000-0x7e0cc000) (0MB)
[0.00] efi: mem12: type=7, attr=0xf, 
range=[0x7e0cc000-0x7e0cd000) (0MB)
[0.00] efi: mem13: type=4, attr=0xf, 
range=[0x7e0cd000-0x7e55d000) (4MB)
[0.00] efi: mem14: type=3, attr=0xf, 
range

Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 18:47, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 06:41:20PM +0200, Laszlo Ersek wrote:
 I didn't realize the timestamps survive kexec. (As far as I remember
 the kernels I played with kexec on didn't have the automatic
 timestamps yet in dmesg, but I might have messed up just as well...)
 
 No, no, no, kexec is not involved at all.

I understand. I just explained why I could not derive that fact from the
timestamps. You said,

 No, kexec is not even involved yet. If you look at the timestamps,
 there's 0.005 seconds between the two dumps during the *same* kernel
 booting on the machine, baremetal, straight from grub.

There are four memmap dumps:

(1) first boot, initial dump,
(2) first boot, dump when entering virtual mode,
(3) kexec boot, initial dump,
(4) kexec boot, dump when entering virtual mode.

I was aware that we were discussing a problem either between (1) and
(2), *or* between (3) and (4); I just didn't know inside which pair.

I misunderstood your reply and thought that you were implying the
(1)+(2) pair by the low absolute timestamps. I assumed that (3)+(4)
would print low timestamps as well (due to the time offset starting from
zero in the kexec kernel too) and took your message as a correction to
that idea. But, you didn't say anything about the magnitude of the
timestamps, only about the differences between them.

Sorry for the noise, it's clear now that we're looking at (1)-(2).

Thanks
Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 08:50:17AM -0700, Andrew Fish wrote:
 AFAICT EFI pre-dates kexec merge into mainline by a number of years as
 SetVirtualaddressMap() was part of EFI 1.0 (previous millennium)

Ok, fair enough.

 The EFI to UEFI conversion was placing EFI 1.10 into an industry
 standard, UEFI 2.0. UEFI is an industry standard so some one just
 needs to make a proposal to update the spec. The edk2 open source
 project is not part of the standards body so complaining on this
 mailing list is not going to get anything changed.

Right, I don't think that even changing the spec would help - it would
actually make things worse because then we'd have to differentiate
between UEFI versions: those which can do SetVirtualaddressMap() more
than once and the older ones.

So let's drop the discussion here - it is what it is, it is too late to
change anything. At least we talked about it. :-)

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 18:47, Borislav Petkov wrote:

 Here's the whole dmesg up until efi_enter_virtual_map. When we have entered
 efi_enter_virtual_mode, the region has changed from
 
 [0.00] efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0cc000) (0MB)
 
 to
 
 [0.023004] efi: mem11: type=4, attr=0xf, 
 range=[0x7e0ad000-0x7e0ad000) (0MB)
 
 
 And yes, I still need to audit whether the kernel actually does that
 change. I'm still looking...

The following is a long shot, but I have no better idea for now.

Normally the following relevant sequence of calls are made to UEFI services:
(a) GetMemoryMap() -- returns memory map and map key,
(b) ExitBootServices() -- takes map key
(c) SetVirtualAddressMap() -- takes memory map (completed with virtual
addresses)

((a)+(b) can be repeated if (b) fails, and Linux seems to retry once.)

Now see Linux commit


http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=916f676f

by Matthew. If I understand correctly, it introduces the function
efi_reserve_boot_services(). Normally, immediately after a successful
(b) -- ExitBootServices() -- one should be allowed to free boot services
code and data. However (c) itself -- SetVirtualAddressMap() -- seems to
depend on boot services code and data in some firmware implementations
(probably violating the spec). Therefore this commit keeps boot services
code and data around long enough for SetVirtualAddressMap(), and
releases them after.

I *think* efi_reserve_boot_services() runs between (b) and (c), that is,
after the initial EFI memmap dump, and before efi_enter_virtual_mode()
does its thing (ie. before your debug memmap dump is executed there):

efi_main() [arch/x86/boot/compressed/eboot.c]
  exit_boot()
-- covers (a) and (b)

start_kernel() [init/main.c]
  setup_arch() [arch/x86/kernel/setup.c]
efi_memblock_x86_reserve_range() [arch/x86/platform/efi/efi.c]
efi_reserve_boot_services() [arch/x86/platform/efi/efi.c]
  efi_enter_virtual_mode() [arch/x86/platform/efi/efi.c]
-- covers (c)

That is, efi_reserve_boot_services() is called in a place where it can
potentially alter the EFI memmap between the two dumps.

(I only display efi_memblock_x86_reserve_range() in the callstack above
for completeness; I'll refer back to it lower down.)

Now look at Linux commit


http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=7d68dc3f

This commit changes efi_reserve_boot_services() -- it restricts the
function to reserve the boot services code  data only under some
circumstances. If those don't hold, then:

  md-num_pages = 0;

Which I think is exactly the source of the region being truncated to
zero size.

(memmap.phys_map is set to the EFI memory map in
efi_memblock_x86_reserve_range(), see the above partial callstack, and
memmap.map is pointed at memmap.phys_map in efi_memmap_init().
efi_reserve_boot_services() iterates over memmap.map, so we can say it
modifies the EFI memory map.)

Granted, memblock_dbg() is called too if num_pages is reset, and the
message it prints is not included in your dmesg. However I think that
could be explained by memblock_debug==0 [include/linux/memblock.h].

What happens if you pass memblock=debug on the kernel command line
(see early_memblock() in mm/memblock.c)?

(I just tried it in my Fedora 19 guest, and it in fact produced the message

[0.00] efi: Could not reserve boot range [0x80-0xff]

)


BTW, regarding Michael's answer, I think this is just one of several
ways in which Linux manipulates the EFI memmap between (b) and (c). For
example it seems to merge ranges in the map.

Thanks,
Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread H. Peter Anvin
On 08/05/2013 11:12 AM, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 08:50:17AM -0700, Andrew Fish wrote:
 AFAICT EFI pre-dates kexec merge into mainline by a number of years as
 SetVirtualaddressMap() was part of EFI 1.0 (previous millennium)
 
 Ok, fair enough.
 
 The EFI to UEFI conversion was placing EFI 1.10 into an industry
 standard, UEFI 2.0. UEFI is an industry standard so some one just
 needs to make a proposal to update the spec. The edk2 open source
 project is not part of the standards body so complaining on this
 mailing list is not going to get anything changed.
 
 Right, I don't think that even changing the spec would help - it would
 actually make things worse because then we'd have to differentiate
 between UEFI versions: those which can do SetVirtualaddressMap() more
 than once and the older ones.
 
 So let's drop the discussion here - it is what it is, it is too late to
 change anything. At least we talked about it. :-)
 

All of this would be a non-problem if there weren't buggy
implementations which can't run *without* SetVirtualAddressMap().

-=hpa


--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 02:37:08PM -0700, H. Peter Anvin wrote:
 All of this would be a non-problem if there weren't buggy
 implementations which can't run *without* SetVirtualAddressMap().

Oh, you mean, if we were to call the runtime services through their
physical addresses?

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread H. Peter Anvin
On 08/05/2013 02:41 PM, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 02:37:08PM -0700, H. Peter Anvin wrote:
 All of this would be a non-problem if there weren't buggy
 implementations which can't run *without* SetVirtualAddressMap().
 
 Oh, you mean, if we were to call the runtime services through their
 physical addresses?
 

Yes.  It is supposed to work, but at least on some Apple machines it
triggers bugs.

-hpa

--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Laszlo Ersek
On 08/05/13 23:41, Borislav Petkov wrote:
 On Mon, Aug 05, 2013 at 02:37:08PM -0700, H. Peter Anvin wrote:
 All of this would be a non-problem if there weren't buggy
 implementations which can't run *without* SetVirtualAddressMap().
 
 Oh, you mean, if we were to call the runtime services through their
 physical addresses?

I heard that there was a (U)EFI firmware implementation that didn't even
implement SetVirtualAddressMap(). It was okay because the main OS for
that platform didn't want to call it, it thunked to physical mode for
each runtime service call.

(This is not hearsay; I'm omitting the specifics because I'm not sure if
I'm allowed to give any. I've heard about this stuff from a direct
colleague who used to work on these systems.)

Laszlo
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread Borislav Petkov
On Mon, Aug 05, 2013 at 11:26:46PM +0200, Laszlo Ersek wrote:
 What happens if you pass memblock=debug on the kernel command line
 (see early_memblock() in mm/memblock.c)?
 
 (I just tried it in my Fedora 19 guest, and it in fact produced the message
 
 [0.00] efi: Could not reserve boot range [0x80-0xff]

Note to self: Always look for bugs in Linux' UEFI code first, before
going anywhere else!

Yes, very good analysis and good job Laszlo!

I'll write what I see now but will doublecheck it tomorrow because I'm
almost half asleep.

[0.00] efi: efi_reserve_boot_services:  - start: 0x7e0ad000, size: 
0x1f000
[0.00] efi: Could not reserve boot range [0x007e0ad000-0x007e0cbfff]

And yes, this fails because memblock_is_region_reserved(start, size)
returns true.

And why is that:

[0.00] memblock_reserve: [0x00036be000-0x00036c3000] 
setup_arch+0x60e/0xa63
[0.00] MEMBLOCK configuration:
[0.00]  memory size = 0x7fef1000 reserved size = 0x1724570
[0.00]  memory.cnt  = 0x4
[0.00]  memory[0x0] [0x001000-0x09], 0x9f000 
bytes
[0.00]  memory[0x1] [0x10-0x007e667fff], 0x7e568000 
bytes
[0.00]  memory[0x2] [0x007e692000-0x007fb11fff], 0x148 
bytes
[0.00]  memory[0x3] [0x007fb76000-0x007ffd], 0x46a000 
bytes
[0.00]  reserved.cnt  = 0x3
[0.00]  reserved[0x0]   [0x09f000-0x0f], 0x61000 
bytes
[0.00]  reserved[0x1]   [0x000200-0x00036c2fff], 0x16c3000 
bytes
[0.00]  reserved[0x2]   [0x007e0ad018-0x007e0ad587], 0x570 bytes
^

There are 0x570 bytes right in this region which are memblock-reserved
and so we truncate it in efi_reserve_boot_services().

This makes me say words which will offend this list so I'll instead go
out on the balcony and wake up the neighbors. :-)

Ok, thanks again for finding it, I'll go and try to figure out the whole
mess tomorrow.

Good night!

 BTW, regarding Michael's answer, I think this is just one of several
 ways in which Linux manipulates the EFI memmap between (b) and (c).
 For example it seems to merge ranges in the map.

Yes, it does so in efi_enter_virtual_mode(). That was my initial
suspicion, that's why I dumped the regions before the merging.

Thanks.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [edk2] Corrupted EFI region

2013-08-05 Thread James Bottomley
On Mon, 2013-08-05 at 23:55 +0200, Laszlo Ersek wrote:
 On 08/05/13 23:41, Borislav Petkov wrote:
  On Mon, Aug 05, 2013 at 02:37:08PM -0700, H. Peter Anvin wrote:
  All of this would be a non-problem if there weren't buggy
  implementations which can't run *without* SetVirtualAddressMap().
  
  Oh, you mean, if we were to call the runtime services through their
  physical addresses?
 
 I heard that there was a (U)EFI firmware implementation that didn't even
 implement SetVirtualAddressMap(). It was okay because the main OS for
 that platform didn't want to call it, it thunked to physical mode for
 each runtime service call.
 
 (This is not hearsay; I'm omitting the specifics because I'm not sure if
 I'm allowed to give any. I've heard about this stuff from a direct
 colleague who used to work on these systems.)

That's actually the way all non-x86 unix systems operate.  If you look
in the firmware mechanisms for almost every non-x86 system in the Linux
kernel architecture directories they do this if they have to access
firmware from Linux (we do it a lot on parisc to get the IODC to give us
the device inventory for instance).

I strongly suspect the origin of this weirdness is that once upon a time
windows didn't run with a separated address space and so needed a way of
accessing firmware in the same address space, hence the pointer
relocation trick, but even windows hasn't needed this for a while.

James


--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Corrupted EFI region

2013-07-31 Thread Borislav Petkov
Hi guys,

so I'm seeing this funny thing where an EFI region changes when we enter
efi_enter_virtual_mode when booting with edk2 on kvm. Here's the diff:

--- before  2013-07-31 22:20:52.316039492 +0200
+++ after   2013-07-31 22:21:30.960731706 +0200
@@ -9,7 +9,7 @@ efi: mem07: type=2, attr=0xf, range=[0x0
 efi: mem08: type=7, attr=0xf, range=[0x4000-0x7c00) 
(960MB)
 efi: mem09: type=4, attr=0xf, range=[0x7c00-0x7c02) 
(0MB)
 efi: mem10: type=7, attr=0xf, range=[0x7c02-0x7e0ad000) 
(32MB)
-efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0cc000) 
(0MB)
+efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0ad000) 
(0MB)
 efi: mem12: type=7, attr=0xf, range=[0x7e0cc000-0x7e0cd000) 
(0MB)
 efi: mem13: type=4, attr=0xf, range=[0x7e0cd000-0x7e55d000) 
(4MB)
 efi: mem14: type=3, attr=0xf, range=[0x7e55d000-0x7e59c000) 
(0MB)

That second boundary of region mem11 suddenly changes *before* we merge
the regions. edk2 bug?

Whole dmesg attached.

-- 
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--


test-x86_64.log.gz
Description: Binary data


Re: Corrupted EFI region

2013-07-31 Thread Matthew Garrett
On Wed, Jul 31, 2013 at 10:54:31PM +0200, Borislav Petkov wrote:

  efi: mem08: type=7, attr=0xf, range=[0x4000-0x7c00) 
 (960MB)
  efi: mem09: type=4, attr=0xf, range=[0x7c00-0x7c02) 
 (0MB)
  efi: mem10: type=7, attr=0xf, range=[0x7c02-0x7e0ad000) 
 (32MB)
 -efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0cc000) 
 (0MB)
 +efi: mem11: type=4, attr=0xf, range=[0x7e0ad000-0x7e0ad000) 
 (0MB)
  efi: mem12: type=7, attr=0xf, range=[0x7e0cc000-0x7e0cd000) 
 (0MB)
  efi: mem13: type=4, attr=0xf, range=[0x7e0cd000-0x7e55d000) 
 (4MB)
  efi: mem14: type=3, attr=0xf, range=[0x7e55d000-0x7e59c000) 
 (0MB)

Are we making any EFI calls in between? I certainly wouldn't expect the 
memory map to change after ExitBootServices, but up until that point the 
firmware's free to mess with it.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Corrupted EFI region

2013-07-31 Thread Matthew Garrett
On Wed, Jul 31, 2013 at 11:51:30PM +0200, Borislav Petkov wrote:

 But the problem is, something messes up the upper boundary of the region
 and it is an EFI_BOOT_SERVICES_DATA region which we need for the runtime
 services mapping and if we can't map it properly, we're probably going
 to miss functionality or not have runtime at all.

Easiest way around this would probably be to stash the address map 
after ExitBootServices() and compare it at SetVirtualAddressMap() time, 
then take the widest boundaries and trim the e820 map to match. This is 
obviously dependent upon the system not allocating anything further 
after that, but it seems safest. The worst case is finding the firmware 
writing over bits of the kernel.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line unsubscribe linux-efi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Corrupted EFI region

2013-07-31 Thread David Woodhouse
On Wed, 2013-07-31 at 22:54 +0200, Borislav Petkov wrote:
 so I'm seeing this funny thing where an EFI region changes when we enter
 efi_enter_virtual_mode when booting with edk2 on kvm. Here's the diff:

Perhaps the edk2-de...@lists.sourceforge.net list should be in Cc?

-- 
dwmw2



smime.p7s
Description: S/MIME cryptographic signature