RE: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-05 Thread Prakhya, Sai Praneeth
> > Although majority of the changes are made to
> > drivers/firmware/efi/memmap.c file (which is common across
> > architectures), this bug is only limited to x86_64 machines and hence this 
> > patch
> set shouldn't effect any other architectures.
> 
> I will give this a test run on the arm64 test hardware available at my end. I 
> have
> shared some minor comments on the patchset - see respective patch threads for
> the same.

Thanks a lot! for the comments  and taking time to test the patches :)
Really appreciate your effort :)

Regards,
Sai


RE: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-05 Thread Prakhya, Sai Praneeth
> Since we're being pedantic, it also makes sense to decide now whether 'area'
> refers to all [discontiguous] regions or just one of them. I'd say the 
> former, and
> use 'region' for the latter, i.e., an area may be made up of several regions, 
> but
> only one exists of each type.
> 
> > Parameters and local variables should be named unambiguously following
> > these concepts as well. There should be no opaque 'new' variables
> > *especially* where the primary role of the efi_memmap_insert() API
> > call is to add a new *entry* - it should be 'new_efi_map_table' (or
> > 'new_table' where it's unambiguous) instead, and new entries added
> > should be 'new_entry' - etc.
> >
> > Once this is reorganized and clarified it should be a lot more easy to
> > review 'table freeing' patches. I can help out with patches if there's
> > no conceptual objections.

I have a question,

As this table freeing patches do some cleanup of some efi functions, namely, 
efi_clean_memmap(), efi_fake_memmap(), efi_arch_mem_reserve(), 
efi_memmap_install() and __efi_enter_virtual_mode(), I am not very sure if 
you want to do the name changing now and later review the patches or review the 
patches and later do the name changing.

As you pointed out that reviewing them now is pain, which I agree, but my 
concern is 
you might be looking at code and might as well change names that might 
eventually 
be gone because of these table freeing patches.

Regards,
Sai


Re: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-05 Thread Ingo Molnar


* Ard Biesheuvel  wrote:

> So if we are going to rename these things wholesale (which is fine
> with me), I'd prefer it if we can drop the 'map' entirely.
> 
> EFI memory table
> EFI memory table entry
> EFI memory table descriptor

So if you don't actively hate the idea (which you don't seem to :) I can 
cook up a couple of patches and send the series out as an RFC - I think 
such reorganization is much better seen through the lens of an end 
result.

I can then iterate/rebase the reorganization with low overhead, without 
committing them permanently yet.

> For things described by the EFI memory table, we need to be more
> specific anyway, since it typically describes everything, so
> 
> EFI boot services area
> EFI runtime services area
> EFI reserved area
> EFI conventional memory area
> EFI MMIO area
> 
> etc etc

Yeah - that's the analogue of e820_entry->type, right?

> Since we're being pedantic, it also makes sense to decide now whether
> 'area' refers to all [discontiguous] regions or just one of them. I'd
> say the former, and use 'region' for the latter, i.e., an area may be
> made up of several regions, but only one exists of each type.

Makes sense.

Just to make clear I got it right: AFAICS we have 'struct 
efi_boot_memmap' that describes all of these areas, and boot_memmap->map 
is a linear array of efi_memory_desc_t entries, where each entry 
describes a contiguous range of memory of a given type, correct?

On that data structure level the notion of an 'area' doesn't really come 
up, which would be all the (discontiguous) entries of a given type. What 
we have is the 'map', the table, and the entries of the table.

( Regarding being pedantic: reading efi_insert_memmap() hit my 
  pain+confusion threshold, and I think it's better to follow a 
  hysteresis curve: i.e. instead of minimally moving it just below the 
  pain threshold we might as well bite the bullet and be really pedantic 
  and systematic about all of this in a single quick pass. That batches 
  it all up and is also generally a lot more fun and productive to 
  execute, as the end result will be a visible improvement (hopefully).
  We did that with the FPU code and the e820 code recently and it was 
  rather successful. )

Also, would it make sense to actually turn the table into a real C array, 
while taking the "->desc_size != sizeof(efi_memory_desc_t)" future 
expansion compatibility constraint into account:

 - A single, quick bootstrap step allocates a proper array with known
   entry sizes.

 - mem_table->entries[i] would thus actully exist on the C level, with a
   mem_table->nr_entries limit - very similar to the e820 table code.

 - This would simplify the code at the expense of a single bootstrap
   function that does the allocation and copying from EFI-format table to
   Linux-format table.

 - By all means efi_early_memdesc_ptr() and for_each_efi_memory_desc*() 
   are wrappers around a linear array with entry sizes unknown at build 
   time - that complication would largely go away.

Does the EFI runtime have any notion of the OS *modifying* the memory map 
and then the runtime relying on those modifications, or is it in practice 
treated as a read-only table?

If we can make it a real array on the C level I was also hoping to sneak 
in the concept of an 'entry' somewhere, so that readers of the simple 
e820 table/entry logic would feel at home in the EFI code as well. Could 
we perhaps name 'descriptor' as 'entry', instead of 'region'?

I.e. on the data structure level we'd have a clear hierarchy of 
table/entries/entry:

   efi_mem_table:
->entries[i]
->nr_entries

where 'entry->type' distinguishes the different areas.

It's all pretty straightforward and would allow some unification of 
utility functions with the e820 code perhaps.

In fact if we do that then we could probably rename:

struct e820_table  =>  struct x86_mem_table
struct e820_entry  =>  struct x86_mem_entry

... and extend x86_mem_entry with an 'attr' field?

That way all the information stored in the EFI entry could be stored in 
an x86_mem_entry - and the EFI code (and the e820 code) could use this 
more generic table/entry data structure directly and deal with the raw 
underlying data types as little as possible.

The e820 code already has a decoupled 'struct boot_e820_entry' data type 
for the boot protocol ABI, so I think 'e820_entry' could be extended and 
made generic - at least in principle.

There's a number of high level advantages from doing that:

 - Right now we put the EFI memory table entries into an e820 table and
   then feed that to the memblock allocator. This is a mild conceptual 
   layering violation: EFI per definition has nothing to do with E820 
   BIOS calls. The e820_table/e820_entry is the de facto 'generic' entry 
   already, and the naming should now reflect that I believe.

 - There's 23 uses of for_each_efi_memory_desc*(), so the simplification
   would be measurable even in low 

Re: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-05 Thread Bhupesh Sharma

Hi Sai,

Thanks for Cc'ing me on this patchset.

On 12/05/2018 06:03 AM, Sai Praneeth Prakhya wrote:

Presently, in EFI subsystem of kernel, every time kernel allocates memory for a
new EFI memory map, it forgets to free the memory occupied by old EFI memory 
map.
It does clear the mappings though (using efi_memmap_unmap()), but forgets to
free up the memory. Also, there is another minor issue, where in the newly
allocated memory isn't freed, should remap fail.

The first issue is addressed by adding efi_memmap_free() to efi_memmap_install()
and the second issue is addressed by pushing the remap code into
efi_memmap_alloc() and there by handling the failure condition.

Memory allocated to EFI memory map is leaked in below functions and hence they
are modified to fix the issue. Functions that modify EFI memmap are:
1. efi_clean_memmap(),
2. efi_fake_memmap(),
3. efi_arch_mem_reserve(),
4. efi_free_boot_services(),
5. and __efi_enter_virtual_mode()

More detailed explanation:
--
A typical boot flow on EFI supported x86_64 machines might look something like
below
1. EFI memory map is passed by firmware to kernel.
2. Kernel does a memblock_reserve() on this memory
(see efi_memblock_x86_reserve_range()).
3. This memory map is checked for invalid entries in efi_clean_memmap(). If any
invalid entries are found, they are omitted from EFI memory map but the
memory occupied by these invalid EFI memory descriptors isn't freed.
3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init()
and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and
copies the processed memory map to newly allocated memory but it forgets to
free memory occupied by old EFI memory map.
4. Further, in efi_map_regions() the EFI memory map is processed again to
include only EFI memory descriptors of type Runtime Code/Data and Boot
Code/Data. Again, memory is allocated for this new memory map through
realloc_pages() and the old EFI memory map is not freed.
5. After SetVirtualAddressMap() is done, the EFI memory map is processed again
to have only EFI memory descriptors of type Runtime Code/Data. Again, memory
is allocated for this new memory map through efi_memmap_alloc() and the old
EFI memory map is not freed.

Testing:

Tested with LUV on qemu-x86_64 and on my dev machine. Checked for unchanged boot
behavior i.e. shouldn't break any existing stuff. Built for arm, arm64 and ia64
and found no new warnings/errors. Would appreciate the effort if someone could
test on arm machines.

Although majority of the changes are made to drivers/firmware/efi/memmap.c file
(which is common across architectures), this bug is only limited to x86_64
machines and hence this patch set shouldn't effect any other architectures.


I will give this a test run on the arm64 test hardware available at my 
end. I have shared some minor comments on the patchset - see respective 
patch threads for the same.


Thanks,
Bhupesh




Notes:
--
1. This patch set is based on EFI tree's "next" branch [1].
2. This patch set is an outcome of the discussion at [2].

Changes from V1:

1. Drop passing around allocation type from efi_memmap_alloc(), instead change
efi_memmap_alloc() such that it now returns a populated struct 
efi_memory_map
2. Drop fixing issues in efi_fake_memmap(), will be addressed in a separate 
patch.
3. Optimize efi_map_regions().

[1] git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git
[2] https://lkml.org/lkml/2018/7/2/1095

Sai Praneeth Prakhya (3):
   efi: Introduce efi_memmap_free() and efi_memmap_unmap_and_free()
   x86/efi: Fix EFI memory map leaks
   x86/efi: Use efi_memmap_() to create runtime EFI memory
 map

  arch/x86/include/asm/efi.h  |   3 +-
  arch/x86/kernel/setup.c |   6 +
  arch/x86/platform/efi/efi.c | 141 +---
  arch/x86/platform/efi/efi_32.c  |   2 +-
  arch/x86/platform/efi/efi_64.c  |   7 +-
  arch/x86/platform/efi/quirks.c  |  45 ++--
  drivers/firmware/efi/arm-init.c |   2 +-
  drivers/firmware/efi/fake_mem.c |  21 +---
  drivers/firmware/efi/memmap.c   | 190 +---
  include/linux/efi.h |   8 +-
  10 files changed, 235 insertions(+), 190 deletions(-)

Signed-off-by: Sai Praneeth Prakhya 
Suggested-by: Ard Biesheuvel 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Bhupesh Sharma 
Cc: Ard Biesheuvel 





Re: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-04 Thread Ard Biesheuvel
On Wed, 5 Dec 2018 at 08:41, Ingo Molnar  wrote:
>
>
> * Sai Praneeth Prakhya  wrote:
>
> > Presently, in EFI subsystem of kernel, every time kernel allocates memory 
> > for a
> > new EFI memory map, it forgets to free the memory occupied by old EFI 
> > memory map.
> > It does clear the mappings though (using efi_memmap_unmap()), but forgets to
> > free up the memory. Also, there is another minor issue, where in the newly
> > allocated memory isn't freed, should remap fail.
> >
> > The first issue is addressed by adding efi_memmap_free() to 
> > efi_memmap_install()
> > and the second issue is addressed by pushing the remap code into
> > efi_memmap_alloc() and there by handling the failure condition.
> >
> > Memory allocated to EFI memory map is leaked in below functions and hence 
> > they
> > are modified to fix the issue. Functions that modify EFI memmap are:
> > 1. efi_clean_memmap(),
> > 2. efi_fake_memmap(),
> > 3. efi_arch_mem_reserve(),
> > 4. efi_free_boot_services(),
> > 5. and __efi_enter_virtual_mode()
> >
> > More detailed explanation:
> > --
> > A typical boot flow on EFI supported x86_64 machines might look something 
> > like
> > below
> > 1. EFI memory map is passed by firmware to kernel.
> > 2. Kernel does a memblock_reserve() on this memory
> >(see efi_memblock_x86_reserve_range()).
> > 3. This memory map is checked for invalid entries in efi_clean_memmap(). If 
> > any
> >invalid entries are found, they are omitted from EFI memory map but the
> >memory occupied by these invalid EFI memory descriptors isn't freed.
> > 3. To further process this memory map (see efi_fake_memmap(), 
> > efi_bgrt_init()
> >and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() 
> > and
> >copies the processed memory map to newly allocated memory but it forgets 
> > to
> >free memory occupied by old EFI memory map.
> > 4. Further, in efi_map_regions() the EFI memory map is processed again to
> >include only EFI memory descriptors of type Runtime Code/Data and Boot
> >Code/Data. Again, memory is allocated for this new memory map through
> >realloc_pages() and the old EFI memory map is not freed.
> > 5. After SetVirtualAddressMap() is done, the EFI memory map is processed 
> > again
> >to have only EFI memory descriptors of type Runtime Code/Data. Again, 
> > memory
> >is allocated for this new memory map through efi_memmap_alloc() and the 
> > old
> >EFI memory map is not freed.
>
> So it was only halfway through the changelog that I realized that by
> 'free the memory occupied by old EFI memory map' you didn't mean the EFI
> memory map itself (the system RAM described), but the EFI memory map
> *table*, this relatively small array of descriptors that we allocate and
> reallocate every now and then according to the limitations of the
> (largely non-existent yet) early boot memory management system...
>
> Could we please use much more precise terminology when referring to these
> entities, in changelogs and code alike? I.e.:
>
>  - 'EFI memory map' is the whole map and the contents
>
>  - 'EFI memory map table' is the table structure itself, without regard
>of its contents
>
>  - 'EFI memory map table entry/descriptor' is an entry in the table.
>
> This kind of clarity is what we are using on the e820 memory map side as
> well: we have struct e820_table and struct e820_entry.
>
> Ard, would you object to standardizing the EFI code in a similar fashion
> as well: efi_mem_table and efi_mem_entry?
>

I share your concern, but 'EFI memory map' already has an established
meaning beyond Linux, as the set of memory map entries describing the
address space to the OS. (for instance, the EFI boot service to
retrieve it is called GetMemoryMap())

I am not sure what 'the whole EFI memory map' would mean'. It
describes all of DRAM (used, unused or reserved) and on top of that,
some peripheral mappings that need to be virtually remapped by the OS
so that the runtime services can use them (e.g., RTC or flash). So
from UEFI spec pov, your notion of 'EFI memory map' is just the entire
memory map.

So if we are going to rename these things wholesale (which is fine
with me), I'd prefer it if we can drop the 'map' entirely.

EFI memory table
EFI memory table entry
EFI memory table descriptor

For things described by the EFI memory table, we need to be more
specific anyway, since it typically describes everything, so

EFI boot services area
EFI runtime services area
EFI reserved area
EFI conventional memory area
EFI MMIO area

etc etc

Since we're being pedantic, it also makes sense to decide now whether
'area' refers to all [discontiguous] regions or just one of them. I'd
say the former, and use 'region' for the latter, i.e., an area may be
made up of several regions, but only one exists of each type.

> Parameters and local variables should be named unambiguously following
> these concepts as well. There should be no opaque 'new' 

Re: [PATCH V2 0/3] Fix EFI memory map leaks

2018-12-04 Thread Ingo Molnar


* Sai Praneeth Prakhya  wrote:

> Presently, in EFI subsystem of kernel, every time kernel allocates memory for 
> a
> new EFI memory map, it forgets to free the memory occupied by old EFI memory 
> map.
> It does clear the mappings though (using efi_memmap_unmap()), but forgets to
> free up the memory. Also, there is another minor issue, where in the newly
> allocated memory isn't freed, should remap fail.
> 
> The first issue is addressed by adding efi_memmap_free() to 
> efi_memmap_install()
> and the second issue is addressed by pushing the remap code into
> efi_memmap_alloc() and there by handling the failure condition.
> 
> Memory allocated to EFI memory map is leaked in below functions and hence they
> are modified to fix the issue. Functions that modify EFI memmap are:
> 1. efi_clean_memmap(),
> 2. efi_fake_memmap(),
> 3. efi_arch_mem_reserve(),
> 4. efi_free_boot_services(),
> 5. and __efi_enter_virtual_mode()
> 
> More detailed explanation:
> --
> A typical boot flow on EFI supported x86_64 machines might look something like
> below
> 1. EFI memory map is passed by firmware to kernel.
> 2. Kernel does a memblock_reserve() on this memory
>(see efi_memblock_x86_reserve_range()).
> 3. This memory map is checked for invalid entries in efi_clean_memmap(). If 
> any
>invalid entries are found, they are omitted from EFI memory map but the
>memory occupied by these invalid EFI memory descriptors isn't freed.
> 3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init()
>and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and
>copies the processed memory map to newly allocated memory but it forgets to
>free memory occupied by old EFI memory map.
> 4. Further, in efi_map_regions() the EFI memory map is processed again to
>include only EFI memory descriptors of type Runtime Code/Data and Boot
>Code/Data. Again, memory is allocated for this new memory map through
>realloc_pages() and the old EFI memory map is not freed.
> 5. After SetVirtualAddressMap() is done, the EFI memory map is processed again
>to have only EFI memory descriptors of type Runtime Code/Data. Again, 
> memory
>is allocated for this new memory map through efi_memmap_alloc() and the old
>EFI memory map is not freed.

So it was only halfway through the changelog that I realized that by 
'free the memory occupied by old EFI memory map' you didn't mean the EFI 
memory map itself (the system RAM described), but the EFI memory map 
*table*, this relatively small array of descriptors that we allocate and 
reallocate every now and then according to the limitations of the 
(largely non-existent yet) early boot memory management system...

Could we please use much more precise terminology when referring to these 
entities, in changelogs and code alike? I.e.:

 - 'EFI memory map' is the whole map and the contents

 - 'EFI memory map table' is the table structure itself, without regard 
   of its contents

 - 'EFI memory map table entry/descriptor' is an entry in the table.

This kind of clarity is what we are using on the e820 memory map side as 
well: we have struct e820_table and struct e820_entry.

Ard, would you object to standardizing the EFI code in a similar fashion 
as well: efi_mem_table and efi_mem_entry?

Parameters and local variables should be named unambiguously following 
these concepts as well. There should be no opaque 'new' variables 
*especially* where the primary role of the efi_memmap_insert() API call 
is to add a new *entry* - it should be 'new_efi_map_table' (or 
'new_table' where it's unambiguous) instead, and new entries added should 
be 'new_entry' - etc.

Once this is reorganized and clarified it should be a lot more easy to 
review 'table freeing' patches. I can help out with patches if there's no 
conceptual objections.

Thanks,

Ingo


[PATCH V2 0/3] Fix EFI memory map leaks

2018-12-04 Thread Sai Praneeth Prakhya
Presently, in EFI subsystem of kernel, every time kernel allocates memory for a
new EFI memory map, it forgets to free the memory occupied by old EFI memory 
map.
It does clear the mappings though (using efi_memmap_unmap()), but forgets to
free up the memory. Also, there is another minor issue, where in the newly
allocated memory isn't freed, should remap fail.

The first issue is addressed by adding efi_memmap_free() to efi_memmap_install()
and the second issue is addressed by pushing the remap code into
efi_memmap_alloc() and there by handling the failure condition.

Memory allocated to EFI memory map is leaked in below functions and hence they
are modified to fix the issue. Functions that modify EFI memmap are:
1. efi_clean_memmap(),
2. efi_fake_memmap(),
3. efi_arch_mem_reserve(),
4. efi_free_boot_services(),
5. and __efi_enter_virtual_mode()

More detailed explanation:
--
A typical boot flow on EFI supported x86_64 machines might look something like
below
1. EFI memory map is passed by firmware to kernel.
2. Kernel does a memblock_reserve() on this memory
   (see efi_memblock_x86_reserve_range()).
3. This memory map is checked for invalid entries in efi_clean_memmap(). If any
   invalid entries are found, they are omitted from EFI memory map but the
   memory occupied by these invalid EFI memory descriptors isn't freed.
3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init()
   and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and
   copies the processed memory map to newly allocated memory but it forgets to
   free memory occupied by old EFI memory map.
4. Further, in efi_map_regions() the EFI memory map is processed again to
   include only EFI memory descriptors of type Runtime Code/Data and Boot
   Code/Data. Again, memory is allocated for this new memory map through
   realloc_pages() and the old EFI memory map is not freed.
5. After SetVirtualAddressMap() is done, the EFI memory map is processed again
   to have only EFI memory descriptors of type Runtime Code/Data. Again, memory
   is allocated for this new memory map through efi_memmap_alloc() and the old
   EFI memory map is not freed.

Testing:

Tested with LUV on qemu-x86_64 and on my dev machine. Checked for unchanged boot
behavior i.e. shouldn't break any existing stuff. Built for arm, arm64 and ia64
and found no new warnings/errors. Would appreciate the effort if someone could
test on arm machines.

Although majority of the changes are made to drivers/firmware/efi/memmap.c file
(which is common across architectures), this bug is only limited to x86_64
machines and hence this patch set shouldn't effect any other architectures.

Notes:
--
1. This patch set is based on EFI tree's "next" branch [1].
2. This patch set is an outcome of the discussion at [2].

Changes from V1:

1. Drop passing around allocation type from efi_memmap_alloc(), instead change
   efi_memmap_alloc() such that it now returns a populated struct efi_memory_map
2. Drop fixing issues in efi_fake_memmap(), will be addressed in a separate 
patch.
3. Optimize efi_map_regions().

[1] git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git
[2] https://lkml.org/lkml/2018/7/2/1095

Sai Praneeth Prakhya (3):
  efi: Introduce efi_memmap_free() and efi_memmap_unmap_and_free()
  x86/efi: Fix EFI memory map leaks
  x86/efi: Use efi_memmap_() to create runtime EFI memory
map

 arch/x86/include/asm/efi.h  |   3 +-
 arch/x86/kernel/setup.c |   6 +
 arch/x86/platform/efi/efi.c | 141 +---
 arch/x86/platform/efi/efi_32.c  |   2 +-
 arch/x86/platform/efi/efi_64.c  |   7 +-
 arch/x86/platform/efi/quirks.c  |  45 ++--
 drivers/firmware/efi/arm-init.c |   2 +-
 drivers/firmware/efi/fake_mem.c |  21 +---
 drivers/firmware/efi/memmap.c   | 190 +---
 include/linux/efi.h |   8 +-
 10 files changed, 235 insertions(+), 190 deletions(-)

Signed-off-by: Sai Praneeth Prakhya 
Suggested-by: Ard Biesheuvel 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: Bhupesh Sharma 
Cc: Ard Biesheuvel 

-- 
2.19.1