RE: [PATCH V2 0/3] Fix EFI memory map leaks
> > Although majority of the changes are made to > > drivers/firmware/efi/memmap.c file (which is common across > > architectures), this bug is only limited to x86_64 machines and hence this > > patch > set shouldn't effect any other architectures. > > I will give this a test run on the arm64 test hardware available at my end. I > have > shared some minor comments on the patchset - see respective patch threads for > the same. Thanks a lot! for the comments and taking time to test the patches :) Really appreciate your effort :) Regards, Sai
RE: [PATCH V2 0/3] Fix EFI memory map leaks
> Since we're being pedantic, it also makes sense to decide now whether 'area' > refers to all [discontiguous] regions or just one of them. I'd say the > former, and > use 'region' for the latter, i.e., an area may be made up of several regions, > but > only one exists of each type. > > > Parameters and local variables should be named unambiguously following > > these concepts as well. There should be no opaque 'new' variables > > *especially* where the primary role of the efi_memmap_insert() API > > call is to add a new *entry* - it should be 'new_efi_map_table' (or > > 'new_table' where it's unambiguous) instead, and new entries added > > should be 'new_entry' - etc. > > > > Once this is reorganized and clarified it should be a lot more easy to > > review 'table freeing' patches. I can help out with patches if there's > > no conceptual objections. I have a question, As this table freeing patches do some cleanup of some efi functions, namely, efi_clean_memmap(), efi_fake_memmap(), efi_arch_mem_reserve(), efi_memmap_install() and __efi_enter_virtual_mode(), I am not very sure if you want to do the name changing now and later review the patches or review the patches and later do the name changing. As you pointed out that reviewing them now is pain, which I agree, but my concern is you might be looking at code and might as well change names that might eventually be gone because of these table freeing patches. Regards, Sai
Re: [PATCH V2 0/3] Fix EFI memory map leaks
* Ard Biesheuvel wrote: > So if we are going to rename these things wholesale (which is fine > with me), I'd prefer it if we can drop the 'map' entirely. > > EFI memory table > EFI memory table entry > EFI memory table descriptor So if you don't actively hate the idea (which you don't seem to :) I can cook up a couple of patches and send the series out as an RFC - I think such reorganization is much better seen through the lens of an end result. I can then iterate/rebase the reorganization with low overhead, without committing them permanently yet. > For things described by the EFI memory table, we need to be more > specific anyway, since it typically describes everything, so > > EFI boot services area > EFI runtime services area > EFI reserved area > EFI conventional memory area > EFI MMIO area > > etc etc Yeah - that's the analogue of e820_entry->type, right? > Since we're being pedantic, it also makes sense to decide now whether > 'area' refers to all [discontiguous] regions or just one of them. I'd > say the former, and use 'region' for the latter, i.e., an area may be > made up of several regions, but only one exists of each type. Makes sense. Just to make clear I got it right: AFAICS we have 'struct efi_boot_memmap' that describes all of these areas, and boot_memmap->map is a linear array of efi_memory_desc_t entries, where each entry describes a contiguous range of memory of a given type, correct? On that data structure level the notion of an 'area' doesn't really come up, which would be all the (discontiguous) entries of a given type. What we have is the 'map', the table, and the entries of the table. ( Regarding being pedantic: reading efi_insert_memmap() hit my pain+confusion threshold, and I think it's better to follow a hysteresis curve: i.e. instead of minimally moving it just below the pain threshold we might as well bite the bullet and be really pedantic and systematic about all of this in a single quick pass. That batches it all up and is also generally a lot more fun and productive to execute, as the end result will be a visible improvement (hopefully). We did that with the FPU code and the e820 code recently and it was rather successful. ) Also, would it make sense to actually turn the table into a real C array, while taking the "->desc_size != sizeof(efi_memory_desc_t)" future expansion compatibility constraint into account: - A single, quick bootstrap step allocates a proper array with known entry sizes. - mem_table->entries[i] would thus actully exist on the C level, with a mem_table->nr_entries limit - very similar to the e820 table code. - This would simplify the code at the expense of a single bootstrap function that does the allocation and copying from EFI-format table to Linux-format table. - By all means efi_early_memdesc_ptr() and for_each_efi_memory_desc*() are wrappers around a linear array with entry sizes unknown at build time - that complication would largely go away. Does the EFI runtime have any notion of the OS *modifying* the memory map and then the runtime relying on those modifications, or is it in practice treated as a read-only table? If we can make it a real array on the C level I was also hoping to sneak in the concept of an 'entry' somewhere, so that readers of the simple e820 table/entry logic would feel at home in the EFI code as well. Could we perhaps name 'descriptor' as 'entry', instead of 'region'? I.e. on the data structure level we'd have a clear hierarchy of table/entries/entry: efi_mem_table: ->entries[i] ->nr_entries where 'entry->type' distinguishes the different areas. It's all pretty straightforward and would allow some unification of utility functions with the e820 code perhaps. In fact if we do that then we could probably rename: struct e820_table => struct x86_mem_table struct e820_entry => struct x86_mem_entry ... and extend x86_mem_entry with an 'attr' field? That way all the information stored in the EFI entry could be stored in an x86_mem_entry - and the EFI code (and the e820 code) could use this more generic table/entry data structure directly and deal with the raw underlying data types as little as possible. The e820 code already has a decoupled 'struct boot_e820_entry' data type for the boot protocol ABI, so I think 'e820_entry' could be extended and made generic - at least in principle. There's a number of high level advantages from doing that: - Right now we put the EFI memory table entries into an e820 table and then feed that to the memblock allocator. This is a mild conceptual layering violation: EFI per definition has nothing to do with E820 BIOS calls. The e820_table/e820_entry is the de facto 'generic' entry already, and the naming should now reflect that I believe. - There's 23 uses of for_each_efi_memory_desc*(), so the simplification would be measurable even in low
Re: [PATCH V2 0/3] Fix EFI memory map leaks
Hi Sai, Thanks for Cc'ing me on this patchset. On 12/05/2018 06:03 AM, Sai Praneeth Prakhya wrote: Presently, in EFI subsystem of kernel, every time kernel allocates memory for a new EFI memory map, it forgets to free the memory occupied by old EFI memory map. It does clear the mappings though (using efi_memmap_unmap()), but forgets to free up the memory. Also, there is another minor issue, where in the newly allocated memory isn't freed, should remap fail. The first issue is addressed by adding efi_memmap_free() to efi_memmap_install() and the second issue is addressed by pushing the remap code into efi_memmap_alloc() and there by handling the failure condition. Memory allocated to EFI memory map is leaked in below functions and hence they are modified to fix the issue. Functions that modify EFI memmap are: 1. efi_clean_memmap(), 2. efi_fake_memmap(), 3. efi_arch_mem_reserve(), 4. efi_free_boot_services(), 5. and __efi_enter_virtual_mode() More detailed explanation: -- A typical boot flow on EFI supported x86_64 machines might look something like below 1. EFI memory map is passed by firmware to kernel. 2. Kernel does a memblock_reserve() on this memory (see efi_memblock_x86_reserve_range()). 3. This memory map is checked for invalid entries in efi_clean_memmap(). If any invalid entries are found, they are omitted from EFI memory map but the memory occupied by these invalid EFI memory descriptors isn't freed. 3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init() and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and copies the processed memory map to newly allocated memory but it forgets to free memory occupied by old EFI memory map. 4. Further, in efi_map_regions() the EFI memory map is processed again to include only EFI memory descriptors of type Runtime Code/Data and Boot Code/Data. Again, memory is allocated for this new memory map through realloc_pages() and the old EFI memory map is not freed. 5. After SetVirtualAddressMap() is done, the EFI memory map is processed again to have only EFI memory descriptors of type Runtime Code/Data. Again, memory is allocated for this new memory map through efi_memmap_alloc() and the old EFI memory map is not freed. Testing: Tested with LUV on qemu-x86_64 and on my dev machine. Checked for unchanged boot behavior i.e. shouldn't break any existing stuff. Built for arm, arm64 and ia64 and found no new warnings/errors. Would appreciate the effort if someone could test on arm machines. Although majority of the changes are made to drivers/firmware/efi/memmap.c file (which is common across architectures), this bug is only limited to x86_64 machines and hence this patch set shouldn't effect any other architectures. I will give this a test run on the arm64 test hardware available at my end. I have shared some minor comments on the patchset - see respective patch threads for the same. Thanks, Bhupesh Notes: -- 1. This patch set is based on EFI tree's "next" branch [1]. 2. This patch set is an outcome of the discussion at [2]. Changes from V1: 1. Drop passing around allocation type from efi_memmap_alloc(), instead change efi_memmap_alloc() such that it now returns a populated struct efi_memory_map 2. Drop fixing issues in efi_fake_memmap(), will be addressed in a separate patch. 3. Optimize efi_map_regions(). [1] git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git [2] https://lkml.org/lkml/2018/7/2/1095 Sai Praneeth Prakhya (3): efi: Introduce efi_memmap_free() and efi_memmap_unmap_and_free() x86/efi: Fix EFI memory map leaks x86/efi: Use efi_memmap_() to create runtime EFI memory map arch/x86/include/asm/efi.h | 3 +- arch/x86/kernel/setup.c | 6 + arch/x86/platform/efi/efi.c | 141 +--- arch/x86/platform/efi/efi_32.c | 2 +- arch/x86/platform/efi/efi_64.c | 7 +- arch/x86/platform/efi/quirks.c | 45 ++-- drivers/firmware/efi/arm-init.c | 2 +- drivers/firmware/efi/fake_mem.c | 21 +--- drivers/firmware/efi/memmap.c | 190 +--- include/linux/efi.h | 8 +- 10 files changed, 235 insertions(+), 190 deletions(-) Signed-off-by: Sai Praneeth Prakhya Suggested-by: Ard Biesheuvel Cc: Ingo Molnar Cc: Borislav Petkov Cc: Bhupesh Sharma Cc: Ard Biesheuvel
Re: [PATCH V2 0/3] Fix EFI memory map leaks
On Wed, 5 Dec 2018 at 08:41, Ingo Molnar wrote: > > > * Sai Praneeth Prakhya wrote: > > > Presently, in EFI subsystem of kernel, every time kernel allocates memory > > for a > > new EFI memory map, it forgets to free the memory occupied by old EFI > > memory map. > > It does clear the mappings though (using efi_memmap_unmap()), but forgets to > > free up the memory. Also, there is another minor issue, where in the newly > > allocated memory isn't freed, should remap fail. > > > > The first issue is addressed by adding efi_memmap_free() to > > efi_memmap_install() > > and the second issue is addressed by pushing the remap code into > > efi_memmap_alloc() and there by handling the failure condition. > > > > Memory allocated to EFI memory map is leaked in below functions and hence > > they > > are modified to fix the issue. Functions that modify EFI memmap are: > > 1. efi_clean_memmap(), > > 2. efi_fake_memmap(), > > 3. efi_arch_mem_reserve(), > > 4. efi_free_boot_services(), > > 5. and __efi_enter_virtual_mode() > > > > More detailed explanation: > > -- > > A typical boot flow on EFI supported x86_64 machines might look something > > like > > below > > 1. EFI memory map is passed by firmware to kernel. > > 2. Kernel does a memblock_reserve() on this memory > >(see efi_memblock_x86_reserve_range()). > > 3. This memory map is checked for invalid entries in efi_clean_memmap(). If > > any > >invalid entries are found, they are omitted from EFI memory map but the > >memory occupied by these invalid EFI memory descriptors isn't freed. > > 3. To further process this memory map (see efi_fake_memmap(), > > efi_bgrt_init() > >and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() > > and > >copies the processed memory map to newly allocated memory but it forgets > > to > >free memory occupied by old EFI memory map. > > 4. Further, in efi_map_regions() the EFI memory map is processed again to > >include only EFI memory descriptors of type Runtime Code/Data and Boot > >Code/Data. Again, memory is allocated for this new memory map through > >realloc_pages() and the old EFI memory map is not freed. > > 5. After SetVirtualAddressMap() is done, the EFI memory map is processed > > again > >to have only EFI memory descriptors of type Runtime Code/Data. Again, > > memory > >is allocated for this new memory map through efi_memmap_alloc() and the > > old > >EFI memory map is not freed. > > So it was only halfway through the changelog that I realized that by > 'free the memory occupied by old EFI memory map' you didn't mean the EFI > memory map itself (the system RAM described), but the EFI memory map > *table*, this relatively small array of descriptors that we allocate and > reallocate every now and then according to the limitations of the > (largely non-existent yet) early boot memory management system... > > Could we please use much more precise terminology when referring to these > entities, in changelogs and code alike? I.e.: > > - 'EFI memory map' is the whole map and the contents > > - 'EFI memory map table' is the table structure itself, without regard >of its contents > > - 'EFI memory map table entry/descriptor' is an entry in the table. > > This kind of clarity is what we are using on the e820 memory map side as > well: we have struct e820_table and struct e820_entry. > > Ard, would you object to standardizing the EFI code in a similar fashion > as well: efi_mem_table and efi_mem_entry? > I share your concern, but 'EFI memory map' already has an established meaning beyond Linux, as the set of memory map entries describing the address space to the OS. (for instance, the EFI boot service to retrieve it is called GetMemoryMap()) I am not sure what 'the whole EFI memory map' would mean'. It describes all of DRAM (used, unused or reserved) and on top of that, some peripheral mappings that need to be virtually remapped by the OS so that the runtime services can use them (e.g., RTC or flash). So from UEFI spec pov, your notion of 'EFI memory map' is just the entire memory map. So if we are going to rename these things wholesale (which is fine with me), I'd prefer it if we can drop the 'map' entirely. EFI memory table EFI memory table entry EFI memory table descriptor For things described by the EFI memory table, we need to be more specific anyway, since it typically describes everything, so EFI boot services area EFI runtime services area EFI reserved area EFI conventional memory area EFI MMIO area etc etc Since we're being pedantic, it also makes sense to decide now whether 'area' refers to all [discontiguous] regions or just one of them. I'd say the former, and use 'region' for the latter, i.e., an area may be made up of several regions, but only one exists of each type. > Parameters and local variables should be named unambiguously following > these concepts as well. There should be no opaque 'new'
Re: [PATCH V2 0/3] Fix EFI memory map leaks
* Sai Praneeth Prakhya wrote: > Presently, in EFI subsystem of kernel, every time kernel allocates memory for > a > new EFI memory map, it forgets to free the memory occupied by old EFI memory > map. > It does clear the mappings though (using efi_memmap_unmap()), but forgets to > free up the memory. Also, there is another minor issue, where in the newly > allocated memory isn't freed, should remap fail. > > The first issue is addressed by adding efi_memmap_free() to > efi_memmap_install() > and the second issue is addressed by pushing the remap code into > efi_memmap_alloc() and there by handling the failure condition. > > Memory allocated to EFI memory map is leaked in below functions and hence they > are modified to fix the issue. Functions that modify EFI memmap are: > 1. efi_clean_memmap(), > 2. efi_fake_memmap(), > 3. efi_arch_mem_reserve(), > 4. efi_free_boot_services(), > 5. and __efi_enter_virtual_mode() > > More detailed explanation: > -- > A typical boot flow on EFI supported x86_64 machines might look something like > below > 1. EFI memory map is passed by firmware to kernel. > 2. Kernel does a memblock_reserve() on this memory >(see efi_memblock_x86_reserve_range()). > 3. This memory map is checked for invalid entries in efi_clean_memmap(). If > any >invalid entries are found, they are omitted from EFI memory map but the >memory occupied by these invalid EFI memory descriptors isn't freed. > 3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init() >and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and >copies the processed memory map to newly allocated memory but it forgets to >free memory occupied by old EFI memory map. > 4. Further, in efi_map_regions() the EFI memory map is processed again to >include only EFI memory descriptors of type Runtime Code/Data and Boot >Code/Data. Again, memory is allocated for this new memory map through >realloc_pages() and the old EFI memory map is not freed. > 5. After SetVirtualAddressMap() is done, the EFI memory map is processed again >to have only EFI memory descriptors of type Runtime Code/Data. Again, > memory >is allocated for this new memory map through efi_memmap_alloc() and the old >EFI memory map is not freed. So it was only halfway through the changelog that I realized that by 'free the memory occupied by old EFI memory map' you didn't mean the EFI memory map itself (the system RAM described), but the EFI memory map *table*, this relatively small array of descriptors that we allocate and reallocate every now and then according to the limitations of the (largely non-existent yet) early boot memory management system... Could we please use much more precise terminology when referring to these entities, in changelogs and code alike? I.e.: - 'EFI memory map' is the whole map and the contents - 'EFI memory map table' is the table structure itself, without regard of its contents - 'EFI memory map table entry/descriptor' is an entry in the table. This kind of clarity is what we are using on the e820 memory map side as well: we have struct e820_table and struct e820_entry. Ard, would you object to standardizing the EFI code in a similar fashion as well: efi_mem_table and efi_mem_entry? Parameters and local variables should be named unambiguously following these concepts as well. There should be no opaque 'new' variables *especially* where the primary role of the efi_memmap_insert() API call is to add a new *entry* - it should be 'new_efi_map_table' (or 'new_table' where it's unambiguous) instead, and new entries added should be 'new_entry' - etc. Once this is reorganized and clarified it should be a lot more easy to review 'table freeing' patches. I can help out with patches if there's no conceptual objections. Thanks, Ingo
[PATCH V2 0/3] Fix EFI memory map leaks
Presently, in EFI subsystem of kernel, every time kernel allocates memory for a new EFI memory map, it forgets to free the memory occupied by old EFI memory map. It does clear the mappings though (using efi_memmap_unmap()), but forgets to free up the memory. Also, there is another minor issue, where in the newly allocated memory isn't freed, should remap fail. The first issue is addressed by adding efi_memmap_free() to efi_memmap_install() and the second issue is addressed by pushing the remap code into efi_memmap_alloc() and there by handling the failure condition. Memory allocated to EFI memory map is leaked in below functions and hence they are modified to fix the issue. Functions that modify EFI memmap are: 1. efi_clean_memmap(), 2. efi_fake_memmap(), 3. efi_arch_mem_reserve(), 4. efi_free_boot_services(), 5. and __efi_enter_virtual_mode() More detailed explanation: -- A typical boot flow on EFI supported x86_64 machines might look something like below 1. EFI memory map is passed by firmware to kernel. 2. Kernel does a memblock_reserve() on this memory (see efi_memblock_x86_reserve_range()). 3. This memory map is checked for invalid entries in efi_clean_memmap(). If any invalid entries are found, they are omitted from EFI memory map but the memory occupied by these invalid EFI memory descriptors isn't freed. 3. To further process this memory map (see efi_fake_memmap(), efi_bgrt_init() and efi_esrt_init()), kernel allocates memory using efi_memmap_alloc() and copies the processed memory map to newly allocated memory but it forgets to free memory occupied by old EFI memory map. 4. Further, in efi_map_regions() the EFI memory map is processed again to include only EFI memory descriptors of type Runtime Code/Data and Boot Code/Data. Again, memory is allocated for this new memory map through realloc_pages() and the old EFI memory map is not freed. 5. After SetVirtualAddressMap() is done, the EFI memory map is processed again to have only EFI memory descriptors of type Runtime Code/Data. Again, memory is allocated for this new memory map through efi_memmap_alloc() and the old EFI memory map is not freed. Testing: Tested with LUV on qemu-x86_64 and on my dev machine. Checked for unchanged boot behavior i.e. shouldn't break any existing stuff. Built for arm, arm64 and ia64 and found no new warnings/errors. Would appreciate the effort if someone could test on arm machines. Although majority of the changes are made to drivers/firmware/efi/memmap.c file (which is common across architectures), this bug is only limited to x86_64 machines and hence this patch set shouldn't effect any other architectures. Notes: -- 1. This patch set is based on EFI tree's "next" branch [1]. 2. This patch set is an outcome of the discussion at [2]. Changes from V1: 1. Drop passing around allocation type from efi_memmap_alloc(), instead change efi_memmap_alloc() such that it now returns a populated struct efi_memory_map 2. Drop fixing issues in efi_fake_memmap(), will be addressed in a separate patch. 3. Optimize efi_map_regions(). [1] git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git [2] https://lkml.org/lkml/2018/7/2/1095 Sai Praneeth Prakhya (3): efi: Introduce efi_memmap_free() and efi_memmap_unmap_and_free() x86/efi: Fix EFI memory map leaks x86/efi: Use efi_memmap_() to create runtime EFI memory map arch/x86/include/asm/efi.h | 3 +- arch/x86/kernel/setup.c | 6 + arch/x86/platform/efi/efi.c | 141 +--- arch/x86/platform/efi/efi_32.c | 2 +- arch/x86/platform/efi/efi_64.c | 7 +- arch/x86/platform/efi/quirks.c | 45 ++-- drivers/firmware/efi/arm-init.c | 2 +- drivers/firmware/efi/fake_mem.c | 21 +--- drivers/firmware/efi/memmap.c | 190 +--- include/linux/efi.h | 8 +- 10 files changed, 235 insertions(+), 190 deletions(-) Signed-off-by: Sai Praneeth Prakhya Suggested-by: Ard Biesheuvel Cc: Ingo Molnar Cc: Borislav Petkov Cc: Bhupesh Sharma Cc: Ard Biesheuvel -- 2.19.1