Re: [PATCH v3 2/2] x86/kdump: Reserve extra memory when SME or SEV is active
* Kairui Song wrote: > Since commit c7753208a94c ("x86, swiotlb: Add memory encryption support"), > SWIOTLB will be enabled even if there is less than 4G of memory when SME > is active, to support DMA of devices that not support address with the > encrypt bit. > > And commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is > active") make the kernel keep SWIOTLB enabled even if there is an IOMMU. > > Then commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory > encryption") will always force SWIOTLB to be enabled when SEV is active > in all cases. > > Now, when either SME or SEV is active, SWIOTLB will be force enabled, > and this is also true for kdump kernel. As a result kdump kernel will > run out of already scarce pre-reserved memory easily. > > So when SME/SEV is active, reserve extra memory for SWIOTLB to ensure > kdump kernel have enough memory, except when "crashkernel=size[KMG],high" > is specified or any offset is used. As for the high reservation case, an > extra low memory region will always be reserved and that is enough for > SWIOTLB. Else if the offset format is used, user should be fully aware > of any possible kdump kernel memory requirement and have to organize the > memory usage carefully. > > Signed-off-by: Kairui Song > --- > arch/x86/kernel/setup.c | 20 +--- > 1 file changed, 17 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 71f20bb18cb0..ee6a2f1e2226 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -530,7 +530,7 @@ static int __init crashkernel_find_region(unsigned long > long *crash_base, > unsigned long long *crash_size, > bool high) > { > - unsigned long long base, size; > + unsigned long long base, size, mem_enc_req = 0; > > base = *crash_base; > size = *crash_size; > @@ -561,11 +561,25 @@ static int __init crashkernel_find_region(unsigned long > long *crash_base, > if (high) > goto high_reserve; > > + /* > + * When SME/SEV is active and not using high reserve, > + * it will always required an extra SWIOTLB region. > + */ > + if (mem_encrypt_active()) > + mem_enc_req = ALIGN(swiotlb_size_or_default(), SZ_1M); > + > base = memblock_find_in_range(CRASH_ALIGN, > - CRASH_ADDR_LOW_MAX, size, > + CRASH_ADDR_LOW_MAX, > + size + mem_enc_req, > CRASH_ALIGN); What sizes are we talking about here? - What is the possible size range of swiotlb_size_or_default() - What is the size of CRASH_ADDR_LOW_MAX (the old limit)? - Why do we replace one fixed limit with another fixed limit instead of accurately sizing the area, with each required feature adding its own requirement to the reservation size? I.e. please engineer this into a proper solution instead of just modifying it around the edges. For example have you considered adding some sort of kdump_memory_reserve(size) facility, which increases the reservation size as something like SWIOTLB gets activated? That would avoid the ugly mem_encrypt_active() flag, it would just automagically work. Thanks, Ingo
Re: [PATCH 1/2] vmcore-dmesg/vmcore-dmesg.c: Fix shifting error reported by cppcheck
On Tue, Sep 10, 2019 at 7:25 PM lijiang wrote: > > 在 2019年09月10日 18:21, Bhupesh Sharma 写道: > > Running 'cppcheck' static code analyzer (see cppcheck(1)) > > on 'vmcore-dmesg/vmcore-dmesg.c' shows the following > > shifting error: > > > > $ cppcheck --enable=all vmcore-dmesg/vmcore-dmesg.c > > Checking vmcore-dmesg/vmcore-dmesg.c ... > > [vmcore-dmesg/vmcore-dmesg.c:17]: (error) Shifting signed 32-bit value by > > 31 bits is undefined behaviour > > > > Fix the same via this patch. > > > > Cc: Lianbo Jiang > > Cc: Simon Horman > > Signed-off-by: Bhupesh Sharma > > --- > > vmcore-dmesg/vmcore-dmesg.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/vmcore-dmesg/vmcore-dmesg.c b/vmcore-dmesg/vmcore-dmesg.c > > index 81c2a58..122e536 100644 > > --- a/vmcore-dmesg/vmcore-dmesg.c > > +++ b/vmcore-dmesg/vmcore-dmesg.c > > @@ -6,7 +6,7 @@ typedef Elf32_Nhdr Elf_Nhdr; > > extern const char *fname; > > > > /* stole this macro from kernel printk.c */ > > -#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31) > > +#define LOG_BUF_LEN_MAX (uint32_t)(1U << 31) > > > > This looks better. Thank you, Bhupesh. Thanks for your review, Lianbo. Regards, Bhupesh ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v3 0/2] x86/kdump: Reserve extra memory when SME or SEV is active
This series let kernel reserve extra memory for kdump when SME or SEV is active. When SME or SEV is active, SWIOTLB will be always be force enabled, and this is also true for kdump kernel. As a result kdump kernel will run out of already scarce pre-reserved memory easily. So when SME/SEV is active, reserve extra memory for SWIOTLB to ensure kdump kernel have enough memory, except when "crashkernel=size[KMG],high" is specified or any offset is used. With high reservation an extra low memory region will always be reserved and that is enough for SWIOTLB. With offset format, user should be fully aware of any possible kdump kernel memory requirement and have to organize the memory usage carefully. Patch 1/2 simply split some code out of the reserve_crashkernel, prepare for the change of next patch. Patch 2/2 will let crashkernel reserve extra memory when SME or SEV is active, and explains more details and history about why this change is introduced. Update from V2: - Refactor and split some function out of reserve_crashkernel to make it cleaner, as suggested by Borislav Petkov - Split into 2 patches Update from V1: - Use mem_encrypt_active() instead of "sme_active() || sev_active()" - Don't reserve extra memory when ",high" or "@offset" is used, and don't print redundant message. - Fix coding style problem Kairui Song (2): x86/kdump: Split some code out of reserve_crashkernel x86/kdump: Reserve extra memory when SME or SEV is active arch/x86/kernel/setup.c | 106 1 file changed, 74 insertions(+), 32 deletions(-) -- 2.21.0
[PATCH v3 2/2] x86/kdump: Reserve extra memory when SME or SEV is active
Since commit c7753208a94c ("x86, swiotlb: Add memory encryption support"), SWIOTLB will be enabled even if there is less than 4G of memory when SME is active, to support DMA of devices that not support address with the encrypt bit. And commit aba2d9a6385a ("iommu/amd: Do not disable SWIOTLB if SME is active") make the kernel keep SWIOTLB enabled even if there is an IOMMU. Then commit d7b417fa08d1 ("x86/mm: Add DMA support for SEV memory encryption") will always force SWIOTLB to be enabled when SEV is active in all cases. Now, when either SME or SEV is active, SWIOTLB will be force enabled, and this is also true for kdump kernel. As a result kdump kernel will run out of already scarce pre-reserved memory easily. So when SME/SEV is active, reserve extra memory for SWIOTLB to ensure kdump kernel have enough memory, except when "crashkernel=size[KMG],high" is specified or any offset is used. As for the high reservation case, an extra low memory region will always be reserved and that is enough for SWIOTLB. Else if the offset format is used, user should be fully aware of any possible kdump kernel memory requirement and have to organize the memory usage carefully. Signed-off-by: Kairui Song --- arch/x86/kernel/setup.c | 20 +--- 1 file changed, 17 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 71f20bb18cb0..ee6a2f1e2226 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -530,7 +530,7 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, unsigned long long *crash_size, bool high) { - unsigned long long base, size; + unsigned long long base, size, mem_enc_req = 0; base = *crash_base; size = *crash_size; @@ -561,11 +561,25 @@ static int __init crashkernel_find_region(unsigned long long *crash_base, if (high) goto high_reserve; + /* +* When SME/SEV is active and not using high reserve, +* it will always required an extra SWIOTLB region. +*/ + if (mem_encrypt_active()) + mem_enc_req = ALIGN(swiotlb_size_or_default(), SZ_1M); + base = memblock_find_in_range(CRASH_ALIGN, - CRASH_ADDR_LOW_MAX, size, + CRASH_ADDR_LOW_MAX, + size + mem_enc_req, CRASH_ALIGN); - if (base) + if (base) { + if (mem_enc_req) { + pr_info("Memory encryption is active, crashkernel needs %ldMB extra memory\n", + (unsigned long)(mem_enc_req >> 20)); + size += mem_enc_req; + } goto found; + } high_reserve: /* Try high reserve */ -- 2.21.0
[PATCH v3 1/2] x86/kdump: Split some code out of reserve_crashkernel
Split out the code related to finding suitable region for kdump out of reserve_crashkernel, clean up and refactor for further change, no feature change. Signed-off-by: Kairui Song --- arch/x86/kernel/setup.c | 92 +++-- 1 file changed, 60 insertions(+), 32 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index bbe35bf879f5..71f20bb18cb0 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -526,6 +526,63 @@ static int __init reserve_crashkernel_low(void) return 0; } +static int __init crashkernel_find_region(unsigned long long *crash_base, + unsigned long long *crash_size, + bool high) +{ + unsigned long long base, size; + + base = *crash_base; + size = *crash_size; + + /* +* base == 0 means: find the address automatically, else just +* verify the region is useable +*/ + if (base) { + unsigned long long start; + + start = memblock_find_in_range(base, base + size, + size, 1 << 20); + if (start != base) { + pr_info("crashkernel reservation failed - memory is in use.\n"); + return -1; + } + return 0; + } + + /* +* crashkernel=x,high reserves memory over 4G, also allocates +* 256M extra low memory for DMA buffers and swiotlb. +* But the extra memory is not required for all machines. +* So try low memory first and fall back to high memory +* unless "crashkernel=size[KMG],high" is specified. +*/ + if (high) + goto high_reserve; + + base = memblock_find_in_range(CRASH_ALIGN, + CRASH_ADDR_LOW_MAX, size, + CRASH_ALIGN); + if (base) + goto found; + +high_reserve: + /* Try high reserve */ + base = memblock_find_in_range(CRASH_ALIGN, + CRASH_ADDR_HIGH_MAX, size, + CRASH_ALIGN); + if (base) + goto found; + + pr_info("crashkernel reservation failed - No suitable area found.\n"); + return -1; +found: + *crash_base = base; + *crash_size = size; + return 0; +} + static void __init reserve_crashkernel(void) { unsigned long long crash_size, crash_base, total_mem; @@ -550,39 +607,10 @@ static void __init reserve_crashkernel(void) return; } - /* 0 means: find the address automatically */ - if (!crash_base) { - /* -* Set CRASH_ADDR_LOW_MAX upper bound for crash memory, -* crashkernel=x,high reserves memory over 4G, also allocates -* 256M extra low memory for DMA buffers and swiotlb. -* But the extra memory is not required for all machines. -* So try low memory first and fall back to high memory -* unless "crashkernel=size[KMG],high" is specified. -*/ - if (!high) - crash_base = memblock_find_in_range(CRASH_ALIGN, - CRASH_ADDR_LOW_MAX, - crash_size, CRASH_ALIGN); - if (!crash_base) - crash_base = memblock_find_in_range(CRASH_ALIGN, - CRASH_ADDR_HIGH_MAX, - crash_size, CRASH_ALIGN); - if (!crash_base) { - pr_info("crashkernel reservation failed - No suitable area found.\n"); - return; - } - } else { - unsigned long long start; + ret = crashkernel_find_region(&crash_base, &crash_size, high); + if (ret) + return; - start = memblock_find_in_range(crash_base, - crash_base + crash_size, - crash_size, 1 << 20); - if (start != crash_base) { - pr_info("crashkernel reservation failed - memory is in use.\n"); - return; - } - } ret = memblock_reserve(crash_base, crash_size); if (ret) { pr_err("%s: Error reserving crashkernel memblock.\n", __func__); -- 2.21.0
Re: [PATCH 1/2] vmcore-dmesg/vmcore-dmesg.c: Fix shifting error reported by cppcheck
在 2019年09月10日 18:21, Bhupesh Sharma 写道: > Running 'cppcheck' static code analyzer (see cppcheck(1)) > on 'vmcore-dmesg/vmcore-dmesg.c' shows the following > shifting error: > > $ cppcheck --enable=all vmcore-dmesg/vmcore-dmesg.c > Checking vmcore-dmesg/vmcore-dmesg.c ... > [vmcore-dmesg/vmcore-dmesg.c:17]: (error) Shifting signed 32-bit value by 31 > bits is undefined behaviour > > Fix the same via this patch. > > Cc: Lianbo Jiang > Cc: Simon Horman > Signed-off-by: Bhupesh Sharma > --- > vmcore-dmesg/vmcore-dmesg.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/vmcore-dmesg/vmcore-dmesg.c b/vmcore-dmesg/vmcore-dmesg.c > index 81c2a58..122e536 100644 > --- a/vmcore-dmesg/vmcore-dmesg.c > +++ b/vmcore-dmesg/vmcore-dmesg.c > @@ -6,7 +6,7 @@ typedef Elf32_Nhdr Elf_Nhdr; > extern const char *fname; > > /* stole this macro from kernel printk.c */ > -#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31) > +#define LOG_BUF_LEN_MAX (uint32_t)(1U << 31) > This looks better. Thank you, Bhupesh. > static void write_to_stdout(char *buf, unsigned int nr) > { > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 2/2] i386/kexec-mb2-x86.c: Fix compilation warning
This patch fixes the following compilation warning in 'i386/kexec-mb2-x86.c' regarding the variable 'result' which is set but not used: kexec/arch/i386/kexec-mb2-x86.c:402:6: warning: variable ‘result’ set but not used [-Wunused-but-set-variable] int result; ^~ Cc: Simon Horman Signed-off-by: Bhupesh Sharma --- kexec/arch/i386/kexec-mb2-x86.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kexec/arch/i386/kexec-mb2-x86.c b/kexec/arch/i386/kexec-mb2-x86.c index 7eaab0c..b839d59 100644 --- a/kexec/arch/i386/kexec-mb2-x86.c +++ b/kexec/arch/i386/kexec-mb2-x86.c @@ -399,7 +399,6 @@ int multiboot2_x86_load(int argc, char **argv, const char *buf, off_t len, char *command_line = NULL, *tmp_cmdline = NULL; int command_line_len; char *imagename, *cp, *append = NULL;; - int result; int opt; int modules, mod_command_line_space; uint64_t mbi_ptr; @@ -429,7 +428,6 @@ int multiboot2_x86_load(int argc, char **argv, const char *buf, off_t len, command_line_len = 0; modules = 0; mod_command_line_space = 0; - result = 0; while((opt = getopt_long(argc, argv, short_options, options, 0)) != -1) { switch(opt) { -- 2.17.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 0/2] Fix a compilation warning and a static check error
This patchset fixes two kexec-tools issues: - Fixes a shifting error reported by cppcheck inside 'vmcore-dmesg/vmcore-dmesg.c'. - Fixes a compilation warning in 'i386/kexec-mb2-x86.c'. Bhupesh Sharma (2): vmcore-dmesg/vmcore-dmesg.c: Fix shifting error reported by cppcheck i386/kexec-mb2-x86.c: Fix compilation warning kexec/arch/i386/kexec-mb2-x86.c | 2 -- vmcore-dmesg/vmcore-dmesg.c | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) -- 2.17.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH 1/2] vmcore-dmesg/vmcore-dmesg.c: Fix shifting error reported by cppcheck
Running 'cppcheck' static code analyzer (see cppcheck(1)) on 'vmcore-dmesg/vmcore-dmesg.c' shows the following shifting error: $ cppcheck --enable=all vmcore-dmesg/vmcore-dmesg.c Checking vmcore-dmesg/vmcore-dmesg.c ... [vmcore-dmesg/vmcore-dmesg.c:17]: (error) Shifting signed 32-bit value by 31 bits is undefined behaviour Fix the same via this patch. Cc: Lianbo Jiang Cc: Simon Horman Signed-off-by: Bhupesh Sharma --- vmcore-dmesg/vmcore-dmesg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/vmcore-dmesg/vmcore-dmesg.c b/vmcore-dmesg/vmcore-dmesg.c index 81c2a58..122e536 100644 --- a/vmcore-dmesg/vmcore-dmesg.c +++ b/vmcore-dmesg/vmcore-dmesg.c @@ -6,7 +6,7 @@ typedef Elf32_Nhdr Elf_Nhdr; extern const char *fname; /* stole this macro from kernel printk.c */ -#define LOG_BUF_LEN_MAX (uint32_t)(1 << 31) +#define LOG_BUF_LEN_MAX (uint32_t)(1U << 31) static void write_to_stdout(char *buf, unsigned int nr) { -- 2.17.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 10/17] arm64: trans_pgd: make trans_pgd_map_page generic
> > +/* > > + * Add map entry to trans_pgd for a base-size page at PTE level. > > + * page: page to be mapped. > > + * dst_addr: new VA address for the pages > > + * pgprot: protection for the page. > > For consistency please describe all function parameters. From my experience > function parameter description is normally done in the C-file that implements > the logic. Don't ask me why. Ok, I move the comment, and will describe every parameter. Thank you. > > > + */ > > +int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, > > +void *page, unsigned long dst_addr, pgprot_t pgprot); > > > > #endif /* _ASM_TRANS_TABLE_H */ > > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > > index 94ede33bd777..9b75b680ab70 100644 > > --- a/arch/arm64/kernel/hibernate.c > > +++ b/arch/arm64/kernel/hibernate.c > > @@ -179,6 +179,12 @@ int arch_hibernation_header_restore(void *addr) > > } > > EXPORT_SYMBOL(arch_hibernation_header_restore); > > > > +static void * > > +hibernate_page_alloc(void *arg) > > AFAICS no new line needed here. Right, will fix it. > > > +{ > > + return (void *)get_safe_page((gfp_t)(unsigned long)arg); > > +} > > + > > /* > > * Copies length bytes, starting at src_start into an new page, > > * perform cache maintenance, then maps it at the specified address low > > @@ -195,6 +201,10 @@ static int create_safe_exec_page(void *src_start, > > size_t length, > >unsigned long dst_addr, > >phys_addr_t *phys_dst_addr) > > { > > + struct trans_pgd_info trans_info = { > > + .trans_alloc_page = hibernate_page_alloc, > > + .trans_alloc_arg= (void *)GFP_ATOMIC, > > + }; > > New line between end of struct and other variables. Sure. > > With these changes: > Reviewed-by: Matthias Brugger Thank you, Pasha ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 05/17] arm64: hibernate: remove gotos in create_safe_exec_page
> On 09/09/2019 20:12, Pavel Tatashin wrote: > > Usually, gotos are used to handle cleanup after exception, but > > in case of create_safe_exec_page there are no clean-ups. So, > > simply return the errors directly. > > > > While at it, how about also cleaning up swsusp_arch_resume() which has the > same > issue. Thank you for suggestion. I will do cleanups in swsusp_arch_resume() as well. Pasha > > Regards, > Matthias > > > Signed-off-by: Pavel Tatashin > > Reviewed-by: James Morse > > --- > > arch/arm64/kernel/hibernate.c | 34 +++--- > > 1 file changed, 11 insertions(+), 23 deletions(-) > > > > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > > index 47a861e0cb0c..7bbeb33c700d 100644 > > --- a/arch/arm64/kernel/hibernate.c > > +++ b/arch/arm64/kernel/hibernate.c > > @@ -198,7 +198,6 @@ static int create_safe_exec_page(void *src_start, > > size_t length, > >unsigned long dst_addr, > >phys_addr_t *phys_dst_addr) > > { > > - int rc = 0; > > pgd_t *trans_pgd; > > pgd_t *pgdp; > > pud_t *pudp; > > @@ -206,47 +205,37 @@ static int create_safe_exec_page(void *src_start, > > size_t length, > > pte_t *ptep; > > unsigned long dst = get_safe_page(GFP_ATOMIC); > > > > - if (!dst) { > > - rc = -ENOMEM; > > - goto out; > > - } > > + if (!dst) > > + return -ENOMEM; > > > > memcpy((void *)dst, src_start, length); > > __flush_icache_range(dst, dst + length); > > > > trans_pgd = (void *)get_safe_page(GFP_ATOMIC); > > - if (!trans_pgd) { > > - rc = -ENOMEM; > > - goto out; > > - } > > + if (!trans_pgd) > > + return -ENOMEM; > > > > pgdp = pgd_offset_raw(trans_pgd, dst_addr); > > if (pgd_none(READ_ONCE(*pgdp))) { > > pudp = (void *)get_safe_page(GFP_ATOMIC); > > - if (!pudp) { > > - rc = -ENOMEM; > > - goto out; > > - } > > + if (!pudp) > > + return -ENOMEM; > > pgd_populate(&init_mm, pgdp, pudp); > > } > > > > pudp = pud_offset(pgdp, dst_addr); > > if (pud_none(READ_ONCE(*pudp))) { > > pmdp = (void *)get_safe_page(GFP_ATOMIC); > > - if (!pmdp) { > > - rc = -ENOMEM; > > - goto out; > > - } > > + if (!pmdp) > > + return -ENOMEM; > > pud_populate(&init_mm, pudp, pmdp); > > } > > > > pmdp = pmd_offset(pudp, dst_addr); > > if (pmd_none(READ_ONCE(*pmdp))) { > > ptep = (void *)get_safe_page(GFP_ATOMIC); > > - if (!ptep) { > > - rc = -ENOMEM; > > - goto out; > > - } > > + if (!ptep) > > + return -ENOMEM; > > pmd_populate_kernel(&init_mm, pmdp, ptep); > > } > > > > @@ -272,8 +261,7 @@ static int create_safe_exec_page(void *src_start, > > size_t length, > > > > *phys_dst_addr = virt_to_phys((void *)dst); > > > > -out: > > - return rc; > > + return 0; > > } > > > > #define dcache_clean_range(start, end) __flush_dcache_area(start, > > (end - start)) > >
Re: [PATCH v4 04/17] arm64: hibernate: use get_safe_page directly
> On 09/09/2019 20:12, Pavel Tatashin wrote: > > create_safe_exec_page() uses hibernate's allocator to create a set of page > > table to map a single page that will contain the relocation code. > > > > Remove the allocator related arguments, and use get_safe_page directly, as > > it is done in other local functions in this file to simplify function > > prototype. > > > > Removing this function pointer makes it easier to refactor the code later. > > > > Signed-off-by: Pavel Tatashin > > Reviewed-by: Matthias Brugger > Thank you ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 10/17] arm64: trans_pgd: make trans_pgd_map_page generic
Bikeshedding alarm, please see below. On 09/09/2019 20:12, Pavel Tatashin wrote: > kexec is going to use a different allocator, so make > trans_pgd_map_page to accept allocator as an argument, and also > kexec is going to use a different map protection, so also pass > it via argument. > > Signed-off-by: Pavel Tatashin > --- > arch/arm64/include/asm/trans_pgd.h | 24 ++-- > arch/arm64/kernel/hibernate.c | 12 +++- > arch/arm64/mm/trans_pgd.c | 17 +++-- > 3 files changed, 44 insertions(+), 9 deletions(-) > > diff --git a/arch/arm64/include/asm/trans_pgd.h > b/arch/arm64/include/asm/trans_pgd.h > index c7b5402b7d87..53f67ec84cdc 100644 > --- a/arch/arm64/include/asm/trans_pgd.h > +++ b/arch/arm64/include/asm/trans_pgd.h > @@ -11,10 +11,30 @@ > #include > #include > > +/* > + * trans_alloc_page > + * - Allocator that should return exactly one zeroed page, if this > + *allocator fails, trans_pgd returns -ENOMEM error. > + * > + * trans_alloc_arg > + * - Passed to trans_alloc_page as an argument > + */ > + > +struct trans_pgd_info { > + void * (*trans_alloc_page)(void *arg); > + void *trans_alloc_arg; > +}; > + > int trans_pgd_create_copy(pgd_t **dst_pgdp, unsigned long start, > unsigned long end); > > -int trans_pgd_map_page(pgd_t *trans_pgd, void *page, unsigned long dst_addr, > -pgprot_t pgprot); > +/* > + * Add map entry to trans_pgd for a base-size page at PTE level. > + * page: page to be mapped. > + * dst_addr: new VA address for the pages > + * pgprot: protection for the page. For consistency please describe all function parameters. From my experience function parameter description is normally done in the C-file that implements the logic. Don't ask me why. > + */ > +int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, > +void *page, unsigned long dst_addr, pgprot_t pgprot); > > #endif /* _ASM_TRANS_TABLE_H */ > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > index 94ede33bd777..9b75b680ab70 100644 > --- a/arch/arm64/kernel/hibernate.c > +++ b/arch/arm64/kernel/hibernate.c > @@ -179,6 +179,12 @@ int arch_hibernation_header_restore(void *addr) > } > EXPORT_SYMBOL(arch_hibernation_header_restore); > > +static void * > +hibernate_page_alloc(void *arg) AFAICS no new line needed here. > +{ > + return (void *)get_safe_page((gfp_t)(unsigned long)arg); > +} > + > /* > * Copies length bytes, starting at src_start into an new page, > * perform cache maintenance, then maps it at the specified address low > @@ -195,6 +201,10 @@ static int create_safe_exec_page(void *src_start, size_t > length, >unsigned long dst_addr, >phys_addr_t *phys_dst_addr) > { > + struct trans_pgd_info trans_info = { > + .trans_alloc_page = hibernate_page_alloc, > + .trans_alloc_arg= (void *)GFP_ATOMIC, > + }; New line between end of struct and other variables. With these changes: Reviewed-by: Matthias Brugger > void *page = (void *)get_safe_page(GFP_ATOMIC); > pgd_t *trans_pgd; > int rc; > @@ -209,7 +219,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > if (!trans_pgd) > return -ENOMEM; > > - rc = trans_pgd_map_page(trans_pgd, page, dst_addr, > + rc = trans_pgd_map_page(&trans_info, trans_pgd, page, dst_addr, > PAGE_KERNEL_EXEC); > if (rc) > return rc; > diff --git a/arch/arm64/mm/trans_pgd.c b/arch/arm64/mm/trans_pgd.c > index 5ac712b92439..7521d558a0b9 100644 > --- a/arch/arm64/mm/trans_pgd.c > +++ b/arch/arm64/mm/trans_pgd.c > @@ -25,6 +25,11 @@ > #include > #include > > +static void *trans_alloc(struct trans_pgd_info *info) > +{ > + return info->trans_alloc_page(info->trans_alloc_arg); > +} > + > static void _copy_pte(pte_t *dst_ptep, pte_t *src_ptep, unsigned long addr) > { > pte_t pte = READ_ONCE(*src_ptep); > @@ -180,8 +185,8 @@ int trans_pgd_create_copy(pgd_t **dst_pgdp, unsigned long > start, > return rc; > } > > -int trans_pgd_map_page(pgd_t *trans_pgd, void *page, unsigned long dst_addr, > -pgprot_t pgprot) > +int trans_pgd_map_page(struct trans_pgd_info *info, pgd_t *trans_pgd, > +void *page, unsigned long dst_addr, pgprot_t pgprot) > { > pgd_t *pgdp; > pud_t *pudp; > @@ -190,7 +195,7 @@ int trans_pgd_map_page(pgd_t *trans_pgd, void *page, > unsigned long dst_addr, > > pgdp = pgd_offset_raw(trans_pgd, dst_addr); > if (pgd_none(READ_ONCE(*pgdp))) { > - pudp = (void *)get_safe_page(GFP_ATOMIC); > + pudp = trans_alloc(info); > if (!pudp) > return -ENOMEM; > pgd_populate(&init_mm, pgdp, pudp); > @@ -198,7 +203,7 @@
Re: [PATCH v4 05/17] arm64: hibernate: remove gotos in create_safe_exec_page
On 09/09/2019 20:12, Pavel Tatashin wrote: > Usually, gotos are used to handle cleanup after exception, but > in case of create_safe_exec_page there are no clean-ups. So, > simply return the errors directly. > While at it, how about also cleaning up swsusp_arch_resume() which has the same issue. Regards, Matthias > Signed-off-by: Pavel Tatashin > Reviewed-by: James Morse > --- > arch/arm64/kernel/hibernate.c | 34 +++--- > 1 file changed, 11 insertions(+), 23 deletions(-) > > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > index 47a861e0cb0c..7bbeb33c700d 100644 > --- a/arch/arm64/kernel/hibernate.c > +++ b/arch/arm64/kernel/hibernate.c > @@ -198,7 +198,6 @@ static int create_safe_exec_page(void *src_start, size_t > length, >unsigned long dst_addr, >phys_addr_t *phys_dst_addr) > { > - int rc = 0; > pgd_t *trans_pgd; > pgd_t *pgdp; > pud_t *pudp; > @@ -206,47 +205,37 @@ static int create_safe_exec_page(void *src_start, > size_t length, > pte_t *ptep; > unsigned long dst = get_safe_page(GFP_ATOMIC); > > - if (!dst) { > - rc = -ENOMEM; > - goto out; > - } > + if (!dst) > + return -ENOMEM; > > memcpy((void *)dst, src_start, length); > __flush_icache_range(dst, dst + length); > > trans_pgd = (void *)get_safe_page(GFP_ATOMIC); > - if (!trans_pgd) { > - rc = -ENOMEM; > - goto out; > - } > + if (!trans_pgd) > + return -ENOMEM; > > pgdp = pgd_offset_raw(trans_pgd, dst_addr); > if (pgd_none(READ_ONCE(*pgdp))) { > pudp = (void *)get_safe_page(GFP_ATOMIC); > - if (!pudp) { > - rc = -ENOMEM; > - goto out; > - } > + if (!pudp) > + return -ENOMEM; > pgd_populate(&init_mm, pgdp, pudp); > } > > pudp = pud_offset(pgdp, dst_addr); > if (pud_none(READ_ONCE(*pudp))) { > pmdp = (void *)get_safe_page(GFP_ATOMIC); > - if (!pmdp) { > - rc = -ENOMEM; > - goto out; > - } > + if (!pmdp) > + return -ENOMEM; > pud_populate(&init_mm, pudp, pmdp); > } > > pmdp = pmd_offset(pudp, dst_addr); > if (pmd_none(READ_ONCE(*pmdp))) { > ptep = (void *)get_safe_page(GFP_ATOMIC); > - if (!ptep) { > - rc = -ENOMEM; > - goto out; > - } > + if (!ptep) > + return -ENOMEM; > pmd_populate_kernel(&init_mm, pmdp, ptep); > } > > @@ -272,8 +261,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > > *phys_dst_addr = virt_to_phys((void *)dst); > > -out: > - return rc; > + return 0; > } > > #define dcache_clean_range(start, end) __flush_dcache_area(start, (end > - start)) > ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v4 04/17] arm64: hibernate: use get_safe_page directly
On 09/09/2019 20:12, Pavel Tatashin wrote: > create_safe_exec_page() uses hibernate's allocator to create a set of page > table to map a single page that will contain the relocation code. > > Remove the allocator related arguments, and use get_safe_page directly, as > it is done in other local functions in this file to simplify function > prototype. > > Removing this function pointer makes it easier to refactor the code later. > > Signed-off-by: Pavel Tatashin Reviewed-by: Matthias Brugger > --- > arch/arm64/kernel/hibernate.c | 17 +++-- > 1 file changed, 7 insertions(+), 10 deletions(-) > > diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c > index 227cc26720f7..47a861e0cb0c 100644 > --- a/arch/arm64/kernel/hibernate.c > +++ b/arch/arm64/kernel/hibernate.c > @@ -196,9 +196,7 @@ EXPORT_SYMBOL(arch_hibernation_header_restore); > */ > static int create_safe_exec_page(void *src_start, size_t length, >unsigned long dst_addr, > - phys_addr_t *phys_dst_addr, > - void *(*allocator)(gfp_t mask), > - gfp_t mask) > + phys_addr_t *phys_dst_addr) > { > int rc = 0; > pgd_t *trans_pgd; > @@ -206,7 +204,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > pud_t *pudp; > pmd_t *pmdp; > pte_t *ptep; > - unsigned long dst = (unsigned long)allocator(mask); > + unsigned long dst = get_safe_page(GFP_ATOMIC); > > if (!dst) { > rc = -ENOMEM; > @@ -216,7 +214,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > memcpy((void *)dst, src_start, length); > __flush_icache_range(dst, dst + length); > > - trans_pgd = allocator(mask); > + trans_pgd = (void *)get_safe_page(GFP_ATOMIC); > if (!trans_pgd) { > rc = -ENOMEM; > goto out; > @@ -224,7 +222,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > > pgdp = pgd_offset_raw(trans_pgd, dst_addr); > if (pgd_none(READ_ONCE(*pgdp))) { > - pudp = allocator(mask); > + pudp = (void *)get_safe_page(GFP_ATOMIC); > if (!pudp) { > rc = -ENOMEM; > goto out; > @@ -234,7 +232,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > > pudp = pud_offset(pgdp, dst_addr); > if (pud_none(READ_ONCE(*pudp))) { > - pmdp = allocator(mask); > + pmdp = (void *)get_safe_page(GFP_ATOMIC); > if (!pmdp) { > rc = -ENOMEM; > goto out; > @@ -244,7 +242,7 @@ static int create_safe_exec_page(void *src_start, size_t > length, > > pmdp = pmd_offset(pudp, dst_addr); > if (pmd_none(READ_ONCE(*pmdp))) { > - ptep = allocator(mask); > + ptep = (void *)get_safe_page(GFP_ATOMIC); > if (!ptep) { > rc = -ENOMEM; > goto out; > @@ -530,8 +528,7 @@ int swsusp_arch_resume(void) >*/ > rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size, > (unsigned long)hibernate_exit, > -&phys_hibernate_exit, > -(void *)get_safe_page, GFP_ATOMIC); > +&phys_hibernate_exit); > if (rc) { > pr_err("Failed to create safe executable page for > hibernate_exit code.\n"); > goto out; >