Re: [5.16.0] build error on unrecognized opcode: ptesync
Le 10/01/2022 à 13:32, Mike a écrit : > Hey, so I originally sat down to compile the fast headers V2 patch, but > quickly discovered other things at play, and grabbed 5.16.0 a few hours > after it lifted off, arch/powerpc/mm/mmu_context.c I had to > specifically say had to include -maltivec or it barfed on a 'dssall', > I'm fine with that, I've spent years in kernel land, I can deal with > that, then came arch/powerpc/lib/step.c with the ptesync. This seems > like a totally normal instruction that shouldn't need any extra flags or > anything, yet the assembler throws up, and no flag I can think of fixes > it. This is a G4 7447. I reverted back to the Debian 5.15. defconfig > before dropping this mail as I had tweaked my config to be more G4. > Hi Mike, Can you provide a bit more details about your setup and config ? Are you using GCC or LLVM ? What version of GCC and BINUTILS or what version of LLVM ? What is DEBIAN defconfig ? Does it correspond to one of the standard mainline kernel defconfigs ? If not can you provide it ? Thanks Christophe
Re: [PATCH v2 1/2] powerpc: Fix virt_addr_valid() check
Le 11/01/2022 à 05:37, Nicholas Piggin a écrit : > Excerpts from Kefeng Wang's message of January 8, 2022 9:58 pm: >> Hi PPC maintainers, ping.. > > Hmm. I might have confused myself about this. I'm going back and > trying to work out what I was thinking when I suggested it. This > works on 64e because vmalloc space is below the kernel linear map, > right? > > On 64s it is the other way around and it is still possible to enable > flatmem on 64s. Altough we might just not hit the problem there because > __pa() will not mask away the vmalloc offset for 64s so it will still > return something that's outside the pfn_valid range for flatmem. That's > very subtle though. That's the way it works on PPC32 at least, so for me it's not chocking to have it work the same way on PPC64s. The main issue here is the way __pa() works. On PPC32 __pa = va - PAGE_OFFSET, so it works correctly for any address. On PPC64, __pa() works by masking out the 2 top bits instead of substracting PAGE_OFFSET, so the test must add a verification that we really have the 2 top bits set at first. This is what (addr >= PAGE_OFFSET) does. Once this first test is done, we can perfectly rely on pfn_valid() just like PPC32, I see absolutely no point in an additionnal test checking the addr is below KERN_VIRT_START. > > The checks added to __pa actually don't prevent vmalloc memory from > being passed to it either on 64s, only a more basic test. That's correct. It is the role of pfn_valid() to check that. Christophe > > I think 64s wants (addr >= PAGE_OFFSET && addr < KERN_VIRT_START) as > the condition. Could possibly add that check to __pa as well to > catch vmalloc addresses. > > Thanks, > Nick > >> >> On 2021/12/25 20:06, Kefeng Wang wrote: >>> When run ethtool eth0, the BUG occurred, >>> >>> usercopy: Kernel memory exposure attempt detected from SLUB object not >>> in SLUB page?! (offset 0, size 1048)! >>> kernel BUG at mm/usercopy.c:99 >>> ... >>> usercopy_abort+0x64/0xa0 (unreliable) >>> __check_heap_object+0x168/0x190 >>> __check_object_size+0x1a0/0x200 >>> dev_ethtool+0x2494/0x2b20 >>> dev_ioctl+0x5d0/0x770 >>> sock_do_ioctl+0xf0/0x1d0 >>> sock_ioctl+0x3ec/0x5a0 >>> __se_sys_ioctl+0xf0/0x160 >>> system_call_exception+0xfc/0x1f0 >>> system_call_common+0xf8/0x200 >>> >>> The code shows below, >>> >>> data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); >>> copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) >>> >>> The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true >>> on PowerPC64, which leads to the panic. >>> >>> As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va >>> and __pa addresses") does, make sure the virt addr above PAGE_OFFSET in >>> the virt_addr_valid(). >>> >>> Signed-off-by: Kefeng Wang >>> --- >>>arch/powerpc/include/asm/page.h | 5 - >>>1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/powerpc/include/asm/page.h >>> b/arch/powerpc/include/asm/page.h >>> index 254687258f42..300d4c105a3a 100644 >>> --- a/arch/powerpc/include/asm/page.h >>> +++ b/arch/powerpc/include/asm/page.h >>> @@ -132,7 +132,10 @@ static inline bool pfn_valid(unsigned long pfn) >>>#define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr)) >>>#define pfn_to_kaddr(pfn)__va((pfn) << PAGE_SHIFT) >>> >>> -#define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr)) >>> +#define virt_addr_valid(vaddr) ({ >>> \ >>> + unsigned long _addr = (unsigned long)vaddr; >>> \ >>> + (unsigned long)(_addr) >= PAGE_OFFSET && pfn_valid(virt_to_pfn(_addr)); >>> \ >>> +}) >>> >>>/* >>> * On Book-E parts we need __va to parse the device tree and we can't >>
Re: [PATCH v2 1/2] powerpc: Fix virt_addr_valid() check
Excerpts from Kefeng Wang's message of January 8, 2022 9:58 pm: > Hi PPC maintainers, ping.. Hmm. I might have confused myself about this. I'm going back and trying to work out what I was thinking when I suggested it. This works on 64e because vmalloc space is below the kernel linear map, right? On 64s it is the other way around and it is still possible to enable flatmem on 64s. Altough we might just not hit the problem there because __pa() will not mask away the vmalloc offset for 64s so it will still return something that's outside the pfn_valid range for flatmem. That's very subtle though. The checks added to __pa actually don't prevent vmalloc memory from being passed to it either on 64s, only a more basic test. I think 64s wants (addr >= PAGE_OFFSET && addr < KERN_VIRT_START) as the condition. Could possibly add that check to __pa as well to catch vmalloc addresses. Thanks, Nick > > On 2021/12/25 20:06, Kefeng Wang wrote: >> When run ethtool eth0, the BUG occurred, >> >>usercopy: Kernel memory exposure attempt detected from SLUB object not in >> SLUB page?! (offset 0, size 1048)! >>kernel BUG at mm/usercopy.c:99 >>... >>usercopy_abort+0x64/0xa0 (unreliable) >>__check_heap_object+0x168/0x190 >>__check_object_size+0x1a0/0x200 >>dev_ethtool+0x2494/0x2b20 >>dev_ioctl+0x5d0/0x770 >>sock_do_ioctl+0xf0/0x1d0 >>sock_ioctl+0x3ec/0x5a0 >>__se_sys_ioctl+0xf0/0x160 >>system_call_exception+0xfc/0x1f0 >>system_call_common+0xf8/0x200 >> >> The code shows below, >> >>data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); >>copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) >> >> The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true >> on PowerPC64, which leads to the panic. >> >> As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va >> and __pa addresses") does, make sure the virt addr above PAGE_OFFSET in >> the virt_addr_valid(). >> >> Signed-off-by: Kefeng Wang >> --- >> arch/powerpc/include/asm/page.h | 5 - >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/include/asm/page.h >> b/arch/powerpc/include/asm/page.h >> index 254687258f42..300d4c105a3a 100644 >> --- a/arch/powerpc/include/asm/page.h >> +++ b/arch/powerpc/include/asm/page.h >> @@ -132,7 +132,10 @@ static inline bool pfn_valid(unsigned long pfn) >> #define virt_to_page(kaddr)pfn_to_page(virt_to_pfn(kaddr)) >> #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) >> >> -#define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr)) >> +#define virt_addr_valid(vaddr) ({ >> \ >> +unsigned long _addr = (unsigned long)vaddr; >> \ >> +(unsigned long)(_addr) >= PAGE_OFFSET && pfn_valid(virt_to_pfn(_addr)); >> \ >> +}) >> >> /* >>* On Book-E parts we need __va to parse the device tree and we can't >
Re: [PATCH v3 5/6] KVM: PPC: mmio: Return to guest after emulation failure
Excerpts from Alexey Kardashevskiy's message of January 11, 2022 9:51 am: > > > On 1/10/22 18:36, Nicholas Piggin wrote: >> Excerpts from Fabiano Rosas's message of January 8, 2022 7:00 am: >>> If MMIO emulation fails we don't want to crash the whole guest by >>> returning to userspace. >>> >>> The original commit bbf45ba57eae ("KVM: ppc: PowerPC 440 KVM >>> implementation") added a todo: >>> >>>/* XXX Deliver Program interrupt to guest. */ >>> >>> and later the commit d69614a295ae ("KVM: PPC: Separate loadstore >>> emulation from priv emulation") added the Program interrupt injection >>> but in another file, so I'm assuming it was missed that this block >>> needed to be altered. >>> >>> Signed-off-by: Fabiano Rosas >>> Reviewed-by: Alexey Kardashevskiy >>> --- >>> arch/powerpc/kvm/powerpc.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c >>> index 6daeea4a7de1..56b0faab7a5f 100644 >>> --- a/arch/powerpc/kvm/powerpc.c >>> +++ b/arch/powerpc/kvm/powerpc.c >>> @@ -309,7 +309,7 @@ int kvmppc_emulate_mmio(struct kvm_vcpu *vcpu) >>> kvmppc_get_last_inst(vcpu, INST_GENERIC, _inst); >>> kvmppc_core_queue_program(vcpu, 0); >>> pr_info("%s: emulation failed (%08x)\n", __func__, last_inst); >>> - r = RESUME_HOST; >>> + r = RESUME_GUEST; >> >> So at this point can the pr_info just go away? >> >> I wonder if this shouldn't be a DSI rather than a program check. >> DSI with DSISR[37] looks a bit more expected. Not that Linux >> probably does much with it but at least it would give a SIGBUS >> rather than SIGILL. > > It does not like it is more expected to me, it is not about wrong memory > attributes, it is the instruction itself which cannot execute. It's not an illegal instruction though, it can't execute because of the nature of the data / address it is operating on. That says d-side to me. DSISR[37] isn't perfect but if you squint it's not terrible. It's about certain instructions that have restrictions operating on other than normal cacheable mappings. Thanks, Nick > > DSISR[37]: > Set to 1 if the access is due to a lq, stq, lwat, ldat, lbarx, lharx, > lwarx, ldarx, lqarx, stwat, > stdat, stbcx., sthcx., stwcx., stdcx., or stqcx. instruction that > addresses storage that is Write > Through Required or Caching Inhibited; or if the access is due to a copy > or paste. instruction > that addresses storage that is Caching Inhibited; or if the access is > due to a lwat, ldat, stwat, or > stdat instruction that addresses storage that is Guarded; otherwise set > to 0. >
Re: [PATCH] powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32
Excerpts from Christophe Leroy's message of January 11, 2022 1:29 am: > CC arch/powerpc/kernel/time.o > In file included from : > ./arch/powerpc/include/asm/hw_irq.h: In function 'do_hard_irq_enable': > ././include/linux/compiler_types.h:335:45: error: call to > '__compiletime_assert_35' declared with attribute error: BUILD_BUG failed > 335 | _compiletime_assert(condition, msg, > __compiletime_assert_, __COUNTER__) > | ^ > ././include/linux/compiler_types.h:316:25: note: in definition of macro > '__compiletime_assert' > 316 | prefix ## suffix(); > \ > | ^~ > ././include/linux/compiler_types.h:335:9: note: in expansion of macro > '_compiletime_assert' > 335 | _compiletime_assert(condition, msg, > __compiletime_assert_, __COUNTER__) > | ^~~ > ./include/linux/build_bug.h:39:37: note: in expansion of macro > 'compiletime_assert' > 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), > msg) > | ^~ > ./include/linux/build_bug.h:59:21: note: in expansion of macro > 'BUILD_BUG_ON_MSG' > 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") > | ^~~~ > ./arch/powerpc/include/asm/hw_irq.h:483:9: note: in expansion of macro > 'BUILD_BUG' > 483 | BUILD_BUG(); > | ^ > > should_hard_irq_enable() returns false on PPC32 so this BUILD_BUG() shouldn't > trigger. > > Force inlining of should_hard_irq_enable() > > Signed-off-by: Christophe Leroy > Fixes: 0faf20a1ad16 ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq > handlers unless perf is in use") > Cc: Nicholas Piggin Acked-by: Nicholas Piggin Thanks, Nick > --- > arch/powerpc/include/asm/hw_irq.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/hw_irq.h > b/arch/powerpc/include/asm/hw_irq.h > index a58fb4aa6c81..674e5aaafcbd 100644 > --- a/arch/powerpc/include/asm/hw_irq.h > +++ b/arch/powerpc/include/asm/hw_irq.h > @@ -473,7 +473,7 @@ static inline bool arch_irq_disabled_regs(struct pt_regs > *regs) > return !(regs->msr & MSR_EE); > } > > -static inline bool should_hard_irq_enable(void) > +static __always_inline bool should_hard_irq_enable(void) > { > return false; > } > -- > 2.33.1 >
Re: [PATCH 3/6] KVM: Remove opaque from kvm_arch_check_processor_compat
On Mon, Jan 10, 2022 at 11:06:44PM +, Sean Christopherson wrote: >On Mon, Dec 27, 2021, Chao Gao wrote: >> No arch implementation uses this opaque now. > >Except for the RISC-V part, this can be a pure revert of commit b99040853738 >("KVM: >Pass kvm_init()'s opaque param to additional arch funcs"). I think it makes >sense >to process it as a revert, with a short blurb in the changelog to note that >RISC-V >is manually modified as RISC-V support came along in the interim. commit b99040853738 adds opaque param to kvm_arch_hardware_setup(), which isn't reverted in this patch. I.e., this patch is a partial revert of b99040853738 plus manual changes to RISC-V. Given that, "process it as a revert" means clearly say in changelog that this commit contains a partial revert of commit b99040853738 ("KVM: Pass kvm_init()'s opaque param to additional arch funcs"). Right?
Re: [PATCH V2 03/17] asm-generic: fcntl: compat: Remove duplicate definitions
On Mon, Jan 10, 2022 at 9:35 PM Arnd Bergmann wrote: > > On Tue, Dec 28, 2021 at 3:39 PM wrote: > > > > From: Guo Ren > > > > Remove duplicate F_GETLK64,F_SETLK64,F_SETLKW64 definitions in > > arch/*/include/asm/compat.h. > > > > Signed-off-by: Guo Ren > > Signed-off-by: Guo Ren > > Cc: Arnd Bergmann > > Unfortunately, this one does not look correct to me: > > > @@ -116,7 +116,7 @@ > > #define F_GETSIG 11 /* for sockets. */ > > #endif > > > > -#ifndef CONFIG_64BIT > > +#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) > > #ifndef F_GETLK64 > > #define F_GETLK64 12 /* using 'struct flock64' */ > > #define F_SETLK64 13 > > The problem here is that include/uapi/ headers cannot contain checks for > CONFIG_* symbols because those may have different meanings in user space > compared to kernel. > > This is a preexisting problem in the header, but I think the change > makes it worse. > > With the current behavior, user space will always see the definitions, > unless it happens to have its own definition for CONFIG_64BIT already. > On 64-bit parisc, this has the effect of defining the macros to the > same values as F_SETOWN/F_SETSIG/F_GETSIG, which is potentially > harmful. On MIPS, it uses values that are different from the 32-bit numbers > but are otherwise unused. Everywhere else, we get the definition from > the 32-bit architecture in user space, which will do nothing in the kernel. > > The correct check for a uapi header would be to test for > __BITS_PER_LONG==32. We should probably do that here, but > this won't help you move the definitions, and it is a user-visible change > as the incorrect definition will no longer be visible. [Adding Jeff and Bruce > (the flock mainainers) to Cc for additional feedback on this] > > For your series, I would suggest just moving the macro definitions to > include/linux/compat.h along with the 'struct compat_flock64' > definition, and leaving the duplicate one in the uapi header unchanged > until we have decided on a solution. Okay. > > Arnd -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
Re: [PATCH V2 11/17] riscv: compat: Add elf.h implementation
On Mon, Jan 10, 2022 at 10:29 PM Arnd Bergmann wrote: > > On Tue, Dec 28, 2021 at 3:39 PM wrote: > > > > From: Guo Ren > > > > Implement necessary type and macro for compat elf. See the code > > comment for detail. > > > > Signed-off-by: Guo Ren > > Signed-off-by: Guo Ren > > Cc: Arnd Bergmann > > This looks mostly correct, > > > +/* > > + * FIXME: not sure SET_PERSONALITY for compat process is right! > > + */ > > +#define SET_PERSONALITY(ex) \ > > +do {if ((ex).e_ident[EI_CLASS] == ELFCLASS32) \ > > + set_thread_flag(TIF_32BIT);\ > > + else \ > > + clear_thread_flag(TIF_32BIT); \ > > + set_personality(PER_LINUX | (current->personality & (~PER_MASK))); \ > > +} while (0) > > This means the personality after exec is always set to PER_LINUX, not > PER_LINUX32, which I think is wrong: you want the PER_LINUX32 > setting to stick, just like the upper bits do in the default implementation. > > What the other ones do is: > > | arch/parisc/include/asm/elf.h- > set_personality((current->personality & ~PER_MASK) | PER_LINUX); \ > > This looks like the same problem you introduce here: always forcing PER_LINUX > instead of PER_LINUX32 makes it impossible to use PER_LINUX32. > > | arch/alpha/include/asm/elf.h:#define SET_PERSONALITY(EX) >\ > | arch/alpha/include/asm/elf.h- set_personality(((EX).e_flags & > EF_ALPHA_32BIT) \ > | arch/alpha/include/asm/elf.h- ? PER_LINUX_32BIT : PER_LINUX) > | arch/csky/include/asm/elf.h:#define SET_PERSONALITY(ex) > set_personality(PER_LINUX) > | arch/nds32/include/asm/elf.h:#define SET_PERSONALITY(ex) > set_personality(PER_LINUX) > > These look even worse: instead of forcing the lower bits to > PER_LINUX/PER_LINUX32 and > leaving the upper bits untouched, these also clear the upper bits > unconditionally. > > | arch/arm64/include/asm/elf.h:#define SET_PERSONALITY(ex) >\ > | arch/arm64/include/asm/elf.h- current->personality &= > ~READ_IMPLIES_EXEC; \ > | arch/x86/um/asm/elf.h:#define SET_PERSONALITY(ex) do {} while(0) > | arch/x86/include/asm/elf.h:#define set_personality_64bit() do { > } while (0) > | arch/x86/kernel/process_64.c:static void __set_personality_ia32(void) > | current->personality |= force_personality32; > > Inconsistent: does not enforce PER_LINUX/PER_LINUX32 as the default > implementation > does, but just leaves the value untouched (other than (re)setting > READ_IMPLIES_EXEC). > I think this is harmless otherwise, as we generally ignore the lower > bits, except for the > bit of code that checks for PER_LINUX32 in override_architecture() to mangle > the > output of sys_newuname() or in /proc/cpuinfo. > > | arch/s390/include/asm/elf.h-if > (personality(current->personality) != PER_LINUX32) \ > | arch/s390/include/asm/elf.h-set_personality(PER_LINUX | >\ > | arch/s390/include/asm/elf.h- > (current->personality & ~PER_MASK));\ > | arch/powerpc/include/asm/elf.h- if > (personality(current->personality) != PER_LINUX32) \ > | arch/powerpc/include/asm/elf.h- set_personality(PER_LINUX | >\ > | arch/powerpc/include/asm/elf.h- > (current->personality & (~PER_MASK))); \ > | arch/sparc/include/asm/elf_64.h-if > (personality(current->personality) != PER_LINUX32) \ > | arch/sparc/include/asm/elf_64.h- > set_personality(PER_LINUX | \ > | arch/sparc/include/asm/elf_64.h- > (current->personality & (~PER_MASK))); \ > > This is probably the behavior you want to copy. Thank you very much for your detailed explanation. Here is my modification. +#define SET_PERSONALITY(ex)\ +do {if ((ex).e_ident[EI_CLASS] == ELFCLASS32) \ + set_thread_flag(TIF_32BIT); \ + else\ + clear_thread_flag(TIF_32BIT); \ + if (personality(current->personality) != PER_LINUX32) \ + set_personality(PER_LINUX | \ + (current->personality & (~PER_MASK))); \ +} while (0) > > Arnd -- Best Regards Guo Ren ML: https://lore.kernel.org/linux-csky/
[PATCH kernel v5] KVM: PPC: Merge powerpc's debugfs entry content into generic entry
At the moment KVM on PPC creates 4 types of entries under the kvm debugfs: 1) "%pid-%fd" per a KVM instance (for all platforms); 2) "vm%pid" (for PPC Book3s HV KVM); 3) "vm%u_vcpu%u_timing" (for PPC Book3e KVM); 4) "kvm-xive-%p" (for XIVE PPC Book3s KVM, the same for XICS); The problem with this is that multiple VMs per process is not allowed for 2) and 3) which makes it possible for userspace to trigger errors when creating duplicated debugfs entries. This merges all these into 1). This defines kvm_arch_create_kvm_debugfs() similar to kvm_arch_create_vcpu_debugfs(). This defines 2 hooks in kvmppc_ops that allow specific KVM implementations add necessary entries, this adds the _e500 suffix to kvmppc_create_vcpu_debugfs_e500() to make it clear what platform it is for. This makes use of already existing kvm_arch_create_vcpu_debugfs() on PPC. This removes no more used debugfs_dir pointers from PPC kvm_arch structs. This stops removing vcpu entries as once created vcpus stay around for the entire life of a VM and removed when the KVM instance is closed, see commit d56f5136b010 ("KVM: let kvm_destroy_vm_debugfs clean up vCPU debugfs directories"). Suggested-by: Fabiano Rosas Signed-off-by: Alexey Kardashevskiy --- Changes: v5: * fixed e500mc2 v4: * added "kvm-xive-%p" v3: * reworked commit log, especially, the bit about removing vcpus v2: * handled powerpc-booke * s/kvm/vm/ in arch hooks --- arch/powerpc/include/asm/kvm_host.h| 6 ++--- arch/powerpc/include/asm/kvm_ppc.h | 2 ++ arch/powerpc/kvm/timing.h | 12 +- arch/powerpc/kvm/book3s_64_mmu_hv.c| 2 +- arch/powerpc/kvm/book3s_64_mmu_radix.c | 2 +- arch/powerpc/kvm/book3s_hv.c | 31 ++ arch/powerpc/kvm/book3s_xics.c | 13 ++- arch/powerpc/kvm/book3s_xive.c | 13 ++- arch/powerpc/kvm/book3s_xive_native.c | 13 ++- arch/powerpc/kvm/e500.c| 1 + arch/powerpc/kvm/e500mc.c | 1 + arch/powerpc/kvm/powerpc.c | 16 ++--- arch/powerpc/kvm/timing.c | 21 + 13 files changed, 51 insertions(+), 82 deletions(-) diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h index 17263276189e..f5e14fa683f4 100644 --- a/arch/powerpc/include/asm/kvm_host.h +++ b/arch/powerpc/include/asm/kvm_host.h @@ -26,6 +26,8 @@ #include #include +#define __KVM_HAVE_ARCH_VCPU_DEBUGFS + #define KVM_MAX_VCPUS NR_CPUS #define KVM_MAX_VCORES NR_CPUS @@ -295,7 +297,6 @@ struct kvm_arch { bool dawr1_enabled; pgd_t *pgtable; u64 process_table; - struct dentry *debugfs_dir; struct kvm_resize_hpt *resize_hpt; /* protected by kvm->lock */ #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */ #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE @@ -673,7 +674,6 @@ struct kvm_vcpu_arch { u64 timing_min_duration[__NUMBER_OF_KVM_EXIT_TYPES]; u64 timing_max_duration[__NUMBER_OF_KVM_EXIT_TYPES]; u64 timing_last_exit; - struct dentry *debugfs_exit_timing; #endif #ifdef CONFIG_PPC_BOOK3S @@ -829,8 +829,6 @@ struct kvm_vcpu_arch { struct kvmhv_tb_accumulator rm_exit;/* real-mode exit code */ struct kvmhv_tb_accumulator guest_time; /* guest execution */ struct kvmhv_tb_accumulator cede_time; /* time napping inside guest */ - - struct dentry *debugfs_dir; #endif /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */ }; diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h index 33db83b82fbd..d2b192dea0d2 100644 --- a/arch/powerpc/include/asm/kvm_ppc.h +++ b/arch/powerpc/include/asm/kvm_ppc.h @@ -316,6 +316,8 @@ struct kvmppc_ops { int (*svm_off)(struct kvm *kvm); int (*enable_dawr1)(struct kvm *kvm); bool (*hash_v3_possible)(void); + int (*create_vm_debugfs)(struct kvm *kvm); + int (*create_vcpu_debugfs)(struct kvm_vcpu *vcpu, struct dentry *debugfs_dentry); }; extern struct kvmppc_ops *kvmppc_hv_ops; diff --git a/arch/powerpc/kvm/timing.h b/arch/powerpc/kvm/timing.h index feef7885ba82..45817ab82bb4 100644 --- a/arch/powerpc/kvm/timing.h +++ b/arch/powerpc/kvm/timing.h @@ -14,8 +14,8 @@ #ifdef CONFIG_KVM_EXIT_TIMING void kvmppc_init_timing_stats(struct kvm_vcpu *vcpu); void kvmppc_update_timing_stats(struct kvm_vcpu *vcpu); -void kvmppc_create_vcpu_debugfs(struct kvm_vcpu *vcpu, unsigned int id); -void kvmppc_remove_vcpu_debugfs(struct kvm_vcpu *vcpu); +int kvmppc_create_vcpu_debugfs_e500(struct kvm_vcpu *vcpu, + struct dentry *debugfs_dentry); static inline void kvmppc_set_exit_type(struct kvm_vcpu *vcpu, int type) { @@ -26,9 +26,11 @@ static inline void kvmppc_set_exit_type(struct kvm_vcpu *vcpu, int type) /* if exit timing is not configured there is no need to build the c file */ static inline void kvmppc_init_timing_stats(struct kvm_vcpu *vcpu) {} static inline void
Re: [PATCH v3 5/6] KVM: PPC: mmio: Return to guest after emulation failure
On 1/10/22 18:36, Nicholas Piggin wrote: Excerpts from Fabiano Rosas's message of January 8, 2022 7:00 am: If MMIO emulation fails we don't want to crash the whole guest by returning to userspace. The original commit bbf45ba57eae ("KVM: ppc: PowerPC 440 KVM implementation") added a todo: /* XXX Deliver Program interrupt to guest. */ and later the commit d69614a295ae ("KVM: PPC: Separate loadstore emulation from priv emulation") added the Program interrupt injection but in another file, so I'm assuming it was missed that this block needed to be altered. Signed-off-by: Fabiano Rosas Reviewed-by: Alexey Kardashevskiy --- arch/powerpc/kvm/powerpc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index 6daeea4a7de1..56b0faab7a5f 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -309,7 +309,7 @@ int kvmppc_emulate_mmio(struct kvm_vcpu *vcpu) kvmppc_get_last_inst(vcpu, INST_GENERIC, _inst); kvmppc_core_queue_program(vcpu, 0); pr_info("%s: emulation failed (%08x)\n", __func__, last_inst); - r = RESUME_HOST; + r = RESUME_GUEST; So at this point can the pr_info just go away? I wonder if this shouldn't be a DSI rather than a program check. DSI with DSISR[37] looks a bit more expected. Not that Linux probably does much with it but at least it would give a SIGBUS rather than SIGILL. It does not like it is more expected to me, it is not about wrong memory attributes, it is the instruction itself which cannot execute. DSISR[37]: Set to 1 if the access is due to a lq, stq, lwat, ldat, lbarx, lharx, lwarx, ldarx, lqarx, stwat, stdat, stbcx., sthcx., stwcx., stdcx., or stqcx. instruction that addresses storage that is Write Through Required or Caching Inhibited; or if the access is due to a copy or paste. instruction that addresses storage that is Caching Inhibited; or if the access is due to a lwat, ldat, stwat, or stdat instruction that addresses storage that is Guarded; otherwise set to 0.
[5.16.0] build error on unrecognized opcode: ptesync
Hey, so I originally sat down to compile the fast headers V2 patch, but quickly discovered other things at play, and grabbed 5.16.0 a few hours after it lifted off, arch/powerpc/mm/mmu_context.c I had to specifically say had to include -maltivec or it barfed on a 'dssall', I'm fine with that, I've spent years in kernel land, I can deal with that, then came arch/powerpc/lib/step.c with the ptesync. This seems like a totally normal instruction that shouldn't need any extra flags or anything, yet the assembler throws up, and no flag I can think of fixes it. This is a G4 7447. I reverted back to the Debian 5.15. defconfig before dropping this mail as I had tweaked my config to be more G4. Best regards. Michael Heltne
Re: [PATCH 00/16] Remove usage of the deprecated "pci-dma-compat.h" API
Christophe, > This serie axes all the remaining usages of the deprecated > "pci-dma-compat.h" API. Applied patches 10-15 to 5.17/scsi-staging, thanks! -- Martin K. Petersen Oracle Linux Engineering
Re: [PATCH V2 03/17] asm-generic: fcntl: compat: Remove duplicate definitions
On Mon, Jan 10, 2022 at 02:35:19PM +0100, Arnd Bergmann wrote: > > +#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) > > #ifndef F_GETLK64 > > #define F_GETLK64 12 /* using 'struct flock64' */ > > #define F_SETLK64 13 > > The problem here is that include/uapi/ headers cannot contain checks for > CONFIG_* symbols because those may have different meanings in user space > compared to kernel. > > This is a preexisting problem in the header, but I think the change > makes it worse. FYI, this is what I did in my old branch, which also sidesteps the duplicate value problem on parisc. The rebase is untested so far, but I can spend some cycles on finishing it: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/fcntl-asm-generic-cleanup
Re: [PATCH V2 03/17] asm-generic: fcntl: compat: Remove duplicate definitions
Le 28/12/2021 à 15:39, guo...@kernel.org a écrit : > From: Guo Ren > > Remove duplicate F_GETLK64,F_SETLK64,F_SETLKW64 definitions in > arch/*/include/asm/compat.h. > > Signed-off-by: Guo Ren > Signed-off-by: Guo Ren > Cc: Arnd Bergmann > --- > arch/arm64/include/asm/compat.h | 4 > arch/powerpc/include/asm/compat.h | 4 > arch/s390/include/asm/compat.h| 4 > arch/sparc/include/asm/compat.h | 4 > arch/x86/include/asm/compat.h | 4 > include/uapi/asm-generic/fcntl.h | 2 +- > 6 files changed, 1 insertion(+), 21 deletions(-) > ... > diff --git a/include/uapi/asm-generic/fcntl.h > b/include/uapi/asm-generic/fcntl.h > index ecd0f5bdfc1d..5bc1e51d73b1 100644 > --- a/include/uapi/asm-generic/fcntl.h > +++ b/include/uapi/asm-generic/fcntl.h > @@ -116,7 +116,7 @@ > #define F_GETSIG11 /* for sockets. */ > #endif > > -#ifndef CONFIG_64BIT > +#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) > #ifndef F_GETLK64 > #define F_GETLK64 12 /* using 'struct flock64' */ > #define F_SETLK64 13 There seems to be a problem with this change: error: /linux/include/uapi/asm-generic/fcntl.h: leak CONFIG_COMPAT to user-space make[3]: *** [/linux/scripts/Makefile.headersinst:63: usr/include/asm-generic/fcntl.h] Error 1 make[3]: *** Waiting for unfinished jobs make[2]: *** [/linux/Makefile:1283: headers] Error 2 make[1]: *** [Makefile:219: __sub-make] Error 2 make[2]: Leaving directory '/output' make[1]: Leaving directory '/linux' make: *** [Makefile:157: khdr] Error 2 make: Leaving directory '/linux/tools/testing/selftests' ## Selftest build completed rc = 2 ## Found 2 binaries !! Error build failed rc 2 Error: Process completed with exit code 2.
[PATCH] powerpc/time: Fix build failure due to do_hard_irq_enable() on PPC32
CC arch/powerpc/kernel/time.o In file included from : ./arch/powerpc/include/asm/hw_irq.h: In function 'do_hard_irq_enable': ././include/linux/compiler_types.h:335:45: error: call to '__compiletime_assert_35' declared with attribute error: BUILD_BUG failed 335 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^ ././include/linux/compiler_types.h:316:25: note: in definition of macro '__compiletime_assert' 316 | prefix ## suffix(); \ | ^~ ././include/linux/compiler_types.h:335:9: note: in expansion of macro '_compiletime_assert' 335 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) | ^~~ ./include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) | ^~ ./include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") | ^~~~ ./arch/powerpc/include/asm/hw_irq.h:483:9: note: in expansion of macro 'BUILD_BUG' 483 | BUILD_BUG(); | ^ should_hard_irq_enable() returns false on PPC32 so this BUILD_BUG() shouldn't trigger. Force inlining of should_hard_irq_enable() Signed-off-by: Christophe Leroy Fixes: 0faf20a1ad16 ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless perf is in use") Cc: Nicholas Piggin --- arch/powerpc/include/asm/hw_irq.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h index a58fb4aa6c81..674e5aaafcbd 100644 --- a/arch/powerpc/include/asm/hw_irq.h +++ b/arch/powerpc/include/asm/hw_irq.h @@ -473,7 +473,7 @@ static inline bool arch_irq_disabled_regs(struct pt_regs *regs) return !(regs->msr & MSR_EE); } -static inline bool should_hard_irq_enable(void) +static __always_inline bool should_hard_irq_enable(void) { return false; } -- 2.33.1
[PATCH v3] powerpc/32s: Fix kasan_init_region() for KASAN
It has been reported some configuration where the kernel doesn't boot with KASAN enabled. This is due to wrong BAT allocation for the KASAN area: ---[ Data Block Address Translation ]--- 0: 0xc000-0xcfff 0x 256M Kernel rw m 1: 0xd000-0xdfff 0x1000 256M Kernel rw m 2: 0xe000-0xefff 0x2000 256M Kernel rw m 3: 0xf800-0xf9ff 0x2a0032M Kernel rw m 4: 0xfa00-0xfdff 0x2c0064M Kernel rw m A BAT must have both virtual and physical addresses alignment matching the size of the BAT. This is not the case for BAT 4 above. Fix kasan_init_region() by using block_size() function that is in book3s32/mmu.c. To be able to reuse it here, make it non static and change its name to bat_block_size() in order to avoid name conflict with block_size() defined in Also reuse find_free_bat() to avoid an error message from setbat() when no BAT is available. And allocate memory outside of linear memory mapping to avoid wasting that precious space. With this change we get correct alignment for BATs and KASAN shadow memory is allocated outside the linear memory space. ---[ Data Block Address Translation ]--- 0: 0xc000-0xcfff 0x 256M Kernel rw 1: 0xd000-0xdfff 0x1000 256M Kernel rw 2: 0xe000-0xefff 0x2000 256M Kernel rw 3: 0xf800-0xfbff 0x7c0064M Kernel rw 4: 0xfc00-0xfdff 0x7a0032M Kernel rw Reported-by: Maxime Bizon Fixes: 7974c4732642 ("powerpc/32s: Implement dedicated kasan_init_region()") Cc: sta...@vger.kernel.org Signed-off-by: Christophe Leroy Tested-by: Maxime Bizon --- v3: Rebased v2: - Allocate kasan shadow memory outside precious kernel linear memory - Properly zeroise kasan shadow memory --- arch/powerpc/include/asm/book3s/32/mmu-hash.h | 2 + arch/powerpc/mm/book3s32/mmu.c| 10 ++-- arch/powerpc/mm/kasan/book3s_32.c | 59 ++- 3 files changed, 38 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/mmu-hash.h b/arch/powerpc/include/asm/book3s/32/mmu-hash.h index 7be27862329f..78c6a5fde1d6 100644 --- a/arch/powerpc/include/asm/book3s/32/mmu-hash.h +++ b/arch/powerpc/include/asm/book3s/32/mmu-hash.h @@ -223,6 +223,8 @@ static __always_inline void update_user_segments(u32 val) update_user_segment(15, val); } +int __init find_free_bat(void); +unsigned int bat_block_size(unsigned long base, unsigned long top); #endif /* !__ASSEMBLY__ */ /* We happily ignore the smaller BATs on 601, we don't actually use diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c index 94045b265b6b..203735caf691 100644 --- a/arch/powerpc/mm/book3s32/mmu.c +++ b/arch/powerpc/mm/book3s32/mmu.c @@ -76,7 +76,7 @@ unsigned long p_block_mapped(phys_addr_t pa) return 0; } -static int __init find_free_bat(void) +int __init find_free_bat(void) { int b; int n = mmu_has_feature(MMU_FTR_USE_HIGH_BATS) ? 8 : 4; @@ -100,7 +100,7 @@ static int __init find_free_bat(void) * - block size has to be a power of two. This is calculated by finding the * highest bit set to 1. */ -static unsigned int block_size(unsigned long base, unsigned long top) +unsigned int bat_block_size(unsigned long base, unsigned long top) { unsigned int max_size = SZ_256M; unsigned int base_shift = (ffs(base) - 1) & 31; @@ -145,7 +145,7 @@ static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned long to int idx; while ((idx = find_free_bat()) != -1 && base != top) { - unsigned int size = block_size(base, top); + unsigned int size = bat_block_size(base, top); if (size < 128 << 10) break; @@ -201,12 +201,12 @@ void mmu_mark_initmem_nx(void) unsigned long size; for (i = 0; i < nb - 1 && base < top;) { - size = block_size(base, top); + size = bat_block_size(base, top); setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT); base += size; } if (base < top) { - size = block_size(base, top); + size = bat_block_size(base, top); if ((top - base) > size) { size <<= 1; if (strict_kernel_rwx_enabled() && base + size > border) diff --git a/arch/powerpc/mm/kasan/book3s_32.c b/arch/powerpc/mm/kasan/book3s_32.c index 35b287b0a8da..450a67ef0bbe 100644 --- a/arch/powerpc/mm/kasan/book3s_32.c +++ b/arch/powerpc/mm/kasan/book3s_32.c @@ -10,48 +10,51 @@ int __init kasan_init_region(void *start, size_t size) { unsigned long k_start = (unsigned long)kasan_mem_to_shadow(start); unsigned long k_end = (unsigned
Re: [PATCH V2 11/17] riscv: compat: Add elf.h implementation
On Tue, Dec 28, 2021 at 3:39 PM wrote: > > From: Guo Ren > > Implement necessary type and macro for compat elf. See the code > comment for detail. > > Signed-off-by: Guo Ren > Signed-off-by: Guo Ren > Cc: Arnd Bergmann This looks mostly correct, > +/* > + * FIXME: not sure SET_PERSONALITY for compat process is right! > + */ > +#define SET_PERSONALITY(ex) \ > +do {if ((ex).e_ident[EI_CLASS] == ELFCLASS32) \ > + set_thread_flag(TIF_32BIT);\ > + else \ > + clear_thread_flag(TIF_32BIT); \ > + set_personality(PER_LINUX | (current->personality & (~PER_MASK))); \ > +} while (0) This means the personality after exec is always set to PER_LINUX, not PER_LINUX32, which I think is wrong: you want the PER_LINUX32 setting to stick, just like the upper bits do in the default implementation. What the other ones do is: | arch/parisc/include/asm/elf.h- set_personality((current->personality & ~PER_MASK) | PER_LINUX); \ This looks like the same problem you introduce here: always forcing PER_LINUX instead of PER_LINUX32 makes it impossible to use PER_LINUX32. | arch/alpha/include/asm/elf.h:#define SET_PERSONALITY(EX) \ | arch/alpha/include/asm/elf.h- set_personality(((EX).e_flags & EF_ALPHA_32BIT) \ | arch/alpha/include/asm/elf.h- ? PER_LINUX_32BIT : PER_LINUX) | arch/csky/include/asm/elf.h:#define SET_PERSONALITY(ex) set_personality(PER_LINUX) | arch/nds32/include/asm/elf.h:#define SET_PERSONALITY(ex) set_personality(PER_LINUX) These look even worse: instead of forcing the lower bits to PER_LINUX/PER_LINUX32 and leaving the upper bits untouched, these also clear the upper bits unconditionally. | arch/arm64/include/asm/elf.h:#define SET_PERSONALITY(ex) \ | arch/arm64/include/asm/elf.h- current->personality &= ~READ_IMPLIES_EXEC; \ | arch/x86/um/asm/elf.h:#define SET_PERSONALITY(ex) do {} while(0) | arch/x86/include/asm/elf.h:#define set_personality_64bit() do { } while (0) | arch/x86/kernel/process_64.c:static void __set_personality_ia32(void) | current->personality |= force_personality32; Inconsistent: does not enforce PER_LINUX/PER_LINUX32 as the default implementation does, but just leaves the value untouched (other than (re)setting READ_IMPLIES_EXEC). I think this is harmless otherwise, as we generally ignore the lower bits, except for the bit of code that checks for PER_LINUX32 in override_architecture() to mangle the output of sys_newuname() or in /proc/cpuinfo. | arch/s390/include/asm/elf.h-if (personality(current->personality) != PER_LINUX32) \ | arch/s390/include/asm/elf.h-set_personality(PER_LINUX | \ | arch/s390/include/asm/elf.h- (current->personality & ~PER_MASK));\ | arch/powerpc/include/asm/elf.h- if (personality(current->personality) != PER_LINUX32) \ | arch/powerpc/include/asm/elf.h- set_personality(PER_LINUX | \ | arch/powerpc/include/asm/elf.h- (current->personality & (~PER_MASK))); \ | arch/sparc/include/asm/elf_64.h-if (personality(current->personality) != PER_LINUX32) \ | arch/sparc/include/asm/elf_64.h- set_personality(PER_LINUX | \ | arch/sparc/include/asm/elf_64.h- (current->personality & (~PER_MASK))); \ This is probably the behavior you want to copy. Arnd
[PATCH v4 6/6] module: Move duplicate mod_check_sig users code to mod_parse_sig
Multiple users of mod_check_sig check for the marker, then call mod_check_sig, extract signature length, and remove the signature. Put this code in one place together with mod_check_sig. This changes the error from ENOENT to ENODATA for ima_read_modsig in the case the signature marker is missing. This also changes the buffer length in ima_read_modsig from size_t to unsigned long. This reduces the possible value range on 32bit but the length refers to kernel in-memory buffer which cannot be longer than ULONG_MAX. Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the commit with note about change of raturn value - Preserve the EBADMSG error code while moving code araound v4: - remove unused variable ms in module_signing - note about buffer length --- include/linux/module_signature.h| 1 + kernel/module_signature.c | 56 - kernel/module_signing.c | 27 +++--- security/integrity/ima/ima_modsig.c | 22 ++-- 4 files changed, 63 insertions(+), 43 deletions(-) diff --git a/include/linux/module_signature.h b/include/linux/module_signature.h index 7eb4b00381ac..1343879b72b3 100644 --- a/include/linux/module_signature.h +++ b/include/linux/module_signature.h @@ -42,5 +42,6 @@ struct module_signature { int mod_check_sig(const struct module_signature *ms, size_t file_len, const char *name); +int mod_parse_sig(const void *data, size_t *len, size_t *sig_len, const char *name); #endif /* _LINUX_MODULE_SIGNATURE_H */ diff --git a/kernel/module_signature.c b/kernel/module_signature.c index 00132d12487c..b8eb30182183 100644 --- a/kernel/module_signature.c +++ b/kernel/module_signature.c @@ -8,14 +8,36 @@ #include #include +#include #include #include +/** + * mod_check_sig_marker - check that the given data has signature marker at the end + * + * @data: Data with appended signature + * @len: Length of data. Signature marker length is subtracted on success. + */ +static inline int mod_check_sig_marker(const void *data, size_t *len) +{ + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; + + if (markerlen > *len) + return -ENODATA; + + if (memcmp(data + *len - markerlen, MODULE_SIG_STRING, + markerlen)) + return -ENODATA; + + *len -= markerlen; + return 0; +} + /** * mod_check_sig - check that the given signature is sane * * @ms:Signature to check. - * @file_len: Size of the file to which @ms is appended. + * @file_len: Size of the file to which @ms is appended (without the marker). * @name: What is being checked. Used for error messages. */ int mod_check_sig(const struct module_signature *ms, size_t file_len, @@ -44,3 +66,35 @@ int mod_check_sig(const struct module_signature *ms, size_t file_len, return 0; } + +/** + * mod_parse_sig - check that the given signature is sane and determine signature length + * + * @data: Data with appended signature. + * @len: Length of data. Signature and marker length is subtracted on success. + * @sig_len: Length of signature. Filled on success. + * @name: What is being checked. Used for error messages. + */ +int mod_parse_sig(const void *data, size_t *len, size_t *sig_len, const char *name) +{ + const struct module_signature *sig; + int rc; + + rc = mod_check_sig_marker(data, len); + if (rc) + return rc; + + if (*len < sizeof(*sig)) + return -EBADMSG; + + sig = (const struct module_signature *)(data + (*len - sizeof(*sig))); + + rc = mod_check_sig(sig, *len, name); + if (rc) + return rc; + + *sig_len = be32_to_cpu(sig->sig_len); + *len -= *sig_len + sizeof(*sig); + + return 0; +} diff --git a/kernel/module_signing.c b/kernel/module_signing.c index 20857d2a15ca..1d4cb03cce21 100644 --- a/kernel/module_signing.c +++ b/kernel/module_signing.c @@ -25,35 +25,16 @@ int verify_appended_signature(const void *data, unsigned long *len, struct key *trusted_keys, enum key_being_used_for purpose) { - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; - struct module_signature *ms; - unsigned long sig_len, modlen = *len; + unsigned long sig_len; int ret; - pr_devel("==>%s %s(,%lu)\n", __func__, key_being_used_for[purpose], modlen); + pr_devel("==>%s %s(,%lu)\n", __func__, key_being_used_for[purpose], *len); - if (markerlen > modlen) - return -ENODATA; - - if (memcmp(data + modlen - markerlen, MODULE_SIG_STRING, - markerlen)) - return -ENODATA; - modlen -= markerlen; - - if (modlen <= sizeof(*ms)) - return -EBADMSG; - - ms = data + modlen - sizeof(*ms); - - ret = mod_check_sig(ms,
[PATCH v4 5/6] module: Use key_being_used_for for log messages in verify_appended_signature
Add value for kexec appended signature and pass in key_being_used_for enum rather than a string to verify_appended_signature to produce log messages about the signature. Signed-off-by: Michal Suchanek --- arch/powerpc/kexec/elf_64.c | 2 +- arch/s390/kernel/machine_kexec_file.c| 2 +- crypto/asymmetric_keys/asymmetric_type.c | 1 + include/linux/verification.h | 4 +++- kernel/module.c | 3 ++- kernel/module_signing.c | 11 ++- 6 files changed, 14 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index 6dec8151ef73..c50869195d51 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -156,7 +156,7 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, int elf64_verify_sig(const char *kernel, unsigned long kernel_len) { return verify_appended_signature(kernel, _len, VERIFY_USE_PLATFORM_KEYRING, -"kexec_file"); +VERIFYING_KEXEC_APPENDED_SIGNATURE); } #endif /* CONFIG_KEXEC_SIG */ diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index c3deccf1da83..63eec38e3137 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -32,7 +32,7 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len) return 0; return verify_appended_signature(kernel, _len, VERIFY_USE_PLATFORM_KEYRING, - "kexec_file"); + VERIFYING_KEXEC_APPENDED_SIGNATURE); } #endif /* CONFIG_KEXEC_SIG */ diff --git a/crypto/asymmetric_keys/asymmetric_type.c b/crypto/asymmetric_keys/asymmetric_type.c index ad8af3d70ac0..6fd20eec3882 100644 --- a/crypto/asymmetric_keys/asymmetric_type.c +++ b/crypto/asymmetric_keys/asymmetric_type.c @@ -25,6 +25,7 @@ const char *const key_being_used_for[NR__KEY_BEING_USED_FOR] = { [VERIFYING_KEXEC_PE_SIGNATURE] = "kexec PE sig", [VERIFYING_KEY_SIGNATURE] = "key sig", [VERIFYING_KEY_SELF_SIGNATURE] = "key self sig", + [VERIFYING_KEXEC_APPENDED_SIGNATURE]= "kexec appended sig", [VERIFYING_UNSPECIFIED_SIGNATURE] = "unspec sig", }; EXPORT_SYMBOL_GPL(key_being_used_for); diff --git a/include/linux/verification.h b/include/linux/verification.h index 32db9287a7b0..f92c49443b4f 100644 --- a/include/linux/verification.h +++ b/include/linux/verification.h @@ -26,6 +26,7 @@ enum key_being_used_for { VERIFYING_KEXEC_PE_SIGNATURE, VERIFYING_KEY_SIGNATURE, VERIFYING_KEY_SELF_SIGNATURE, + VERIFYING_KEXEC_APPENDED_SIGNATURE, VERIFYING_UNSPECIFIED_SIGNATURE, NR__KEY_BEING_USED_FOR }; @@ -61,7 +62,8 @@ extern int verify_pefile_signature(const void *pebuf, unsigned pelen, #endif int verify_appended_signature(const void *data, unsigned long *len, - struct key *trusted_keys, const char *what); + struct key *trusted_keys, + enum key_being_used_for purpose); #endif /* CONFIG_SYSTEM_DATA_VERIFICATION */ #endif /* _LINUX_VERIFY_PEFILE_H */ diff --git a/kernel/module.c b/kernel/module.c index d91ca0f93a40..0a359dc6b690 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2891,7 +2891,8 @@ static int module_sig_check(struct load_info *info, int flags) */ if (flags == 0) { err = verify_appended_signature(mod, >len, - VERIFY_USE_SECONDARY_KEYRING, "module"); + VERIFY_USE_SECONDARY_KEYRING, + VERIFYING_MODULE_SIGNATURE); if (!err) { info->sig_ok = true; return 0; diff --git a/kernel/module_signing.c b/kernel/module_signing.c index 39a6dd7c6dd2..20857d2a15ca 100644 --- a/kernel/module_signing.c +++ b/kernel/module_signing.c @@ -19,17 +19,18 @@ * @data: The data to be verified * @len: Size of @data. * @trusted_keys: Keyring to use for verification - * @what: Informational string for log messages + * @purpose: The use to which the key is being put */ int verify_appended_signature(const void *data, unsigned long *len, - struct key *trusted_keys, const char *what) + struct key *trusted_keys, + enum key_being_used_for purpose) { const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; struct module_signature *ms; unsigned long sig_len, modlen = *len; int ret; - pr_devel("==>%s(,%lu)\n", __func__, modlen); + pr_devel("==>%s %s(,%lu)\n", __func__, key_being_used_for[purpose], modlen); if
[PATCH v4 4/6] module: strip the signature marker in the verification function.
It is stripped by each caller separately. Note: this changes the error for kexec_file from EKEYREJECTED to ENODATA when the signature marker is missing. Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the commit with note about change of raturn value - the module_signature.h is now no longer needed for kexec_file --- arch/powerpc/kexec/elf_64.c | 11 --- arch/s390/kernel/machine_kexec_file.c | 11 --- kernel/module.c | 7 +-- kernel/module_signing.c | 12 ++-- 4 files changed, 11 insertions(+), 30 deletions(-) diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index 64cd314cce0d..6dec8151ef73 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -24,7 +24,6 @@ #include #include #include -#include static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -156,16 +155,6 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, #ifdef CONFIG_KEXEC_SIG int elf64_verify_sig(const char *kernel, unsigned long kernel_len) { - const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; - - if (marker_len > kernel_len) - return -EKEYREJECTED; - - if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING, - marker_len)) - return -EKEYREJECTED; - kernel_len -= marker_len; - return verify_appended_signature(kernel, _len, VERIFY_USE_PLATFORM_KEYRING, "kexec_file"); } diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index 345f2eab6e04..c3deccf1da83 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -12,7 +12,6 @@ #include #include #include -#include #include #include #include @@ -28,20 +27,10 @@ const struct kexec_file_ops * const kexec_file_loaders[] = { #ifdef CONFIG_KEXEC_SIG int s390_verify_sig(const char *kernel, unsigned long kernel_len) { - const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; - /* Skip signature verification when not secure IPLed. */ if (!ipl_secure_flag) return 0; - if (marker_len > kernel_len) - return -EKEYREJECTED; - - if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING, - marker_len)) - return -EKEYREJECTED; - kernel_len -= marker_len; - return verify_appended_signature(kernel, _len, VERIFY_USE_PLATFORM_KEYRING, "kexec_file"); } diff --git a/kernel/module.c b/kernel/module.c index 8481933dfa92..d91ca0f93a40 100644 --- a/kernel/module.c +++ b/kernel/module.c @@ -2882,7 +2882,6 @@ static inline void kmemleak_load_module(const struct module *mod, static int module_sig_check(struct load_info *info, int flags) { int err = -ENODATA; - const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; const char *reason; const void *mod = info->hdr; @@ -2890,11 +2889,7 @@ static int module_sig_check(struct load_info *info, int flags) * Require flags == 0, as a module with version information * removed is no longer the module that was signed */ - if (flags == 0 && - info->len > markerlen && - memcmp(mod + info->len - markerlen, MODULE_SIG_STRING, markerlen) == 0) { - /* We truncate the module to discard the signature */ - info->len -= markerlen; + if (flags == 0) { err = verify_appended_signature(mod, >len, VERIFY_USE_SECONDARY_KEYRING, "module"); if (!err) { diff --git a/kernel/module_signing.c b/kernel/module_signing.c index 30149969f21f..39a6dd7c6dd2 100644 --- a/kernel/module_signing.c +++ b/kernel/module_signing.c @@ -15,8 +15,7 @@ #include "module-internal.h" /** - * verify_appended_signature - Verify the signature on a module with the - * signature marker stripped. + * verify_appended_signature - Verify the signature on a module * @data: The data to be verified * @len: Size of @data. * @trusted_keys: Keyring to use for verification @@ -25,12 +24,21 @@ int verify_appended_signature(const void *data, unsigned long *len, struct key *trusted_keys, const char *what) { + const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1; struct module_signature *ms; unsigned long sig_len, modlen = *len; int ret; pr_devel("==>%s(,%lu)\n", __func__, modlen); + if (markerlen > modlen) + return -ENODATA; + + if (memcmp(data + modlen - markerlen, MODULE_SIG_STRING, + markerlen)) + return -ENODATA; + modlen -= markerlen; +
[PATCH v4 2/6] powerpc/kexec_file: Add KEXEC_SIG support.
Copy the code from s390x Both powerpc and s390x use appended signature format (as opposed to EFI based patforms using PE format). Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the comit message with explanation why the s390 code is usable on powerpc. - Include correct header for mod_check_sig - Nayna : Mention additional IMA features in kconfig text --- arch/powerpc/Kconfig| 16 arch/powerpc/kexec/elf_64.c | 36 2 files changed, 52 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index dea74d7717c0..1cde9b6c5987 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -560,6 +560,22 @@ config KEXEC_FILE config ARCH_HAS_KEXEC_PURGATORY def_bool KEXEC_FILE +config KEXEC_SIG + bool "Verify kernel signature during kexec_file_load() syscall" + depends on KEXEC_FILE && MODULE_SIG_FORMAT + help + This option makes kernel signature verification mandatory for + the kexec_file_load() syscall. + + In addition to that option, you need to enable signature + verification for the corresponding kernel image type being + loaded in order for this to work. + + Note: on powerpc IMA_ARCH_POLICY also implements kexec'ed kernel + verification. In addition IMA adds kernel hashes to the measurement + list, extends IMA PCR in the TPM, and implements kernel image + blacklist by hash. + config RELOCATABLE bool "Build a relocatable kernel" depends on PPC64 || (FLATMEM && (44x || FSL_BOOKE)) diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index eeb258002d1e..98d1cb5135b4 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -23,6 +23,7 @@ #include #include #include +#include static void *elf64_load(struct kimage *image, char *kernel_buf, unsigned long kernel_len, char *initrd, @@ -151,7 +152,42 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, return ret ? ERR_PTR(ret) : NULL; } +#ifdef CONFIG_KEXEC_SIG +int elf64_verify_sig(const char *kernel, unsigned long kernel_len) +{ + const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; + struct module_signature *ms; + unsigned long sig_len; + int ret; + + if (marker_len > kernel_len) + return -EKEYREJECTED; + + if (memcmp(kernel + kernel_len - marker_len, MODULE_SIG_STRING, + marker_len)) + return -EKEYREJECTED; + kernel_len -= marker_len; + + ms = (void *)kernel + kernel_len - sizeof(*ms); + ret = mod_check_sig(ms, kernel_len, "kexec"); + if (ret) + return ret; + + sig_len = be32_to_cpu(ms->sig_len); + kernel_len -= sizeof(*ms) + sig_len; + + return verify_pkcs7_signature(kernel, kernel_len, + kernel + kernel_len, sig_len, + VERIFY_USE_PLATFORM_KEYRING, + VERIFYING_MODULE_SIGNATURE, + NULL, NULL); +} +#endif /* CONFIG_KEXEC_SIG */ + const struct kexec_file_ops kexec_elf64_ops = { .probe = kexec_elf_probe, .load = elf64_load, +#ifdef CONFIG_KEXEC_SIG + .verify_sig = elf64_verify_sig, +#endif }; -- 2.31.1
[PATCH v4 3/6] kexec_file: Don't opencode appended signature verification.
Module verification already implements appeded signature verification. Reuse it for kexec_file. Signed-off-by: Michal Suchanek --- v3: - Philipp Rudo : Update the dependency on MODULE_SIG_FORMAT to MODULE_SIG - Include linux/verification.h - previously added in earlier patch v4: - kernel test robot : Use unsigned long rather than size_t for data length - Update the code in kernel/module_signing.c to use pointer rather than memcpy as the kexec and IMA code does --- arch/powerpc/Kconfig | 2 +- arch/powerpc/kexec/elf_64.c | 19 +++ arch/s390/Kconfig | 2 +- arch/s390/kernel/machine_kexec_file.c | 18 ++ include/linux/verification.h | 3 +++ kernel/module-internal.h | 2 -- kernel/module.c | 4 +++- kernel/module_signing.c | 34 --- 8 files changed, 33 insertions(+), 51 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 1cde9b6c5987..4092187474ff 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -562,7 +562,7 @@ config ARCH_HAS_KEXEC_PURGATORY config KEXEC_SIG bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE && MODULE_SIG_FORMAT + depends on KEXEC_FILE && MODULE_SIG help This option makes kernel signature verification mandatory for the kexec_file_load() syscall. diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c index 98d1cb5135b4..64cd314cce0d 100644 --- a/arch/powerpc/kexec/elf_64.c +++ b/arch/powerpc/kexec/elf_64.c @@ -23,6 +23,7 @@ #include #include #include +#include #include static void *elf64_load(struct kimage *image, char *kernel_buf, @@ -156,9 +157,6 @@ static void *elf64_load(struct kimage *image, char *kernel_buf, int elf64_verify_sig(const char *kernel, unsigned long kernel_len) { const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; - struct module_signature *ms; - unsigned long sig_len; - int ret; if (marker_len > kernel_len) return -EKEYREJECTED; @@ -168,19 +166,8 @@ int elf64_verify_sig(const char *kernel, unsigned long kernel_len) return -EKEYREJECTED; kernel_len -= marker_len; - ms = (void *)kernel + kernel_len - sizeof(*ms); - ret = mod_check_sig(ms, kernel_len, "kexec"); - if (ret) - return ret; - - sig_len = be32_to_cpu(ms->sig_len); - kernel_len -= sizeof(*ms) + sig_len; - - return verify_pkcs7_signature(kernel, kernel_len, - kernel + kernel_len, sig_len, - VERIFY_USE_PLATFORM_KEYRING, - VERIFYING_MODULE_SIGNATURE, - NULL, NULL); + return verify_appended_signature(kernel, _len, VERIFY_USE_PLATFORM_KEYRING, +"kexec_file"); } #endif /* CONFIG_KEXEC_SIG */ diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 2a5bb4f29cfe..cece7152ea35 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -544,7 +544,7 @@ config ARCH_HAS_KEXEC_PURGATORY config KEXEC_SIG bool "Verify kernel signature during kexec_file_load() syscall" - depends on KEXEC_FILE && MODULE_SIG_FORMAT + depends on KEXEC_FILE && MODULE_SIG help This option makes kernel signature verification mandatory for the kexec_file_load() syscall. diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index c944d71316c7..345f2eab6e04 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -29,9 +29,6 @@ const struct kexec_file_ops * const kexec_file_loaders[] = { int s390_verify_sig(const char *kernel, unsigned long kernel_len) { const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; - struct module_signature *ms; - unsigned long sig_len; - int ret; /* Skip signature verification when not secure IPLed. */ if (!ipl_secure_flag) @@ -45,19 +42,8 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len) return -EKEYREJECTED; kernel_len -= marker_len; - ms = (void *)kernel + kernel_len - sizeof(*ms); - ret = mod_check_sig(ms, kernel_len, "kexec"); - if (ret) - return ret; - - sig_len = be32_to_cpu(ms->sig_len); - kernel_len -= sizeof(*ms) + sig_len; - - return verify_pkcs7_signature(kernel, kernel_len, - kernel + kernel_len, sig_len, - VERIFY_USE_PLATFORM_KEYRING, - VERIFYING_MODULE_SIGNATURE, - NULL, NULL); + return verify_appended_signature(kernel, _len,
[PATCH v4 1/6] s390/kexec_file: Don't opencode appended signature check.
Module verification already implements appeded signature check. Reuse it for kexec_file. The kexec_file implementation uses EKEYREJECTED error in some cases when there is no key and the common implementation uses ENOPKG or EBADMSG instead. Signed-off-by: Michal Suchanek Acked-by: Heiko Carstens --- v3: Philipp Rudo : Update the commit with note about change of return value --- arch/s390/kernel/machine_kexec_file.c | 22 +- 1 file changed, 5 insertions(+), 17 deletions(-) diff --git a/arch/s390/kernel/machine_kexec_file.c b/arch/s390/kernel/machine_kexec_file.c index 8f43575a4dd3..c944d71316c7 100644 --- a/arch/s390/kernel/machine_kexec_file.c +++ b/arch/s390/kernel/machine_kexec_file.c @@ -31,6 +31,7 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len) const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1; struct module_signature *ms; unsigned long sig_len; + int ret; /* Skip signature verification when not secure IPLed. */ if (!ipl_secure_flag) @@ -45,25 +46,12 @@ int s390_verify_sig(const char *kernel, unsigned long kernel_len) kernel_len -= marker_len; ms = (void *)kernel + kernel_len - sizeof(*ms); - kernel_len -= sizeof(*ms); + ret = mod_check_sig(ms, kernel_len, "kexec"); + if (ret) + return ret; sig_len = be32_to_cpu(ms->sig_len); - if (sig_len >= kernel_len) - return -EKEYREJECTED; - kernel_len -= sig_len; - - if (ms->id_type != PKEY_ID_PKCS7) - return -EKEYREJECTED; - - if (ms->algo != 0 || - ms->hash != 0 || - ms->signer_len != 0 || - ms->key_id_len != 0 || - ms->__pad[0] != 0 || - ms->__pad[1] != 0 || - ms->__pad[2] != 0) { - return -EBADMSG; - } + kernel_len -= sizeof(*ms) + sig_len; return verify_pkcs7_signature(kernel, kernel_len, kernel + kernel_len, sig_len, -- 2.31.1
[PATCH v4 0/6] KEXEC_SIG with appended signature
Hello, This is a refresh of the KEXEC_SIG series. This adds KEXEC_SIG support on powerpc and deduplicates the code dealing with appended signatures in the kernel. powerpc supports IMA_KEXEC but that's an exception rather than the norm. On the other hand, KEXEC_SIG is portable across platforms. For distributions to have uniform security features across platforms one option should be used on all platforms. Thanks Michal Previous revision: https://lore.kernel.org/linuxppc-dev/cover.1637862358.git.msucha...@suse.de/ Patched kernel tree: https://github.com/hramrach/kernel/tree/kexec_sig Michal Suchanek (6): s390/kexec_file: Don't opencode appended signature check. powerpc/kexec_file: Add KEXEC_SIG support. kexec_file: Don't opencode appended signature verification. module: strip the signature marker in the verification function. module: Use key_being_used_for for log messages in verify_appended_signature module: Move duplicate mod_check_sig users code to mod_parse_sig arch/powerpc/Kconfig | 16 +++ arch/powerpc/kexec/elf_64.c | 12 + arch/s390/Kconfig| 2 +- arch/s390/kernel/machine_kexec_file.c| 41 + crypto/asymmetric_keys/asymmetric_type.c | 1 + include/linux/module_signature.h | 1 + include/linux/verification.h | 5 +++ kernel/module-internal.h | 2 - kernel/module.c | 12 +++-- kernel/module_signature.c| 56 +++- kernel/module_signing.c | 34 +++--- security/integrity/ima/ima_modsig.c | 22 ++ 12 files changed, 116 insertions(+), 88 deletions(-) -- 2.31.1
Re: [PATCH V2 03/17] asm-generic: fcntl: compat: Remove duplicate definitions
On Tue, Dec 28, 2021 at 3:39 PM wrote: > > From: Guo Ren > > Remove duplicate F_GETLK64,F_SETLK64,F_SETLKW64 definitions in > arch/*/include/asm/compat.h. > > Signed-off-by: Guo Ren > Signed-off-by: Guo Ren > Cc: Arnd Bergmann Unfortunately, this one does not look correct to me: > @@ -116,7 +116,7 @@ > #define F_GETSIG 11 /* for sockets. */ > #endif > > -#ifndef CONFIG_64BIT > +#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) > #ifndef F_GETLK64 > #define F_GETLK64 12 /* using 'struct flock64' */ > #define F_SETLK64 13 The problem here is that include/uapi/ headers cannot contain checks for CONFIG_* symbols because those may have different meanings in user space compared to kernel. This is a preexisting problem in the header, but I think the change makes it worse. With the current behavior, user space will always see the definitions, unless it happens to have its own definition for CONFIG_64BIT already. On 64-bit parisc, this has the effect of defining the macros to the same values as F_SETOWN/F_SETSIG/F_GETSIG, which is potentially harmful. On MIPS, it uses values that are different from the 32-bit numbers but are otherwise unused. Everywhere else, we get the definition from the 32-bit architecture in user space, which will do nothing in the kernel. The correct check for a uapi header would be to test for __BITS_PER_LONG==32. We should probably do that here, but this won't help you move the definitions, and it is a user-visible change as the incorrect definition will no longer be visible. [Adding Jeff and Bruce (the flock mainainers) to Cc for additional feedback on this] For your series, I would suggest just moving the macro definitions to include/linux/compat.h along with the 'struct compat_flock64' definition, and leaving the duplicate one in the uapi header unchanged until we have decided on a solution. Arnd
Re: [PATCH v5] powerpc/pseries: read the lpar name from the firmware
Hi Laurent, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on powerpc/next] [also build test WARNING on linux/master linus/master v5.16 next-20220110] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Laurent-Dufour/powerpc-pseries-read-the-lpar-name-from-the-firmware/20220107-001503 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc64-randconfig-r026-20220106 (https://download.01.org/0day-ci/archive/20220110/202201102154.a95oqepr-...@intel.com/config) compiler: powerpc64-linux-gcc (GCC) 11.2.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5cf0dea6e919e93ff3088f87acd40e84608a6861 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Laurent-Dufour/powerpc-pseries-read-the-lpar-name-from-the-firmware/20220107-001503 git checkout 5cf0dea6e919e93ff3088f87acd40e84608a6861 # save the config file to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-11.2.0 make.cross O=build_dir ARCH=powerpc SHELL=/bin/bash arch/powerpc/platforms/pseries/ If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All warnings (new ones prefixed by >>): arch/powerpc/platforms/pseries/lparcfg.c:257: warning: Function parameter or member 'm' not described in 'parse_mpp_data' arch/powerpc/platforms/pseries/lparcfg.c:295: warning: Function parameter or member 'm' not described in 'parse_mpp_x_data' arch/powerpc/platforms/pseries/lparcfg.c:334: warning: Function parameter or member 'm' not described in 'read_RTAS_lpar_name' >> arch/powerpc/platforms/pseries/lparcfg.c:334: warning: expecting prototype >> for Read the lpar name using the RTAS ibm,get-system(). Prototype was for >> read_RTAS_lpar_name() instead >> arch/powerpc/platforms/pseries/lparcfg.c:378: warning: This comment starts >> with '/**', but isn't a kernel-doc comment. Refer >> Documentation/doc-guide/kernel-doc.rst * Read the LPAR name from the Device Tree. arch/powerpc/platforms/pseries/lparcfg.c:678: warning: Function parameter or member 'entitlement' not described in 'update_mpp' arch/powerpc/platforms/pseries/lparcfg.c:678: warning: Function parameter or member 'weight' not described in 'update_mpp' vim +334 arch/powerpc/platforms/pseries/lparcfg.c 323 324 /** 325 * Read the lpar name using the RTAS ibm,get-system-parameter call. 326 * 327 * The name read through this call is updated if changes are made by the end 328 * user on the hypervisor side. 329 * 330 * Some hypervisor (like Qemu) may not provide this value. In that case, a non 331 * null value is returned. 332 */ 333 static int read_RTAS_lpar_name(struct seq_file *m) > 334 { 335 int rc, len, token; 336 union { 337 char raw_buffer[GET_SYS_PARM_BUF_SIZE]; 338 struct { 339 __be16 len; 340 char name[GET_SYS_PARM_BUF_SIZE-2]; 341 }; 342 } *local_buffer; 343 344 token = rtas_token("ibm,get-system-parameter"); 345 if (token == RTAS_UNKNOWN_SERVICE) 346 return -EINVAL; 347 348 local_buffer = kmalloc(sizeof(*local_buffer), GFP_KERNEL); 349 if (!local_buffer) 350 return -ENOMEM; 351 352 do { 353 spin_lock(_data_buf_lock); 354 memset(rtas_data_buf, 0, sizeof(*local_buffer)); 355 rc = rtas_call(token, 3, 1, NULL, SPLPAR_LPAR_NAME_TOKEN, 356 __pa(rtas_data_buf), sizeof(*local_buffer)); 357 if (!rc) 358 memcpy(local_buffer->raw_buffer, rtas_data_buf, 359 sizeof(local_buffer->raw_buffer)); 360 spin_unlock(_data_buf_lock); 361 } while (rtas_busy_delay(rc)); 362 363 if (!rc) { 364 /* Force end of string */ 365 len = min((int) be16_to_cpu(local_buffer->len), 366(int) sizeof(local_buffer->name)-1); 367 local_buffer->name[len] = '\0'; 368 369 seq_printf(m, "partition_name=%s\n", local_buffer->name); 370 } else 371 rc = -ENODATA; 372
[PATCH] powerpc/bpf: Always reallocate BPF_REG_5, BPF_REG_AX and TMP_REG when possible
BPF_REG_5, BPF_REG_AX and TMP_REG are mapped on non volatile registers because there are not enough volatile registers, but they don't need to be preserved on function calls. So when some volatile registers become available, those registers can always be reallocated regardless of whether SEEN_FUNC is set or not. Suggested-by: Naveen N. Rao Signed-off-by: Christophe Leroy --- arch/powerpc/net/bpf_jit.h| 3 --- arch/powerpc/net/bpf_jit_comp32.c | 14 +++--- 2 files changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h index b20a2a83a6e7..b75507fc8f6b 100644 --- a/arch/powerpc/net/bpf_jit.h +++ b/arch/powerpc/net/bpf_jit.h @@ -127,9 +127,6 @@ #define SEEN_FUNC 0x2000 /* might call external helpers */ #define SEEN_TAILCALL 0x4000 /* uses tail calls */ -#define SEEN_VREG_MASK 0x1ff8 /* Volatile registers r3-r12 */ -#define SEEN_NVREG_MASK0x0003 /* Non volatile registers r14-r31 */ - #ifdef CONFIG_PPC64 extern const int b2p[MAX_BPF_JIT_REG + 2]; #else diff --git a/arch/powerpc/net/bpf_jit_comp32.c b/arch/powerpc/net/bpf_jit_comp32.c index d3a52cd42f53..cfec42c8a511 100644 --- a/arch/powerpc/net/bpf_jit_comp32.c +++ b/arch/powerpc/net/bpf_jit_comp32.c @@ -77,14 +77,22 @@ static int bpf_jit_stack_offsetof(struct codegen_context *ctx, int reg) return BPF_PPC_STACKFRAME(ctx) - 4; } +#define SEEN_VREG_MASK 0x1ff8 /* Volatile registers r3-r12 */ +#define SEEN_NVREG_FULL_MASK 0x0003 /* Non volatile registers r14-r31 */ +#define SEEN_NVREG_TEMP_MASK 0x1e01 /* BPF_REG_5, BPF_REG_AX, TMP_REG */ + void bpf_jit_realloc_regs(struct codegen_context *ctx) { + unsigned int nvreg_mask; + if (ctx->seen & SEEN_FUNC) - return; + nvreg_mask = SEEN_NVREG_TEMP_MASK; + else + nvreg_mask = SEEN_NVREG_FULL_MASK; - while (ctx->seen & SEEN_NVREG_MASK && + while (ctx->seen & nvreg_mask && (ctx->seen & SEEN_VREG_MASK) != SEEN_VREG_MASK) { - int old = 32 - fls(ctx->seen & (SEEN_NVREG_MASK & 0xaaab)); + int old = 32 - fls(ctx->seen & (nvreg_mask & 0xaaab)); int new = 32 - fls(~ctx->seen & (SEEN_VREG_MASK & 0x)); int i; -- 2.33.1
Re: [PATCH v2 8/8] powerpc/bpf: Reallocate BPF registers to volatile registers when possible on PPC32
Le 07/01/2022 à 12:51, Naveen N. Rao a écrit : > Christophe Leroy wrote: >> When the BPF routine doesn't call any function, the non volatile >> registers can be reallocated to volatile registers in order to >> avoid having to save them/restore on the stack. >> >> Before this patch, the test #359 ADD default X is: >> >> 0: 7c 64 1b 78 mr r4,r3 >> 4: 38 60 00 00 li r3,0 >> 8: 94 21 ff b0 stwu r1,-80(r1) >> c: 60 00 00 00 nop >> 10: 92 e1 00 2c stw r23,44(r1) >> 14: 93 01 00 30 stw r24,48(r1) >> 18: 93 21 00 34 stw r25,52(r1) >> 1c: 93 41 00 38 stw r26,56(r1) >> 20: 39 80 00 00 li r12,0 >> 24: 39 60 00 00 li r11,0 >> 28: 3b 40 00 00 li r26,0 >> 2c: 3b 20 00 00 li r25,0 >> 30: 7c 98 23 78 mr r24,r4 >> 34: 7c 77 1b 78 mr r23,r3 >> 38: 39 80 00 42 li r12,66 >> 3c: 39 60 00 00 li r11,0 >> 40: 7d 8c d2 14 add r12,r12,r26 >> 44: 39 60 00 00 li r11,0 >> 48: 7d 83 63 78 mr r3,r12 >> 4c: 82 e1 00 2c lwz r23,44(r1) >> 50: 83 01 00 30 lwz r24,48(r1) >> 54: 83 21 00 34 lwz r25,52(r1) >> 58: 83 41 00 38 lwz r26,56(r1) >> 5c: 38 21 00 50 addi r1,r1,80 >> 60: 4e 80 00 20 blr >> >> After this patch, the same test has become: >> >> 0: 7c 64 1b 78 mr r4,r3 >> 4: 38 60 00 00 li r3,0 >> 8: 94 21 ff b0 stwu r1,-80(r1) >> c: 60 00 00 00 nop >> 10: 39 80 00 00 li r12,0 >> 14: 39 60 00 00 li r11,0 >> 18: 39 00 00 00 li r8,0 >> 1c: 38 e0 00 00 li r7,0 >> 20: 7c 86 23 78 mr r6,r4 >> 24: 7c 65 1b 78 mr r5,r3 >> 28: 39 80 00 42 li r12,66 >> 2c: 39 60 00 00 li r11,0 >> 30: 7d 8c 42 14 add r12,r12,r8 >> 34: 39 60 00 00 li r11,0 >> 38: 7d 83 63 78 mr r3,r12 >> 3c: 38 21 00 50 addi r1,r1,80 >> 40: 4e 80 00 20 blr >> >> Signed-off-by: Christophe Leroy >> --- >> arch/powerpc/net/bpf_jit.h | 16 >> arch/powerpc/net/bpf_jit64.h | 2 +- >> arch/powerpc/net/bpf_jit_comp.c | 2 ++ >> arch/powerpc/net/bpf_jit_comp32.c | 30 -- >> arch/powerpc/net/bpf_jit_comp64.c | 4 >> 5 files changed, 51 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/net/bpf_jit.h b/arch/powerpc/net/bpf_jit.h >> index a45b8266355d..776abef4d2a0 100644 >> --- a/arch/powerpc/net/bpf_jit.h >> +++ b/arch/powerpc/net/bpf_jit.h >> @@ -116,6 +116,15 @@ static inline bool is_nearbranch(int offset) >> #define SEEN_STACK 0x4000 /* uses BPF stack */ >> #define SEEN_TAILCALL 0x8000 /* uses tail calls */ >> >> +#define SEEN_VREG_MASK 0x1ff8 /* Volatile registers r3-r12 */ >> +#define SEEN_NVREG_MASK 0x0003 /* Non volatile registers >> r14-r31 */ >> + >> +#ifdef CONFIG_PPC64 >> +extern const int b2p[MAX_BPF_JIT_REG + 2]; >> +#else >> +extern const int b2p[MAX_BPF_JIT_REG + 1]; >> +#endif >> + >> struct codegen_context { >> /* >> * This is used to track register usage as well >> @@ -129,6 +138,7 @@ struct codegen_context { >> unsigned int seen; >> unsigned int idx; >> unsigned int stack_size; >> + int b2p[ARRAY_SIZE(b2p)]; >> }; >> >> static inline void bpf_flush_icache(void *start, void *end) >> @@ -147,11 +157,17 @@ static inline void bpf_set_seen_register(struct >> codegen_context *ctx, int i) >> ctx->seen |= 1 << (31 - i); >> } >> >> +static inline void bpf_clear_seen_register(struct codegen_context >> *ctx, int i) >> +{ >> + ctx->seen &= ~(1 << (31 - i)); >> +} >> + >> void bpf_jit_emit_func_call_rel(u32 *image, struct codegen_context >> *ctx, u64 func); >> int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, struct >> codegen_context *ctx, >> u32 *addrs, bool extra_pass); >> void bpf_jit_build_prologue(u32 *image, struct codegen_context *ctx); >> void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx); >> +void bpf_jit_realloc_regs(struct codegen_context *ctx); >> >> #endif >> >> diff --git a/arch/powerpc/net/bpf_jit64.h b/arch/powerpc/net/bpf_jit64.h >> index b05f2e67bba1..7b713edfa7e2 100644 >> --- a/arch/powerpc/net/bpf_jit64.h >> +++ b/arch/powerpc/net/bpf_jit64.h >> @@ -39,7 +39,7 @@ >> #define TMP_REG_2 (MAX_BPF_JIT_REG + 1) >> >> /* BPF to ppc register mappings */ >> -static const int b2p[] = { >> +const int b2p[MAX_BPF_JIT_REG + 2] = { >> /* function return value */ >> [BPF_REG_0] = 8, >> /* function arguments */ >> diff --git a/arch/powerpc/net/bpf_jit_comp.c >> b/arch/powerpc/net/bpf_jit_comp.c >> index efac89964873..798ac4350a82 100644 >> --- a/arch/powerpc/net/bpf_jit_comp.c >> +++
Re: [PATCH V2 02/17] fs: stat: compat: Add __ARCH_WANT_COMPAT_STAT
On Tue, Dec 28, 2021 at 3:39 PM wrote: > > From: Guo Ren > > RISC-V doesn't neeed compat_stat, so using __ARCH_WANT_COMPAT_STAT > to exclude unnecessary SYSCALL functions. > > Signed-off-by: Guo Ren > Signed-off-by: Guo Ren Reviewed-by: Arnd Bergmann
Re: [PATCH V2 01/17] kconfig: Add SYSVIPC_COMPAT for all architectures
On Tue, Dec 28, 2021 at 3:39 PM wrote: > > From: Guo Ren > > The existing per-arch definitions are pretty much historic cruft. > Move SYSVIPC_COMPAT into init/Kconfig. > > Signed-off-by: Guo Ren > Signed-off-by: Guo Ren > Cc: Arnd Bergmann > Cc: Christoph Hellwig Acked-by: Arnd Bergmann
Re: [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
Christophe Leroy wrote: Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : These instructions are updated after the initial JIT, so redo codegen during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify that this is more than just subprog calls. Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper") Cc: sta...@vger.kernel.org # v5.15 Signed-off-by: Naveen N. Rao --- arch/powerpc/net/bpf_jit_comp.c | 29 +++-- arch/powerpc/net/bpf_jit_comp32.c | 6 ++ arch/powerpc/net/bpf_jit_comp64.c | 7 ++- 3 files changed, 35 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index d6ffdd0f2309d0..56dd1f4e3e4447 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned int size) memset32(area, BREAKPOINT_INSTRUCTION, size / 4); } -/* Fix the branch target addresses for subprog calls */ -static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image, - struct codegen_context *ctx, u32 *addrs) +/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra pass */ +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, + struct codegen_context *ctx, u32 *addrs) { const struct bpf_insn *insn = fp->insnsi; bool func_addr_fixed; u64 func_addr; u32 tmp_idx; - int i, ret; + int i, j, ret; for (i = 0; i < fp->len; i++) { /* @@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image, * of the JITed sequence remains unchanged. */ ctx->idx = tmp_idx; + } else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) { + tmp_idx = ctx->idx; + ctx->idx = addrs[i] / 4; +#ifdef CONFIG_PPC32 + PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + 1].imm); + PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm); + for (j = ctx->idx - addrs[i] / 4; j < 4; j++) + EMIT(PPC_RAW_NOP()); +#else + func_addr = ((u64)(u32)insn[i].imm) | (((u64)(u32)insn[i + 1].imm) << 32); + PPC_LI64(b2p[insn[i].dst_reg], func_addr); + /* overwrite rest with nops */ + for (j = ctx->idx - addrs[i] / 4; j < 5; j++) + EMIT(PPC_RAW_NOP()); +#endif #ifdefs should be avoided as much as possible. Here it seems we could easily do an if (IS_ENABLED(CONFIG_PPC32)) { } else { } And it looks like the CONFIG_PPC64 alternative would in fact also work on PPC32, wouldn't it ? We never implemented PPC_LI64() for ppc32: /linux/arch/powerpc/net/bpf_jit_comp.c: In function 'bpf_jit_fixup_addresses': /linux/arch/powerpc/net/bpf_jit_comp.c:81:5: error: this decimal constant is unsigned only in ISO C90 [-Werror] 81 | PPC_LI64(b2p[insn[i].dst_reg], func_addr); | ^~~~ /linux/arch/powerpc/net/bpf_jit_comp.c:81:5: error: this decimal constant is unsigned only in ISO C90 [-Werror] In file included from /linux/arch/powerpc/net/bpf_jit_comp.c:19: /linux/arch/powerpc/net/bpf_jit.h:78:40: error: right shift count >= width of type [-Werror=shift-count-overflow] 78 | EMIT(PPC_RAW_LI(d, ((uintptr_t)(i) >> 32) & \ |^~ We should move that out from bpf_jit.h - Naveen
Re: [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
Christophe Leroy wrote: Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : Pad instructions emitted for BPF_CALL so that the number of instructions generated does not change for different function addresses. This is especially important for calls to other bpf functions, whose address will only be known during extra pass. In first pass, 'image' is NULL and we emit the 4 instructions sequence already, so the code won't grow after first pass, it can only shrink. Right, but this patch addresses the scenario where the function address is only provided during the extra pass. So, even though we will not write past the end of the BPF image, the emitted instructions can still be wrong. On PPC32, a huge effort is made to minimise the situations where 'bl' cannot be used, see commit 2ec13df16704 ("powerpc/modules: Load modules closer to kernel text") And if you take the 8xx for instance, a NOP a just like any other instruction, it takes one cycle. If it is absolutely needed, then I'd prefer we use an out-of-line trampoline for the unlikely case and use 'bl' to that trampoline. Yes, something like that will be nice to do, but we will still need this patch for -stable. The other option is to redo the whole JIT during the extra pass, but only if we can ensure that we have provisioned for the maximum image size. - Naveen
Re: [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
Christophe Leroy wrote: Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : task_pt_regs() can return NULL on powerpc for kernel threads. This is then used in __bpf_get_stack() to check for user mode, resulting in a kernel oops. Guard against this by checking return value of task_pt_regs() before trying to obtain the call chain. I started looking at that some time ago, and I'm wondering whether it is worth keeping that powerpc particularity. We used to have a potentially different pt_regs depending on how we entered kernel, especially on PPC32, but since the following commits it is not the case anymore. 06d67d54741a ("powerpc: make process.c suitable for both 32-bit and 64-bit") db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry") b5cfc9cd7b04 ("powerpc/32: Fix critical and debug interrupts on BOOKE") We could therefore just do like other architectures, define #define task_pt_regs(p) ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1) And then remove the regs field we have in thread_struct. Sure, I don't have an opinion on that, but I think this patch will still be needed for -stable. Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()") Cc: sta...@vger.kernel.org # v5.9+ Signed-off-by: Naveen N. Rao --- kernel/bpf/stackmap.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index 6e75bbee39f0b5..0dcaed4d3f4cec 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -525,13 +525,14 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, task, void *, buf, u32, size, u64, flags) { struct pt_regs *regs; - long res; + long res = -EINVAL; if (!try_get_task_stack(task)) return -EFAULT; regs = task_pt_regs(task); - res = __bpf_get_stack(regs, task, NULL, buf, size, flags); + if (regs) + res = __bpf_get_stack(regs, task, NULL, buf, size, flags); Should there be a comment explaining that on powerpc, 'regs' can be NULL for a kernel thread ? I guess this won't be required if we end up with the change you are proposing above? - Naveen
[PATCH] powerpc/security: Provide stubs for when PPC_BARRIER_NOSPEC isn't enabled
kernel test robot reported the below build error with a randconfig: powerpc64-linux-ld: arch/powerpc/net/bpf_jit_comp64.o:(.toc+0x0): undefined reference to `powerpc_security_features' This can happen if CONFIG_PPC_BARRIER_NOSPEC is not enabled. Address this by providing stub functions for security_ftr_enabled() and related helpers when the config option is not enabled. Reported-by: kernel test robot Signed-off-by: Naveen N. Rao --- arch/powerpc/include/asm/security_features.h | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/security_features.h b/arch/powerpc/include/asm/security_features.h index 27574f218b371f..f2b990052641a0 100644 --- a/arch/powerpc/include/asm/security_features.h +++ b/arch/powerpc/include/asm/security_features.h @@ -8,10 +8,6 @@ #ifndef _ASM_POWERPC_SECURITY_FEATURES_H #define _ASM_POWERPC_SECURITY_FEATURES_H - -extern u64 powerpc_security_features; -extern bool rfi_flush; - /* These are bit flags */ enum stf_barrier_type { STF_BARRIER_NONE= 0x1, @@ -20,6 +16,10 @@ enum stf_barrier_type { STF_BARRIER_SYNC_ORI= 0x8, }; +#ifdef CONFIG_PPC_BARRIER_NOSPEC +extern u64 powerpc_security_features; +extern bool rfi_flush; + void setup_stf_barrier(void); void do_stf_barrier_fixups(enum stf_barrier_type types); void setup_count_cache_flush(void); @@ -45,6 +45,13 @@ enum stf_barrier_type stf_barrier_type_get(void); static inline enum stf_barrier_type stf_barrier_type_get(void) { return STF_BARRIER_NONE; } #endif +#else /* CONFIG_PPC_BARRIER_NOSPEC */ +static inline void security_ftr_set(u64 feature) { } +static inline void security_ftr_clear(u64 feature) { } +static inline bool security_ftr_enabled(u64 feature) { return false; } +static inline enum stf_barrier_type stf_barrier_type_get(void) { return STF_BARRIER_NONE; } +#endif /* CONFIG_PPC_BARRIER_NOSPEC */ + // Features indicating support for Spectre/Meltdown mitigations // The L1-D cache can be flushed with ori r30,r30,0 base-commit: bdcf18e133f656b2c97390a594fc95e37849e682 -- 2.34.1
Re: [PATCH 03/13] powerpc/bpf: Update ldimm64 instructions during extra pass
Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : > These instructions are updated after the initial JIT, so redo codegen > during the extra pass. Rename bpf_jit_fixup_subprog_calls() to clarify > that this is more than just subprog calls. > > Fixes: 69c087ba6225b5 ("bpf: Add bpf_for_each_map_elem() helper") > Cc: sta...@vger.kernel.org # v5.15 > Signed-off-by: Naveen N. Rao > --- > arch/powerpc/net/bpf_jit_comp.c | 29 +++-- > arch/powerpc/net/bpf_jit_comp32.c | 6 ++ > arch/powerpc/net/bpf_jit_comp64.c | 7 ++- > 3 files changed, 35 insertions(+), 7 deletions(-) > > diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c > index d6ffdd0f2309d0..56dd1f4e3e4447 100644 > --- a/arch/powerpc/net/bpf_jit_comp.c > +++ b/arch/powerpc/net/bpf_jit_comp.c > @@ -23,15 +23,15 @@ static void bpf_jit_fill_ill_insns(void *area, unsigned > int size) > memset32(area, BREAKPOINT_INSTRUCTION, size / 4); > } > > -/* Fix the branch target addresses for subprog calls */ > -static int bpf_jit_fixup_subprog_calls(struct bpf_prog *fp, u32 *image, > -struct codegen_context *ctx, u32 *addrs) > +/* Fix updated addresses (for subprog calls, ldimm64, et al) during extra > pass */ > +static int bpf_jit_fixup_addresses(struct bpf_prog *fp, u32 *image, > +struct codegen_context *ctx, u32 *addrs) > { > const struct bpf_insn *insn = fp->insnsi; > bool func_addr_fixed; > u64 func_addr; > u32 tmp_idx; > - int i, ret; > + int i, j, ret; > > for (i = 0; i < fp->len; i++) { > /* > @@ -66,6 +66,23 @@ static int bpf_jit_fixup_subprog_calls(struct bpf_prog > *fp, u32 *image, >* of the JITed sequence remains unchanged. >*/ > ctx->idx = tmp_idx; > + } else if (insn[i].code == (BPF_LD | BPF_IMM | BPF_DW)) { > + tmp_idx = ctx->idx; > + ctx->idx = addrs[i] / 4; > +#ifdef CONFIG_PPC32 > + PPC_LI32(ctx->b2p[insn[i].dst_reg] - 1, (u32)insn[i + > 1].imm); > + PPC_LI32(ctx->b2p[insn[i].dst_reg], (u32)insn[i].imm); > + for (j = ctx->idx - addrs[i] / 4; j < 4; j++) > + EMIT(PPC_RAW_NOP()); > +#else > + func_addr = ((u64)(u32)insn[i].imm) | > (((u64)(u32)insn[i + 1].imm) << 32); > + PPC_LI64(b2p[insn[i].dst_reg], func_addr); > + /* overwrite rest with nops */ > + for (j = ctx->idx - addrs[i] / 4; j < 5; j++) > + EMIT(PPC_RAW_NOP()); > +#endif #ifdefs should be avoided as much as possible. Here it seems we could easily do an if (IS_ENABLED(CONFIG_PPC32)) { } else { } And it looks like the CONFIG_PPC64 alternative would in fact also work on PPC32, wouldn't it ? > + ctx->idx = tmp_idx; > + i++; > } > } > > @@ -200,13 +217,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog > *fp) > /* >* Do not touch the prologue and epilogue as they will remain >* unchanged. Only fix the branch target address for subprog > - * calls in the body. > + * calls in the body, and ldimm64 instructions. >* >* This does not change the offsets and lengths of the subprog >* call instruction sequences and hence, the size of the JITed >* image as well. >*/ > - bpf_jit_fixup_subprog_calls(fp, code_base, , addrs); > + bpf_jit_fixup_addresses(fp, code_base, , addrs); > > /* There is no need to perform the usual passes. */ > goto skip_codegen_passes; > diff --git a/arch/powerpc/net/bpf_jit_comp32.c > b/arch/powerpc/net/bpf_jit_comp32.c > index 997a47fa615b30..2258d3886d02ec 100644 > --- a/arch/powerpc/net/bpf_jit_comp32.c > +++ b/arch/powerpc/net/bpf_jit_comp32.c > @@ -293,6 +293,8 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, > struct codegen_context * > bool func_addr_fixed; > u64 func_addr; > u32 true_cond; > + u32 tmp_idx; > + int j; > > /* >* addrs[] maps a BPF bytecode address into a real offset from > @@ -908,8 +910,12 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, > struct codegen_context * >* 16 byte instruction that uses two 'struct bpf_insn' >*/ > case BPF_LD | BPF_IMM | BPF_DW: /* dst = (u64) imm */ > + tmp_idx = ctx->idx; > PPC_LI32(dst_reg_h, (u32)insn[i + 1].imm); > PPC_LI32(dst_reg, (u32)insn[i].imm); > + /*
Re: [PATCH 11/13] powerpc64/bpf elfv2: Setup kernel TOC in r2 on entry
Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : > In preparation for using kernel TOC, load the same in r2 on entry. With > elfv1, the kernel TOC is already setup by our caller so we just emit a > nop. We adjust the number of instructions to skip on a tail call > accordingly. > > Signed-off-by: Naveen N. Rao > --- > arch/powerpc/net/bpf_jit_comp64.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/net/bpf_jit_comp64.c > b/arch/powerpc/net/bpf_jit_comp64.c > index ce4fc59bbd6a92..e05b577d95bf11 100644 > --- a/arch/powerpc/net/bpf_jit_comp64.c > +++ b/arch/powerpc/net/bpf_jit_comp64.c > @@ -73,6 +73,12 @@ void bpf_jit_build_prologue(u32 *image, struct > codegen_context *ctx) > { > int i; > > +#ifdef PPC64_ELF_ABI_v2 > + PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc)); > +#else > + EMIT(PPC_RAW_NOP()); > +#endif Can we avoid the #ifdef, using if (__is_defined(PPC64_ELF_ABI_v2)) PPC_BPF_LL(_R2, _R13, offsetof(struct paca_struct, kernel_toc)); else EMIT(PPC_RAW_NOP()); > + > /* >* Initialize tail_call_cnt if we do tail calls. >* Otherwise, put in NOPs so that it can be skipped when we are > @@ -87,7 +93,7 @@ void bpf_jit_build_prologue(u32 *image, struct > codegen_context *ctx) > EMIT(PPC_RAW_NOP()); > } > > -#define BPF_TAILCALL_PROLOGUE_SIZE 8 > +#define BPF_TAILCALL_PROLOGUE_SIZE 12 Why not change that for v2 ABI only instead of adding a NOP ? ABI won't change during runtime AFAIU > > if (bpf_has_stack_frame(ctx)) { > /*
Re: [PATCH 02/13] powerpc32/bpf: Fix codegen for bpf-to-bpf calls
Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : > Pad instructions emitted for BPF_CALL so that the number of instructions > generated does not change for different function addresses. This is > especially important for calls to other bpf functions, whose address > will only be known during extra pass. In first pass, 'image' is NULL and we emit the 4 instructions sequence already, so the code won't grow after first pass, it can only shrink. On PPC32, a huge effort is made to minimise the situations where 'bl' cannot be used, see commit 2ec13df16704 ("powerpc/modules: Load modules closer to kernel text") And if you take the 8xx for instance, a NOP a just like any other instruction, it takes one cycle. If it is absolutely needed, then I'd prefer we use an out-of-line trampoline for the unlikely case and use 'bl' to that trampoline. > > Fixes: 51c66ad849a703 ("powerpc/bpf: Implement extended BPF on PPC32") > Cc: sta...@vger.kernel.org # v5.13+ > Signed-off-by: Naveen N. Rao > --- > arch/powerpc/net/bpf_jit_comp32.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/net/bpf_jit_comp32.c > b/arch/powerpc/net/bpf_jit_comp32.c > index d3a52cd42f5346..997a47fa615b30 100644 > --- a/arch/powerpc/net/bpf_jit_comp32.c > +++ b/arch/powerpc/net/bpf_jit_comp32.c > @@ -191,6 +191,9 @@ void bpf_jit_emit_func_call_rel(u32 *image, struct > codegen_context *ctx, u64 fun > > if (image && rel < 0x200 && rel >= -0x200) { > PPC_BL_ABS(func); > + EMIT(PPC_RAW_NOP()); > + EMIT(PPC_RAW_NOP()); > + EMIT(PPC_RAW_NOP()); > } else { > /* Load function address into r0 */ > EMIT(PPC_RAW_LIS(_R0, IMM_H(func)));
Re: [PATCH 01/13] bpf: Guard against accessing NULL pt_regs in bpf_get_task_stack()
Le 06/01/2022 à 12:45, Naveen N. Rao a écrit : > task_pt_regs() can return NULL on powerpc for kernel threads. This is > then used in __bpf_get_stack() to check for user mode, resulting in a > kernel oops. Guard against this by checking return value of > task_pt_regs() before trying to obtain the call chain. I started looking at that some time ago, and I'm wondering whether it is worth keeping that powerpc particularity. We used to have a potentially different pt_regs depending on how we entered kernel, especially on PPC32, but since the following commits it is not the case anymore. 06d67d54741a ("powerpc: make process.c suitable for both 32-bit and 64-bit") db297c3b07af ("powerpc/32: Don't save thread.regs on interrupt entry") b5cfc9cd7b04 ("powerpc/32: Fix critical and debug interrupts on BOOKE") We could therefore just do like other architectures, define #define task_pt_regs(p) ((struct pt_regs *)(THREAD_SIZE + task_stack_page(p)) - 1) And then remove the regs field we have in thread_struct. > > Fixes: fa28dcb82a38f8 ("bpf: Introduce helper bpf_get_task_stack()") > Cc: sta...@vger.kernel.org # v5.9+ > Signed-off-by: Naveen N. Rao > --- > kernel/bpf/stackmap.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c > index 6e75bbee39f0b5..0dcaed4d3f4cec 100644 > --- a/kernel/bpf/stackmap.c > +++ b/kernel/bpf/stackmap.c > @@ -525,13 +525,14 @@ BPF_CALL_4(bpf_get_task_stack, struct task_struct *, > task, void *, buf, > u32, size, u64, flags) > { > struct pt_regs *regs; > - long res; > + long res = -EINVAL; > > if (!try_get_task_stack(task)) > return -EFAULT; > > regs = task_pt_regs(task); > - res = __bpf_get_stack(regs, task, NULL, buf, size, flags); > + if (regs) > + res = __bpf_get_stack(regs, task, NULL, buf, size, flags); Should there be a comment explaining that on powerpc, 'regs' can be NULL for a kernel thread ? > put_task_stack(task); > > return res;
Re: [PATCH 02/16] floppy: Remove usage of the deprecated "pci-dma-compat.h" API
Looks good, Reviewed-by: Christoph Hellwig
Re: [PATCH 06/23] powerpc/membarrier: Remove special barrier on mm switch
Le 08/01/2022 à 17:43, Andy Lutomirski a écrit : > powerpc did the following on some, but not all, paths through > switch_mm_irqs_off(): > > /* > * Only need the full barrier when switching between processes. > * Barrier when switching from kernel to userspace is not > * required here, given that it is implied by mmdrop(). Barrier > * when switching from userspace to kernel is not needed after > * store to rq->curr. > */ > if (likely(!(atomic_read(>membarrier_state) & > (MEMBARRIER_STATE_PRIVATE_EXPEDITED | > MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) > return; > > This is puzzling: if !prev, then one might expect that we are switching > from kernel to user, not user to kernel, which is inconsistent with the > comment. But this is all nonsense, because the one and only caller would > never have prev == NULL and would, in fact, OOPS if prev == NULL. > > In any event, this code is unnecessary, since the new generic > membarrier_finish_switch_mm() provides the same barrier without arch help. I can't find this function membarrier_finish_switch_mm(), neither in Linus tree, nor in linux-next tree. > > arch/powerpc/include/asm/membarrier.h remains as an empty header, > because a later patch in this series will add code to it. > > Cc: Michael Ellerman > Cc: Benjamin Herrenschmidt > Cc: Paul Mackerras > Cc: linuxppc-dev@lists.ozlabs.org > Cc: Nicholas Piggin > Cc: Mathieu Desnoyers > Cc: Peter Zijlstra > Signed-off-by: Andy Lutomirski > --- > arch/powerpc/include/asm/membarrier.h | 24 > arch/powerpc/mm/mmu_context.c | 1 - > 2 files changed, 25 deletions(-) > > diff --git a/arch/powerpc/include/asm/membarrier.h > b/arch/powerpc/include/asm/membarrier.h > index de7f79157918..b90766e95bd1 100644 > --- a/arch/powerpc/include/asm/membarrier.h > +++ b/arch/powerpc/include/asm/membarrier.h > @@ -1,28 +1,4 @@ > #ifndef _ASM_POWERPC_MEMBARRIER_H > #define _ASM_POWERPC_MEMBARRIER_H > > -static inline void membarrier_arch_switch_mm(struct mm_struct *prev, > - struct mm_struct *next, > - struct task_struct *tsk) > -{ > - /* > - * Only need the full barrier when switching between processes. > - * Barrier when switching from kernel to userspace is not > - * required here, given that it is implied by mmdrop(). Barrier > - * when switching from userspace to kernel is not needed after > - * store to rq->curr. > - */ > - if (IS_ENABLED(CONFIG_SMP) && > - likely(!(atomic_read(>membarrier_state) & > - (MEMBARRIER_STATE_PRIVATE_EXPEDITED | > - MEMBARRIER_STATE_GLOBAL_EXPEDITED)) || !prev)) > - return; > - > - /* > - * The membarrier system call requires a full memory barrier > - * after storing to rq->curr, before going back to user-space. > - */ > - smp_mb(); > -} > - > #endif /* _ASM_POWERPC_MEMBARRIER_H */ > diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c > index 74246536b832..5f2daa6b0497 100644 > --- a/arch/powerpc/mm/mmu_context.c > +++ b/arch/powerpc/mm/mmu_context.c > @@ -84,7 +84,6 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct > mm_struct *next, > asm volatile ("dssall"); > > if (!new_on_cpu) > - membarrier_arch_switch_mm(prev, next, tsk); Are you sure that's what you want ? It now means you have: if (!new_on_cpu) switch_mmu_context(prev, next, tsk); > > /* >* The actual HW switching method differs between the various
Re: [PATCH] fs: btrfs: Disable BTRFS on platforms having 256K pages
Hi Qu, Le 05/01/2022 à 00:32, Qu Wenruo a écrit : > Hi Christophe, > > I'm recently enhancing the subpage support for btrfs, and my current > branch should solve the problem for btrfs to support larger page sizes. > > But unfortunately my current test environment can only provide page size > with 64K or 4K, no 16K or 128K/256K support. > > Mind to test my new branch on 128K page size systems? > (256K page size support is still lacking though, which will be addressed > in the future) I don't have any system with disk, I only use flashdisks with UBIFS filesystem. The reason why I did this commit was because of a build failure reported by Kernel Build Robot, that's it. Also note that powerpc doesn't have 128K pages. Only 4/16/64/256. And for 256 it requires a special version of ld and binutils that I don't have. I have a board where I can do 16k pages, but again that board has no disk. Christophe > > https://github.com/adam900710/linux/tree/metadata_subpage_switch > > Thanks, > Qu > > On 2021/6/10 13:23, Christophe Leroy wrote: >> With a config having PAGE_SIZE set to 256K, BTRFS build fails >> with the following message >> >> include/linux/compiler_types.h:326:38: error: call to >> '__compiletime_assert_791' declared with attribute error: BUILD_BUG_ON >> failed: (BTRFS_MAX_COMPRESSED % PAGE_SIZE) != 0 >> >> BTRFS_MAX_COMPRESSED being 128K, BTRFS cannot support platforms with >> 256K pages at the time being. >> >> There are two platforms that can select 256K pages: >> - hexagon >> - powerpc >> >> Disable BTRFS when 256K page size is selected. >> >> Reported-by: kernel test robot >> Signed-off-by: Christophe Leroy >> --- >> fs/btrfs/Kconfig | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/fs/btrfs/Kconfig b/fs/btrfs/Kconfig >> index 68b95ad82126..520a0f6a7d9e 100644 >> --- a/fs/btrfs/Kconfig >> +++ b/fs/btrfs/Kconfig >> @@ -18,6 +18,8 @@ config BTRFS_FS >> select RAID6_PQ >> select XOR_BLOCKS >> select SRCU >> + depends on !PPC_256K_PAGES # powerpc >> + depends on !PAGE_SIZE_256KB # hexagon >> >> help >> Btrfs is a general purpose copy-on-write filesystem with extents,
Re: [PATCH v7 1/3] riscv: Introduce CONFIG_RELOCATABLE
Hi Palmer, Do you think this could go in for-next? Thanks, Alex On 12/6/21 10:44, Alexandre ghiti wrote: @Palmer, can I do anything for that to be pulled in 5.17? Thanks, Alex On 10/27/21 07:04, Alexandre ghiti wrote: Hi Palmer, On 10/26/21 11:29 PM, Palmer Dabbelt wrote: On Sat, 09 Oct 2021 10:20:20 PDT (-0700), a...@ghiti.fr wrote: Arf, I have sent this patchset with the wrong email address. @Palmer tell me if you want me to resend it correctly. Sorry for being kind of slow here. It's fine: there's a "From:" in the patch, and git picks those up so it'll match the signed-off-by line. I send pretty much all my patches that way, as I never managed to get my Google address working correctly. Thanks, Alex On 10/9/21 7:12 PM, Alexandre Ghiti wrote: From: Alexandre Ghiti This config allows to compile 64b kernel as PIE and to relocate it at any virtual address at runtime: this paves the way to KASLR. Runtime relocation is possible since relocation metadata are embedded into the kernel. IMO this should really be user selectable, at a bare minimum so it's testable. I just sent along a patch to do that (my power's off at home, so email is a bit wacky right now). I haven't put this on for-next yet as I'm not sure if you had a fix for the kasan issue (which IIUC would conflict with this). The kasan issue only revealed that I need to move the kasan shadow memory around with sv48 support, that's not related to the relocatable kernel. Thanks, Alex Note that relocating at runtime introduces an overhead even if the kernel is loaded at the same address it was linked at and that the compiler options are those used in arm64 which uses the same RELA relocation format. Signed-off-by: Alexandre Ghiti --- arch/riscv/Kconfig | 12 arch/riscv/Makefile | 7 +++-- arch/riscv/kernel/vmlinux.lds.S | 6 arch/riscv/mm/Makefile | 4 +++ arch/riscv/mm/init.c | 54 - 5 files changed, 80 insertions(+), 3 deletions(-) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index ea16fa2dd768..043ba92559fa 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -213,6 +213,18 @@ config PGTABLE_LEVELS config LOCKDEP_SUPPORT def_bool y +config RELOCATABLE + bool + depends on MMU && 64BIT && !XIP_KERNEL + help + This builds a kernel as a Position Independent Executable (PIE), + which retains all relocation metadata required to relocate the + kernel binary at runtime to a different virtual address than the + address it was linked at. + Since RISCV uses the RELA relocation format, this requires a + relocation pass at runtime even if the kernel is loaded at the + same address it was linked at. + source "arch/riscv/Kconfig.socs" source "arch/riscv/Kconfig.erratas" diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile index 0eb4568fbd29..2f509915f246 100644 --- a/arch/riscv/Makefile +++ b/arch/riscv/Makefile @@ -9,9 +9,12 @@ # OBJCOPYFLAGS := -O binary -LDFLAGS_vmlinux := +ifeq ($(CONFIG_RELOCATABLE),y) + LDFLAGS_vmlinux += -shared -Bsymbolic -z notext -z norelro + KBUILD_CFLAGS += -fPIE +endif ifeq ($(CONFIG_DYNAMIC_FTRACE),y) - LDFLAGS_vmlinux := --no-relax + LDFLAGS_vmlinux += --no-relax KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY CC_FLAGS_FTRACE := -fpatchable-function-entry=8 endif diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S index 5104f3a871e3..862a8c09723c 100644 --- a/arch/riscv/kernel/vmlinux.lds.S +++ b/arch/riscv/kernel/vmlinux.lds.S @@ -133,6 +133,12 @@ SECTIONS BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0) + .rela.dyn : ALIGN(8) { + __rela_dyn_start = .; + *(.rela .rela*) + __rela_dyn_end = .; + } + #ifdef CONFIG_EFI . = ALIGN(PECOFF_SECTION_ALIGNMENT); __pecoff_data_virt_size = ABSOLUTE(. - __pecoff_text_end); diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile index 7ebaef10ea1b..2d33ec574bbb 100644 --- a/arch/riscv/mm/Makefile +++ b/arch/riscv/mm/Makefile @@ -1,6 +1,10 @@ # SPDX-License-Identifier: GPL-2.0-only CFLAGS_init.o := -mcmodel=medany +ifdef CONFIG_RELOCATABLE +CFLAGS_init.o += -fno-pie +endif + ifdef CONFIG_FTRACE CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE) CFLAGS_REMOVE_cacheflush.o = $(CC_FLAGS_FTRACE) diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index c0cddf0fc22d..42041c12d496 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -20,6 +20,9 @@ #include #include #include +#ifdef CONFIG_RELOCATABLE +#include +#endif #include #include @@ -103,7 +106,7 @@ static void __init print_vm_layout(void) print_mlm("lowmem", (unsigned long)PAGE_OFFSET, (unsigned long)high_memory); #ifdef CONFIG_64BIT - print_mlm("kernel", (unsigned long)KERNEL_LINK_ADDR, + print_mlm("kernel", (unsigned
Re: [PATCH v2 1/2] powerpc: Fix virt_addr_valid() check
Le 25/12/2021 à 13:06, Kefeng Wang a écrit : > When run ethtool eth0, the BUG occurred, > >usercopy: Kernel memory exposure attempt detected from SLUB object not in > SLUB page?! (offset 0, size 1048)! >kernel BUG at mm/usercopy.c:99 >... >usercopy_abort+0x64/0xa0 (unreliable) >__check_heap_object+0x168/0x190 >__check_object_size+0x1a0/0x200 >dev_ethtool+0x2494/0x2b20 >dev_ioctl+0x5d0/0x770 >sock_do_ioctl+0xf0/0x1d0 >sock_ioctl+0x3ec/0x5a0 >__se_sys_ioctl+0xf0/0x160 >system_call_exception+0xfc/0x1f0 >system_call_common+0xf8/0x200 > > The code shows below, > >data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); >copy_to_user(useraddr, data, gstrings.len * ETH_GSTRING_LEN)) > > The data is alloced by vmalloc(), virt_addr_valid(ptr) will return true > on PowerPC64, which leads to the panic. > > As commit 4dd7554a6456 ("powerpc/64: Add VIRTUAL_BUG_ON checks for __va > and __pa addresses") does, make sure the virt addr above PAGE_OFFSET in > the virt_addr_valid(). The change done by that commit only applies to PPC64. The change you are doing applies to both PPC64 and PPC32. Did you verify the impact (or should I say the absence of impact) on PPC32 ? > > Signed-off-by: Kefeng Wang > --- > arch/powerpc/include/asm/page.h | 5 - > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h > index 254687258f42..300d4c105a3a 100644 > --- a/arch/powerpc/include/asm/page.h > +++ b/arch/powerpc/include/asm/page.h > @@ -132,7 +132,10 @@ static inline bool pfn_valid(unsigned long pfn) > #define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr)) > #define pfn_to_kaddr(pfn) __va((pfn) << PAGE_SHIFT) > > -#define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr)) > +#define virt_addr_valid(vaddr) ({ > \ > + unsigned long _addr = (unsigned long)vaddr; > \ > + (unsigned long)(_addr) >= PAGE_OFFSET && pfn_valid(virt_to_pfn(_addr)); > \ _addr is already an 'unsigned long' so you shouldnt need to cast it. > +}) > > /* >* On Book-E parts we need __va to parse the device tree and we can't