Re: [PATCH v3 26/27] powerpc/powernv/pmem: Expose the firmware version in sysfs
On 21/2/20 2:27 pm, Alastair D'Silva wrote: From: Alastair D'Silva This information will be used by ndctl in userspace to help users identify the device. You should include the information from the subject line in the body of the commit message too. I think this patch could probably be squashed in with the last one. -- Andrew Donnellan OzLabs, ADL Canberra a...@linux.ibm.com IBM Australia Limited
Reg: KASLR backporting to 4.9 kernels for ppc platform.
Hi, Any plans to back port KASLR for 4.9 kernel. How feasible is to back port to 4.9 kernel for PPC platforms. Regards S Balamurugan.
Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64
在 2020/3/2 11:24, Scott Wood 写道: On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote: 在 2020/3/1 6:54, Scott Wood 写道: On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote: Turnning to %p may not be a good idea in this situation. So for the REG logs printed when dumping stack, we can disable it when KASLR is open. For the REG logs in other places like show_regs(), only privileged can trigger it, and they are not combind with a symbol, so I think it's ok to keep them. diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index fad50db9dcf2..659c51f0739a 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned long *stack) newsp = stack[0]; ip = stack[STACK_FRAME_LR_SAVE]; if (!firstframe || ip != lr) { - printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip); + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) + printk("%pS", (void *)ip); + else + printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip); This doesn't deal with "nokaslr" on the kernel command line. It also doesn't seem like something that every callsite should have to opencode, versus having an appropriate format specifier behaves as I described above (and I still don't see why that format specifier should not be "%p"). Actually I still do not understand why we should print the raw value here. When KALLSYMS is enabled we have symbol name and offset like put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw address. I'm more concerned about the stack address for wading through a raw stack dump (to find function call arguments, etc). The return address does help confirm that I'm on the right stack frame though, and also makes looking up a line number slightly easier than having to look up a symbol address and then add the offset (at least for non-module addresses). As a random aside, the mismatch between Linux printing a hex offset and GDB using decimal in disassembly is annoying... OK, I will send a RFC patch to add a new format specifier such as "%pk" or change the exsiting "%pK" to print raw value of addresses when KASLR is disabled and print hash value of addresses when KASLR is enabled. Let's see what the printk guys would say :) -Scott .
[RFC 2/3] mm/vma: Introduce VM_ACCESS_FLAGS
There are many places where all basic VMA access flags (read, write, exec) are initialized or checked against as a group. One such example is during page fault. Existing vma_is_accessible() wrapper already creates the notion of VMA accessibility as a group access permissions. Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which will not only reduce code duplication but also extend the VMA accessibility concept in general. Cc: Russell King CC: Catalin Marinas CC: Mark Salter Cc: Nick Hu CC: Ley Foon Tan Cc: Michael Ellerman Cc: Heiko Carstens Cc: Yoshinori Sato Cc: Guan Xuetao Cc: Dave Hansen Cc: Thomas Gleixner Cc: Rob Springer Cc: Greg Kroah-Hartman Cc: Andrew Morton Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c6x-...@linux-c6x.org Cc: nios2-...@lists.rocketboards.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-s...@vger.kernel.org Cc: linux...@vger.kernel.org Cc: de...@driverdev.osuosl.org Cc: linux...@kvack.org Cc: linux-ker...@vger.kernel.org Signed-off-by: Anshuman Khandual --- arch/arm/mm/fault.c | 2 +- arch/arm64/mm/fault.c| 2 +- arch/c6x/include/asm/processor.h | 2 +- arch/nds32/mm/fault.c| 2 +- arch/nios2/include/asm/processor.h | 2 +- arch/powerpc/mm/book3s64/pkeys.c | 2 +- arch/s390/mm/fault.c | 2 +- arch/sh/include/asm/processor_64.h | 2 +- arch/unicore32/mm/fault.c| 2 +- arch/x86/mm/pkeys.c | 2 +- drivers/staging/gasket/gasket_core.c | 2 +- include/linux/mm.h | 4 +++- mm/mmap.c| 4 ++-- mm/mprotect.c| 7 +++ 14 files changed, 19 insertions(+), 18 deletions(-) diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c index bd0f4821f7e1..2c71028d9d6b 100644 --- a/arch/arm/mm/fault.c +++ b/arch/arm/mm/fault.c @@ -189,7 +189,7 @@ void do_bad_area(unsigned long addr, unsigned int fsr, struct pt_regs *regs) */ static inline bool access_error(unsigned int fsr, struct vm_area_struct *vma) { - unsigned int mask = VM_READ | VM_WRITE | VM_EXEC; + unsigned int mask = VM_ACCESS_FLAGS; if ((fsr & FSR_WRITE) && !(fsr & FSR_CM)) mask = VM_WRITE; diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 85566d32958f..63f31206a12e 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -445,7 +445,7 @@ static int __kprobes do_page_fault(unsigned long addr, unsigned int esr, const struct fault_info *inf; struct mm_struct *mm = current->mm; vm_fault_t fault, major = 0; - unsigned long vm_flags = VM_READ | VM_WRITE | VM_EXEC; + unsigned long vm_flags = VM_ACCESS_FLAGS; unsigned int mm_flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; if (kprobe_page_fault(regs, esr)) diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h index 1456f5e11de3..77372b8c28d7 100644 --- a/arch/c6x/include/asm/processor.h +++ b/arch/c6x/include/asm/processor.h @@ -57,7 +57,7 @@ struct thread_struct { } #define INIT_MMAP { \ - &init_mm, 0, 0, NULL, PAGE_SHARED, VM_READ | VM_WRITE | VM_EXEC, 1, \ + &init_mm, 0, 0, NULL, PAGE_SHARED, VM_ACCESS_FLAGS, 1, \ NULL, NULL } #define task_pt_regs(task) \ diff --git a/arch/nds32/mm/fault.c b/arch/nds32/mm/fault.c index 906dfb25353c..55387a31bf42 100644 --- a/arch/nds32/mm/fault.c +++ b/arch/nds32/mm/fault.c @@ -79,7 +79,7 @@ void do_page_fault(unsigned long entry, unsigned long addr, struct vm_area_struct *vma; int si_code; vm_fault_t fault; - unsigned int mask = VM_READ | VM_WRITE | VM_EXEC; + unsigned int mask = VM_ACCESS_FLAGS; unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; error_code = error_code & (ITYPE_mskINST | ITYPE_mskETYPE); diff --git a/arch/nios2/include/asm/processor.h b/arch/nios2/include/asm/processor.h index 94bcb86f679f..fbfb3ab14cfc 100644 --- a/arch/nios2/include/asm/processor.h +++ b/arch/nios2/include/asm/processor.h @@ -51,7 +51,7 @@ struct thread_struct { }; #define INIT_MMAP \ - { &init_mm, (0), (0), __pgprot(0x0), VM_READ | VM_WRITE | VM_EXEC } + { &init_mm, (0), (0), __pgprot(0x0), VM_ACCESS_FLAGS } # define INIT_THREAD { \ .kregs = NULL, \ diff --git a/arch/powerpc/mm/book3s64/pkeys.c b/arch/powerpc/mm/book3s64/pkeys.c index 59e0ebbd8036..11fd52b24f68 100644 --- a/arch/powerpc/mm/book3s64/pkeys.c +++ b/arch/powerpc/mm/book3s64/pkeys.c @@ -315,7 +315,7 @@ int __execute_only_pkey(struct mm_struct *mm) static inline bool vma_is_pkey_exec_only(struct vm_area_struct *vma) { /* Do this check first since the vm_flags should be hot */ - if ((vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)) != VM_EXEC) + if ((vma->vm_flags & VM_ACCESS_FLAGS) != VM_EXEC) return false; return (vma_pkey(vma) == vma->vm_mm->c
[RFC 1/3] mm/vma: Define a default value for VM_DATA_DEFAULT_FLAGS
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros with standard VMA access flag combinations that are used frequently across many platforms. Apart from simplification, this reduces code duplication as well. Cc: Richard Henderson Cc: Vineet Gupta Cc: Russell King Cc: Catalin Marinas Cc: Mark Salter Cc: Guo Ren Cc: Yoshinori Sato Cc: Brian Cain Cc: Tony Luck Cc: Geert Uytterhoeven Cc: Michal Simek Cc: Ralf Baechle Cc: Paul Burton Cc: Nick Hu Cc: Ley Foon Tan Cc: Jonas Bonn Cc: "James E.J. Bottomley" Cc: Michael Ellerman Cc: Paul Walmsley Cc: Heiko Carstens Cc: Rich Felker Cc: "David S. Miller" Cc: Guan Xuetao Cc: Thomas Gleixner Cc: Jeff Dike Cc: Chris Zankel Cc: Andrew Morton Cc: linux-al...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c6x-...@linux-c6x.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-hexa...@vger.kernel.org Cc: linux-i...@vger.kernel.org Cc: linux-m...@lists.linux-m68k.org Cc: linux-m...@vger.kernel.org Cc: nios2-...@lists.rocketboards.org Cc: openr...@lists.librecores.org Cc: linux-par...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux-s...@vger.kernel.org Cc: linux...@vger.kernel.org Cc: sparcli...@vger.kernel.org Cc: linux...@lists.infradead.org Cc: linux-xte...@linux-xtensa.org Cc: linux...@kvack.org Signed-off-by: Anshuman Khandual --- arch/alpha/include/asm/page.h | 3 --- arch/arc/include/asm/page.h| 2 +- arch/arm/include/asm/page.h| 4 +--- arch/arm64/include/asm/page.h | 4 +--- arch/c6x/include/asm/page.h| 5 + arch/csky/include/asm/page.h | 3 --- arch/h8300/include/asm/page.h | 2 -- arch/hexagon/include/asm/page.h| 3 +-- arch/ia64/include/asm/page.h | 5 + arch/m68k/include/asm/page.h | 3 --- arch/microblaze/include/asm/page.h | 2 -- arch/mips/include/asm/page.h | 5 + arch/nds32/include/asm/page.h | 3 --- arch/nios2/include/asm/page.h | 3 +-- arch/openrisc/include/asm/page.h | 5 - arch/parisc/include/asm/page.h | 3 --- arch/powerpc/include/asm/page.h| 9 ++--- arch/powerpc/include/asm/page_64.h | 7 ++- arch/riscv/include/asm/page.h | 3 +-- arch/s390/include/asm/page.h | 3 +-- arch/sh/include/asm/page.h | 3 --- arch/sparc/include/asm/page_32.h | 3 --- arch/sparc/include/asm/page_64.h | 3 --- arch/unicore32/include/asm/page.h | 3 --- arch/x86/include/asm/page_types.h | 4 +--- arch/x86/um/asm/vm-flags.h | 10 ++ arch/xtensa/include/asm/page.h | 3 --- include/linux/mm.h | 15 +++ 28 files changed, 32 insertions(+), 89 deletions(-) diff --git a/arch/alpha/include/asm/page.h b/arch/alpha/include/asm/page.h index f3fb2848470a..e241bd0f 100644 --- a/arch/alpha/include/asm/page.h +++ b/arch/alpha/include/asm/page.h @@ -90,9 +90,6 @@ typedef struct page *pgtable_t; #define virt_addr_valid(kaddr) pfn_valid(__pa(kaddr) >> PAGE_SHIFT) #endif /* CONFIG_DISCONTIGMEM */ -#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_EXEC | \ -VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) - #include #include diff --git a/arch/arc/include/asm/page.h b/arch/arc/include/asm/page.h index 0a32e8cfd074..b0dfed0f12be 100644 --- a/arch/arc/include/asm/page.h +++ b/arch/arc/include/asm/page.h @@ -102,7 +102,7 @@ typedef pte_t * pgtable_t; #define virt_addr_valid(kaddr) pfn_valid(virt_to_pfn(kaddr)) /* Default Permissions for stack/heaps pages (Non Executable) */ -#define VM_DATA_DEFAULT_FLAGS (VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) +#define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_NON_EXEC #define WANT_PAGE_VIRTUAL 1 diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h index c2b75cba26df..11b058a72a5b 100644 --- a/arch/arm/include/asm/page.h +++ b/arch/arm/include/asm/page.h @@ -161,9 +161,7 @@ extern int pfn_valid(unsigned long); #endif /* !__ASSEMBLY__ */ -#define VM_DATA_DEFAULT_FLAGS \ - (((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \ -VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC) +#define VM_DATA_DEFAULT_FLAGS VM_DATA_FLAGS_TSK_EXEC #include diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h index d39ddb258a04..cb4e1e6ca385 100644 --- a/arch/arm64/include/asm/page.h +++ b/arch/arm64/include/asm/page.h @@ -32,9 +32,7 @@ extern int pfn_valid(unsigned long); #endif /* !__ASSEMBLY__ */ -#define VM_DATA_DEFAULT_FLAGS \ - (((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \ -VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
[RFC 0/3] mm/vma: some new flags and helpers
The motivation here is to consolidate VMA flag combinations commonly used across platforms and reduce code duplication while making it uncluttered in general. This first introduces a default VM_DATA_DEFAULT_FLAGS which platforms can easily fall back on without requiring to define any similar data flag combinations as they currently do. This also adds some more common data flag combinations which are generally used when the platforms decide to override the default. The second patch consolidates VM_READ, VM_WRITE, VM_EXEC as VM_ACCESS_FLAGS extending the existing VMA accessibility concept via vma_is_accessibility(). VM_ACCESS_FLAGS replaces many other instances which used check all three VMA access flags simultaneously. While here, this also adds some more special VMA flag based helpers which wraps around similar checks at various places thus improving readability. This series intentionally limits these new helpers which are applicable only for special purpose VM flags than the more common ones like VM_READ, VM_WRITE, VM_EXEC, VM_SHARED etc just to limit code churn. But if there is common agreement that every flag should have it's own wrapper here, we can do that as well. Otherwise if this patch seems really unnecessary with much code churn, will be happy to drop it. Reviews, comments, suggestions and concerns welcome. Thank you. This series is based on v5.6-r4 after applying these patches. 1. https://patchwork.kernel.org/cover/11399319/ 2. https://patchwork.kernel.org/patch/11399379/ This series is build tested across multiple architectures but boot tested only on arm64 and x86 platforms. Cc: linux-al...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-snps-...@lists.infradead.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-c6x-...@linux-c6x.org Cc: uclinux-h8-de...@lists.sourceforge.jp Cc: linux-hexa...@vger.kernel.org Cc: linux-i...@vger.kernel.org Cc: linux-m...@lists.linux-m68k.org Cc: linux-m...@vger.kernel.org Cc: nios2-...@lists.rocketboards.org Cc: openr...@lists.librecores.org Cc: linux-par...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-ri...@lists.infradead.org Cc: linux-s...@vger.kernel.org Cc: linux...@vger.kernel.org Cc: sparcli...@vger.kernel.org Cc: linux...@lists.infradead.org Cc: linux-xte...@linux-xtensa.org Cc: linux...@kvack.org Anshuman Khandual (3): mm/vma: Define a default value for VM_DATA_DEFAULT_FLAGS mm/vma: Introduce VM_ACCESS_FLAGS mm/vma: Introduce some more VMA flag wrappers arch/alpha/include/asm/page.h| 3 -- arch/arc/include/asm/page.h | 2 +- arch/arm/include/asm/page.h | 4 +- arch/arm/mm/fault.c | 2 +- arch/arm64/include/asm/page.h| 4 +- arch/arm64/mm/fault.c| 2 +- arch/c6x/include/asm/page.h | 5 +-- arch/c6x/include/asm/processor.h | 2 +- arch/csky/include/asm/page.h | 3 -- arch/h8300/include/asm/page.h| 2 - arch/hexagon/include/asm/page.h | 3 +- arch/ia64/include/asm/page.h | 5 +-- arch/m68k/include/asm/page.h | 3 -- arch/microblaze/include/asm/page.h | 2 - arch/mips/include/asm/page.h | 5 +-- arch/nds32/include/asm/page.h| 3 -- arch/nds32/mm/fault.c| 2 +- arch/nios2/include/asm/page.h| 3 +- arch/nios2/include/asm/processor.h | 2 +- arch/openrisc/include/asm/page.h | 5 --- arch/parisc/include/asm/page.h | 3 -- arch/powerpc/include/asm/page.h | 9 + arch/powerpc/include/asm/page_64.h | 7 +--- arch/powerpc/mm/book3s64/pkeys.c | 2 +- arch/riscv/include/asm/page.h| 3 +- arch/s390/include/asm/page.h | 3 +- arch/s390/mm/fault.c | 2 +- arch/sh/include/asm/page.h | 3 -- arch/sh/include/asm/processor_64.h | 2 +- arch/sparc/include/asm/mman.h| 2 +- arch/sparc/include/asm/page_32.h | 3 -- arch/sparc/include/asm/page_64.h | 3 -- arch/unicore32/include/asm/page.h| 3 -- arch/unicore32/mm/fault.c| 2 +- arch/x86/include/asm/page_types.h| 4 +- arch/x86/mm/pkeys.c | 2 +- arch/x86/um/asm/vm-flags.h | 10 + arch/xtensa/include/asm/page.h | 3 -- drivers/staging/gasket/gasket_core.c | 2 +- fs/binfmt_elf.c | 2 +- fs/proc/task_mmu.c | 14 +++ include/linux/huge_mm.h | 4 +- include/linux/mm.h | 58 +++- kernel/events/core.c | 2 +- kernel/events/uprobes.c | 2 +- mm/gup.c | 2 +- mm/huge_memory.c | 6 +-- mm/hugetlb.c | 4 +- mm/ksm.c | 8 ++-- mm/madvise.c | 4 +- mm/memory.c | 4 +- mm/migrate.c | 4 +- mm/mlock.c | 4 +- mm/mmap.c
Re: [PATCH] powerpc/sysdev: fix compile errors
Le 02/03/2020 à 06:37, WANG Wenhu a écrit : Include linux/io.h into fsl_85xx_cache_sram.c to fix the implicit-declaration compile errors when building Cache-Sram. arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’: arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? [-Werror=implicit-function-declaration] cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys, ^~~~ bitmap_complement arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes pointer from integer without a cast [-Werror=int-conversion] cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys, ^ arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of function ‘iounmap’; did you mean ‘roundup’? [-Werror=implicit-function-declaration] iounmap(cache_sram->base_virt); ^~~ roundup cc1: all warnings being treated as errors Fixed: commit 6db92cc9d07d ("powerpc/85xx: add cache-sram support") Signed-off-by: WANG Wenhu Reviewed-by: Christophe Leroy --- arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c index f6c665dac725..be3aef4229d7 100644 --- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c +++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "fsl_85xx_cache_ctlr.h"
Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
On Mon, 2020-03-02 at 16:34 +1100, Andrew Donnellan wrote: > On 21/2/20 2:27 pm, Alastair D'Silva wrote: > > From: Alastair D'Silva > > > > This patch introduces a character device (/dev/ocxl-scmX) which > > further > > patches will use to interact with userspace. > > As with the comments on other patches in this series, this commit > message is lacking in explanation. What's the purpose of this device? > I'll reword this for v4. > > Signed-off-by: Alastair D'Silva > > --- > > arch/powerpc/platforms/powernv/pmem/ocxl.c| 116 > > +- > > .../platforms/powernv/pmem/ocxl_internal.h| 2 + > > 2 files changed, 116 insertions(+), 2 deletions(-) > > > > diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c > > b/arch/powerpc/platforms/powernv/pmem/ocxl.c > > index b8bd7e703b19..63109a870d2c 100644 > > --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c > > +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c > > @@ -10,6 +10,7 @@ > > #include > > #include > > #include > > +#include > > #include > > #include > > #include "ocxl_internal.h" > > @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem > > *ocxlpmem) > > > > free_minor(ocxlpmem); > > > > + if (ocxlpmem->cdev.owner) > > + cdev_del(&ocxlpmem->cdev); > > + > > if (ocxlpmem->metadata_addr) > > devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr); > > > > @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem > > *ocxlpmem) > > return device_register(&ocxlpmem->dev); > > } > > > > +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem) > > +{ > > + put_device(&ocxlpmem->dev); > > +} > > + > > +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem) > > +{ > > + return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem; > > +} > > + > > +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno) > > +{ > > + struct ocxlpmem *ocxlpmem; > > + int minor = MINOR(devno); > > + /* > > +* We don't declare an RCU critical section here, as our AFU > > +* is protected by a re0ference counter on the device. By the > > time the > > +* minor number of a device is removed from the idr, the ref > > count of > > +* the device is already at 0, so no user API will access that > > AFU and > > +* this function can't return it. > > +*/ > > + ocxlpmem = idr_find(&minors_idr, minor); > > + if (ocxlpmem) > > + ocxlpmem_get(ocxlpmem); > > + return ocxlpmem; > > +} > > + > > +static int file_open(struct inode *inode, struct file *file) > > +{ > > + struct ocxlpmem *ocxlpmem; > > + > > + ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev); > > + if (!ocxlpmem) > > + return -ENODEV; > > + > > + file->private_data = ocxlpmem; > > + return 0; > > +} > > + > > +static int file_release(struct inode *inode, struct file *file) > > +{ > > + struct ocxlpmem *ocxlpmem = file->private_data; > > + > > + ocxlpmem_put(ocxlpmem); > > + return 0; > > +} > > + > > +static const struct file_operations fops = { > > + .owner = THIS_MODULE, > > + .open = file_open, > > + .release= file_release, > > +}; > > + > > +/** > > + * create_cdev() - Create the chardev in /dev for the device > > + * @ocxlpmem: the SCM metadata > > + * Return: 0 on success, negative on failure > > + */ > > +static int create_cdev(struct ocxlpmem *ocxlpmem) > > +{ > > + cdev_init(&ocxlpmem->cdev, &fops); > > + return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1); > > +} > > + > > /** > >* ocxlpmem_remove() - Free an OpenCAPI persistent memory device > >* @pdev: the PCI device information struct > > @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const > > struct pci_device_id *ent) > > goto err; > > } > > > > + if (create_cdev(ocxlpmem)) { > > + dev_err(&pdev->dev, "Could not create character > > device\n"); > > + goto err; > > + } > > + > > elapsed = 0; > > timeout = ocxlpmem->readiness_timeout + ocxlpmem- > > >memory_available_timeout; > > while (!is_usable(ocxlpmem, false)) { > > @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = { > > .shutdown = ocxlpmem_remove, > > }; > > > > +static int file_init(void) > > +{ > > + int rc; > > + > > + mutex_init(&minors_idr_lock); > > + idr_init(&minors_idr); > > + > > + rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl- > > pmem"); > > If the driver is going to be called "ocxlpmem" can we standardise on > that without the extra hyphen? Ok > > + if (rc) { > > + idr_destroy(&minors_idr); > > + pr_err("Unable to allocate OpenCAPI persistent memory > > major number: %d\n", rc); > > + return rc; > > + } > > + > > + ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem"); > > + if (IS_ERR(ocxlpmem_class)) { > > + idr_destroy(&minors_idr); > > + pr_err("Unable to create ocxl-pmem class\n");
[PATCH] powerpc/sysdev: fix compile errors
Include linux/io.h into fsl_85xx_cache_sram.c to fix the implicit-declaration compile errors when building Cache-Sram. arch/powerpc/sysdev/fsl_85xx_cache_sram.c: In function ‘instantiate_cache_sram’: arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:26: error: implicit declaration of function ‘ioremap_coherent’; did you mean ‘bitmap_complement’? [-Werror=implicit-function-declaration] cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys, ^~~~ bitmap_complement arch/powerpc/sysdev/fsl_85xx_cache_sram.c:97:24: error: assignment makes pointer from integer without a cast [-Werror=int-conversion] cache_sram->base_virt = ioremap_coherent(cache_sram->base_phys, ^ arch/powerpc/sysdev/fsl_85xx_cache_sram.c:123:2: error: implicit declaration of function ‘iounmap’; did you mean ‘roundup’? [-Werror=implicit-function-declaration] iounmap(cache_sram->base_virt); ^~~ roundup cc1: all warnings being treated as errors Fixed: commit 6db92cc9d07d ("powerpc/85xx: add cache-sram support") Signed-off-by: WANG Wenhu --- arch/powerpc/sysdev/fsl_85xx_cache_sram.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c index f6c665dac725..be3aef4229d7 100644 --- a/arch/powerpc/sysdev/fsl_85xx_cache_sram.c +++ b/arch/powerpc/sysdev/fsl_85xx_cache_sram.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "fsl_85xx_cache_ctlr.h" -- 2.17.1
Re: [PATCH v3 21/27] powerpc/powernv/pmem: Add an IOCTL to request controller health & perf data
On Fri, 2020-02-28 at 17:12 +1100, Andrew Donnellan wrote: > On 21/2/20 2:27 pm, Alastair D'Silva wrote: > > From: Alastair D'Silva > > > > When health & performance data is requested from the controller, > > it responds with an error log containing the requested information. > > > > This patch allows the request to me issued via an IOCTL. > > A better explanation would be good - this IOCTL triggers a request > to > the controller to collect controller health/perf data, and the > controller will later respond with an error log that can be picked > up > via the error log IOCTL that you've defined earlier. > > Ok -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819
RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
On Mon, 2020-03-02 at 10:42 +1100, Alastair D'Silva wrote: > On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote: > > On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote: > > > On 21/2/20 2:27 pm, Alastair D'Silva wrote: > > > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem) > > > > +{ > > > > + int i, rc; > > > > + > > > > + for (i = 0; i < ARRAY_SIZE(attrs); i++) { > > > > + rc = device_create_file(&ocxlpmem->dev, > > > > &attrs[i]); > > > > + if (rc) { > > > > + for (; --i >= 0;) > > > > + device_remove_file(&ocxlpmem- > > > > >dev, > > > > &attrs[i]); > > > > > > I'd rather avoid weird for loop constructs if possible. > > > > > > Is it actually dangerous to call device_remove_file() on an attr > > > that hasn't > > > been added? If not then I'd rather define an err: label and loop > > > over the > > > whole array there. > > > > None of this should be used at all, just use attribute groups > > properly > > and the driver core will handle this all for you. > > > > device_create/remove_file should never be called by anyone anymore > > if > > at all > > possible. > > > > thanks, > > > > greg k-h > > Thanks, I'll rework it to use the .groups member of struct > pci_driver. > I ended up making these available as DIMM attributes instead. -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819
Re: [PATCH v3 16/27] powerpc/powernv/pmem: Register a character device for userspace to interact with
On 21/2/20 2:27 pm, Alastair D'Silva wrote: From: Alastair D'Silva This patch introduces a character device (/dev/ocxl-scmX) which further patches will use to interact with userspace. As with the comments on other patches in this series, this commit message is lacking in explanation. What's the purpose of this device? Signed-off-by: Alastair D'Silva --- arch/powerpc/platforms/powernv/pmem/ocxl.c| 116 +- .../platforms/powernv/pmem/ocxl_internal.h| 2 + 2 files changed, 116 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pmem/ocxl.c b/arch/powerpc/platforms/powernv/pmem/ocxl.c index b8bd7e703b19..63109a870d2c 100644 --- a/arch/powerpc/platforms/powernv/pmem/ocxl.c +++ b/arch/powerpc/platforms/powernv/pmem/ocxl.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include #include "ocxl_internal.h" @@ -339,6 +340,9 @@ static void free_ocxlpmem(struct ocxlpmem *ocxlpmem) free_minor(ocxlpmem); + if (ocxlpmem->cdev.owner) + cdev_del(&ocxlpmem->cdev); + if (ocxlpmem->metadata_addr) devm_memunmap(&ocxlpmem->dev, ocxlpmem->metadata_addr); @@ -396,6 +400,70 @@ static int ocxlpmem_register(struct ocxlpmem *ocxlpmem) return device_register(&ocxlpmem->dev); } +static void ocxlpmem_put(struct ocxlpmem *ocxlpmem) +{ + put_device(&ocxlpmem->dev); +} + +static struct ocxlpmem *ocxlpmem_get(struct ocxlpmem *ocxlpmem) +{ + return (get_device(&ocxlpmem->dev) == NULL) ? NULL : ocxlpmem; +} + +static struct ocxlpmem *find_and_get_ocxlpmem(dev_t devno) +{ + struct ocxlpmem *ocxlpmem; + int minor = MINOR(devno); + /* +* We don't declare an RCU critical section here, as our AFU +* is protected by a re0ference counter on the device. By the time the +* minor number of a device is removed from the idr, the ref count of +* the device is already at 0, so no user API will access that AFU and +* this function can't return it. +*/ + ocxlpmem = idr_find(&minors_idr, minor); + if (ocxlpmem) + ocxlpmem_get(ocxlpmem); + return ocxlpmem; +} + +static int file_open(struct inode *inode, struct file *file) +{ + struct ocxlpmem *ocxlpmem; + + ocxlpmem = find_and_get_ocxlpmem(inode->i_rdev); + if (!ocxlpmem) + return -ENODEV; + + file->private_data = ocxlpmem; + return 0; +} + +static int file_release(struct inode *inode, struct file *file) +{ + struct ocxlpmem *ocxlpmem = file->private_data; + + ocxlpmem_put(ocxlpmem); + return 0; +} + +static const struct file_operations fops = { + .owner = THIS_MODULE, + .open = file_open, + .release= file_release, +}; + +/** + * create_cdev() - Create the chardev in /dev for the device + * @ocxlpmem: the SCM metadata + * Return: 0 on success, negative on failure + */ +static int create_cdev(struct ocxlpmem *ocxlpmem) +{ + cdev_init(&ocxlpmem->cdev, &fops); + return cdev_add(&ocxlpmem->cdev, ocxlpmem->dev.devt, 1); +} + /** * ocxlpmem_remove() - Free an OpenCAPI persistent memory device * @pdev: the PCI device information struct @@ -572,6 +640,11 @@ static int probe(struct pci_dev *pdev, const struct pci_device_id *ent) goto err; } + if (create_cdev(ocxlpmem)) { + dev_err(&pdev->dev, "Could not create character device\n"); + goto err; + } + elapsed = 0; timeout = ocxlpmem->readiness_timeout + ocxlpmem->memory_available_timeout; while (!is_usable(ocxlpmem, false)) { @@ -613,20 +686,59 @@ static struct pci_driver pci_driver = { .shutdown = ocxlpmem_remove, }; +static int file_init(void) +{ + int rc; + + mutex_init(&minors_idr_lock); + idr_init(&minors_idr); + + rc = alloc_chrdev_region(&ocxlpmem_dev, 0, NUM_MINORS, "ocxl-pmem"); If the driver is going to be called "ocxlpmem" can we standardise on that without the extra hyphen? + if (rc) { + idr_destroy(&minors_idr); + pr_err("Unable to allocate OpenCAPI persistent memory major number: %d\n", rc); + return rc; + } + + ocxlpmem_class = class_create(THIS_MODULE, "ocxl-pmem"); + if (IS_ERR(ocxlpmem_class)) { + idr_destroy(&minors_idr); + pr_err("Unable to create ocxl-pmem class\n"); + unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS); + return PTR_ERR(ocxlpmem_class); + } + + return 0; +} + +static void file_exit(void) +{ + class_destroy(ocxlpmem_class); + unregister_chrdev_region(ocxlpmem_dev, NUM_MINORS); + idr_destroy(&minors_idr); +} + static int __init ocxlpmem_init(void) { - int rc = 0; + int rc; - rc = pci_register_driver(&pci_driver); + rc = file_init();
[RFC 05/11] perf tools: Enable record and script to record and show hazard data
From: Madhavan Srinivasan Introduce new perf record option "--hazard" to capture cpu pipeline hazard data. Also enable perf script -D to dump raw values of it. Sample o/p: $ ./perf record -e r4010e --hazard -- ls $ ./perf script -D ... PERF_RECORD_SAMPLE(IP, 0x2): ... hazard information: Inst Type 0x1 Inst Cache 0x1 Hazard Stage 0x4 Hazard Reason 0x3 Stall Stage 0x4 Stall Reason 0x2 Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- tools/perf/Documentation/perf-record.txt | 3 +++ tools/perf/builtin-record.c | 1 + tools/perf/util/event.h | 1 + tools/perf/util/evsel.c | 10 ++ tools/perf/util/perf_event_attr_fprintf.c | 1 + tools/perf/util/record.h | 1 + tools/perf/util/session.c | 16 7 files changed, 33 insertions(+) diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt index b23a4012a606..e7bd1b6938ce 100644 --- a/tools/perf/Documentation/perf-record.txt +++ b/tools/perf/Documentation/perf-record.txt @@ -283,6 +283,9 @@ OPTIONS --phys-data:: Record the sample physical addresses. +--hazard:: + Record processor pipeline hazard and stall information. + -T:: --timestamp:: Record the sample timestamps. Use it with 'perf report -D' to see the diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 4c301466101b..6bd32d7bc4e9 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -2301,6 +2301,7 @@ static struct option __record_options[] = { OPT_BOOLEAN('s', "stat", &record.opts.inherit_stat, "per thread counts"), OPT_BOOLEAN('d', "data", &record.opts.sample_address, "Record the sample addresses"), + OPT_BOOLEAN(0, "hazard", &record.opts.hazard, "Record processor pipeline hazard and stall information"), OPT_BOOLEAN(0, "phys-data", &record.opts.sample_phys_addr, "Record the sample physical addresses"), OPT_BOOLEAN(0, "sample-cpu", &record.opts.sample_cpu, "Record the sample cpu"), diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h index 85223159737c..ff0f03253a95 100644 --- a/tools/perf/util/event.h +++ b/tools/perf/util/event.h @@ -148,6 +148,7 @@ struct perf_sample { struct stack_dump user_stack; struct sample_read read; struct aux_sample aux_sample; + struct perf_pipeline_haz_data *pipeline_haz; }; #define PERF_MEM_DATA_SRC_NONE \ diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index c8dc4450884c..e37ed7929c2c 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -1080,6 +1080,9 @@ void perf_evsel__config(struct evsel *evsel, struct record_opts *opts, if (opts->sample_phys_addr) perf_evsel__set_sample_bit(evsel, PHYS_ADDR); + if (opts->hazard) + perf_evsel__set_sample_bit(evsel, PIPELINE_HAZ); + if (opts->no_buffering) { attr->watermark = 0; attr->wakeup_events = 1; @@ -2265,6 +2268,13 @@ int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event, array = (void *)array + sz; } + if (type & PERF_SAMPLE_PIPELINE_HAZ) { + sz = sizeof(struct perf_pipeline_haz_data); + OVERFLOW_CHECK(array, sz, max_size); + data->pipeline_haz = (struct perf_pipeline_haz_data *)array; + array = (void *)array + sz; + } + return 0; } diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c index 651203126c71..d97e755c886b 100644 --- a/tools/perf/util/perf_event_attr_fprintf.c +++ b/tools/perf/util/perf_event_attr_fprintf.c @@ -35,6 +35,7 @@ static void __p_sample_type(char *buf, size_t size, u64 value) bit_name(BRANCH_STACK), bit_name(REGS_USER), bit_name(STACK_USER), bit_name(IDENTIFIER), bit_name(REGS_INTR), bit_name(DATA_SRC), bit_name(WEIGHT), bit_name(PHYS_ADDR), bit_name(AUX), + bit_name(PIPELINE_HAZ), { .name = NULL, } }; #undef bit_name diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h index 5421fd2ad383..f1678a0bc8ce 100644 --- a/tools/perf/util/record.h +++ b/tools/perf/util/record.h @@ -67,6 +67,7 @@ struct record_opts { int affinity; int mmap_flush; unsigned int comp_level; + bool hazard; }; extern const char * const *record_usage; diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index d0d7d25b23e3..834ca7df2349 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1153,6 +1153,19 @@ static void stack_user__printf(struct stack_dump *dump) dump->size, dump->offset); } +static void pipeline_hazard__printf(struct perf_sample *sampl
[RFC 11/11] perf annotate: Show hazard data in tui mode
Enable perf report->annotate tui mode to show hazard information. By default they are hidden, but user can unhide them by pressing hot key 'S'. Sample o/p: │Disassembly of section .text: │ │10001cf8 : │compare(): │return NULL; │} │ │static int │compare(const void *p1, const void *p2) │{ 33.23 │ stdr31,-8(r1) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: Load Hit Store, stall_stage: LSU, stall_reason: -, icache: L3 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, stall_reason: -, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} 0.84 │ stdu r1,-64(r1) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, stall_reason: -, icache: L1 hit} 0.24 │ mr r31,r1 │ {haz_stage: -, haz_reason: -, stall_stage: -, stall_reason: -, icache: L1 hit} 21.18 │ stdr3,32(r31) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} Signed-off-by: Ravi Bangoria --- tools/perf/builtin-annotate.c | 5 ++ tools/perf/ui/browsers/annotate.c | 124 ++ tools/perf/util/annotate.c| 51 +++- tools/perf/util/annotate.h| 18 - 4 files changed, 178 insertions(+), 20 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index 78552a9428a6..a51313a6b019 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -472,6 +472,7 @@ static const char * const annotate_usage[] = { int cmd_annotate(int argc, const char **argv) { + bool annotate_haz = false; struct perf_annotate annotate = { .tool = { .sample = process_sample_event, @@ -531,6 +532,8 @@ int cmd_annotate(int argc, const char **argv) symbol__config_symfs), OPT_BOOLEAN(0, "source", &annotate.opts.annotate_src, "Interleave source code with assembly code (default)"), + OPT_BOOLEAN(0, "hazard", &annotate_haz, + "Interleave CPU pileline hazard/stall data with assembly code"), OPT_BOOLEAN(0, "asm-raw", &annotate.opts.show_asm_raw, "Display raw encoding of assembly instructions (default)"), OPT_STRING('M', "disassembler-style", &annotate.opts.disassembler_style, "disassembler style", @@ -583,6 +586,8 @@ int cmd_annotate(int argc, const char **argv) if (annotate_check_args(&annotate.opts) < 0) return -EINVAL; + annotate.opts.hide_haz_data = !annotate_haz; + if (symbol_conf.show_nr_samples && annotate.use_gtk) { pr_err("--show-nr-samples is not available in --gtk mode at this time\n"); return ret; diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 2e4db8216b3b..b04d825cee50 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -190,9 +190,15 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser) return; } - if (notes->options->hide_src_code) { + if (notes->options->hide_src_code && notes->options->hide_haz_data) { from = cursor->al.idx_asm; to = target->idx_asm; + } else if (!notes->options->hide_src_code && notes->options->hide_haz_data) { + from = cursor->al.idx_asm + cursor->al.idx_src + 1; + to = target->idx_asm + target->idx_src + 1; + } else if (notes->options->hide_src_code && !notes->options->hide_haz_data) { + from = cursor->al.idx_asm + cursor->al.idx_haz + 1; + to = target->idx_asm + target->idx_haz + 1; } else { from = (u64)cursor->al.idx; to = (u64)target->idx; @@ -293,8 +299,13 @@ static void annotate_browser__set_rb_top(struct annotate_browser *browser, struct annotation_line * pos = rb_entry(nd, struct annotation_line, rb_node); u32 idx = pos->idx; - if (notes->options->hide_src_code) + if (notes->options->hide_src_code && notes->options->hide_haz_data) idx = pos->idx_asm;
[RFC 09/11] perf annotate: Introduce type for annotation_line
struct annotation_line can contain either assembly instruction or a source code line. To distinguish between them we currently use offset. If offset is -1, it's a source otherwise it's assembly. This is bit cryptic when you first read the code. Introduce new field 'type' that denotes type of the data the annotation_line object contains. Signed-off-by: Ravi Bangoria --- tools/perf/ui/browsers/annotate.c | 4 ++-- tools/perf/ui/gtk/annotate.c | 6 +++--- tools/perf/util/annotate.c| 27 --- tools/perf/util/annotate.h| 8 +++- 4 files changed, 28 insertions(+), 17 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index 9023267e5643..2e4db8216b3b 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -317,7 +317,7 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser, double max_percent = 0.0; int i; - if (pos->al.offset == -1) { + if (pos->al.type != AL_TYPE_ASM) { RB_CLEAR_NODE(&pos->al.rb_node); continue; } @@ -816,7 +816,7 @@ static int annotate_browser__run(struct annotate_browser *browser, if (browser->selection == NULL) ui_helpline__puts("Huh? No selection. Report to linux-ker...@vger.kernel.org"); - else if (browser->selection->offset == -1) + else if (browser->selection->type != AL_TYPE_ASM) ui_helpline__puts("Actions are only available for assembly lines."); else if (!dl->ins.ops) goto show_sup_ins; diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c index 35f9641bf670..71c792a0b17d 100644 --- a/tools/perf/ui/gtk/annotate.c +++ b/tools/perf/ui/gtk/annotate.c @@ -35,7 +35,7 @@ static int perf_gtk__get_percent(char *buf, size_t size, struct symbol *sym, strcpy(buf, ""); - if (dl->al.offset == (s64) -1) + if (dl->al.type != AL_TYPE_ASM) return 0; symhist = annotation__histogram(symbol__annotation(sym), evidx); @@ -61,7 +61,7 @@ static int perf_gtk__get_offset(char *buf, size_t size, struct map_symbol *ms, strcpy(buf, ""); - if (dl->al.offset == (s64) -1) + if (dl->al.type != AL_TYPE_ASM) return 0; return scnprintf(buf, size, "%"PRIx64, start + dl->al.offset); @@ -78,7 +78,7 @@ static int perf_gtk__get_line(char *buf, size_t size, struct disasm_line *dl) if (!line) return 0; - if (dl->al.offset != (s64) -1) + if (dl->al.type == AL_TYPE_ASM) markup = NULL; if (markup) diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 4e2706274d85..8aef60a6ffea 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1148,6 +1148,7 @@ struct annotate_args { struct evsel *evsel; struct annotation_options *options; s64 offset; + u8type; char *line; int line_nr; }; @@ -1156,6 +1157,7 @@ static void annotation_line__init(struct annotation_line *al, struct annotate_args *args, int nr) { + al->type = args->type; al->offset = args->offset; al->line = strdup(args->line); al->line_nr = args->line_nr; @@ -1202,7 +1204,7 @@ static struct disasm_line *disasm_line__new(struct annotate_args *args) if (dl->al.line == NULL) goto out_delete; - if (args->offset != -1) { + if (dl->al.type == AL_TYPE_ASM) { if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0) goto out_free_line; @@ -1246,7 +1248,7 @@ struct annotation_line * annotation_line__next(struct annotation_line *pos, struct list_head *head) { list_for_each_entry_continue(pos, head, node) - if (pos->offset >= 0) + if (pos->type == AL_TYPE_ASM) return pos; return NULL; @@ -1357,7 +1359,7 @@ annotation_line__print(struct annotation_line *al, struct symbol *sym, u64 start static const char *prev_line; static const char *prev_color; - if (al->offset != -1) { + if (al->type == AL_TYPE_ASM) { double max_percent = 0.0; int i, nr_percent = 1; const char *color; @@ -1500,6 +1502,7 @@ static int symbol__parse_objdump_line(struct symbol *sym, } args->offset = offset; + args->type= (offset != -1) ? AL_TYPE_ASM : AL_TYPE_SRC; args->line= parsed_line; args->lin
[RFC 10/11] perf annotate: Preparation for hazard
Introduce 'struct hazard_hist' that will contain hazard specific information for annotate. Add Array of list of 'struct hazard_hist' into 'struct annotated_source' where array length = symbol size and each member of list contain hazard info from associated perf sample. This information is prepared while parsing samples in perf report. Also, this is just a preparation step for annotate and followup patch does actual annotate ui changes. Signed-off-by: Ravi Bangoria --- tools/perf/builtin-report.c | 1 + tools/perf/util/annotate.c | 75 + tools/perf/util/annotate.h | 14 +++ tools/perf/util/hist.c | 13 +++ tools/perf/util/hist.h | 4 ++ tools/perf/util/machine.c | 6 +++ tools/perf/util/machine.h | 3 ++ 7 files changed, 116 insertions(+) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index a47542a12da1..ff950ff8dd51 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -301,6 +301,7 @@ static int process_sample_event(struct perf_tool *tool, hist__account_cycles(sample->branch_stack, &al, sample, rep->nonany_branch_mode, &rep->total_cycles); + hist__capture_haz_info(&al, sample, evsel); } ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 8aef60a6ffea..766934b0f36d 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -36,6 +36,7 @@ #include "string2.h" #include "util/event.h" #include "arch/common.h" +#include "hazard.h" #include #include #include @@ -800,6 +801,21 @@ static int symbol__alloc_hist_cycles(struct symbol *sym) return 0; } +static int symbol__alloc_hist_hazard(struct symbol *sym) +{ + struct annotation *notes = symbol__annotation(sym); + const size_t size = symbol__size(sym); + size_t i; + + notes->src->haz_hist = calloc(size, sizeof(struct hazard_hist)); + if (notes->src->haz_hist == NULL) + return -1; + + for (i = 0; i < size; i++) + INIT_LIST_HEAD(¬es->src->haz_hist[i].list); + return 0; +} + void symbol__annotate_zero_histograms(struct symbol *sym) { struct annotation *notes = symbol__annotation(sym); @@ -920,6 +936,25 @@ static struct cyc_hist *symbol__cycles_hist(struct symbol *sym) return notes->src->cycles_hist; } +static struct hazard_hist *symbol__hazard_hist(struct symbol *sym) +{ + struct annotation *notes = symbol__annotation(sym); + + if (notes->src == NULL) { + notes->src = annotated_source__new(); + if (notes->src == NULL) + return NULL; + goto alloc_haz_hist; + } + + if (!notes->src->haz_hist) { +alloc_haz_hist: + symbol__alloc_hist_hazard(sym); + } + + return notes->src->haz_hist; +} + struct annotated_source *symbol__hists(struct symbol *sym, int nr_hists) { struct annotation *notes = symbol__annotation(sym); @@ -1014,6 +1049,46 @@ int addr_map_symbol__account_cycles(struct addr_map_symbol *ams, return err; } +int symbol__capture_haz_info(struct addr_map_symbol *ams, +struct perf_sample *sample, +struct evsel *evsel) +{ + struct hazard_hist *hh, *tmp; + u64 offset; + const char *arch = perf_env__arch(perf_evsel__env(evsel)); + + if (ams->ms.sym == NULL) + return 0; + + hh = symbol__hazard_hist(ams->ms.sym); + if (!hh) + return -ENOMEM; + + if (ams->al_addr < ams->ms.sym->start || ams->al_addr >= ams->ms.sym->end) + return -ERANGE; + + offset = ams->al_addr - ams->ms.sym->start; + + tmp = zalloc(sizeof(*tmp)); + if (!tmp) + return -ENOMEM; + + tmp->icache = perf_haz__icache_str(sample->pipeline_haz->icache, arch); + tmp->haz_stage = perf_haz__hstage_str(sample->pipeline_haz->hazard_stage, + arch); + tmp->haz_reason = perf_haz__hreason_str(sample->pipeline_haz->hazard_stage, + sample->pipeline_haz->hazard_reason, + arch); + tmp->stall_stage = perf_haz__sstage_str(sample->pipeline_haz->stall_stage, + arch); + tmp->stall_reason = perf_haz__sreason_str(sample->pipeline_haz->stall_stage, + sample->pipeline_haz->stall_reason, + arch); + + list_add(&tmp->list, &hh[offset].list); + return 0; +} + static unsigned annotation__count_insn(struct annotation *notes, u64 start, u64 end) { unsigned n_insn = 0;
[RFC 08/11] perf report: Enable hazard mode
From: Madhavan Srinivasan Introduce --hazard with perf report to show perf report with hazard data. Hazard mode columns are Instruction Type, Hazard Stage, Hazard Reason, Stall Stage, Stall Reason and Icache access. Default sort order is sym, dso, inst type, hazard stage, hazard reason, stall stage, stall reason, inst cache. Sample o/p on IBM PowerPC machine: Overhead Symbol Shared Instruction Type Hazard Stage Hazard Reason Stall Stage Stall Reason ICache access 36.58% [.] thread_run ebizzy Load LSUMispredict LSU Load fin L1 hit 9.46% [.] thread_run ebizzy Load LSUMispredict LSU Dcache_miss L1 hit 1.76% [.] thread_run ebizzy Fixed point - - - - L1 hit 1.31% [.] thread_run ebizzy Load LSUERAT Miss LSU Load fin L1 hit 1.27% [.] thread_run ebizzy Load LSUMispredict - - L1 hit 1.16% [.] thread_run ebizzy Fixed point - - FXU Fixed cycle L1 hit 0.50% [.] thread_run ebizzy Fixed point ISUSource UnavailableFXU Fixed cycle L1 hit 0.30% [.] thread_run ebizzy Load LSULMQ Full, DERAT Miss LSU Load fin L1 hit 0.24% [.] thread_run ebizzy Load LSUERAT Miss - - L1 hit 0.08% [.] thread_run ebizzy - - - BRU Fixed cycle L1 hit 0.05% [.] thread_run ebizzy Branch- - BRU Fixed cycle L1 hit 0.04% [.] thread_run ebizzy Fixed point ISUSource Unavailable- - L1 hit Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- tools/perf/builtin-report.c | 28 + tools/perf/util/hist.c | 77 tools/perf/util/hist.h | 7 ++ tools/perf/util/sort.c | 230 tools/perf/util/sort.h | 22 5 files changed, 364 insertions(+) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 72a12b69f120..a47542a12da1 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -77,6 +77,7 @@ struct report { boolshow_threads; boolinverted_callchain; boolmem_mode; + boolhazard; boolstats_mode; booltasks_mode; boolmmaps_mode; @@ -285,6 +286,8 @@ static int process_sample_event(struct perf_tool *tool, iter.ops = &hist_iter_branch; } else if (rep->mem_mode) { iter.ops = &hist_iter_mem; + } else if (rep->hazard) { + iter.ops = &hist_iter_haz; } else if (symbol_conf.cumulate_callchain) { iter.ops = &hist_iter_cumulative; } else { @@ -396,6 +399,14 @@ static int report__setup_sample_type(struct report *rep) } } + if (sort__mode == SORT_MODE__HAZARD) { + if (!is_pipe && !(sample_type & PERF_SAMPLE_PIPELINE_HAZ)) { + ui__error("Selected --hazard but no hazard data. " + "Did you call perf record without --hazard?\n"); + return -1; + } + } + if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) { if ((sample_type & PERF_SAMPLE_REGS_USER) && (sample_type & PERF_SAMPLE_STACK_USER)) { @@ -484,6 +495,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report if (rep->mem_mode) { ret += fprintf(fp, "\n# Total weight : %" PRIu64, nr_events); ret += fprintf(fp, "\n# Sort order : %s", sort_order ? : default_mem_sort_order); + } else if (rep->hazard) { + ret += fprintf(fp, "\n# Event count (approx.): %" PRIu64, nr_events); + ret += fprintf(fp, "\n# Sort order: %s", sort_order ? : default_haz_sort_order); } else ret += fprintf(fp, "\n# Event count (approx.): %" PRIu64, nr_events); @@ -1228,6 +1242,7 @@ int cmd_report(int argc, const char **argv) OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel, "Enable kernel symbol demangling"), OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"), + OPT_BOOLEAN(0, "hazard", &report.hazard, "Processor pipeline hazard and stalls"), OPT_INTEGER(0, "samples", &symbol_conf.res_sample
[RFC 07/11] perf hazard: Functions to convert generic hazard data to arch specific string
From: Madhavan Srinivasan Kernel provides pipeline hazard data in struct perf_pipeline_haz_data format. Add code to convert this data into meaningful string which can be shown in perf report (followup patch). Introduce tools/perf/utils/hazard directory which will contains arch specific directories. Under arch specific directory, add arch specific logic that will be called by generic code. This directory structure is introduced to enable cross-arch reporting. Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- tools/perf/util/Build | 2 + tools/perf/util/hazard.c | 51 +++ tools/perf/util/hazard.h | 14 ++ tools/perf/util/hazard/Build | 1 + .../util/hazard/powerpc/perf_pipeline_haz.h | 80 ++ .../perf/util/hazard/powerpc/powerpc_hazard.c | 142 ++ .../perf/util/hazard/powerpc/powerpc_hazard.h | 14 ++ 7 files changed, 304 insertions(+) create mode 100644 tools/perf/util/hazard.c create mode 100644 tools/perf/util/hazard.h create mode 100644 tools/perf/util/hazard/Build create mode 100644 tools/perf/util/hazard/powerpc/perf_pipeline_haz.h create mode 100644 tools/perf/util/hazard/powerpc/powerpc_hazard.c create mode 100644 tools/perf/util/hazard/powerpc/powerpc_hazard.h diff --git a/tools/perf/util/Build b/tools/perf/util/Build index 07da6c790b63..f5e1b7d79b6d 100644 --- a/tools/perf/util/Build +++ b/tools/perf/util/Build @@ -118,6 +118,7 @@ perf-y += parse-regs-options.o perf-y += term.o perf-y += help-unknown-cmd.o perf-y += mem-events.o +perf-y += hazard.o perf-y += vsprintf.o perf-y += units.o perf-y += time-utils.o @@ -153,6 +154,7 @@ perf-$(CONFIG_LIBUNWIND_AARCH64) += libunwind/arm64.o perf-$(CONFIG_LIBBABELTRACE) += data-convert-bt.o perf-y += scripting-engines/ +perf-y += hazard/ perf-$(CONFIG_ZLIB) += zlib.o perf-$(CONFIG_LZMA) += lzma.o diff --git a/tools/perf/util/hazard.c b/tools/perf/util/hazard.c new file mode 100644 index ..db235b26b266 --- /dev/null +++ b/tools/perf/util/hazard.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include "hazard/powerpc/powerpc_hazard.h" + +const char *perf_haz__itype_str(u8 itype, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__itype_str(itype); + + return "-"; +} + +const char *perf_haz__icache_str(u8 icache, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__icache_str(icache); + + return "-"; +} + +const char *perf_haz__hstage_str(u8 hstage, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__hstage_str(hstage); + + return "-"; +} + +const char *perf_haz__hreason_str(u8 hstage, u8 hreason, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__hreason_str(hstage, hreason); + + return "-"; +} + +const char *perf_haz__sstage_str(u8 sstage, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__sstage_str(sstage); + + return "-"; +} + +const char *perf_haz__sreason_str(u8 sstage, u8 sreason, const char *arch) +{ + if (!strncmp(arch, "powerpc", strlen("powerpc"))) + return powerpc__haz__sreason_str(sstage, sreason); + + return "-"; +} diff --git a/tools/perf/util/hazard.h b/tools/perf/util/hazard.h new file mode 100644 index ..eab4190e056a --- /dev/null +++ b/tools/perf/util/hazard.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __PERF_HAZARD_H +#define __PERF_HAZARD_H + +#include "sort.h" + +const char *perf_haz__itype_str(u8 itype, const char *arch); +const char *perf_haz__icache_str(u8 icache, const char *arch); +const char *perf_haz__hstage_str(u8 hstage, const char *arch); +const char *perf_haz__hreason_str(u8 hstage, u8 hreason, const char *arch); +const char *perf_haz__sstage_str(u8 sstage, const char *arch); +const char *perf_haz__sreason_str(u8 sstage, u8 sreason, const char *arch); + +#endif /* __PERF_HAZARD_H */ diff --git a/tools/perf/util/hazard/Build b/tools/perf/util/hazard/Build new file mode 100644 index ..314c5e316383 --- /dev/null +++ b/tools/perf/util/hazard/Build @@ -0,0 +1 @@ +perf-y += powerpc/powerpc_hazard.o diff --git a/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h b/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h new file mode 100644 index ..de8857ec31dd --- /dev/null +++ b/tools/perf/util/hazard/powerpc/perf_pipeline_haz.h @@ -0,0 +1,80 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H +#define _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H + +enum perf_inst_type { + PERF_HAZ__ITYPE_LOAD = 1, + PERF_HAZ__ITYPE_STORE, + PERF_HAZ__ITYPE_B
[RFC 06/11] perf hists: Make a room for hazard info in struct hist_entry
From: Madhavan Srinivasan To enable hazard mode with perf report (followup patch) we need to have cpu pipeline hazard data available in hist_entry. Add hazard info into struct hist_entry. Also add hazard_info as parameter to hists__add_entry(). Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- tools/perf/builtin-annotate.c | 2 +- tools/perf/builtin-c2c.c | 4 ++-- tools/perf/builtin-diff.c | 6 +++--- tools/perf/tests/hists_link.c | 4 ++-- tools/perf/util/hist.c| 22 +++--- tools/perf/util/hist.h| 2 ++ tools/perf/util/sort.h| 1 + 7 files changed, 26 insertions(+), 15 deletions(-) diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c index 6c0a0412502e..78552a9428a6 100644 --- a/tools/perf/builtin-annotate.c +++ b/tools/perf/builtin-annotate.c @@ -249,7 +249,7 @@ static int perf_evsel__add_sample(struct evsel *evsel, if (ann->has_br_stack && has_annotation(ann)) return process_branch_callback(evsel, sample, al, ann, machine); - he = hists__add_entry(hists, al, NULL, NULL, NULL, sample, true); + he = hists__add_entry(hists, al, NULL, NULL, NULL, NULL, sample, true); if (he == NULL) return -ENOMEM; diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c index 246ac0b4d54f..2a1cb5cda6d9 100644 --- a/tools/perf/builtin-c2c.c +++ b/tools/perf/builtin-c2c.c @@ -292,7 +292,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused, c2c_decode_stats(&stats, mi); he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops, - &al, NULL, NULL, mi, + &al, NULL, NULL, mi, NULL, sample, true); if (he == NULL) goto free_mi; @@ -326,7 +326,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused, goto free_mi; he = hists__add_entry_ops(&c2c_hists->hists, &c2c_entry_ops, - &al, NULL, NULL, mi, + &al, NULL, NULL, mi, NULL, sample, true); if (he == NULL) goto free_mi; diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c index f8b6ae557d8b..e32e91f89a18 100644 --- a/tools/perf/builtin-diff.c +++ b/tools/perf/builtin-diff.c @@ -412,15 +412,15 @@ static int diff__process_sample_event(struct perf_tool *tool, } if (compute != COMPUTE_CYCLES) { - if (!hists__add_entry(hists, &al, NULL, NULL, NULL, sample, - true)) { + if (!hists__add_entry(hists, &al, NULL, NULL, NULL, NULL, + sample, true)) { pr_warning("problem incrementing symbol period, " "skipping event\n"); goto out_put; } } else { if (!hists__add_entry_ops(hists, &block_hist_ops, &al, NULL, - NULL, NULL, sample, true)) { + NULL, NULL, NULL, sample, true)) { pr_warning("problem incrementing symbol period, " "skipping event\n"); goto out_put; diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c index a024d3f3a412..112a90818d2e 100644 --- a/tools/perf/tests/hists_link.c +++ b/tools/perf/tests/hists_link.c @@ -86,7 +86,7 @@ static int add_hist_entries(struct evlist *evlist, struct machine *machine) if (machine__resolve(machine, &al, &sample) < 0) goto out; - he = hists__add_entry(hists, &al, NULL, + he = hists__add_entry(hists, &al, NULL, NULL, NULL, NULL, &sample, true); if (he == NULL) { addr_location__put(&al); @@ -105,7 +105,7 @@ static int add_hist_entries(struct evlist *evlist, struct machine *machine) if (machine__resolve(machine, &al, &sample) < 0) goto out; - he = hists__add_entry(hists, &al, NULL, + he = hists__add_entry(hists, &al, NULL, NULL, NULL, NULL, &sample, true); if (he == NULL) { addr_location__put(&al); diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c index ca5a8f4d007e..6d23efaa52c8 100644 --- a/tools/perf/util/hist.c +++ b/tools/perf/util/hist.c @@ -604,6 +604,7 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists, *
[RFC 03/11] powerpc/perf: Arch specific definitions for pipeline
From: Madhavan Srinivasan Create powerpc specific definitions for pipeline hazard and stalls. This information is available in SIER register on powerpc. Current definitions are based on IBM PowerPC SIER specification available in ISA[1] and Performance Monitor Unit User’s Guide[2]. [1]: Book III, Section 9.4.10: https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0 [2]: https://wiki.raptorcs.com/w/images/6/6b/POWER9_PMU_UG_v12_28NOV2018_pub.pdf#G9.1106986 Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- .../include/uapi/asm/perf_pipeline_haz.h | 80 +++ 1 file changed, 80 insertions(+) create mode 100644 arch/powerpc/include/uapi/asm/perf_pipeline_haz.h diff --git a/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h b/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h new file mode 100644 index ..de8857ec31dd --- /dev/null +++ b/arch/powerpc/include/uapi/asm/perf_pipeline_haz.h @@ -0,0 +1,80 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H +#define _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H + +enum perf_inst_type { + PERF_HAZ__ITYPE_LOAD = 1, + PERF_HAZ__ITYPE_STORE, + PERF_HAZ__ITYPE_BRANCH, + PERF_HAZ__ITYPE_FP, + PERF_HAZ__ITYPE_FX, + PERF_HAZ__ITYPE_CR_OR_SC, +}; + +enum perf_inst_cache { + PERF_HAZ__ICACHE_L1_HIT = 1, + PERF_HAZ__ICACHE_L2_HIT, + PERF_HAZ__ICACHE_L3_HIT, + PERF_HAZ__ICACHE_L3_MISS, +}; + +enum perf_pipeline_stage { + PERF_HAZ__PIPE_STAGE_IFU = 1, + PERF_HAZ__PIPE_STAGE_IDU, + PERF_HAZ__PIPE_STAGE_ISU, + PERF_HAZ__PIPE_STAGE_LSU, + PERF_HAZ__PIPE_STAGE_BRU, + PERF_HAZ__PIPE_STAGE_FXU, + PERF_HAZ__PIPE_STAGE_FPU, + PERF_HAZ__PIPE_STAGE_VSU, + PERF_HAZ__PIPE_STAGE_OTHER, +}; + +enum perf_haz_bru_reason { + PERF_HAZ__HAZ_BRU_MPRED_DIR = 1, + PERF_HAZ__HAZ_BRU_MPRED_TA, +}; + +enum perf_haz_isu_reason { + PERF_HAZ__HAZ_ISU_SRC = 1, + PERF_HAZ__HAZ_ISU_COL = 1, +}; + +enum perf_haz_lsu_reason { + PERF_HAZ__HAZ_LSU_ERAT_MISS = 1, + PERF_HAZ__HAZ_LSU_LMQ, + PERF_HAZ__HAZ_LSU_LHS, + PERF_HAZ__HAZ_LSU_MPRED, + PERF_HAZ__HAZ_DERAT_MISS, + PERF_HAZ__HAZ_LSU_LMQ_DERAT_MISS, + PERF_HAZ__HAZ_LSU_LHS_DERAT_MISS, + PERF_HAZ__HAZ_LSU_MPRED_DERAT_MISS, +}; + +enum perf_stall_lsu_reason { + PERF_HAZ__STALL_LSU_DCACHE_MISS = 1, + PERF_HAZ__STALL_LSU_LD_FIN, + PERF_HAZ__STALL_LSU_ST_FWD, + PERF_HAZ__STALL_LSU_ST, +}; + +enum perf_stall_fxu_reason { + PERF_HAZ__STALL_FXU_MC = 1, + PERF_HAZ__STALL_FXU_FC, +}; + +enum perf_stall_bru_reason { + PERF_HAZ__STALL_BRU_FIN_MPRED = 1, + PERF_HAZ__STALL_BRU_FC, +}; + +enum perf_stall_vsu_reason { + PERF_HAZ__STALL_VSU_MC = 1, + PERF_HAZ__STALL_VSU_FC, +}; + +enum perf_stall_other_reason { + PERF_HAZ__STALL_NTC, +}; + +#endif /* _UAPI_ASM_POWERPC_PERF_PIPELINE_HAZ_H */ -- 2.21.1
[RFC 04/11] powerpc/perf: Arch support to expose Hazard data
From: Madhavan Srinivasan SIER register on PowerPC hw pmu provides cpu pipeline hazard information. Add logic to convert this arch specific data into perf_pipeline_haz_data structure. Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- arch/powerpc/include/asm/perf_event_server.h | 2 + arch/powerpc/perf/core-book3s.c | 4 + arch/powerpc/perf/isa207-common.c| 157 +++ arch/powerpc/perf/isa207-common.h| 12 ++ arch/powerpc/perf/power8-pmu.c | 1 + arch/powerpc/perf/power9-pmu.c | 1 + 6 files changed, 177 insertions(+) diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h index 3e9703f44c7c..9b8f90439ff2 100644 --- a/arch/powerpc/include/asm/perf_event_server.h +++ b/arch/powerpc/include/asm/perf_event_server.h @@ -37,6 +37,8 @@ struct power_pmu { void(*get_mem_data_src)(union perf_mem_data_src *dsrc, u32 flags, struct pt_regs *regs); void(*get_mem_weight)(u64 *weight); + void(*get_phazard_data)(struct perf_pipeline_haz_data *phaz, + u32 flags, struct pt_regs *regs); unsigned long group_constraint_mask; unsigned long group_constraint_val; u64 (*bhrb_filter_map)(u64 branch_sample_type); diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 3086055bf681..fcbb4acc3a03 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -2096,6 +2096,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val, ppmu->get_mem_weight) ppmu->get_mem_weight(&data.weight); + if (event->attr.sample_type & PERF_SAMPLE_PIPELINE_HAZ && + ppmu->get_phazard_data) + ppmu->get_phazard_data(&data.pipeline_haz, ppmu->flags, regs); + if (perf_event_overflow(event, &data, regs)) power_pmu_stop(event, 0); } diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c index 07026bbd292b..03dafde7cace 100644 --- a/arch/powerpc/perf/isa207-common.c +++ b/arch/powerpc/perf/isa207-common.c @@ -239,6 +239,163 @@ void isa207_get_mem_weight(u64 *weight) *weight = mantissa << (2 * exp); } +static __u8 get_inst_type(u64 sier) +{ + switch (SIER_TYPE(sier)) { + case 1: + return PERF_HAZ__ITYPE_LOAD; + case 2: + return PERF_HAZ__ITYPE_STORE; + case 3: + return PERF_HAZ__ITYPE_BRANCH; + case 4: + return PERF_HAZ__ITYPE_FP; + case 5: + return PERF_HAZ__ITYPE_FX; + case 6: + return PERF_HAZ__ITYPE_CR_OR_SC; + } + return PERF_HAZ__ITYPE_NA; +} + +static __u8 get_inst_cache(u64 sier) +{ + switch (SIER_ICACHE(sier)) { + case 1: + return PERF_HAZ__ICACHE_L1_HIT; + case 2: + return PERF_HAZ__ICACHE_L2_HIT; + case 3: + return PERF_HAZ__ICACHE_L3_HIT; + case 4: + return PERF_HAZ__ICACHE_L3_MISS; + } + return PERF_HAZ__ICACHE_NA; +} + +static void get_hazard_data(u64 sier, struct perf_pipeline_haz_data *haz) +{ + if (SIER_MPRED(sier)) { + haz->hazard_stage = PERF_HAZ__PIPE_STAGE_BRU; + + switch (SIER_MPRED_TYPE(sier)) { + case 1: + haz->hazard_reason = PERF_HAZ__HAZ_BRU_MPRED_DIR; + return; + case 2: + haz->hazard_reason = PERF_HAZ__HAZ_BRU_MPRED_TA; + return; + } + } + + if (cpu_has_feature(CPU_FTR_ARCH_300) && + (SIER_TYPE(sier) == 1 || SIER_TYPE(sier) == 2)) { + haz->hazard_stage = PERF_HAZ__PIPE_STAGE_LSU; + haz->hazard_reason = PERF_HAZ__HAZ_DERAT_MISS; + return; + } + + if (cpu_has_feature(CPU_FTR_ARCH_207S) && + (SIER_TYPE(sier) == 1 || SIER_TYPE(sier) == 2)) { + int derat_miss = SIER_DERAT_MISS(sier); + + haz->hazard_stage = PERF_HAZ__PIPE_STAGE_LSU; + + switch (p8_SIER_REJ_LSU_REASON(sier)) { + case 0: + haz->hazard_reason = PERF_HAZ__HAZ_LSU_ERAT_MISS; + return; + case 1: + haz->hazard_reason = (derat_miss) ? +PERF_HAZ__HAZ_LSU_LMQ_DERAT_MISS : +PERF_HAZ__HAZ_LSU_LMQ; + return; + case 2: + haz->hazard_reason = (derat_miss) ? +
[RFC 02/11] perf/core: Data structure to present hazard data
From: Madhavan Srinivasan Introduce new perf sample_type PERF_SAMPLE_PIPELINE_HAZ to request kernel to provide cpu pipeline hazard data. Also, introduce arch independent structure 'perf_pipeline_haz_data' to pass hazard data to userspace. This is generic structure and arch specific data needs to be converted to this format. Signed-off-by: Madhavan Srinivasan Signed-off-by: Ravi Bangoria --- include/linux/perf_event.h| 7 ++ include/uapi/linux/perf_event.h | 32 ++- kernel/events/core.c | 6 + tools/include/uapi/linux/perf_event.h | 32 ++- 4 files changed, 75 insertions(+), 2 deletions(-) diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 547773f5894e..d5b606e3c57d 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1001,6 +1001,7 @@ struct perf_sample_data { u64 stack_user_size; u64 phys_addr; + struct perf_pipeline_haz_data pipeline_haz; } cacheline_aligned; /* default value for data source */ @@ -1021,6 +1022,12 @@ static inline void perf_sample_data_init(struct perf_sample_data *data, data->weight = 0; data->data_src.val = PERF_MEM_NA; data->txn = 0; + data->pipeline_haz.itype = PERF_HAZ__ITYPE_NA; + data->pipeline_haz.icache = PERF_HAZ__ICACHE_NA; + data->pipeline_haz.hazard_stage = PERF_HAZ__PIPE_STAGE_NA; + data->pipeline_haz.hazard_reason = PERF_HAZ__HREASON_NA; + data->pipeline_haz.stall_stage = PERF_HAZ__PIPE_STAGE_NA; + data->pipeline_haz.stall_reason = PERF_HAZ__SREASON_NA; } extern void perf_output_sample(struct perf_output_handle *handle, diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 377d794d3105..ff252618ca93 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -142,8 +142,9 @@ enum perf_event_sample_format { PERF_SAMPLE_REGS_INTR = 1U << 18, PERF_SAMPLE_PHYS_ADDR = 1U << 19, PERF_SAMPLE_AUX = 1U << 20, + PERF_SAMPLE_PIPELINE_HAZ= 1U << 21, - PERF_SAMPLE_MAX = 1U << 21, /* non-ABI */ + PERF_SAMPLE_MAX = 1U << 22, /* non-ABI */ __PERF_SAMPLE_CALLCHAIN_EARLY = 1ULL << 63, /* non-ABI; internal use */ }; @@ -870,6 +871,13 @@ enum perf_event_type { * { u64 phys_addr;} && PERF_SAMPLE_PHYS_ADDR * { u64 size; *char data[size]; } && PERF_SAMPLE_AUX +* { u8itype; +*u8icache; +*u8hazard_stage; +*u8hazard_reason; +*u8stall_stage; +*u8stall_reason; +*u16 pad;} && PERF_SAMPLE_PIPELINE_HAZ * }; */ PERF_RECORD_SAMPLE = 9, @@ -1185,4 +1193,26 @@ struct perf_branch_entry { reserved:40; }; +struct perf_pipeline_haz_data { + /* Instruction/Opcode type: Load, Store, Branch */ + __u8itype; + /* Instruction Cache source */ + __u8icache; + /* Instruction suffered hazard in pipeline stage */ + __u8hazard_stage; + /* Hazard reason */ + __u8hazard_reason; + /* Instruction suffered stall in pipeline stage */ + __u8stall_stage; + /* Stall reason */ + __u8stall_reason; + __u16 pad; +}; + +#define PERF_HAZ__ITYPE_NA 0x0 +#define PERF_HAZ__ICACHE_NA0x0 +#define PERF_HAZ__PIPE_STAGE_NA0x0 +#define PERF_HAZ__HREASON_NA 0x0 +#define PERF_HAZ__SREASON_NA 0x0 + #endif /* _UAPI_LINUX_PERF_EVENT_H */ diff --git a/kernel/events/core.c b/kernel/events/core.c index e453589da97c..d00037c77ccf 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1754,6 +1754,9 @@ static void __perf_event_header_size(struct perf_event *event, u64 sample_type) if (sample_type & PERF_SAMPLE_PHYS_ADDR) size += sizeof(data->phys_addr); + if (sample_type & PERF_SAMPLE_PIPELINE_HAZ) + size += sizeof(data->pipeline_haz); + event->header_size = size; } @@ -6712,6 +6715,9 @@ void perf_output_sample(struct perf_output_handle *handle, perf_aux_sample_output(event, handle, data); } + if (sample_type & PERF_SAMPLE_PIPELINE_HAZ) + perf_output_put(handle, data->pipeline_haz); + if (!event->attr.watermark) { int wakeup_events = event->attr.wakeup_events; diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h index 377d794d
[RFC 01/11] powerpc/perf: Simplify ISA207_SIER macros
Instead of having separate macros for MASK and SHIFT, and using them to derive the bits, let's have simple macro to do the job. Also, remove ISA207_ prefix because some of the SIER bits which are extracted with these macros are not defined in ISA, example DATA_SRC bits. Signed-off-by: Ravi Bangoria --- arch/powerpc/perf/isa207-common.c | 8 arch/powerpc/perf/isa207-common.h | 11 +++ 2 files changed, 7 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c index 4c86da5eb28a..07026bbd292b 100644 --- a/arch/powerpc/perf/isa207-common.c +++ b/arch/powerpc/perf/isa207-common.c @@ -215,10 +215,10 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags, } sier = mfspr(SPRN_SIER); - val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT; + val = SIER_TYPE(sier); if (val == 1 || val == 2) { - idx = (sier & ISA207_SIER_LDST_MASK) >> ISA207_SIER_LDST_SHIFT; - sub_idx = (sier & ISA207_SIER_DATA_SRC_MASK) >> ISA207_SIER_DATA_SRC_SHIFT; + idx = SIER_LDST(sier); + sub_idx = SIER_DATA_SRC(sier); dsrc->val = isa207_find_source(idx, sub_idx); dsrc->val |= (val == 1) ? P(OP, LOAD) : P(OP, STORE); @@ -231,7 +231,7 @@ void isa207_get_mem_weight(u64 *weight) u64 exp = MMCRA_THR_CTR_EXP(mmcra); u64 mantissa = MMCRA_THR_CTR_MANT(mmcra); u64 sier = mfspr(SPRN_SIER); - u64 val = (sier & ISA207_SIER_TYPE_MASK) >> ISA207_SIER_TYPE_SHIFT; + u64 val = SIER_TYPE(sier); if (val == 0 || val == 7) *weight = 0; diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h index 63fd4f3f6013..7027eb9f3e40 100644 --- a/arch/powerpc/perf/isa207-common.h +++ b/arch/powerpc/perf/isa207-common.h @@ -202,14 +202,9 @@ #define MAX_ALT2 #define MAX_PMU_COUNTERS 6 -#define ISA207_SIER_TYPE_SHIFT 15 -#define ISA207_SIER_TYPE_MASK (0x7ull << ISA207_SIER_TYPE_SHIFT) - -#define ISA207_SIER_LDST_SHIFT 1 -#define ISA207_SIER_LDST_MASK (0x7ull << ISA207_SIER_LDST_SHIFT) - -#define ISA207_SIER_DATA_SRC_SHIFT 53 -#define ISA207_SIER_DATA_SRC_MASK (0x7ull << ISA207_SIER_DATA_SRC_SHIFT) +#define SIER_DATA_SRC(sier)(((sier) >> (63 - 10)) & 0x7ull) +#define SIER_TYPE(sier)(((sier) >> (63 - 48)) & 0x7ull) +#define SIER_LDST(sier)(((sier) >> (63 - 62)) & 0x7ull) #define P(a, b)PERF_MEM_S(a, b) #define PH(a, b) (P(LVL, HIT) | P(a, b)) -- 2.21.1
[RFC 00/11] perf: Enhancing perf to export processor hazard information
Most modern microprocessors employ complex instruction execution pipelines such that many instructions can be 'in flight' at any given point in time. Various factors affect this pipeline and hazards are the primary among them. Different types of hazards exist - Data hazards, Structural hazards and Control hazards. Data hazard is the case where data dependencies exist between instructions in different stages in the pipeline. Structural hazard is when the same processor hardware is needed by more than one instruction in flight at the same time. Control hazards are more the branch misprediction kinds. Information about these hazards are critical towards analyzing performance issues and also to tune software to overcome such issues. Modern processors export such hazard data in Performance Monitoring Unit (PMU) registers. Ex, 'Sampled Instruction Event Register' on IBM PowerPC[1][2] and 'Instruction-Based Sampling' on AMD[3] provides similar information. Implementation detail: A new sample_type called PERF_SAMPLE_PIPELINE_HAZ is introduced. If it's set, kernel converts arch specific hazard information into generic format: struct perf_pipeline_haz_data { /* Instruction/Opcode type: Load, Store, Branch */ __u8itype; /* Instruction Cache source */ __u8icache; /* Instruction suffered hazard in pipeline stage */ __u8hazard_stage; /* Hazard reason */ __u8hazard_reason; /* Instruction suffered stall in pipeline stage */ __u8stall_stage; /* Stall reason */ __u8stall_reason; __u16 pad; }; ... which can be read by user from mmap() ring buffer. With this approach, sample perf report in hazard mode looks like (On IBM PowerPC): # ./perf record --hazard ./ebizzy # ./perf report --hazard Overhead Symbol Shared Instruction Type Hazard Stage Hazard Reason Stall Stage Stall Reason ICache access 36.58% [.] thread_run ebizzy Load LSUMispredict LSU Load fin L1 hit 9.46% [.] thread_run ebizzy Load LSUMispredict LSU Dcache_miss L1 hit 1.76% [.] thread_run ebizzy Fixed point - - - - L1 hit 1.31% [.] thread_run ebizzy Load LSUERAT Miss LSU Load fin L1 hit 1.27% [.] thread_run ebizzy Load LSUMispredict - - L1 hit 1.16% [.] thread_run ebizzy Fixed point - - FXU Fixed cycle L1 hit 0.50% [.] thread_run ebizzy Fixed point ISUSource UnavailableFXU Fixed cycle L1 hit 0.30% [.] thread_run ebizzy Load LSULMQ Full, DERAT Miss LSU Load fin L1 hit 0.24% [.] thread_run ebizzy Load LSUERAT Miss - - L1 hit 0.08% [.] thread_run ebizzy - - - BRU Fixed cycle L1 hit 0.05% [.] thread_run ebizzy Branch- - BRU Fixed cycle L1 hit 0.04% [.] thread_run ebizzy Fixed point ISUSource Unavailable- - L1 hit Also perf annotate with hazard data: │Disassembly of section .text: │ │10001cf8 : │compare(): │return NULL; │} │ │static int │compare(const void *p1, const void *p2) │{ 33.23 │ stdr31,-8(r1) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: Load Hit Store, stall_stage: LSU, stall_reason: -, icache: L3 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, stall_reason: -, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} 0.84 │ stdu r1,-64(r1) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: -, stall_reason: -, icache: L1 hit} 0.24 │ mr r31,r1 │ {haz_stage: -, haz_reason: -, stall_stage: -, stall_reason: -, icache: L1 hit} 21.18 │ stdr3,32(r31) │ {haz_stage: LSU, haz_reason: ERAT Miss, stall_stage: LSU, stall_reason: Store, icache: L1 hit} │ {haz_stage: LSU, haz_reason: ERAT Miss, s
[RFC PATCH] powerpc/64s: CONFIG_PPC_HASH_MMU
This allows the 64s hash MMU code to be compiled out if radix is selected. This saves about 128kB kernel image size (90kB text) on powernv_defconfig minus KVM, 40kB on a tiny config. Signed-off-by: Nicholas Piggin --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/book3s/64/mmu.h | 20 - arch/powerpc/include/asm/book3s/64/pgtable.h | 6 + .../include/asm/book3s/64/tlbflush-hash.h | 10 +++-- arch/powerpc/include/asm/book3s/64/tlbflush.h | 4 arch/powerpc/include/asm/book3s/pgtable.h | 4 arch/powerpc/include/asm/mmu.h| 14 ++-- arch/powerpc/include/asm/mmu_context.h| 2 ++ arch/powerpc/include/asm/paca.h | 9 arch/powerpc/include/asm/sparsemem.h | 2 +- arch/powerpc/kernel/asm-offsets.c | 4 arch/powerpc/kernel/dt_cpu_ftrs.c | 10 - arch/powerpc/kernel/entry_64.S| 4 ++-- arch/powerpc/kernel/exceptions-64s.S | 22 --- arch/powerpc/kernel/mce.c | 2 +- arch/powerpc/kernel/mce_power.c | 10 ++--- arch/powerpc/kernel/paca.c| 18 ++- arch/powerpc/kernel/process.c | 13 ++- arch/powerpc/kernel/prom.c| 2 ++ arch/powerpc/kernel/setup_64.c| 4 arch/powerpc/kexec/core_64.c | 4 ++-- arch/powerpc/kvm/Kconfig | 1 + arch/powerpc/mm/book3s64/Makefile | 17 -- arch/powerpc/mm/book3s64/hash_pgtable.c | 1 - arch/powerpc/mm/book3s64/hash_utils.c | 9 .../{hash_hugetlbpage.c => hugetlbpage.c} | 2 ++ arch/powerpc/mm/book3s64/mmu_context.c| 18 +-- arch/powerpc/mm/book3s64/pgtable.c| 22 +-- arch/powerpc/mm/book3s64/radix_pgtable.c | 5 + arch/powerpc/mm/book3s64/slb.c| 14 arch/powerpc/mm/copro_fault.c | 2 ++ arch/powerpc/mm/fault.c | 20 + arch/powerpc/platforms/Kconfig.cputype| 20 - arch/powerpc/platforms/powernv/idle.c | 2 ++ arch/powerpc/platforms/powernv/setup.c| 2 ++ arch/powerpc/xmon/xmon.c | 8 +-- 36 files changed, 231 insertions(+), 77 deletions(-) rename arch/powerpc/mm/book3s64/{hash_hugetlbpage.c => hugetlbpage.c} (99%) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 497b7d0b2d7e..50c361b8b7fd 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -963,6 +963,7 @@ config PPC_MEM_KEYS prompt "PowerPC Memory Protection Keys" def_bool y depends on PPC_BOOK3S_64 + depends on PPC_HASH_MMU select ARCH_USES_HIGH_VMA_FLAGS select ARCH_HAS_PKEYS help diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index bb3deb76c951..fca69dd23e25 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -107,7 +107,9 @@ typedef struct { * from EA and new context ids to build the new VAs. */ mm_context_id_t id; +#ifdef CONFIG_PPC_HASH_MMU mm_context_id_t extended_id[TASK_SIZE_USER64/TASK_CONTEXT_SIZE]; +#endif }; /* Number of bits in the mm_cpumask */ @@ -116,7 +118,9 @@ typedef struct { /* Number of users of the external (Nest) MMU */ atomic_t copros; +#ifdef CONFIG_PPC_HASH_MMU struct hash_mm_context *hash_context; +#endif unsigned long vdso_base; /* @@ -139,6 +143,7 @@ typedef struct { #endif } mm_context_t; +#ifdef CONFIG_PPC_HASH_MMU static inline u16 mm_ctx_user_psize(mm_context_t *ctx) { return ctx->hash_context->user_psize; @@ -193,11 +198,22 @@ static inline struct subpage_prot_table *mm_ctx_subpage_prot(mm_context_t *ctx) } #endif +#endif + /* * The current system page and segment sizes */ -extern int mmu_linear_psize; +#if defined(CONFIG_PPC_RADIX_MMU) && !defined(CONFIG_PPC_HASH_MMU) +#ifdef CONFIG_PPC_64K_PAGES +#define mmu_virtual_psize MMU_PAGE_64K +#else +#define mmu_virtual_psize MMU_PAGE_4K +#endif +#else extern int mmu_virtual_psize; +#endif + +extern int mmu_linear_psize; extern int mmu_vmalloc_psize; extern int mmu_vmemmap_psize; extern int mmu_io_psize; @@ -243,6 +259,7 @@ extern void radix_init_pseries(void); static inline void radix_init_pseries(void) { }; #endif +#ifdef CONFIG_PPC_HASH_MMU static inline int get_user_context(mm_context_t *ctx, unsigned long ea) { int index = ea >> MAX_EA_BITS_PER_CONTEXT; @@ -262,6 +279,7 @@ static inline unsigned long get_user_vsid(mm_context_t *ctx, return get_vsid(context, ea, ssize); } +#endif #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_BOOK3S_64_MM
Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable
发件人:Scott Wood 发送日期:2020-03-01 07:12:58 收件人:"王文虎" 抄送人:wangwenhu ,Kumar Gala ,Benjamin Herrenschmidt ,Paul Mackerras ,Michael Ellerman ,linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org,triv...@kernel.org,Rai Harninder 主题:Re: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On Tue, 2020-01-21 at 14:38 +0800, 王文虎 wrote: >> 发件人:Scott Wood >> 发送日期:2020-01-21 13:49:59 >> 收件人:"王文虎" >> 抄送人:wangwenhu ,Kumar Gala ,B >> enjamin Herrenschmidt ,Paul Mackerras < >> pau...@samba.org>,Michael Ellerman , >> linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org, >> triv...@kernel.org,Rai Harninder >> 主题:Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM configurable>On >> Tue, 2020-01-21 at 13:20 +0800, 王文虎 wrote: >> > > From: Scott Wood >> > > Date: 2020-01-21 11:25:25 >> > > To: wangwenhu ,Kumar Gala < >> > > ga...@kernel.crashing.org>, >> > > Benjamin Herrenschmidt ,Paul Mackerras < >> > > pau...@samba.org>,Michael Ellerman , >> > > linuxppc-dev@lists.ozlabs.org,linux-ker...@vger.kernel.org >> > > Cc: triv...@kernel.org,wenhu.w...@vivo.com,Rai Harninder < >> > > harninder@nxp.com> >> > > Subject: Re: [PATCH] powerpc/Kconfig: Make FSL_85XX_CACHE_SRAM >> > > configurable>On Mon, 2020-01-20 at 06:43 -0800, wangwenhu wrote: >> > > > > From: wangwenhu >> > > > > >> > > > > When generating .config file with menuconfig on Freescale BOOKE >> > > > > SOC, FSL_85XX_CACHE_SRAM is not configurable for the lack of >> > > > > description in the Kconfig field, which makes it impossible >> > > > > to support L2Cache-Sram driver. Add a description to make it >> > > > > configurable. >> > > > > >> > > > > Signed-off-by: wangwenhu >> > > > >> > > > The intent was that drivers using the SRAM API would select the >> > > > symbol. What >> > > > is the use case for selecting it manually? >> > > > >> > > >> > > With a repository of multiple products(meaning different defconfigs) and >> > > multiple >> > > developers, the Kconfigs of the Kernel Source Tree change frequently. So >> > > the >> > > "make menuconfig" >> > > process is needed for defconfigs' re-generating or updating for the >> > > complexity of dependencies >> > > between different features defined in the Kconfigs. >> > >> > That doesn't answer my question of how the SRAM code would be useful other >> > than to some other driver that uses the API (which would use >> > "select"). There >> > is no userspace API. You could use the kernel command line to configure >> > the >> > SRAM but you need to get the address of it for it to be useful. >> > >> >> Like you've asked below, via /dev/mem or direct calling within the Kernel. >> And they are not submitted yes, under development. > >If they are calling within the kernel, then whatever driver that is should >select FSL_85XX_CACHE_SRAM. Directly accessing /dev/mem without any way for >the kernel to advertise where it is or which parts of SRAM are available for >use sounds like a bad idea. > Yes, definitely. So like we enable the moulde which should selet FSL_85XX_CACHE_SRAM to build vmlinux, FSL_85XX_CACHE_SRAM could not be seleted because of the Kconfig definition problem which I am trying to fix now. So would you please merge the patch for the convenience of later works depending on the driver. Wenhu >-Scott > >
Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64
On Mon, 2020-03-02 at 10:17 +0800, Jason Yan wrote: > > 在 2020/3/1 6:54, Scott Wood 写道: > > On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote: > > > > > > Turnning to %p may not be a good idea in this situation. So > > > for the REG logs printed when dumping stack, we can disable it when > > > KASLR is open. For the REG logs in other places like show_regs(), only > > > privileged can trigger it, and they are not combind with a symbol, so > > > I think it's ok to keep them. > > > > > > diff --git a/arch/powerpc/kernel/process.c > > > b/arch/powerpc/kernel/process.c > > > index fad50db9dcf2..659c51f0739a 100644 > > > --- a/arch/powerpc/kernel/process.c > > > +++ b/arch/powerpc/kernel/process.c > > > @@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned > > > long *stack) > > > newsp = stack[0]; > > > ip = stack[STACK_FRAME_LR_SAVE]; > > > if (!firstframe || ip != lr) { > > > - printk("["REG"] ["REG"] %pS", sp, ip, (void > > > *)ip); > > > + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) > > > + printk("%pS", (void *)ip); > > > + else > > > + printk("["REG"] ["REG"] %pS", sp, ip, > > > (void *)ip); > > > > This doesn't deal with "nokaslr" on the kernel command line. It also > > doesn't > > seem like something that every callsite should have to opencode, versus > > having > > an appropriate format specifier behaves as I described above (and I still > > don't see why that format specifier should not be "%p"). > > > > Actually I still do not understand why we should print the raw value > here. When KALLSYMS is enabled we have symbol name and offset like > put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw > address. I'm more concerned about the stack address for wading through a raw stack dump (to find function call arguments, etc). The return address does help confirm that I'm on the right stack frame though, and also makes looking up a line number slightly easier than having to look up a symbol address and then add the offset (at least for non-module addresses). As a random aside, the mismatch between Linux printing a hex offset and GDB using decimal in disassembly is annoying... -Scott
Re: [PATCH] mm/debug: Add tests validating arch page table helpers for core features
On 02/27/2020 04:59 PM, Christophe Leroy wrote: > > > Le 27/02/2020 à 11:33, Anshuman Khandual a écrit : >> This adds new tests validating arch page table helpers for these following >> core memory features. These tests create and test specific mapping types at >> various page table levels. >> >> * SPECIAL mapping >> * PROTNONE mapping >> * DEVMAP mapping >> * SOFTDIRTY mapping >> * SWAP mapping >> * MIGRATION mapping >> * HUGETLB mapping >> * THP mapping >> >> Cc: Andrew Morton >> Cc: Mike Rapoport >> Cc: Vineet Gupta >> Cc: Catalin Marinas >> Cc: Will Deacon >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Michael Ellerman >> Cc: Heiko Carstens >> Cc: Vasily Gorbik >> Cc: Christian Borntraeger >> Cc: Thomas Gleixner >> Cc: Ingo Molnar >> Cc: Borislav Petkov >> Cc: "H. Peter Anvin" >> Cc: Kirill A. Shutemov >> Cc: Paul Walmsley >> Cc: Palmer Dabbelt >> Cc: linux-snps-...@lists.infradead.org >> Cc: linux-arm-ker...@lists.infradead.org >> Cc: linuxppc-dev@lists.ozlabs.org >> Cc: linux-s...@vger.kernel.org >> Cc: linux-ri...@lists.infradead.org >> Cc: x...@kernel.org >> Cc: linux-a...@vger.kernel.org >> Cc: linux-ker...@vger.kernel.org >> Suggested-by: Catalin Marinas >> Signed-off-by: Anshuman Khandual >> --- >> Tested on arm64 and x86 platforms without any test failures. But this has >> only been built tested on several other platforms. Individual tests need >> to be verified on all current enabling platforms for the test i.e s390, >> ppc32, arc etc. >> >> This patch must be applied on v5.6-rc3 after these patches >> >> 1. https://patchwork.kernel.org/patch/11385057/ >> 2. https://patchwork.kernel.org/patch/11407715/ >> >> OR >> >> This patch must be applied on linux-next (next-20200227) after this patch >> >> 2. https://patchwork.kernel.org/patch/11407715/ >> >> mm/debug_vm_pgtable.c | 310 +- >> 1 file changed, 309 insertions(+), 1 deletion(-) >> >> diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c >> index 96dd7d574cef..3fb90d5b604e 100644 >> --- a/mm/debug_vm_pgtable.c >> +++ b/mm/debug_vm_pgtable.c >> @@ -41,6 +41,44 @@ >> * wrprotect(entry) = A write protected and not a write entry >> * pxx_bad(entry) = A mapped and non-table entry >> * pxx_same(entry1, entry2) = Both entries hold the exact same value >> + * >> + * Specific feature operations >> + * >> + * pte_mkspecial(entry) = Creates a special entry at PTE level >> + * pte_special(entry) = Tests a special entry at PTE level >> + * >> + * pte_protnone(entry) = Tests a no access entry at PTE level >> + * pmd_protnone(entry) = Tests a no access entry at PMD level >> + * >> + * pte_mkdevmap(entry) = Creates a device entry at PTE level >> + * pmd_mkdevmap(entry) = Creates a device entry at PMD level >> + * pud_mkdevmap(entry) = Creates a device entry at PUD level >> + * pte_devmap(entry) = Tests a device entry at PTE level >> + * pmd_devmap(entry) = Tests a device entry at PMD level >> + * pud_devmap(entry) = Tests a device entry at PUD level >> + * >> + * pte_mksoft_dirty(entry) = Creates a soft dirty entry at PTE level >> + * pmd_mksoft_dirty(entry) = Creates a soft dirty entry at PMD level >> + * pte_swp_mksoft_dirty(entry) = Creates a soft dirty swap entry at PTE >> level >> + * pmd_swp_mksoft_dirty(entry) = Creates a soft dirty swap entry at PMD >> level >> + * pte_soft_dirty(entry) = Tests a soft dirty entry at PTE level >> + * pmd_soft_dirty(entry) = Tests a soft dirty entry at PMD level >> + * pte_swp_soft_dirty(entry) = Tests a soft dirty swap entry at PTE level >> + * pmd_swp_soft_dirty(entry) = Tests a soft dirty swap entry at PMD level >> + * pte_clear_soft_dirty(entry) = Clears a soft dirty entry at PTE >> level >> + * pmd_clear_soft_dirty(entry) = Clears a soft dirty entry at PMD >> level >> + * pte_swp_clear_soft_dirty(entry) = Clears a soft dirty swap entry at PTE >> level >> + * pmd_swp_clear_soft_dirty(entry) = Clears a soft dirty swap entry at PMD >> level >> + * >> + * pte_mkhuge(entry) = Creates a HugeTLB entry at given level >> + * pte_huge(entry) = Tests a HugeTLB entry at given level >> + * >> + * pmd_trans_huge(entry) = Tests a trans huge page at PMD level >> + * pud_trans_huge(entry) = Tests a trans huge page at PUD level >> + * pmd_present(entry) = Tests an entry points to memory at PMD level >> + * pud_present(entry) = Tests an entry points to memory at PUD level >> + * pmd_mknotpresent(entry) = Invalidates an PMD entry for MMU >> + * pud_mknotpresent(entry) = Invalidates an PUD entry for MMU >> */ >> #define VMFLAGS (VM_READ|VM_WRITE|VM_EXEC) >> @@ -287,6 +325,233 @@ static void __init pmd_populate_tests(struct >> mm_struct *mm, pmd_t *pmdp, >> WARN_ON(pmd_bad(pmd)); >> } >> +#ifdef CONFIG_ARCH_HAS_PTE_SPECIAL > > Can we avoid ifdefs unless nece
Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)
On 01/03/20 22:33, Linus Torvalds wrote: > On Sun, Mar 1, 2020 at 1:03 PM Paolo Bonzini wrote: >> >> Paolo Bonzini (4): >> KVM: allow disabling -Werror > > Honestly, this is just badly done. > > You've basically made it enable -Werror only for very random > configurations - and apparently the one you test. > Doing things like COMPILE_TEST disables it, but so does not having > EXPERT enabled. Yes, I took this from the i915 Kconfig. It's temporary, in 5.7 I am planning to get it to just !KASAN, but for 5.6 I wanted to avoid more breakage so I added the other restrictions. The difference between x86-64 and i386 is really just the frame size warnings, which Christoph triggered because of a higher CONFIG_NR_CPUS. (BTW, perhaps it makes sense for Sparse to have something like __nostack for structs that contain potentially large arrays). > I've merged this, but I wonder why you couldn't just do what I > suggested originally? Seriously, if you script your build tests, > and don't even look at the results, then you might as well use > >make KCFLAGS=-Werror I did that and I'm also adding W=1; and I threw in a smaller than default frame size warning option too because I don't want cpumasks on the stack anyway. However, that wouldn't help contributors. I'm okay if I get W=1 or frame size warnings from patches from other contributors, but I think it's a disservice to them that they have to set KCFLAGS in order to avoid warnings. > the "now it causes problems for > random compiler versions" is a real issue again - but at least it > wouldn't be a random kernel subsystem that happens to trigger it, it > would be a _generic_ issue, and we'd have everybody involved when a > compiler change introduces a new warning. Yes, and GCC prereleases are tested with Linux, for example by doing full Rawhide rebuilds. If we started using -Werror by default (including allyesconfig), they would probably report warnings early. Same for clang. I hope that Linux can have -Werror everywhere, or at least a CONFIG_WERROR option that does it even if it defaults to n for a release or more. But I don't think we can get there without first seeing what issues pop up in a few subsystems or arches---even before considering new compilers---so I decided I would just try. Paolo > Adding the powerpc people, since they have more history with their > somewhat less hacky one. Except that one automatically gets disabled > by "make allmodconfig" and friends, which is also kind of pointless. > Michael, what tends to be the triggers for people using > PPC_DISABLE_WERROR? Do you have reports for it? Could we have a > _generic_ option that just gets enabled by default, except it gets > disabled by _known_ issues (like KASAN). > > Being disabled for "make allmodconfig" is kind of against one of the > _points_ of "the build should be warning-free".
Re: [PATCH net-next 00/23] Clean driver, module and FW versions
From: Leon Romanovsky Date: Sun, 1 Mar 2020 16:44:33 +0200 > This is second batch of the series which removes various static versions > in favour of globaly defined Linux kernel version. This generally looks fine to me but I'll let it sit for a few days so that others can review.
Re: [PATCH v3 0/6] implement KASLR for powerpc/fsl_booke/64
在 2020/3/1 6:54, Scott Wood 写道: On Sat, 2020-02-29 at 15:27 +0800, Jason Yan wrote: 在 2020/2/29 12:28, Scott Wood 写道: On Fri, 2020-02-28 at 14:47 +0800, Jason Yan wrote: 在 2020/2/28 13:53, Scott Wood 写道: I don't see any debug setting for %pK (or %p) to always print the actual address (closest is kptr_restrict=1 but that only works in certain contexts)... from looking at the code it seems it hashes even if kaslr is entirely disabled? Or am I missing something? Yes, %pK (or %p) always hashes whether kaslr is disabled or not. So if we want the real value of the address, we cannot use it. But if you only want to distinguish if two pointers are the same, it's ok. Am I the only one that finds this a bit crazy? If you want to lock a system down then fine, but why wage war on debugging even when there's no randomization going on? Comparing two pointers for equality is not always adequate. AFAIK, %p hashing is only exist because of many legacy address printings and force who really want the raw values to switch to %px or even %lx. It's not the opposite of debugging. Raw address printing is not forbidden, only people need to estimate the risk of adrdress leaks. Yes, but I don't see any format specifier to switch to that will hash in a randomized production environment, but not in a debug or other non-randomized environment which seems like the ideal default for most debug output. Sorry I have no idea why there is no format specifier considered for switching of randomized or non-randomized environment. May they think that raw address should not leak in non-randomized environment too. May be Kees or Tobin can answer this question. Kees? Tobin? Turnning to %p may not be a good idea in this situation. So for the REG logs printed when dumping stack, we can disable it when KASLR is open. For the REG logs in other places like show_regs(), only privileged can trigger it, and they are not combind with a symbol, so I think it's ok to keep them. diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index fad50db9dcf2..659c51f0739a 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -2068,7 +2068,10 @@ void show_stack(struct task_struct *tsk, unsigned long *stack) newsp = stack[0]; ip = stack[STACK_FRAME_LR_SAVE]; if (!firstframe || ip != lr) { - printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip); + if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) + printk("%pS", (void *)ip); + else + printk("["REG"] ["REG"] %pS", sp, ip, (void *)ip); This doesn't deal with "nokaslr" on the kernel command line. It also doesn't seem like something that every callsite should have to opencode, versus having an appropriate format specifier behaves as I described above (and I still don't see why that format specifier should not be "%p"). Actually I still do not understand why we should print the raw value here. When KALLSYMS is enabled we have symbol name and offset like put_cred_rcu+0x108/0x110, and when KALLSYMS is disabled we have the raw address. -Scott .
[PATCH] powerpc/64s/radix: Fix !SMP build
Signed-off-by: Nicholas Piggin --- arch/powerpc/mm/book3s64/radix_pgtable.c | 1 + arch/powerpc/mm/book3s64/radix_tlb.c | 7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c index dd1bea45325c..2a9a0cd79490 100644 --- a/arch/powerpc/mm/book3s64/radix_pgtable.c +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c @@ -26,6 +26,7 @@ #include #include #include +#include #include #include #include diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c index 03f43c924e00..758ade2c2b6e 100644 --- a/arch/powerpc/mm/book3s64/radix_tlb.c +++ b/arch/powerpc/mm/book3s64/radix_tlb.c @@ -587,6 +587,11 @@ void radix__local_flush_all_mm(struct mm_struct *mm) preempt_enable(); } EXPORT_SYMBOL(radix__local_flush_all_mm); + +static void __flush_all_mm(struct mm_struct *mm, bool fullmm) +{ + radix__local_flush_all_mm(mm); +} #endif /* CONFIG_SMP */ void radix__local_flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmaddr, @@ -777,7 +782,7 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, unsigned long vmaddr) EXPORT_SYMBOL(radix__flush_tlb_page); #else /* CONFIG_SMP */ -#define radix__flush_all_mm radix__local_flush_all_mm +static inline void exit_flush_lazy_tlbs(struct mm_struct *mm) { } #endif /* CONFIG_SMP */ static void do_tlbiel_kernel(void *info) -- 2.23.0
Re: [Intel-gfx] [PATCH v7 00/12] Introduce CAP_PERFMON to secure system performance monitoring and observability
Thanks, this looks good to me, in keeping with the CAP_SYSLOG break. Acked-by: Serge E. Hallyn for the set. James/Ingo/Peter, if noone has remaining objections, whose branch should these go in through? thanks, -serge On Tue, Feb 25, 2020 at 12:55:54PM +0300, Alexey Budankov wrote: > > Hi, > > Is there anything else I could do in order to move the changes forward > or is something still missing from this patch set? > Could you please share you mind? > > Thanks, > Alexey > > On 17.02.2020 11:02, Alexey Budankov wrote: > > > > Currently access to perf_events, i915_perf and other performance > > monitoring and observability subsystems of the kernel is open only for > > a privileged process [1] with CAP_SYS_ADMIN capability enabled in the > > process effective set [2]. > > > > This patch set introduces CAP_PERFMON capability designed to secure > > system performance monitoring and observability operations so that > > CAP_PERFMON would assist CAP_SYS_ADMIN capability in its governing role > > for performance monitoring and observability subsystems of the kernel. > > > > CAP_PERFMON intends to harden system security and integrity during > > performance monitoring and observability operations by decreasing attack > > surface that is available to a CAP_SYS_ADMIN privileged process [2]. > > Providing the access to performance monitoring and observability > > operations under CAP_PERFMON capability singly, without the rest of > > CAP_SYS_ADMIN credentials, excludes chances to misuse the credentials > > and makes the operation more secure. Thus, CAP_PERFMON implements the > > principal of least privilege for performance monitoring and > > observability operations (POSIX IEEE 1003.1e: 2.2.2.39 principle of > > least privilege: A security design principle that states that a process > > or program be granted only those privileges (e.g., capabilities) > > necessary to accomplish its legitimate function, and only for the time > > that such privileges are actually required) > > > > CAP_PERFMON intends to meet the demand to secure system performance > > monitoring and observability operations for adoption in security > > sensitive, restricted, multiuser production environments (e.g. HPC > > clusters, cloud and virtual compute environments), where root or > > CAP_SYS_ADMIN credentials are not available to mass users of a system, > > and securely unblock accessibility of system performance monitoring and > > observability operations beyond root and CAP_SYS_ADMIN use cases. > > > > CAP_PERFMON intends to take over CAP_SYS_ADMIN credentials related to > > system performance monitoring and observability operations and balance > > amount of CAP_SYS_ADMIN credentials following the recommendations in > > the capabilities man page [2] for CAP_SYS_ADMIN: "Note: this capability > > is overloaded; see Notes to kernel developers, below." For backward > > compatibility reasons access to system performance monitoring and > > observability subsystems of the kernel remains open for CAP_SYS_ADMIN > > privileged processes but CAP_SYS_ADMIN capability usage for secure > > system performance monitoring and observability operations is > > discouraged with respect to the designed CAP_PERFMON capability. > > > > Possible alternative solution to this system security hardening, > > capabilities balancing task of making performance monitoring and > > observability operations more secure and accessible could be to use > > the existing CAP_SYS_PTRACE capability to govern system performance > > monitoring and observability subsystems. However CAP_SYS_PTRACE > > capability still provides users with more credentials than are > > required for secure performance monitoring and observability > > operations and this excess is avoided by the designed CAP_PERFMON. > > > > Although software running under CAP_PERFMON can not ensure avoidance of > > related hardware issues, the software can still mitigate those issues > > following the official hardware issues mitigation procedure [3]. The > > bugs in the software itself can be fixed following the standard kernel > > development process [4] to maintain and harden security of system > > performance monitoring and observability operations. Finally, the patch > > set is shaped in the way that simplifies backtracking procedure of > > possible induced issues [5] as much as possible. > > > > The patch set is for tip perf/core repository: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core > > sha1: fdb64822443ec9fb8c3a74b598a74790ae8d2e22 > > > > --- > > Changes in v7: > > - updated and extended kernel.rst and perf-security.rst documentation > > files with the information about CAP_PERFMON capability and its use cases > > - documented the case of double audit logging of CAP_PERFMON and > > CAP_SYS_ADMIN > > capabilities on a SELinux enabled system > > Changes in v6: > > - avoided noaudit checks in perfmon_capable() to explicitly advertise > > CAP_PERFMON usage thru audit logs to
RE: [PATCH v3 25/27] powerpc/powernv/pmem: Expose the serial number in sysfs
On Fri, 2020-02-28 at 08:15 +0100, Greg Kroah-Hartman wrote: > On Fri, Feb 28, 2020 at 05:25:31PM +1100, Andrew Donnellan wrote: > > On 21/2/20 2:27 pm, Alastair D'Silva wrote: > > > +int ocxlpmem_sysfs_add(struct ocxlpmem *ocxlpmem) > > > +{ > > > + int i, rc; > > > + > > > + for (i = 0; i < ARRAY_SIZE(attrs); i++) { > > > + rc = device_create_file(&ocxlpmem->dev, &attrs[i]); > > > + if (rc) { > > > + for (; --i >= 0;) > > > + device_remove_file(&ocxlpmem->dev, > > > &attrs[i]); > > > > I'd rather avoid weird for loop constructs if possible. > > > > Is it actually dangerous to call device_remove_file() on an attr > > that hasn't > > been added? If not then I'd rather define an err: label and loop > > over the > > whole array there. > > None of this should be used at all, just use attribute groups > properly > and the driver core will handle this all for you. > > device_create/remove_file should never be called by anyone anymore if > at all > possible. > > thanks, > > greg k-h Thanks, I'll rework it to use the .groups member of struct pci_driver. -- Alastair D'Silva Open Source Developer Linux Technology Centre, IBM Australia mob: 0423 762 819
Re: [GIT PULL] Second batch of KVM changes for Linux 5.6-rc4 (or rc5)
On Sun, Mar 1, 2020 at 1:03 PM Paolo Bonzini wrote: > > Paolo Bonzini (4): > KVM: allow disabling -Werror Honestly, this is just badly done. You've basically made it enable -Werror only for very random configurations - and apparently the one you test. Doing things like COMPILE_TEST disables it, but so does not having EXPERT enabled. So it looks entirely ad-hoc and makes very little sense. At least the "with KASAN, disable this" part makes sense, since that's a known source or warnings. But everything else looks very random. I've merged this, but I wonder why you couldn't just do what I suggested originally? Seriously, if you script your build tests, and don't even look at the results, then you might as well use make KCFLAGS=-Werror instead of having this kind of completely random option that has almost no logic to it at all. And if you depend entirely on random build infrastructure like the 0day bot etc, this likely _is_ going to break when it starts using a new gcc version, or when it starts testing using clang, or whatever. So then we end up with another odd random situation where now kvm (and only kvm) will fail those builds just because they are automated. Yes, as I said in that original thread, I'd love to do -Werror in general, at which point it wouldn't be some random ad-hoc kvm special case for some random option. But the "now it causes problems for random compiler versions" is a real issue again - but at least it wouldn't be a random kernel subsystem that happens to trigger it, it would be a _generic_ issue, and we'd have everybody involved when a compiler change introduces a new warning. I've pulled this for now, but I really think it's a horrible hack, and it's just done entirely wrong. Adding the powerpc people, since they have more history with their somewhat less hacky one. Except that one automatically gets disabled by "make allmodconfig" and friends, which is also kind of pointless. Michael, what tends to be the triggers for people using PPC_DISABLE_WERROR? Do you have reports for it? Could we have a _generic_ option that just gets enabled by default, except it gets disabled by _known_ issues (like KASAN). Being disabled for "make allmodconfig" is kind of against one of the _points_ of "the build should be warning-free". Linus
[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set
https://bugzilla.kernel.org/show_bug.cgi?id=199471 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #275509|0 |1 is obsolete|| --- Comment #14 from Erhard F. (erhar...@mailbox.org) --- Created attachment 287749 --> https://bugzilla.kernel.org/attachment.cgi?id=287749&action=edit dmesg (kernel 4.17, PowerMac G5 11,2) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set
https://bugzilla.kernel.org/show_bug.cgi?id=199471 --- Comment #13 from Erhard F. (erhar...@mailbox.org) --- Created attachment 287747 --> https://bugzilla.kernel.org/attachment.cgi?id=287747&action=edit kernel .config (kernel 4.17, PowerMac G5 11,2) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set
https://bugzilla.kernel.org/show_bug.cgi?id=199471 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #275507|0 |1 is obsolete|| --- Comment #12 from Erhard F. (erhar...@mailbox.org) --- Created attachment 287745 --> https://bugzilla.kernel.org/attachment.cgi?id=287745&action=edit dmesg (kernel 4.16.18, PowerMac G5 11,2) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 199471] [Bisected][Regression] windfarm_pm* no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set
https://bugzilla.kernel.org/show_bug.cgi?id=199471 --- Comment #11 from Erhard F. (erhar...@mailbox.org) --- (In reply to Wolfram Sang from comment #8) > "This has been quite nice since 4.?.x up to 4.16.x as you only need > CONFIG_I2C_POWERMAC=y which selects the proper windfarm_pmXX at boot time." > > I can't find that in the code. Are you sure i2c-powermac requested that > module? I guess so 'cause if I build i2c_powermac as a module and manually modprobe it, all the relevant windfarm modules get pulled in. But not before. # modprobe -v i2c_powermac insmod /lib/modules/4.16.18-PowerMacG5+/kernel/drivers/i2c/busses/i2c-powermac.ko # dmesg | tail [ 150.181478] 11 [ 150.182851] 0 [ 150.184220] 0 [ 150.626685] windfarm: Backside control loop started. [ 150.690132] windfarm: Slots control loop started. [ 150.794843] i2c i2c-0: master_xfer[0] W, addr=0x50, len=1 [ 150.796467] i2c i2c-0: master_xfer[1] R, addr=0x50, len=8 [ 150.801851] i2c i2c-0: NAK from device addr 0x50 msg #0 [ 150.807758] windfarm: Drive bay control loop started. -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 199471] windfarm_pm72 no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set (regression)
https://bugzilla.kernel.org/show_bug.cgi?id=199471 --- Comment #10 from Erhard F. (erhar...@mailbox.org) --- Created attachment 287743 --> https://bugzilla.kernel.org/attachment.cgi?id=287743&action=edit bisect.log Finally checked on that bug again and bisected it. The offending commit is: # git bisect bad | tee -a ~/bisect02.log af503716ac1444db61d80cb6d17cfe62929c21df is the first bad commit commit af503716ac1444db61d80cb6d17cfe62929c21df Author: Javier Martinez Canillas Date: Sun Dec 3 22:40:50 2017 +0100 i2c: core: report OF style module alias for devices registered via OF The buses should honor the firmware interface used to register the device, but the I2C core reports a MODALIAS of the form i2c: even for I2C devices registered via OF. This means that user-space will never get an OF stype uevent MODALIAS even when the drivers modules contain aliases exported from both the I2C and OF device ID tables. For example, an Atmel maXTouch Touchscreen registered by a DT node with compatible "atmel,maxtouch" has the following module alias: $ cat /sys/class/i2c-adapter/i2c-8/8-004b/modalias i2c:maxtouch So udev won't be able to auto-load a module for an OF-only device driver. Many OF-only drivers duplicate the OF device ID table entries in an I2C ID table only has a workaround for how the I2C core reports the module alias. This patch changes the I2C core to report an OF related MODALIAS uevent if the device was registered via OF. So for the previous example, after this patch, the reported MODALIAS for the Atmel maXTouch will be the following: $ cat /sys/class/i2c-adapter/i2c-8/8-004b/modalias of:NtrackpadTCatmel,maxtouch NOTE: This patch may break out-of-tree drivers that were relying on this behavior, and only had an I2C device ID table even when the device was registered via OF. There are no remaining drivers in mainline that do this, but out-of-tree drivers have to be fixed and define a proper OF device ID table to have module auto-loading working. Signed-off-by: Javier Martinez Canillas Tested-by: Dmitry Mastykin Signed-off-by: Wolfram Sang drivers/i2c/i2c-core-base.c | 8 1 file changed, 8 insertions(+) -- You are receiving this mail because: You are watching the assignee of the bug.
[Bug 199471] windfarm_pm72 no longer gets automatically loaded when CONFIG_I2C_POWERMAC=y is set (regression)
https://bugzilla.kernel.org/show_bug.cgi?id=199471 Erhard F. (erhar...@mailbox.org) changed: What|Removed |Added Attachment #275503|0 |1 is obsolete|| Attachment #275505|0 |1 is obsolete|| --- Comment #9 from Erhard F. (erhar...@mailbox.org) --- Created attachment 287741 --> https://bugzilla.kernel.org/attachment.cgi?id=287741&action=edit kernel .config (kernel 4.16, PowerMac G5 11,2) With the attached kernel .config the G5 7,3 and the G5 11,2 automatically load the suitable windfarm module on kernel <4.17. Starting from kernel 4.17 windfarm core needs to be CONFIG_WINDFARM=y to automacitally load the suitable windfarm module, CONFIG_WINDFARM=m is no longer sufficient. Needed for 4.16.x to automatically load the suitable windfarm module: # grep -i wind .config CONFIG_WINDFARM=m CONFIG_WINDFARM_PM81=m CONFIG_WINDFARM_PM72=m CONFIG_WINDFARM_RM31=m CONFIG_WINDFARM_PM91=m CONFIG_WINDFARM_PM112=m CONFIG_WINDFARM_PM121=m Needed for >=4.17.x to automatically load the suitable windfarm module: # grep -i wind .config CONFIG_WINDFARM=y CONFIG_WINDFARM_PM81=m CONFIG_WINDFARM_PM72=m CONFIG_WINDFARM_RM31=m CONFIG_WINDFARM_PM91=m CONFIG_WINDFARM_PM112=m CONFIG_WINDFARM_PM121=m -- You are receiving this mail because: You are watching the assignee of the bug.
[PATCH] tty: hvc: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style in header file related to the HVC driver. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used). Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches Signed-off-by: Nishad Kamdar --- drivers/tty/hvc/hvc_console.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h index e9319954c832..18d005814e4b 100644 --- a/drivers/tty/hvc/hvc_console.h +++ b/drivers/tty/hvc/hvc_console.h @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0+ +/* SPDX-License-Identifier: GPL-2.0+ */ /* * hvc_console.h * Copyright (C) 2005 IBM Corporation -- 2.17.1
[PATCH] tty: hvc: Use the correct style for SPDX License Identifier
This patch corrects the SPDX License Identifier style in header file related to the HVC driver. For C header files Documentation/process/license-rules.rst mandates C-like comments (opposed to C source files where C++ style should be used). Changes made by using a script provided by Joe Perches here: https://lkml.org/lkml/2019/2/7/46. Suggested-by: Joe Perches Signed-off-by: Nishad Kamdar --- drivers/tty/hvc/hvc_console.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h index e9319954c832..18d005814e4b 100644 --- a/drivers/tty/hvc/hvc_console.h +++ b/drivers/tty/hvc/hvc_console.h @@ -1,4 +1,4 @@ -// SPDX-License-Identifier: GPL-2.0+ +/* SPDX-License-Identifier: GPL-2.0+ */ /* * hvc_console.h * Copyright (C) 2005 IBM Corporation -- 2.17.1
[PATCH net-next 22/23] net/freescale: Don't set zero if FW not-available in ucc_geth
From: Leon Romanovsky Rely on ethtool to properly present the fact that FW is not available for the ucc_geth driver. Signed-off-by: Leon Romanovsky --- drivers/net/ethernet/freescale/ucc_geth_ethtool.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c index bc7ba70d176c..14c08a868190 100644 --- a/drivers/net/ethernet/freescale/ucc_geth_ethtool.c +++ b/drivers/net/ethernet/freescale/ucc_geth_ethtool.c @@ -334,7 +334,6 @@ uec_get_drvinfo(struct net_device *netdev, struct ethtool_drvinfo *drvinfo) { strlcpy(drvinfo->driver, DRV_NAME, sizeof(drvinfo->driver)); - strlcpy(drvinfo->fw_version, "N/A", sizeof(drvinfo->fw_version)); strlcpy(drvinfo->bus_info, "QUICC ENGINE", sizeof(drvinfo->bus_info)); } -- 2.24.1
[PATCH net-next 20/23] net/freescale: Clean drivers from static versions
From: Leon Romanovsky There is no need to set static versions because linux kernel is released all together with same version applicable to the whole code base. Signed-off-by: Leon Romanovsky --- drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 2 -- drivers/net/ethernet/freescale/enetc/enetc_pf.c | 13 - drivers/net/ethernet/freescale/enetc/enetc_vf.c | 12 drivers/net/ethernet/freescale/fec_main.c | 1 - .../net/ethernet/freescale/fs_enet/fs_enet-main.c | 2 -- drivers/net/ethernet/freescale/fs_enet/fs_enet.h| 2 -- drivers/net/ethernet/freescale/gianfar.c| 2 -- drivers/net/ethernet/freescale/gianfar.h| 1 - drivers/net/ethernet/freescale/gianfar_ethtool.c| 2 -- drivers/net/ethernet/freescale/ucc_geth.c | 1 - drivers/net/ethernet/freescale/ucc_geth.h | 1 - drivers/net/ethernet/freescale/ucc_geth_ethtool.c | 1 - 12 files changed, 40 deletions(-) diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c index 66d150872d48..13ab669ca8b3 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c @@ -110,8 +110,6 @@ static void dpaa_get_drvinfo(struct net_device *net_dev, strlcpy(drvinfo->driver, KBUILD_MODNAME, sizeof(drvinfo->driver)); - len = snprintf(drvinfo->version, sizeof(drvinfo->version), - "%X", 0); len = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), "%X", 0); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_pf.c b/drivers/net/ethernet/freescale/enetc/enetc_pf.c index fc0d7d99e9a1..545a344bce00 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_pf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_pf.c @@ -7,12 +7,6 @@ #include #include "enetc_pf.h" -#define ENETC_DRV_VER_MAJ 1 -#define ENETC_DRV_VER_MIN 0 - -#define ENETC_DRV_VER_STR __stringify(ENETC_DRV_VER_MAJ) "." \ - __stringify(ENETC_DRV_VER_MIN) -static const char enetc_drv_ver[] = ENETC_DRV_VER_STR; #define ENETC_DRV_NAME_STR "ENETC PF driver" static const char enetc_drv_name[] = ENETC_DRV_NAME_STR; @@ -929,9 +923,6 @@ static int enetc_pf_probe(struct pci_dev *pdev, netif_carrier_off(ndev); - netif_info(priv, probe, ndev, "%s v%s\n", - enetc_drv_name, enetc_drv_ver); - return 0; err_reg_netdev: @@ -959,9 +950,6 @@ static void enetc_pf_remove(struct pci_dev *pdev) enetc_sriov_configure(pdev, 0); priv = netdev_priv(si->ndev); - netif_info(priv, drv, si->ndev, "%s v%s remove\n", - enetc_drv_name, enetc_drv_ver); - unregister_netdev(si->ndev); enetc_mdio_remove(pf); @@ -995,4 +983,3 @@ module_pci_driver(enetc_pf_driver); MODULE_DESCRIPTION(ENETC_DRV_NAME_STR); MODULE_LICENSE("Dual BSD/GPL"); -MODULE_VERSION(ENETC_DRV_VER_STR); diff --git a/drivers/net/ethernet/freescale/enetc/enetc_vf.c b/drivers/net/ethernet/freescale/enetc/enetc_vf.c index ebd21bf4cfa1..28a786b2f3e7 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc_vf.c +++ b/drivers/net/ethernet/freescale/enetc/enetc_vf.c @@ -4,12 +4,6 @@ #include #include "enetc.h" -#define ENETC_DRV_VER_MAJ 1 -#define ENETC_DRV_VER_MIN 0 - -#define ENETC_DRV_VER_STR __stringify(ENETC_DRV_VER_MAJ) "." \ - __stringify(ENETC_DRV_VER_MIN) -static const char enetc_drv_ver[] = ENETC_DRV_VER_STR; #define ENETC_DRV_NAME_STR "ENETC VF driver" static const char enetc_drv_name[] = ENETC_DRV_NAME_STR; @@ -201,9 +195,6 @@ static int enetc_vf_probe(struct pci_dev *pdev, netif_carrier_off(ndev); - netif_info(priv, probe, ndev, "%s v%s\n", - enetc_drv_name, enetc_drv_ver); - return 0; err_reg_netdev: @@ -225,8 +216,6 @@ static void enetc_vf_remove(struct pci_dev *pdev) struct enetc_ndev_priv *priv; priv = netdev_priv(si->ndev); - netif_info(priv, drv, si->ndev, "%s v%s remove\n", - enetc_drv_name, enetc_drv_ver); unregister_netdev(si->ndev); enetc_free_msix(priv); @@ -254,4 +243,3 @@ module_pci_driver(enetc_vf_driver); MODULE_DESCRIPTION(ENETC_DRV_NAME_STR); MODULE_LICENSE("Dual BSD/GPL"); -MODULE_VERSION(ENETC_DRV_VER_STR); diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 12edd4e358f8..af7653e341f2 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -2128,7 +2128,6 @@ static void fec_enet_get_drvinfo(struct net_device *ndev, strlcpy(info->driver, fep->pdev->dev.driver->name, sizeof(info->driver)); - strlcpy(info->version, "Revision: 1.0", sizeof(info->version)); strlcpy(info->bus_info, dev_name(&ndev->dev), sizeof(info->bus_inf
[PATCH net-next 00/23] Clean driver, module and FW versions
From: Leon Romanovsky Hi, This is second batch of the series which removes various static versions in favour of globaly defined Linux kernel version. The first part with better cover letter can be found here https://lore.kernel.org/lkml/20200224085311.460338-1-l...@kernel.org The code is based on 68e2c37690b0 ("Merge branch 'hsr-several-code-cleanup-for-hsr-module'") and WIP branch is https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/log/?h=ethtool Thanks Leon Romanovsky (23): net/broadcom: Clean broadcom code from driver versions net/broadcom: Don't set N/A FW if it is not available net/brocade: Delete driver version net/liquidio: Delete driver version assignment net/liquidio: Delete non-working LIQUIDIO_PACKAGE check net/cavium: Clean driver versions net/cavium: Delete N/A assignments for ethtool net/chelsio: Delete drive and module versions net/chelsio: Don't set N/A for not available FW net/cirrus: Delete driver version net/cisco: Delete driver and module versions net/cortina: Delete driver version from ethtool output net/davicom: Delete ethtool version assignment net/dec: Delete driver versions net/dlink: Remove driver version and release date net/dnet: Delete static version from the driver net/emulex: Delete driver version net/faraday: Delete driver version from the drivers net/fealnx: Delete driver version net/freescale: Clean drivers from static versions net/freescale: Don't set zero if FW not-available in dpaa net/freescale: Don't set zero if FW not-available in ucc_geth net/freescale: Don't set zero if FW iand bus not-available in gianfar drivers/net/ethernet/broadcom/b44.c | 5 drivers/net/ethernet/broadcom/bcm63xx_enet.c | 10 ++- drivers/net/ethernet/broadcom/bcmsysport.c| 1 - drivers/net/ethernet/broadcom/bnx2.c | 11 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h | 8 +- .../ethernet/broadcom/bnx2x/bnx2x_ethtool.c | 7 - .../net/ethernet/broadcom/bnx2x/bnx2x_main.c | 7 - drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 -- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 4 ++- .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 1 - drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 1 - .../net/ethernet/broadcom/genet/bcmgenet.c| 1 - drivers/net/ethernet/broadcom/tg3.c | 11 +--- drivers/net/ethernet/brocade/bna/bnad.c | 4 --- drivers/net/ethernet/brocade/bna/bnad.h | 2 -- .../net/ethernet/brocade/bna/bnad_ethtool.c | 1 - .../ethernet/cavium/liquidio/lio_ethtool.c| 2 -- .../net/ethernet/cavium/liquidio/lio_main.c | 8 -- .../ethernet/cavium/liquidio/lio_vf_main.c| 5 ++-- .../cavium/liquidio/liquidio_common.h | 6 - .../ethernet/cavium/liquidio/octeon_console.c | 10 ++- .../net/ethernet/cavium/octeon/octeon_mgmt.c | 6 - .../ethernet/cavium/thunder/nicvf_ethtool.c | 2 -- drivers/net/ethernet/chelsio/cxgb/common.h| 1 - drivers/net/ethernet/chelsio/cxgb/cxgb2.c | 3 --- .../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 4 --- drivers/net/ethernet/chelsio/cxgb3/version.h | 2 -- drivers/net/ethernet/chelsio/cxgb4/cxgb4.h| 3 +-- .../ethernet/chelsio/cxgb4/cxgb4_ethtool.c| 6 + .../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 10 --- .../ethernet/chelsio/cxgb4vf/cxgb4vf_main.c | 9 --- .../ethernet/chelsio/libcxgb/libcxgb_ppm.c| 2 -- drivers/net/ethernet/cirrus/ep93xx_eth.c | 2 -- drivers/net/ethernet/cisco/enic/enic.h| 2 -- .../net/ethernet/cisco/enic/enic_ethtool.c| 1 - drivers/net/ethernet/cisco/enic/enic_main.c | 3 --- drivers/net/ethernet/cortina/gemini.c | 2 -- drivers/net/ethernet/davicom/dm9000.c | 2 -- drivers/net/ethernet/dec/tulip/de2104x.c | 15 --- drivers/net/ethernet/dec/tulip/dmfe.c | 14 -- drivers/net/ethernet/dec/tulip/tulip_core.c | 26 ++- drivers/net/ethernet/dec/tulip/uli526x.c | 13 -- drivers/net/ethernet/dec/tulip/winbond-840.c | 12 - drivers/net/ethernet/dlink/dl2k.c | 9 --- drivers/net/ethernet/dlink/sundance.c | 20 -- drivers/net/ethernet/dnet.c | 1 - drivers/net/ethernet/dnet.h | 1 - drivers/net/ethernet/emulex/benet/be.h| 1 - .../net/ethernet/emulex/benet/be_ethtool.c| 1 - drivers/net/ethernet/emulex/benet/be_main.c | 5 +--- drivers/net/ethernet/faraday/ftgmac100.c | 2 -- drivers/net/ethernet/faraday/ftmac100.c | 3 --- drivers/net/ethernet/fealnx.c | 20 -- .../ethernet/freescale/dpaa/dpaa_ethtool.c| 11 .../net/ethernet/freescale/enetc/enetc_pf.c | 13 -- .../net/ethernet/freescale/enetc/enetc_vf.c | 12 - drivers/net/ethernet/freescale/fec_main.c | 1 - .../ethernet/freescale/fs_enet/fs_enet-main.c | 2 --
Re: [PATCH v3 32/32] powerpc/64s: system call support for scv/rfscv instructions
Hi Nicholas, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on v5.6-rc3 next-20200228] [cannot apply to kvm-ppc/kvm-ppc-next scottwood/next] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-64-interrupts-and-syscalls-series/20200226-043224 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-ppc64e_defconfig (attached as .config) compiler: powerpc64-linux-gcc (GCC) 7.5.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree GCC_VERSION=7.5.0 make.cross ARCH=powerpc If you fix the issue, kindly add following tag Reported-by: kbuild test robot All errors (new ones prefixed by >>): arch/powerpc/kernel/entry_64.S: Assembler messages: >> arch/powerpc/kernel/entry_64.S:67: Error: unrecognized opcode: >> `interrupt_to_kernel' >> arch/powerpc/kernel/entry_64.S:164: Error: unrecognized opcode: >> `rfscv_to_user' vim +67 arch/powerpc/kernel/entry_64.S 47 48 /* 49 * System calls. 50 */ 51 .section".toc","aw" 52 SYS_CALL_TABLE: 53 .tc sys_call_table[TC],sys_call_table 54 55 COMPAT_SYS_CALL_TABLE: 56 .tc compat_sys_call_table[TC],compat_sys_call_table 57 58 /* This value is used to mark exception frames on the stack. */ 59 exception_marker: 60 .tc ID_EXC_MARKER[TC],STACK_FRAME_REGS_MARKER 61 62 .section".text" 63 .align 7 64 65 .globl system_call_vectored_common 66 system_call_vectored_common: > 67 INTERRUPT_TO_KERNEL 68 mr r10,r1 69 ld r1,PACAKSAVE(r13) 70 std r10,0(r1) 71 std r11,_NIP(r1) 72 std r12,_MSR(r1) 73 std r0,GPR0(r1) 74 std r10,GPR1(r1) 75 std r2,GPR2(r1) 76 ld r2,PACATOC(r13) 77 mfcrr12 78 li r11,0 79 /* Can we avoid saving r3-r8 in common case? */ 80 std r3,GPR3(r1) 81 std r4,GPR4(r1) 82 std r5,GPR5(r1) 83 std r6,GPR6(r1) 84 std r7,GPR7(r1) 85 std r8,GPR8(r1) 86 /* Zero r9-r12, this should only be required when restoring all GPRs */ 87 std r11,GPR9(r1) 88 std r11,GPR10(r1) 89 std r11,GPR11(r1) 90 std r11,GPR12(r1) 91 std r9,GPR13(r1) 92 SAVE_NVGPRS(r1) 93 std r11,_XER(r1) 94 std r11,_LINK(r1) 95 std r11,_CTR(r1) 96 97 li r11,0xc00 98 std r11,_TRAP(r1) 99 std r12,_CCR(r1) 100 std r3,ORIG_GPR3(r1) 101 addir10,r1,STACK_FRAME_OVERHEAD 102 ld r11,exception_marker@toc(r2) 103 std r11,-16(r10)/* "regshere" marker */ 104 105 /* 106 * RECONCILE_IRQ_STATE without calling trace_hardirqs_off(), which 107 * would clobber syscall parameters. Also we always enter with IRQs 108 * enabled and nothing pending. system_call_exception() will call 109 * trace_hardirqs_off(). 110 * 111 * scv enters with MSR[EE]=1, so don't set PACA_IRQ_HARD_DIS. 112 */ 113 li r9,IRQS_ALL_DISABLED 114 stb r9,PACAIRQSOFTMASK(r13) 115 116 /* Calling convention has r9 = orig r0, r10 = regs */ 117 mr r9,r0 118 bl system_call_exception 119 120 .Lsyscall_vectored_exit: 121 addir4,r1,STACK_FRAME_OVERHEAD 122 li r5,1 /* scv */ 123 bl syscall_exit_prepare 124 125 ld r2,_CCR(r1) 126 ld r4,_NIP(r1) 127 ld r5,_MSR(r1) 128 129 BEGIN_FTR_SECTION 130 stdcx. r0,0,r1 /* to clear the reservation */ 131 END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) 132 133 mtlrr4 134 mtctr r5 135 136 cmpdi r3,0 137 bne syscall_vectored_restore_regs 138 li r0,0 139 li r4,0 140 li r5,0 141 li r6,0 142 li r7,0 143 li r8,0 144 li r9,
Re: [PATCH 1/2] powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
Michael Ellerman writes: > From: "Desnes A. Nunes do Rosario" > > PowerVM systems running compatibility mode on a few Power8 revisions are > still vulnerable to the hardware defect that loses PMU exceptions arriving > prior to a context switch. > > The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG > cpu_feature bit, nevertheless this bit also needs to be set for PowerVM > compatibility mode systems. > > Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") > Signed-off-by: Desnes A. Nunes do Rosario > Reviewed-by: Leonardo Bras > Signed-off-by: Michael Ellerman > Link: https://lore.kernel.org/r/20200227134715.9715-1-desn...@linux.ibm.com > --- > arch/powerpc/kernel/cputable.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) Ignore, PEBKAC. Don't try to operate git-send-email after 10pm. cheers > diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c > index e745abc5457a..245be4fafe13 100644 > --- a/arch/powerpc/kernel/cputable.c > +++ b/arch/powerpc/kernel/cputable.c > @@ -2193,11 +2193,13 @@ static struct cpu_spec * __init > setup_cpu_spec(unsigned long offset, >* oprofile_cpu_type already has a value, then we are >* possibly overriding a real PVR with a logical one, >* and, in that case, keep the current value for > - * oprofile_cpu_type. > + * oprofile_cpu_type. Futhermore, let's ensure that the > + * fix for the PMAO bug is enabled on compatibility mode. >*/ > if (old.oprofile_cpu_type != NULL) { > t->oprofile_cpu_type = old.oprofile_cpu_type; > t->oprofile_type = old.oprofile_type; > + t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG; > } > } > > -- > 2.21.1
[PATCH] powerpc/kuap: PPC_KUAP_DEBUG should depend on PPC_KUAP
Currently you can enable PPC_KUAP_DEBUG when PPC_KUAP is disabled, even though the former has not effect without the latter. Fix it so that PPC_KUAP_DEBUG can only be enabled when PPC_KUAP is enabled, not when the platform could support KUAP (PPC_HAVE_KUAP). Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU") Signed-off-by: Michael Ellerman --- arch/powerpc/platforms/Kconfig.cputype | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index 6caedc88474f..6cd4e3240ec6 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -397,7 +397,7 @@ config PPC_KUAP config PPC_KUAP_DEBUG bool "Extra debugging for Kernel Userspace Access Protection" - depends on PPC_HAVE_KUAP && (PPC_RADIX_MMU || PPC_32) + depends on PPC_KUAP && (PPC_RADIX_MMU || PPC_32) help Add extra debugging for Kernel Userspace Access Protection (KUAP) If you're unsure, say N. -- 2.21.1
[PATCH 1/2] powerpc: fix hardware PMU exception bug on PowerVM compatibility mode systems
From: "Desnes A. Nunes do Rosario" PowerVM systems running compatibility mode on a few Power8 revisions are still vulnerable to the hardware defect that loses PMU exceptions arriving prior to a context switch. The software fix for this issue is enabled through the CPU_FTR_PMAO_BUG cpu_feature bit, nevertheless this bit also needs to be set for PowerVM compatibility mode systems. Fixes: 68f2f0d431d9ea4 ("powerpc: Add a cpu feature CPU_FTR_PMAO_BUG") Signed-off-by: Desnes A. Nunes do Rosario Reviewed-by: Leonardo Bras Signed-off-by: Michael Ellerman Link: https://lore.kernel.org/r/20200227134715.9715-1-desn...@linux.ibm.com --- arch/powerpc/kernel/cputable.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/cputable.c b/arch/powerpc/kernel/cputable.c index e745abc5457a..245be4fafe13 100644 --- a/arch/powerpc/kernel/cputable.c +++ b/arch/powerpc/kernel/cputable.c @@ -2193,11 +2193,13 @@ static struct cpu_spec * __init setup_cpu_spec(unsigned long offset, * oprofile_cpu_type already has a value, then we are * possibly overriding a real PVR with a logical one, * and, in that case, keep the current value for -* oprofile_cpu_type. +* oprofile_cpu_type. Futhermore, let's ensure that the +* fix for the PMAO bug is enabled on compatibility mode. */ if (old.oprofile_cpu_type != NULL) { t->oprofile_cpu_type = old.oprofile_cpu_type; t->oprofile_type = old.oprofile_type; + t->cpu_features |= old.cpu_features & CPU_FTR_PMAO_BUG; } } -- 2.21.1