Re: [PATCH v6 1/4] PCI: Remove pci_get_legacy_ide_irq and asm-generic/pci.h

2022-07-23 Thread Stafford Horne
On Fri, Jul 22, 2022 at 06:38:21PM -0500, Bjorn Helgaas wrote:
> On Sat, Jul 23, 2022 at 06:49:41AM +0900, Stafford Horne wrote:
> > The definition of the pci header function pci_get_legacy_ide_irq is only
> > used in platforms that support PNP.  So many of the architecutres where
> > it is defined do not use it.  This also means we can remove
> > asm-generic/pci.h as all it provides is a definition of
> > pci_get_legacy_ide_irq.
> > 
> > Where referenced, replace the usage of pci_get_legacy_ide_irq with the
> > libata.h macros ATA_PRIMARY_IRQ and ATA_SECONDARY_IRQ which provide the
> > same functionality.  This allows removing pci_get_legacy_ide_irq from
> > headers where it is no longer used.
> > 
> > Acked-by: Geert Uytterhoeven 
> > Acked-by: Pierre Morel 
> > Acked-by: Rafael J. Wysocki 
> > Reviewed-by: Christoph Hellwig 
> > Co-developed-by: Arnd Bergmann 
> > Signed-off-by: Arnd Bergmann 
> > Signed-off-by: Stafford Horne 
> 
> I applied all 4 patches in this series to pci/header-cleanup-immutable
> for v5.20.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/header-cleanup-immutable

Thank you,

Sorry, the 0/4 cover letter is here.

  https://lore.kernel.org/lkml/20220722214944.831438-1-sho...@gmail.com/

I hadn't had you CC'd as I was using ./script/get_maintainer.pl to maintain the
CCs.  Maybe patch MAINTAINERS like the following could help keep you CC'd on all
things PCI?  But maybe that would be too much, never-the-less I'll make sure you
are CC'd on pci related patches including cover-letters in the future.

diff --git a/MAINTAINERS b/MAINTAINERS
index f313862b2929..b64cd6bbb34f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -15552,6 +15552,8 @@ F:  include/linux/of_pci.h
 F: include/linux/pci*
 F: include/uapi/linux/pci*
 F: lib/pci*
+K: pci
+N: pci

Palmer, we now have a branch you can use for your RISC-V for-next.  Does, that
work?

-Stafford

> > ---
> >  arch/alpha/include/asm/pci.h   |  6 --
> >  arch/arm/include/asm/pci.h |  5 -
> >  arch/arm64/include/asm/pci.h   |  6 --
> >  arch/csky/include/asm/pci.h|  6 --
> >  arch/ia64/include/asm/pci.h|  6 --
> >  arch/m68k/include/asm/pci.h|  2 --
> >  arch/mips/include/asm/pci.h|  6 --
> >  arch/parisc/include/asm/pci.h  |  5 -
> >  arch/powerpc/include/asm/pci.h |  1 -
> >  arch/riscv/include/asm/pci.h   |  6 --
> >  arch/s390/include/asm/pci.h|  1 -
> >  arch/sh/include/asm/pci.h  |  6 --
> >  arch/sparc/include/asm/pci.h   |  9 -
> >  arch/um/include/asm/pci.h  |  8 
> >  arch/x86/include/asm/pci.h |  3 ---
> >  arch/xtensa/include/asm/pci.h  |  3 ---
> >  drivers/pnp/resource.c |  5 +++--
> >  include/asm-generic/pci.h  | 17 -
> >  18 files changed, 3 insertions(+), 98 deletions(-)
> >  delete mode 100644 include/asm-generic/pci.h
> > 
> > diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
> > index cf6bc1e64d66..6312656279d7 100644
> > --- a/arch/alpha/include/asm/pci.h
> > +++ b/arch/alpha/include/asm/pci.h
> > @@ -56,12 +56,6 @@ struct pci_controller {
> >  
> >  /* IOMMU controls.  */
> >  
> > -/* TODO: integrate with include/asm-generic/pci.h ? */
> > -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> > -{
> > -   return channel ? 15 : 14;
> > -}
> > -
> >  #define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
> >  
> >  static inline int pci_proc_domain(struct pci_bus *bus)
> > diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
> > index 68e6f25784a4..5916b88d4c94 100644
> > --- a/arch/arm/include/asm/pci.h
> > +++ b/arch/arm/include/asm/pci.h
> > @@ -22,11 +22,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
> >  #define HAVE_PCI_MMAP
> >  #define ARCH_GENERIC_PCI_MMAP_RESOURCE
> >  
> > -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> > -{
> > -   return channel ? 15 : 14;
> > -}
> > -
> >  extern void pcibios_report_status(unsigned int status_mask, int warn);
> >  
> >  #endif /* __KERNEL__ */
> > diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> > index b33ca260e3c9..0aebc3488c32 100644
> > --- a/arch/arm64/include/asm/pci.h
> > +++ b/arch/arm64/include/asm/pci.h
> > @@ -23,12 +23,6 @@
> >  extern int isa_dma_bridge_buggy;
> >  
> >  #ifdef CONFIG_PCI
> > -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> > -{
> > -   /* no legacy IRQ on arm64 */
> > -   return -ENODEV;
> > -}
> > -
> >  static inline int pci_proc_domain(struct pci_bus *bus)
> >  {
> > return 1;
> > diff --git a/arch/csky/include/asm/pci.h b/arch/csky/include/asm/pci.h
> > index ebc765b1f78b..0535f1aaae38 100644
> > --- a/arch/csky/include/asm/pci.h
> > +++ b/arch/csky/include/asm/pci.h
> > @@ -18,12 +18,6 @@
> >  extern int isa_dma_bridge_buggy;
> >  
> >  #ifdef 

Re: [PATCH v6 1/4] PCI: Remove pci_get_legacy_ide_irq and asm-generic/pci.h

2022-07-23 Thread Bjorn Helgaas
On Sat, Jul 23, 2022 at 06:49:41AM +0900, Stafford Horne wrote:
> The definition of the pci header function pci_get_legacy_ide_irq is only
> used in platforms that support PNP.  So many of the architecutres where
> it is defined do not use it.  This also means we can remove
> asm-generic/pci.h as all it provides is a definition of
> pci_get_legacy_ide_irq.
> 
> Where referenced, replace the usage of pci_get_legacy_ide_irq with the
> libata.h macros ATA_PRIMARY_IRQ and ATA_SECONDARY_IRQ which provide the
> same functionality.  This allows removing pci_get_legacy_ide_irq from
> headers where it is no longer used.
> 
> Acked-by: Geert Uytterhoeven 
> Acked-by: Pierre Morel 
> Acked-by: Rafael J. Wysocki 
> Reviewed-by: Christoph Hellwig 
> Co-developed-by: Arnd Bergmann 
> Signed-off-by: Arnd Bergmann 
> Signed-off-by: Stafford Horne 

I applied all 4 patches in this series to pci/header-cleanup-immutable
for v5.20.

https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/log/?h=pci/header-cleanup-immutable

> ---
>  arch/alpha/include/asm/pci.h   |  6 --
>  arch/arm/include/asm/pci.h |  5 -
>  arch/arm64/include/asm/pci.h   |  6 --
>  arch/csky/include/asm/pci.h|  6 --
>  arch/ia64/include/asm/pci.h|  6 --
>  arch/m68k/include/asm/pci.h|  2 --
>  arch/mips/include/asm/pci.h|  6 --
>  arch/parisc/include/asm/pci.h  |  5 -
>  arch/powerpc/include/asm/pci.h |  1 -
>  arch/riscv/include/asm/pci.h   |  6 --
>  arch/s390/include/asm/pci.h|  1 -
>  arch/sh/include/asm/pci.h  |  6 --
>  arch/sparc/include/asm/pci.h   |  9 -
>  arch/um/include/asm/pci.h  |  8 
>  arch/x86/include/asm/pci.h |  3 ---
>  arch/xtensa/include/asm/pci.h  |  3 ---
>  drivers/pnp/resource.c |  5 +++--
>  include/asm-generic/pci.h  | 17 -
>  18 files changed, 3 insertions(+), 98 deletions(-)
>  delete mode 100644 include/asm-generic/pci.h
> 
> diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
> index cf6bc1e64d66..6312656279d7 100644
> --- a/arch/alpha/include/asm/pci.h
> +++ b/arch/alpha/include/asm/pci.h
> @@ -56,12 +56,6 @@ struct pci_controller {
>  
>  /* IOMMU controls.  */
>  
> -/* TODO: integrate with include/asm-generic/pci.h ? */
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> - return channel ? 15 : 14;
> -}
> -
>  #define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
>  
>  static inline int pci_proc_domain(struct pci_bus *bus)
> diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
> index 68e6f25784a4..5916b88d4c94 100644
> --- a/arch/arm/include/asm/pci.h
> +++ b/arch/arm/include/asm/pci.h
> @@ -22,11 +22,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
>  #define HAVE_PCI_MMAP
>  #define ARCH_GENERIC_PCI_MMAP_RESOURCE
>  
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> - return channel ? 15 : 14;
> -}
> -
>  extern void pcibios_report_status(unsigned int status_mask, int warn);
>  
>  #endif /* __KERNEL__ */
> diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
> index b33ca260e3c9..0aebc3488c32 100644
> --- a/arch/arm64/include/asm/pci.h
> +++ b/arch/arm64/include/asm/pci.h
> @@ -23,12 +23,6 @@
>  extern int isa_dma_bridge_buggy;
>  
>  #ifdef CONFIG_PCI
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> - /* no legacy IRQ on arm64 */
> - return -ENODEV;
> -}
> -
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
>   return 1;
> diff --git a/arch/csky/include/asm/pci.h b/arch/csky/include/asm/pci.h
> index ebc765b1f78b..0535f1aaae38 100644
> --- a/arch/csky/include/asm/pci.h
> +++ b/arch/csky/include/asm/pci.h
> @@ -18,12 +18,6 @@
>  extern int isa_dma_bridge_buggy;
>  
>  #ifdef CONFIG_PCI
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> - /* no legacy IRQ on csky */
> - return -ENODEV;
> -}
> -
>  static inline int pci_proc_domain(struct pci_bus *bus)
>  {
>   /* always show the domain in /proc */
> diff --git a/arch/ia64/include/asm/pci.h b/arch/ia64/include/asm/pci.h
> index 8c163d1d0189..fa8f545c24c9 100644
> --- a/arch/ia64/include/asm/pci.h
> +++ b/arch/ia64/include/asm/pci.h
> @@ -63,10 +63,4 @@ static inline int pci_proc_domain(struct pci_bus *bus)
>   return (pci_domain_nr(bus) != 0);
>  }
>  
> -#define HAVE_ARCH_PCI_GET_LEGACY_IDE_IRQ
> -static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
> -{
> - return channel ? isa_irq_to_vector(15) : isa_irq_to_vector(14);
> -}
> -
>  #endif /* _ASM_IA64_PCI_H */
> diff --git a/arch/m68k/include/asm/pci.h b/arch/m68k/include/asm/pci.h
> index 5a4bc223743b..ccdfa0dc8413 100644
> --- a/arch/m68k/include/asm/pci.h
> +++ b/arch/m68k/include/asm/pci.h
> @@ -2,8 +2,6 @@
>  #ifndef _ASM_M68K_PCI_H
>  #define _ASM_M68K_PCI_H
>  
> -#include 
> -
>  #define 

[PATCH v6 2/4] PCI: Move isa_dma_bridge_buggy out of dma.h

2022-07-23 Thread Stafford Horne
During recent PCI cleanups we noticed that the isa_dma_bridge_buggy
symbol supported by all architectures is actually only used for x86_32.

This patch moves the symbol out of all architectures limiting usage to
only x86_32.  This is possible because only x86_32 platforms or quirks
existing in PCI devices supported on x86_32 ever set this.  A new global
header linux/isa-dma.h is added to provide a common place to maintain
the definition.

Suggested-by: Arnd Bergmann 
Suggested-by: Christoph Hellwig 
Acked-by: Geert Uytterhoeven 
Reviewed-by: Christoph Hellwig 
Signed-off-by: Stafford Horne 
---
 arch/alpha/include/asm/dma.h   |  9 -
 arch/arc/include/asm/dma.h |  5 -
 arch/arm/include/asm/dma.h |  6 --
 arch/arm64/include/asm/pci.h   |  2 --
 arch/csky/include/asm/pci.h|  2 --
 arch/ia64/include/asm/dma.h|  2 --
 arch/m68k/include/asm/dma.h|  6 --
 arch/microblaze/include/asm/dma.h  |  6 --
 arch/mips/include/asm/dma.h|  8 
 arch/parisc/include/asm/dma.h  |  6 --
 arch/powerpc/include/asm/dma.h |  6 --
 arch/riscv/include/asm/pci.h   |  2 --
 arch/s390/include/asm/dma.h|  6 --
 arch/sh/include/asm/dma.h  |  6 --
 arch/sparc/include/asm/dma.h   |  8 
 arch/um/include/asm/pci.h  |  2 --
 arch/x86/include/asm/dma.h |  8 
 arch/xtensa/include/asm/dma.h  |  7 ---
 drivers/comedi/drivers/comedi_isadma.c |  2 +-
 drivers/pci/pci.c  |  2 ++
 drivers/pci/quirks.c   |  4 +++-
 include/linux/isa-dma.h| 14 ++
 sound/core/isadma.c|  2 +-
 23 files changed, 21 insertions(+), 100 deletions(-)
 create mode 100644 include/linux/isa-dma.h

diff --git a/arch/alpha/include/asm/dma.h b/arch/alpha/include/asm/dma.h
index 28610ea7786d..a04d76b96089 100644
--- a/arch/alpha/include/asm/dma.h
+++ b/arch/alpha/include/asm/dma.h
@@ -365,13 +365,4 @@ extern void free_dma(unsigned int dmanr);  /* release it 
again */
 #define KERNEL_HAVE_CHECK_DMA
 extern int check_dma(unsigned int dmanr);
 
-/* From PCI */
-
-#ifdef CONFIG_PCI
-extern int isa_dma_bridge_buggy;
-#else
-#define isa_dma_bridge_buggy   (0)
-#endif
-
-
 #endif /* _ASM_DMA_H */
diff --git a/arch/arc/include/asm/dma.h b/arch/arc/include/asm/dma.h
index 5b744f4b10a7..02431027ed2f 100644
--- a/arch/arc/include/asm/dma.h
+++ b/arch/arc/include/asm/dma.h
@@ -7,10 +7,5 @@
 #define ASM_ARC_DMA_H
 
 #define MAX_DMA_ADDRESS 0xC000
-#ifdef CONFIG_PCI
-extern int isa_dma_bridge_buggy;
-#else
-#define isa_dma_bridge_buggy   0
-#endif
 
 #endif
diff --git a/arch/arm/include/asm/dma.h b/arch/arm/include/asm/dma.h
index a81dda65c576..907d139be431 100644
--- a/arch/arm/include/asm/dma.h
+++ b/arch/arm/include/asm/dma.h
@@ -143,10 +143,4 @@ extern int  get_dma_residue(unsigned int chan);
 
 #endif /* CONFIG_ISA_DMA_API */
 
-#ifdef CONFIG_PCI
-extern int isa_dma_bridge_buggy;
-#else
-#define isa_dma_bridge_buggy(0)
-#endif
-
 #endif /* __ASM_ARM_DMA_H */
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index 0aebc3488c32..682c922b5658 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -20,8 +20,6 @@
 #define arch_can_pci_mmap_wc() 1
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE 1
 
-extern int isa_dma_bridge_buggy;
-
 #ifdef CONFIG_PCI
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
diff --git a/arch/csky/include/asm/pci.h b/arch/csky/include/asm/pci.h
index 0535f1aaae38..5c02454ec724 100644
--- a/arch/csky/include/asm/pci.h
+++ b/arch/csky/include/asm/pci.h
@@ -15,8 +15,6 @@
 /* C-SKY shim does not initialize PCI bus */
 #define pcibios_assign_all_busses() 1
 
-extern int isa_dma_bridge_buggy;
-
 #ifdef CONFIG_PCI
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
diff --git a/arch/ia64/include/asm/dma.h b/arch/ia64/include/asm/dma.h
index 59625e9c1f9c..eaed2626ffda 100644
--- a/arch/ia64/include/asm/dma.h
+++ b/arch/ia64/include/asm/dma.h
@@ -12,8 +12,6 @@
 
 extern unsigned long MAX_DMA_ADDRESS;
 
-extern int isa_dma_bridge_buggy;
-
 #define free_dma(x)
 
 #endif /* _ASM_IA64_DMA_H */
diff --git a/arch/m68k/include/asm/dma.h b/arch/m68k/include/asm/dma.h
index f6c5e0dfb4e5..1c8d9c5bc2fa 100644
--- a/arch/m68k/include/asm/dma.h
+++ b/arch/m68k/include/asm/dma.h
@@ -6,10 +6,4 @@
bootmem allocator (but this should do it for this) */
 #define MAX_DMA_ADDRESS PAGE_OFFSET
 
-#ifdef CONFIG_PCI
-extern int isa_dma_bridge_buggy;
-#else
-#define isa_dma_bridge_buggy(0)
-#endif
-
 #endif /* _M68K_DMA_H */
diff --git a/arch/microblaze/include/asm/dma.h 
b/arch/microblaze/include/asm/dma.h
index f801582be912..7484c9eb66c4 100644
--- a/arch/microblaze/include/asm/dma.h
+++ b/arch/microblaze/include/asm/dma.h
@@ -9,10 +9,4 @@
 /* Virtual address corresponding to last available physical memory 

[PATCH v6 1/4] PCI: Remove pci_get_legacy_ide_irq and asm-generic/pci.h

2022-07-23 Thread Stafford Horne
The definition of the pci header function pci_get_legacy_ide_irq is only
used in platforms that support PNP.  So many of the architecutres where
it is defined do not use it.  This also means we can remove
asm-generic/pci.h as all it provides is a definition of
pci_get_legacy_ide_irq.

Where referenced, replace the usage of pci_get_legacy_ide_irq with the
libata.h macros ATA_PRIMARY_IRQ and ATA_SECONDARY_IRQ which provide the
same functionality.  This allows removing pci_get_legacy_ide_irq from
headers where it is no longer used.

Acked-by: Geert Uytterhoeven 
Acked-by: Pierre Morel 
Acked-by: Rafael J. Wysocki 
Reviewed-by: Christoph Hellwig 
Co-developed-by: Arnd Bergmann 
Signed-off-by: Arnd Bergmann 
Signed-off-by: Stafford Horne 
---
 arch/alpha/include/asm/pci.h   |  6 --
 arch/arm/include/asm/pci.h |  5 -
 arch/arm64/include/asm/pci.h   |  6 --
 arch/csky/include/asm/pci.h|  6 --
 arch/ia64/include/asm/pci.h|  6 --
 arch/m68k/include/asm/pci.h|  2 --
 arch/mips/include/asm/pci.h|  6 --
 arch/parisc/include/asm/pci.h  |  5 -
 arch/powerpc/include/asm/pci.h |  1 -
 arch/riscv/include/asm/pci.h   |  6 --
 arch/s390/include/asm/pci.h|  1 -
 arch/sh/include/asm/pci.h  |  6 --
 arch/sparc/include/asm/pci.h   |  9 -
 arch/um/include/asm/pci.h  |  8 
 arch/x86/include/asm/pci.h |  3 ---
 arch/xtensa/include/asm/pci.h  |  3 ---
 drivers/pnp/resource.c |  5 +++--
 include/asm-generic/pci.h  | 17 -
 18 files changed, 3 insertions(+), 98 deletions(-)
 delete mode 100644 include/asm-generic/pci.h

diff --git a/arch/alpha/include/asm/pci.h b/arch/alpha/include/asm/pci.h
index cf6bc1e64d66..6312656279d7 100644
--- a/arch/alpha/include/asm/pci.h
+++ b/arch/alpha/include/asm/pci.h
@@ -56,12 +56,6 @@ struct pci_controller {
 
 /* IOMMU controls.  */
 
-/* TODO: integrate with include/asm-generic/pci.h ? */
-static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
-{
-   return channel ? 15 : 14;
-}
-
 #define pci_domain_nr(bus) ((struct pci_controller *)(bus)->sysdata)->index
 
 static inline int pci_proc_domain(struct pci_bus *bus)
diff --git a/arch/arm/include/asm/pci.h b/arch/arm/include/asm/pci.h
index 68e6f25784a4..5916b88d4c94 100644
--- a/arch/arm/include/asm/pci.h
+++ b/arch/arm/include/asm/pci.h
@@ -22,11 +22,6 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 #define HAVE_PCI_MMAP
 #define ARCH_GENERIC_PCI_MMAP_RESOURCE
 
-static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
-{
-   return channel ? 15 : 14;
-}
-
 extern void pcibios_report_status(unsigned int status_mask, int warn);
 
 #endif /* __KERNEL__ */
diff --git a/arch/arm64/include/asm/pci.h b/arch/arm64/include/asm/pci.h
index b33ca260e3c9..0aebc3488c32 100644
--- a/arch/arm64/include/asm/pci.h
+++ b/arch/arm64/include/asm/pci.h
@@ -23,12 +23,6 @@
 extern int isa_dma_bridge_buggy;
 
 #ifdef CONFIG_PCI
-static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
-{
-   /* no legacy IRQ on arm64 */
-   return -ENODEV;
-}
-
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
return 1;
diff --git a/arch/csky/include/asm/pci.h b/arch/csky/include/asm/pci.h
index ebc765b1f78b..0535f1aaae38 100644
--- a/arch/csky/include/asm/pci.h
+++ b/arch/csky/include/asm/pci.h
@@ -18,12 +18,6 @@
 extern int isa_dma_bridge_buggy;
 
 #ifdef CONFIG_PCI
-static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
-{
-   /* no legacy IRQ on csky */
-   return -ENODEV;
-}
-
 static inline int pci_proc_domain(struct pci_bus *bus)
 {
/* always show the domain in /proc */
diff --git a/arch/ia64/include/asm/pci.h b/arch/ia64/include/asm/pci.h
index 8c163d1d0189..fa8f545c24c9 100644
--- a/arch/ia64/include/asm/pci.h
+++ b/arch/ia64/include/asm/pci.h
@@ -63,10 +63,4 @@ static inline int pci_proc_domain(struct pci_bus *bus)
return (pci_domain_nr(bus) != 0);
 }
 
-#define HAVE_ARCH_PCI_GET_LEGACY_IDE_IRQ
-static inline int pci_get_legacy_ide_irq(struct pci_dev *dev, int channel)
-{
-   return channel ? isa_irq_to_vector(15) : isa_irq_to_vector(14);
-}
-
 #endif /* _ASM_IA64_PCI_H */
diff --git a/arch/m68k/include/asm/pci.h b/arch/m68k/include/asm/pci.h
index 5a4bc223743b..ccdfa0dc8413 100644
--- a/arch/m68k/include/asm/pci.h
+++ b/arch/m68k/include/asm/pci.h
@@ -2,8 +2,6 @@
 #ifndef _ASM_M68K_PCI_H
 #define _ASM_M68K_PCI_H
 
-#include 
-
 #definepcibios_assign_all_busses() 1
 
 #definePCIBIOS_MIN_IO  0x0100
diff --git a/arch/mips/include/asm/pci.h b/arch/mips/include/asm/pci.h
index 9ffc8192adae..3fd6e22c108b 100644
--- a/arch/mips/include/asm/pci.h
+++ b/arch/mips/include/asm/pci.h
@@ -139,10 +139,4 @@ static inline int pci_proc_domain(struct pci_bus *bus)
 /* Do platform specific device initialization at pci_enable_device() time */
 extern int pcibios_plat_dev_init(struct pci_dev *dev);
 
-/* 

[PATCH linux-next] powerpc: init jump label early in ppc 64

2022-07-23 Thread zhouzhouyi
From: Zhouyi Zhou 

In ppc 64, invoke jump_label_init in setup_feature_keys is too late
because static key will be used in subroutine of early_init_devtree.

So we can invoke jump_label_init earlier in early_setup.
We can not move setup_feature_keys backward because its subroutine
cpu_feature_keys_init depend on data structures initialized in
early_init_devtree.

Signed-off-by: Zhouyi Zhou 
---
Dear PPC developers

I found this bug when trying to do rcutorture tests in ppc VM of
Open Source Lab of Oregon State University.

qemu-system-ppc64 -nographic -smp cores=8,threads=1 -net none -M pseries 
-nodefaults -device spapr-vscsi -serial 
file:/home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/console.log
 -m 512 -kernel 
/home/ubuntu/linux-next/tools/testing/selftests/rcutorture/res/2022.07.19-01.18.42-torture/results-rcutorture/TREE03/vmlinux
 -append "debug_boot_weak_hash panic=-1 console=ttyS0 
rcupdate.rcu_cpu_stall_suppress_at_boot=1 torture.disable_onoff_at_boot 
rcupdate.rcu_task_stall_timeout=3 rcutorture.onoff_interval=200 
rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12 rcutree.gp_init_delay=3 
rcutree.gp_cleanup_delay=3 rcutree.kthread_prio=2 threadirqs tree.use_softirq=0 
rcutorture.n_barrier_cbs=4 rcutorture.stat_interval=15 
rcutorture.shutdown_secs=420 rcutorture.test_no_idle_hz=1 rcutorture.verbose=1"

console.log report following WARN:
[0.00][T0] static_key_enable_cpuslocked(): static key 
'0xc2953260' used before call to jump_label_init()^M
[0.00][T0] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 
static_key_enable_cpuslocked+0xfc/0x120^M
[0.00][T0] Modules linked in:^M
[0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.19.0-rc5-next-20220708-dirty #131^M
[0.00][T0] NIP:  c038068c LR: c0380688 CTR: 
c0186ac0^M
[0.00][T0] REGS: c2867930 TRAP: 0700   Not tainted  
(5.19.0-rc5-next-20220708-dirty)^M
[0.00][T0] MSR:  80022003   CR: 24282224  XER: 
2004^M
[0.00][T0] CFAR: 0730 IRQMASK: 1 ^M
[0.00][T0] GPR00: c0380688 c2867bd0 
c2868d00 0065 ^M
[0.00][T0] GPR04: 0001  
0080 000d ^M
[0.00][T0] GPR08:   
c27fd000 000f ^M
[0.00][T0] GPR12: c0186ac0 c2082280 
0003 000d ^M
[0.00][T0] GPR16: 02cc00d0  
c2082280 0001 ^M
[0.00][T0] GPR20: c2080942  
  ^M
[0.00][T0] GPR24:  c10d6168 
 c20034c8 ^M
[0.00][T0] GPR28: 0028  
c2080942 c2953260 ^M
[0.00][T0] NIP [c038068c] 
static_key_enable_cpuslocked+0xfc/0x120^M
[0.00][T0] LR [c0380688] 
static_key_enable_cpuslocked+0xf8/0x120^M
[0.00][T0] Call Trace:^M
[0.00][T0] [c2867bd0] [c0380688] 
static_key_enable_cpuslocked+0xf8/0x120 (unreliable)^M
[0.00][T0] [c2867c40] [c0380810] 
static_key_enable+0x30/0x50^M
[0.00][T0] [c2867c70] [c2030314] 
setup_forced_irqthreads+0x28/0x40^M
[0.00][T0] [c2867c90] [c2003568] 
do_early_param+0xa0/0x108^M
[0.00][T0] [c2867d10] [c0175340] 
parse_args+0x290/0x4e0^M
[0.00][T0] [c2867e10] [c2003c74] 
parse_early_options+0x48/0x5c^M
[0.00][T0] [c2867e30] [c2003ce0] 
parse_early_param+0x58/0x84^M
[0.00][T0] [c2867e60] [c2009878] 
early_init_devtree+0xd4/0x518^M
[0.00][T0] [c2867f10] [c200aee0] 
early_setup+0xb4/0x214^M

After this fix, the WARN does not show again.

Kind Regards
Zhouyi
--
 arch/powerpc/kernel/setup_64.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 2b2d0b0fbb30..bf2fb76221da 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -365,6 +365,9 @@ void __init early_setup(unsigned long dt_ptr)
 
udbg_printf(" -> %s(), dt_ptr: 0x%lx\n", __func__, dt_ptr);
 
+   /* Initialise jump label because subsequent calls need it */
+   jump_label_init();
+
/*
 * Do early initialization using the flattened device
 * tree, such as retrieving the physical memory map or
@@ -394,8 +397,15 @@ void __init early_setup(unsigned long dt_ptr)
 
/* Apply all the dynamic patching */
apply_feature_fixups();
-   setup_feature_keys();
+
+   /*
+* 

[PATCH 0/2] lib/nodemask: inline wrappers around bitmap

2022-07-23 Thread Yury Norov
On top of g...@github.com:/norov/linux.git bitmap-for-next.

There are just 2 functions in nodemask.c, both are thin wrappers around
bitmap API. 1st patch of this series drops dependency on  in archrandom.h
  lib/nodemask: inline next_node_in() and node_random()

 MAINTAINERS   |  1 -
 arch/powerpc/include/asm/archrandom.h |  9 +---
 arch/powerpc/kernel/setup-common.c| 11 ++
 include/linux/nodemask.h  | 27 +++-
 lib/Makefile  |  2 +-
 lib/nodemask.c| 30 ---
 6 files changed, 35 insertions(+), 45 deletions(-)
 delete mode 100644 lib/nodemask.c

-- 
2.34.1



[PATCH 1/2] powerpc: drop dependency on in archrandom.h

2022-07-23 Thread Yury Norov
archrandom.h includes  to refer ppc_md. This causes
circular header dependency, if generic nodemask.h  includes random.h:

In file included from include/linux/cred.h:16,
 from include/linux/seq_file.h:13,
 from arch/powerpc/include/asm/machdep.h:6,
 from arch/powerpc/include/asm/archrandom.h:5,
 from include/linux/random.h:109,
 from include/linux/nodemask.h:97,
 from include/linux/list_lru.h:12,
 from include/linux/fs.h:13,
 from include/linux/compat.h:17,
 from arch/powerpc/kernel/asm-offsets.c:12:
include/linux/sched.h:1203:9: error: unknown type name 'nodemask_t'
 1203 | nodemask_t  mems_allowed;
  | ^~

Fix it by removing  dependency from archrandom.h

Signed-off-by: Yury Norov 
---
 arch/powerpc/include/asm/archrandom.h |  9 +
 arch/powerpc/kernel/setup-common.c| 11 +++
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/include/asm/archrandom.h 
b/arch/powerpc/include/asm/archrandom.h
index 9a53e29680f4..21def59ef1a6 100644
--- a/arch/powerpc/include/asm/archrandom.h
+++ b/arch/powerpc/include/asm/archrandom.h
@@ -4,7 +4,7 @@
 
 #ifdef CONFIG_ARCH_RANDOM
 
-#include 
+bool __must_check arch_get_random_seed_long(unsigned long *v);
 
 static inline bool __must_check arch_get_random_long(unsigned long *v)
 {
@@ -16,13 +16,6 @@ static inline bool __must_check arch_get_random_int(unsigned 
int *v)
return false;
 }
 
-static inline bool __must_check arch_get_random_seed_long(unsigned long *v)
-{
-   if (ppc_md.get_random_seed)
-   return ppc_md.get_random_seed(v);
-
-   return false;
-}
 
 static inline bool __must_check arch_get_random_seed_int(unsigned int *v)
 {
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index eb0077b302e2..18c5fa5918bf 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -171,6 +171,17 @@ EXPORT_SYMBOL_GPL(machine_power_off);
 void (*pm_power_off)(void);
 EXPORT_SYMBOL_GPL(pm_power_off);
 
+#ifdef CONFIG_ARCH_RANDOM
+bool __must_check arch_get_random_seed_long(unsigned long *v)
+{
+   if (ppc_md.get_random_seed)
+   return ppc_md.get_random_seed(v);
+
+   return false;
+}
+EXPORT_SYMBOL(arch_get_random_seed_long);
+#endif
+
 void machine_halt(void)
 {
machine_shutdown();
-- 
2.34.1



[RESEND PATCH 2/2] lib/nodemask: inline next_node_in() and node_random()

2022-07-23 Thread Yury Norov
The functions are pretty thin wrappers around find_bit engine, and
keeping them in c-file prevents compiler from small_const_nbits()
optimization, which must take place for all systems with MAX_NUMNODES
less than BITS_PER_LONG (default is 16 for me).

Moving them to header file doesn't blow up the kernel size:
add/remove: 1/2 grow/shrink: 9/5 up/down: 968/-88 (880)

CC: Andy Shevchenko 
CC: Rasmus Villemoes 
Signed-off-by: Yury Norov 
---
 MAINTAINERS  |  1 -
 include/linux/nodemask.h | 27 ++-
 lib/Makefile |  2 +-
 lib/nodemask.c   | 30 --
 4 files changed, 23 insertions(+), 37 deletions(-)
 delete mode 100644 lib/nodemask.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 7c0b8f28aa25..19c8d0ef1177 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3540,7 +3540,6 @@ F:lib/bitmap.c
 F: lib/cpumask.c
 F: lib/find_bit.c
 F: lib/find_bit_benchmark.c
-F: lib/nodemask.c
 F: lib/test_bitmap.c
 F: tools/include/linux/bitmap.h
 F: tools/include/linux/find.h
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index 0f233b76c9ce..48ebe4007955 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -94,6 +94,7 @@
 #include 
 #include 
 #include 
+#include 
 
 typedef struct { DECLARE_BITMAP(bits, MAX_NUMNODES); } nodemask_t;
 extern nodemask_t _unused_nodemask_arg_;
@@ -276,7 +277,14 @@ static inline unsigned int __next_node(int n, const 
nodemask_t *srcp)
  * the first node in src if needed.  Returns MAX_NUMNODES if src is empty.
  */
 #define next_node_in(n, src) __next_node_in((n), &(src))
-unsigned int __next_node_in(int node, const nodemask_t *srcp);
+static inline unsigned int __next_node_in(int node, const nodemask_t *srcp)
+{
+   unsigned int ret = __next_node(node, srcp);
+
+   if (ret == MAX_NUMNODES)
+   ret = __first_node(srcp);
+   return ret;
+}
 
 static inline void init_nodemask_of_node(nodemask_t *mask, int node)
 {
@@ -493,14 +501,23 @@ static inline int num_node_state(enum node_states state)
 
 #endif
 
+/*
+ * Return the bit number of a random bit set in the nodemask.
+ * (returns NUMA_NO_NODE if nodemask is empty)
+ */
+static inline int node_random(const nodemask_t *maskp)
+{
 #if defined(CONFIG_NUMA) && (MAX_NUMNODES > 1)
-extern int node_random(const nodemask_t *maskp);
+   int w, bit = NUMA_NO_NODE;
+
+   w = nodes_weight(*maskp);
+   if (w)
+   bit = find_nth_bit(maskp->bits, MAX_NUMNODES, get_random_int() 
% w);
+   return bit;
 #else
-static inline int node_random(const nodemask_t *mask)
-{
return 0;
-}
 #endif
+}
 
 #define node_online_mapnode_states[N_ONLINE]
 #define node_possible_map  node_states[N_POSSIBLE]
diff --git a/lib/Makefile b/lib/Makefile
index f99bf61f8bbc..731cea0342d1 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -33,7 +33,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 flex_proportions.o ratelimit.o show_mem.o \
 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
 earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
-nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \
+nmi_backtrace.o win_minmax.o memcat_p.o \
 buildid.o
 
 lib-$(CONFIG_PRINTK) += dump_stack.o
diff --git a/lib/nodemask.c b/lib/nodemask.c
deleted file mode 100644
index 7dad4ce8ff59..
--- a/lib/nodemask.c
+++ /dev/null
@@ -1,30 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-#include 
-#include 
-#include 
-
-unsigned int __next_node_in(int node, const nodemask_t *srcp)
-{
-   unsigned int ret = __next_node(node, srcp);
-
-   if (ret == MAX_NUMNODES)
-   ret = __first_node(srcp);
-   return ret;
-}
-EXPORT_SYMBOL(__next_node_in);
-
-#ifdef CONFIG_NUMA
-/*
- * Return the bit number of a random bit set in the nodemask.
- * (returns NUMA_NO_NODE if nodemask is empty)
- */
-int node_random(const nodemask_t *maskp)
-{
-   int w, bit = NUMA_NO_NODE;
-
-   w = nodes_weight(*maskp);
-   if (w)
-   bit = find_nth_bit(maskp->bits, MAX_NUMNODES, get_random_int() 
% w);
-   return bit;
-}
-#endif
-- 
2.34.1



Re: [PATCH] powerpc: Remove the static variable initialisations to 0

2022-07-23 Thread Segher Boessenkool
On Sat, Jul 23, 2022 at 03:34:05PM +0200, Michal Suchánek wrote:
> Hello,
> 
> On Sat, Jul 23, 2022 at 05:24:36PM +0800, Jason Wang wrote:
> > Initialise global and static variable to 0 is always unnecessary.
> > Remove the unnecessary initialisations.
> 
> Isn't this change also unnecessary?
> 
> Initializing to 0 does not affect correctness, or even any kind of
> semantics in any way.

It did make a difference when the kernel was still compiled with
-fcommon (which used to be the GCC default on most configurations, it is
traditional on Unix).  No explicit initialiser puts an object in .bss if
you use -fcommon.  This matters a bit for data layout.

> The current code is slightly easier to understand.
> 
> And changing the code introduces history noise for na gain.

Yup.

This does give you some code golf points of course ;-)


Segher


Re: Regression: Linux v5.15+ does not boot on Freescale P2020

2022-07-23 Thread Pali Rohár
Hello,

On Saturday 23 July 2022 14:42:22 Christophe Leroy wrote:
> Hello,
> 
> Le 22/07/2022 à 11:09, Pali Rohár a écrit :
> > Hello!
> > 
> > Trying to boot mainline Linux kernel v5.15+, including current version
> > from master branch, on Freescale P2020 does not work. Kernel does not
> > print anything to serial console, seems that it does not work and after
> > timeout watchdog reset the board.
> 
> Can you provide more information ? Which defconfig or .config, which 
> version of gcc, etc ... ?

I used default defconfig for mpc85xx with gcc 8, compilation for e500
cores.

If you need exact .config content I can send it during week.

> > 
> > I run git bisect and it found following commit:
> > 
> > 9401f4e46cf6965e23738f70e149172344a01eef is the first bad commit
> > commit 9401f4e46cf6965e23738f70e149172344a01eef
> > Author: Christophe Leroy 
> > Date:   Tue Mar 2 08:48:11 2021 +
> > 
> >  powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros
> > 
> >  Force the eh flag at 0 on PPC32.
> > 
> >  Signed-off-by: Christophe Leroy 
> >  Signed-off-by: Michael Ellerman 
> >  Link: 
> > https://lore.kernel.org/r/1fc81f07cabebb875b963e295408cc3dd38c8d85.1614674882.git.christophe.le...@csgroup.eu
> > 
> > :04 04 fe6747e45736dfcba74914a9445e5f70f5120600 
> > 96358d08b65d3200928a973efb5b969b3d45f2b0 M  arch
> > 
> > 
> > If I revert this commit then kernel boots correctly. It also boots fine
> > if I revert this commit on top of master branch.
> > 
> > Freescale P2020 has two 32-bit e500 powerpc cores.
> > 
> > Any idea why above commit is causing crash of the kernel? And why it is
> > needed? Could eh flag set to 0 cause deadlock?
> 
> Setting the eh flag to 0 is not supposed to be a change introduced by 
> that commit. Indeed that commit is not supposed to change anything at 
> all in the generated code.

My understanding of that commit is that it changed eh flag parameter
from 1 to 0 for 32-bit powerpc, including also p2020.

> Christophe
> 
> > 
> > I have looked into e500 Reference Manual for lwarx instruction (page 562)
> > https://www.nxp.com/files-static/32bit/doc/ref_manual/EREF_RM.pdf and
> > both 0 and 1 values for EH flag should be supported.


Re: Regression: Linux v5.15+ does not boot on Freescale P2020

2022-07-23 Thread Christophe Leroy
Hello,

Le 22/07/2022 à 11:09, Pali Rohár a écrit :
> Hello!
> 
> Trying to boot mainline Linux kernel v5.15+, including current version
> from master branch, on Freescale P2020 does not work. Kernel does not
> print anything to serial console, seems that it does not work and after
> timeout watchdog reset the board.

Can you provide more information ? Which defconfig or .config, which 
version of gcc, etc ... ?

> 
> I run git bisect and it found following commit:
> 
> 9401f4e46cf6965e23738f70e149172344a01eef is the first bad commit
> commit 9401f4e46cf6965e23738f70e149172344a01eef
> Author: Christophe Leroy 
> Date:   Tue Mar 2 08:48:11 2021 +
> 
>  powerpc: Use lwarx/ldarx directly instead of PPC_LWARX/LDARX macros
> 
>  Force the eh flag at 0 on PPC32.
> 
>  Signed-off-by: Christophe Leroy 
>  Signed-off-by: Michael Ellerman 
>  Link: 
> https://lore.kernel.org/r/1fc81f07cabebb875b963e295408cc3dd38c8d85.1614674882.git.christophe.le...@csgroup.eu
> 
> :04 04 fe6747e45736dfcba74914a9445e5f70f5120600 
> 96358d08b65d3200928a973efb5b969b3d45f2b0 M  arch
> 
> 
> If I revert this commit then kernel boots correctly. It also boots fine
> if I revert this commit on top of master branch.
> 
> Freescale P2020 has two 32-bit e500 powerpc cores.
> 
> Any idea why above commit is causing crash of the kernel? And why it is
> needed? Could eh flag set to 0 cause deadlock?

Setting the eh flag to 0 is not supposed to be a change introduced by 
that commit. Indeed that commit is not supposed to change anything at 
all in the generated code.

Christophe

> 
> I have looked into e500 Reference Manual for lwarx instruction (page 562)
> https://www.nxp.com/files-static/32bit/doc/ref_manual/EREF_RM.pdf and
> both 0 and 1 values for EH flag should be supported.

Re: [PATCH] powerpc: Remove the static variable initialisations to 0

2022-07-23 Thread Michal Suchánek
Hello,

On Sat, Jul 23, 2022 at 05:24:36PM +0800, Jason Wang wrote:
> Initialise global and static variable to 0 is always unnecessary.
> Remove the unnecessary initialisations.

Isn't this change also unnecessary?

Initializing to 0 does not affect correctness, or even any kind of
semantics in any way.

The current code is slightly easier to understand.

And changing the code introduces history noise for na gain.

Thanks

Michal

> 
> Signed-off-by: Jason Wang 
> ---
>  arch/powerpc/kexec/core_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
> index c2bea9db1c1e..2407214e3f41 100644
> --- a/arch/powerpc/kexec/core_64.c
> +++ b/arch/powerpc/kexec/core_64.c
> @@ -135,7 +135,7 @@ notrace void kexec_copy_flush(struct kimage *image)
>  
>  #ifdef CONFIG_SMP
>  
> -static int kexec_all_irq_disabled = 0;
> +static int kexec_all_irq_disabled;
>  
>  static void kexec_smp_down(void *arg)
>  {
> -- 
> 2.35.1
> 


[PATCH v2 3/3] powerpc/pseries: Override lib/arch_vars.c with PowerPC architecture specific version

2022-07-23 Thread Nayna Jain
From: Greg Joyce 

Self Encrypting Drives(SED) make use of POWER LPAR Platform KeyStore for
storing its variables. Thus the block subsystem needs to access
PowerPC specific functions to read/write objects in PLPKS.

Override the default implementations in lib/arch_vars.c file with
PowerPC specific versions.

Signed-off-by: Greg Joyce 
---
 arch/powerpc/platforms/pseries/Makefile   |   1 +
 .../platforms/pseries/plpks_arch_ops.c| 166 ++
 2 files changed, 167 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/plpks_arch_ops.c

diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 14e143b946a3..3a545422eae5 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -29,6 +29,7 @@ obj-$(CONFIG_PPC_SPLPAR)  += vphn.o
 obj-$(CONFIG_PPC_SVM)  += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
 obj-$(CONFIG_PSERIES_PLPKS) += plpks.o
+obj-$(CONFIG_PSERIES_PLPKS) += plpks_arch_ops.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
 obj-$(CONFIG_PPC_VAS)  += vas.o vas-sysfs.o
diff --git a/arch/powerpc/platforms/pseries/plpks_arch_ops.c 
b/arch/powerpc/platforms/pseries/plpks_arch_ops.c
new file mode 100644
index ..48fa19f0c9c5
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/plpks_arch_ops.c
@@ -0,0 +1,166 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * POWER Platform arch specific code for SED
+ * Copyright (C) 2022 IBM Corporation
+ *
+ * Define operations for generic kernel subsystems to read/write keys from
+ * POWER LPAR Platform KeyStore(PLPKS).
+ *
+ * List of subsystems/usecase using PLPKS:
+ * - Self Encrypting Drives(SED)
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "plpks.h"
+
+/*
+ * variable structure that contains all SED data
+ */
+struct plpks_sed_object_data {
+   u_char version;
+   u_char pad1[7];
+   u_long authority;
+   u_long range;
+   u_int  key_len;
+   u_char key[32];
+};
+
+/*
+ * ext_type values
+ * 00no extension exists
+ * 01-1F common
+ * 20-3F AIX
+ * 40-5F Linux
+ * 60-7F IBMi
+ */
+
+/*
+ * This extension is optional for version 1 sed_object_data
+ */
+struct sed_object_extension {
+   u8 ext_type;
+   u8 rsvd[3];
+   u8 ext_data[64];
+};
+
+#define PKS_SED_OBJECT_DATA_V1  1
+#define PKS_SED_MANGLED_LABEL   "/default/pri"
+#define PLPKS_SED_COMPONENT "sed-opal"
+#define PLPKS_SED_POLICYWORLDREADABLE
+#define PLPKS_SED_OS_COMMON 4
+
+#ifndef CONFIG_BLK_SED_OPAL
+#defineOPAL_AUTH_KEY   ""
+#endif
+
+/*
+ * Read the variable data from PKS given the label
+ */
+int arch_read_variable(enum arch_variable_type type, char *varname,
+  void *varbuf, u_int *varlen)
+{
+   struct plpks_var var;
+   struct plpks_sed_object_data *data;
+   u_int offset = 0;
+   char *buf = (char *)varbuf;
+   int ret;
+
+   var.name = varname;
+   var.namelen = strlen(varname);
+   var.policy = PLPKS_SED_POLICY;
+   var.os = PLPKS_SED_OS_COMMON;
+   var.data = NULL;
+   var.datalen = 0;
+
+   switch (type) {
+   case ARCH_VAR_OPAL_KEY:
+   var.component = PLPKS_SED_COMPONENT;
+   if (strcmp(OPAL_AUTH_KEY, varname) == 0) {
+   var.name = PKS_SED_MANGLED_LABEL;
+   var.namelen = strlen(varname);
+   }
+   offset = offsetof(struct plpks_sed_object_data, key);
+   break;
+   case ARCH_VAR_OTHER:
+   var.component = "";
+   break;
+   }
+
+   ret = plpks_read_os_var();
+   if (ret != 0)
+   return ret;
+
+   if (offset > var.datalen)
+   offset = 0;
+
+   switch (type) {
+   case ARCH_VAR_OPAL_KEY:
+   data = (struct plpks_sed_object_data *)var.data;
+   *varlen = data->key_len;
+   break;
+   case ARCH_VAR_OTHER:
+   *varlen = var.datalen;
+   break;
+   }
+
+   if (var.data) {
+   memcpy(varbuf, var.data + offset, var.datalen - offset);
+   buf[*varlen] = '\0';
+   kfree(var.data);
+   }
+
+   return 0;
+}
+
+/*
+ * Write the variable data to PKS given the label
+ */
+int arch_write_variable(enum arch_variable_type type, char *varname,
+   void *varbuf, u_int varlen)
+{
+   struct plpks_var var;
+   struct plpks_sed_object_data data;
+   struct plpks_var_name vname;
+
+   var.name = varname;
+   var.namelen = strlen(varname);
+   var.policy = PLPKS_SED_POLICY;
+   var.os = PLPKS_SED_OS_COMMON;
+   var.datalen = varlen;
+   var.data = varbuf;
+
+   switch (type) {
+   case ARCH_VAR_OPAL_KEY:
+   

[PATCH v2 2/3] lib: define generic accessor functions for arch specific keystore

2022-07-23 Thread Nayna Jain
From: Greg Joyce 

Generic kernel subsystems may rely on platform specific persistent
KeyStore to store objects containing sensitive key material. In such case,
they need to access architecture specific functions to perform read/write
operations on these variables.

Define the generic variable read/write prototypes to be implemented by
architecture specific versions. The default(weak) implementations of
these prototypes return -EOPNOTSUPP unless overridden by architecture
versions.

Signed-off-by: Greg Joyce 
---
 include/linux/arch_vars.h | 23 +++
 lib/Makefile  |  2 +-
 lib/arch_vars.c   | 25 +
 3 files changed, 49 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/arch_vars.h
 create mode 100644 lib/arch_vars.c

diff --git a/include/linux/arch_vars.h b/include/linux/arch_vars.h
new file mode 100644
index ..9c280ff9432e
--- /dev/null
+++ b/include/linux/arch_vars.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Platform variable opearations.
+ *
+ * Copyright (C) 2022 IBM Corporation
+ *
+ * These are the accessor functions (read/write) for architecture specific
+ * variables. Specific architectures can provide overrides.
+ *
+ */
+
+#include 
+
+enum arch_variable_type {
+   ARCH_VAR_OPAL_KEY  = 0, /* SED Opal Authentication Key */
+   ARCH_VAR_OTHER = 1, /* Other type of variable */
+   ARCH_VAR_MAX   = 1, /* Maximum type value */
+};
+
+int arch_read_variable(enum arch_variable_type type, char *varname,
+  void *varbuf, u_int *varlen);
+int arch_write_variable(enum arch_variable_type type, char *varname,
+   void *varbuf, u_int varlen);
diff --git a/lib/Makefile b/lib/Makefile
index f99bf61f8bbc..b90c4cb0dbbb 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -48,7 +48,7 @@ obj-y += bcd.o sort.o parser.o debug_locks.o random32.o \
 bsearch.o find_bit.o llist.o memweight.o kfifo.o \
 percpu-refcount.o rhashtable.o \
 once.o refcount.o usercopy.o errseq.o bucket_locks.o \
-generic-radix-tree.o
+generic-radix-tree.o arch_vars.o
 obj-$(CONFIG_STRING_SELFTEST) += test_string.o
 obj-y += string_helpers.o
 obj-$(CONFIG_TEST_STRING_HELPERS) += test-string_helpers.o
diff --git a/lib/arch_vars.c b/lib/arch_vars.c
new file mode 100644
index ..e6f16d7d09c1
--- /dev/null
+++ b/lib/arch_vars.c
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Platform variable operations.
+ *
+ * Copyright (C) 2022 IBM Corporation
+ *
+ * These are the accessor functions (read/write) for architecture specific
+ * variables. Specific architectures can provide overrides.
+ *
+ */
+
+#include 
+#include 
+
+int __weak arch_read_variable(enum arch_variable_type type, char *varname,
+ void *varbuf, u_int *varlen)
+{
+   return -EOPNOTSUPP;
+}
+
+int __weak arch_write_variable(enum arch_variable_type type, char *varname,
+  void *varbuf, u_int varlen)
+{
+   return -EOPNOTSUPP;
+}
-- 
2.27.0



[PATCH v2 1/3] powerpc/pseries: define driver for Platform KeyStore

2022-07-23 Thread Nayna Jain
PowerVM provides an isolated Platform Keystore(PKS) storage allocation
for each LPAR with individually managed access controls to store
sensitive information securely. It provides a new set of hypervisor
calls for Linux kernel to access PKS storage.

Define POWER LPAR Platform KeyStore(PLPKS) driver using H_CALL interface
to access PKS storage.

Signed-off-by: Nayna Jain 
---
 arch/powerpc/include/asm/hvcall.h   |  11 +
 arch/powerpc/platforms/pseries/Kconfig  |  13 +
 arch/powerpc/platforms/pseries/Makefile |   1 +
 arch/powerpc/platforms/pseries/plpks.c  | 460 
 arch/powerpc/platforms/pseries/plpks.h  |  71 
 5 files changed, 556 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/plpks.c
 create mode 100644 arch/powerpc/platforms/pseries/plpks.h

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index d92a20a85395..9f707974af1a 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -79,6 +79,7 @@
 #define H_NOT_ENOUGH_RESOURCES -44
 #define H_R_STATE   -45
 #define H_RESCINDED -46
+#define H_P1   -54
 #define H_P2   -55
 #define H_P3   -56
 #define H_P4   -57
@@ -97,6 +98,8 @@
 #define H_OP_MODE  -73
 #define H_COP_HW   -74
 #define H_STATE-75
+#define H_IN_USE   -77
+#define H_ABORTED  -78
 #define H_UNSUPPORTED_FLAG_START   -256
 #define H_UNSUPPORTED_FLAG_END -511
 #define H_MULTI_THREADS_ACTIVE -9005
@@ -321,6 +324,14 @@
 #define H_SCM_UNBIND_ALL0x3FC
 #define H_SCM_HEALTH0x400
 #define H_SCM_PERFORMANCE_STATS 0x418
+#define H_PKS_GET_CONFIG   0x41C
+#define H_PKS_SET_PASSWORD 0x420
+#define H_PKS_GEN_PASSWORD 0x424
+#define H_PKS_WRITE_OBJECT 0x42C
+#define H_PKS_GEN_KEY  0x430
+#define H_PKS_READ_OBJECT  0x434
+#define H_PKS_REMOVE_OBJECT0x438
+#define H_PKS_CONFIRM_OBJECT_FLUSHED   0x43C
 #define H_RPT_INVALIDATE   0x448
 #define H_SCM_FLUSH0x44C
 #define H_GET_ENERGY_SCALE_INFO0x450
diff --git a/arch/powerpc/platforms/pseries/Kconfig 
b/arch/powerpc/platforms/pseries/Kconfig
index f7fd91d153a4..c4a6d4083a7a 100644
--- a/arch/powerpc/platforms/pseries/Kconfig
+++ b/arch/powerpc/platforms/pseries/Kconfig
@@ -142,6 +142,19 @@ config IBMEBUS
help
  Bus device driver for GX bus based adapters.
 
+config PSERIES_PLPKS
+   depends on PPC_PSERIES
+   bool "Support for the Platform Key Storage"
+   help
+ PowerVM provides an isolated Platform Keystore(PKS) storage
+ allocation for each LPAR with individually managed access
+ controls to store sensitive information securely. It can be
+ used to store asymmetric public keys or secrets as required
+ by different usecases. Select this config to enable
+ operating system interface to hypervisor to access this space.
+
+ If unsure, select N.
+
 config PAPR_SCM
depends on PPC_PSERIES && MEMORY_HOTPLUG && LIBNVDIMM
tristate "Support for the PAPR Storage Class Memory interface"
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index 7aaff5323544..14e143b946a3 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -28,6 +28,7 @@ obj-$(CONFIG_PAPR_SCM)+= papr_scm.o
 obj-$(CONFIG_PPC_SPLPAR)   += vphn.o
 obj-$(CONFIG_PPC_SVM)  += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
+obj-$(CONFIG_PSERIES_PLPKS) += plpks.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
 obj-$(CONFIG_PPC_VAS)  += vas.o vas-sysfs.o
diff --git a/arch/powerpc/platforms/pseries/plpks.c 
b/arch/powerpc/platforms/pseries/plpks.c
new file mode 100644
index ..52aaa2894606
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/plpks.c
@@ -0,0 +1,460 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * POWER LPAR Platform KeyStore(PLPKS)
+ * Copyright (C) 2022 IBM Corporation
+ * Author: Nayna Jain 
+ *
+ * Provides access to variables stored in Power LPAR Platform KeyStore(PLPKS).
+ */
+
+#define pr_fmt(fmt) "plpks: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "plpks.h"
+
+#define PKS_FW_OWNER0x1
+#define PKS_BOOTLOADER_OWNER 0x2
+#define PKS_OS_OWNER0x3
+
+#define LABEL_VERSION  0
+#define MAX_LABEL_ATTR_SIZE 16
+#define MAX_NAME_SIZE  239
+#define MAX_DATA_SIZE  4000
+
+#define PKS_FLUSH_MAX_TIMEOUT 5000 //msec
+#define PKS_FLUSH_SLEEP  10 //msec
+#define PKS_FLUSH_SLEEP_RANGE 400
+
+static u8 *ospassword;
+static u16 ospasswordlength;
+
+// Retrieved with H_PKS_GET_CONFIG
+static u16 maxpwsize;
+static u16 maxobjsize;
+
+struct plpks_auth {
+   u8 version;
+   u8 consumer;
+   __be64 rsvd0;
+   __be32 rsvd1;
+   __be16 passwordlength;
+   u8 password[];
+} 

[PATCH v2 0/3] Provide PowerVM LPAR Platform KeyStore driver for Self Encrypting Drives

2022-07-23 Thread Nayna Jain
PowerVM provides an isolated Platform KeyStore(PKS)[1] storage allocation
for each partition(LPAR) with individually managed access controls to store
sensitive information securely. The Linux Kernel can access this storage by
interfacing with the hypervisor using a new set of hypervisor calls. 

This storage can be used for multiple purposes. The current two usecases
are:

1. Guest Secure Boot on PowerVM[2]
2. Self Encrypting Drives(SED) on PowerVM[3]

Initially, the PowerVM LPAR Platform KeyStore(PLPKS) driver was defined
as part of RFC patches which included the user interface design for guest
secure boot[2]. While this interface is still in progress, the same driver
is also required for Self Encrypting Drives(SED) support. For this reason,
the driver is being split from the patchset[1] and is now separately posted
with SED arch-specific code.

This patchset provides driver for PowerVM LPAR Platform KeyStore and also
arch-specific code for SED to make use of it.

The dependency patch from patch series[3] is moved to this patchset. This
patchset now builds completely of its own.

[1]https://community.ibm.com/community/user/power/blogs/chris-engel1/2020/11/20/powervm-introduces-the-platform-keystore
[2]https://lore.kernel.org/linuxppc-dev/20220622215648.96723-1-na...@linux.ibm.com/
[3]https://lore.kernel.org/keyrings/20220718210156.1535955-1-gjo...@linux.vnet.ibm.com/T/#m8e7b2cbbd26ee1de711bd70967fd0124c85c479f

Changelog:

v2:

* Include feedback from Gregory Joyce, Eric Richter and Murilo Opsfelder Araújo.
* Include suggestions from Michael Ellerman.
* Moved a dependency from generic SED code to this patchset. This patchset now
builds of its own.

Greg Joyce (2):
  lib: define generic accessor functions for arch specific keystore
  powerpc/pseries: Override lib/arch_vars.c with PowerPC architecture
specific version

Nayna Jain (1):
  powerpc/pseries: define driver for Platform KeyStore

 arch/powerpc/include/asm/hvcall.h |  11 +
 arch/powerpc/platforms/pseries/Kconfig|  13 +
 arch/powerpc/platforms/pseries/Makefile   |   2 +
 arch/powerpc/platforms/pseries/plpks.c| 460 ++
 arch/powerpc/platforms/pseries/plpks.h|  71 +++
 .../platforms/pseries/plpks_arch_ops.c| 166 +++
 include/linux/arch_vars.h |  23 +
 lib/Makefile  |   2 +-
 lib/arch_vars.c   |  25 +
 9 files changed, 772 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/pseries/plpks.c
 create mode 100644 arch/powerpc/platforms/pseries/plpks.h
 create mode 100644 arch/powerpc/platforms/pseries/plpks_arch_ops.c
 create mode 100644 include/linux/arch_vars.h
 create mode 100644 lib/arch_vars.c

-- 
2.27.0


[PATCH] powerpc: Remove the static variable initialisations to 0

2022-07-23 Thread Jason Wang
Initialise global and static variable to 0 is always unnecessary.
Remove the unnecessary initialisations.

Signed-off-by: Jason Wang 
---
 arch/powerpc/kexec/core_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kexec/core_64.c b/arch/powerpc/kexec/core_64.c
index c2bea9db1c1e..2407214e3f41 100644
--- a/arch/powerpc/kexec/core_64.c
+++ b/arch/powerpc/kexec/core_64.c
@@ -135,7 +135,7 @@ notrace void kexec_copy_flush(struct kimage *image)
 
 #ifdef CONFIG_SMP
 
-static int kexec_all_irq_disabled = 0;
+static int kexec_all_irq_disabled;
 
 static void kexec_smp_down(void *arg)
 {
-- 
2.35.1



Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-23 Thread xhao



On 7/20/22 7:18 PM, Barry Song wrote:

On Tue, Jul 19, 2022 at 1:28 AM Yicong Yang  wrote:

On 2022/7/14 12:51, Barry Song wrote:

On Thu, Jul 14, 2022 at 3:29 PM Xin Hao  wrote:

Hi barry.

I do some test on Kunpeng arm64 machine use Unixbench.

The test  result as below.

One core, we can see the performance improvement above +30%.

I am really pleased to see the 30%+ improvement on unixbench on single core.


./Run -c 1 -i 1 shell1
w/o
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 5481.0 1292.7

System Benchmarks Index Score (Partial Only) 1292.7

w/
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 6974.6 1645.0

System Benchmarks Index Score (Partial Only) 1645.0


But with whole cores, there have little performance degradation above -5%

That is sad as we might get more concurrency between mprotect(), madvise(),
mremap(), zap_pte_range() and the deferred tlbi.


./Run -c 96 -i 1 shell1
w/o
Shell Scripts (1 concurrent)  80765.5 lpm   (60.0 s, 1
samples)
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 80765.5 19048.5

System Benchmarks Index Score (Partial Only)19048.5

w
Shell Scripts (1 concurrent)  76333.6 lpm   (60.0 s, 1
samples)
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 76333.6 18003.2

System Benchmarks Index Score (Partial Only)18003.2

--


After discuss with you, and do some changes in the patch.

ndex a52381a680db..1ecba81f1277 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -727,7 +727,11 @@ void flush_tlb_batched_pending(struct mm_struct *mm)
  int flushed = batch >> TLB_FLUSH_BATCH_FLUSHED_SHIFT;

  if (pending != flushed) {
+#ifdef CONFIG_ARCH_HAS_MM_CPUMASK
  flush_tlb_mm(mm);
+#else
+   dsb(ish);
+#endif


i was guessing the problem might be flush_tlb_batched_pending()
so i asked you to change this to verify my guess.


flush_tlb_batched_pending() looks like the critical path for this issue then 
the code
above can mitigate this.

I cannot reproduce this on a 2P 128C Kunpeng920 server. The kernel is based on 
the
v5.19-rc6 and unixbench of version 5.1.3. The result of `./Run -c 128 -i 1 
shell1` is:
   iter-1  iter-2 iter-3
w/o  17708.1 17637.117630.1
w17766.0 17752.317861.7

And flush_tlb_batched_pending()isn't the hot spot with the patch:
7.00%  sh[kernel.kallsyms]  [k] ptep_clear_flush
4.17%  sh[kernel.kallsyms]  [k] ptep_set_access_flags
2.43%  multi.sh  [kernel.kallsyms]  [k] ptep_clear_flush
1.98%  sh[kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
1.69%  sh[kernel.kallsyms]  [k] next_uptodate_page
1.66%  sort  [kernel.kallsyms]  [k] ptep_clear_flush
1.56%  multi.sh  [kernel.kallsyms]  [k] ptep_set_access_flags
1.27%  sh[kernel.kallsyms]  [k] page_counter_cancel
1.11%  sh[kernel.kallsyms]  [k] page_remove_rmap
1.06%  sh[kernel.kallsyms]  [k] perf_event_alloc

Hi Xin Hao,

I'm not sure the test setup as well as the config is same with yours. (96C vs 
128C
should not be the reason I think). Did you check that the 5% is a fluctuation or
not? It'll be helpful if more information provided for reproducing this issue.

Thanks.

I guess that is because  "./Run -c 1 -i 1 shell1" isn't an application
stressed on
memory. Hi Xin, in what kinds of configurations can we reproduce your test
result?


Oh, my fault, I do the test is not based on the lastest upstream kernel, there 
maybe some impact here,
i will do a new test on the lastest kernel.


As I suppose tlbbatch will mainly affect the performance of user scenarios
which require memory page-out/page-in like reclaiming file/anon pages.
"./Run -c 1 -i 1 shell1" on a system with sufficient free memory won't be
affected by tlbbatch at all, I believe.

Thanks
Barry


--
Best Regards!
Xin Hao



Re: [PATCH v2 0/4] mm: arm64: bring up BATCHED_UNMAP_TLB_FLUSH

2022-07-23 Thread xhao



On 7/18/22 9:28 PM, Yicong Yang wrote:

On 2022/7/14 12:51, Barry Song wrote:

On Thu, Jul 14, 2022 at 3:29 PM Xin Hao  wrote:

Hi barry.

I do some test on Kunpeng arm64 machine use Unixbench.

The test  result as below.

One core, we can see the performance improvement above +30%.

I am really pleased to see the 30%+ improvement on unixbench on single core.


./Run -c 1 -i 1 shell1
w/o
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 5481.0 1292.7

System Benchmarks Index Score (Partial Only) 1292.7

w/
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 6974.6 1645.0

System Benchmarks Index Score (Partial Only) 1645.0


But with whole cores, there have little performance degradation above -5%

That is sad as we might get more concurrency between mprotect(), madvise(),
mremap(), zap_pte_range() and the deferred tlbi.


./Run -c 96 -i 1 shell1
w/o
Shell Scripts (1 concurrent)  80765.5 lpm   (60.0 s, 1
samples)
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 80765.5 19048.5

System Benchmarks Index Score (Partial Only)19048.5

w
Shell Scripts (1 concurrent)  76333.6 lpm   (60.0 s, 1
samples)
System Benchmarks Partial Index  BASELINE RESULT INDEX
Shell Scripts (1 concurrent) 42.4 76333.6 18003.2

System Benchmarks Index Score (Partial Only)18003.2

--


After discuss with you, and do some changes in the patch.

ndex a52381a680db..1ecba81f1277 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -727,7 +727,11 @@ void flush_tlb_batched_pending(struct mm_struct *mm)
  int flushed = batch >> TLB_FLUSH_BATCH_FLUSHED_SHIFT;

  if (pending != flushed) {
+#ifdef CONFIG_ARCH_HAS_MM_CPUMASK
  flush_tlb_mm(mm);
+#else
+   dsb(ish);
+#endif


i was guessing the problem might be flush_tlb_batched_pending()
so i asked you to change this to verify my guess.


flush_tlb_batched_pending() looks like the critical path for this issue then 
the code
above can mitigate this.

I cannot reproduce this on a 2P 128C Kunpeng920 server. The kernel is based on 
the
v5.19-rc6 and unixbench of version 5.1.3. The result of `./Run -c 128 -i 1 
shell1` is:
   iter-1  iter-2 iter-3
w/o  17708.1 17637.117630.1
w17766.0 17752.317861.7

And flush_tlb_batched_pending()isn't the hot spot with the patch:
7.00%  sh[kernel.kallsyms]  [k] ptep_clear_flush
4.17%  sh[kernel.kallsyms]  [k] ptep_set_access_flags
2.43%  multi.sh  [kernel.kallsyms]  [k] ptep_clear_flush
1.98%  sh[kernel.kallsyms]  [k] _raw_spin_unlock_irqrestore
1.69%  sh[kernel.kallsyms]  [k] next_uptodate_page
1.66%  sort  [kernel.kallsyms]  [k] ptep_clear_flush
1.56%  multi.sh  [kernel.kallsyms]  [k] ptep_set_access_flags
1.27%  sh[kernel.kallsyms]  [k] page_counter_cancel
1.11%  sh[kernel.kallsyms]  [k] page_remove_rmap
1.06%  sh[kernel.kallsyms]  [k] perf_event_alloc

Hi Xin Hao,

I'm not sure the test setup as well as the config is same with yours. (96C vs 
128C
should not be the reason I think). Did you check that the 5% is a fluctuation or
not? It'll be helpful if more information provided for reproducing this issue.

Yes, not always the 5% reduce,  there exist a fluctuation.


Thanks.


  /*

   * If the new TLB flushing is pending during flushing, leave
   * mm->tlb_flush_batched as is, to avoid losing flushing.

there have a performance improvement with whole cores, above +30%

But I don't think it is a proper patch. There is no guarantee the cpu calling
flush_tlb_batched_pending is exactly the cpu sending the deferred
tlbi. so the solution is unsafe. But since this temporary code can bring the
30%+ performance improvement back for high concurrency, we have huge
potential to finally make it.

Unfortunately I don't have an arm64 server to debug on this. I only have
8 cores which are unlikely to reproduce regression which happens in
high concurrency with 96 parallel tasks.

So I'd ask if @yicong or someone else working on kunpeng or other
arm64 servers  is able to actually debug and figure out a proper
patch for this, then add the patch as 5/5 into this series?


./Run -c 96 -i 1 shell1
96 CPUs in system; running 96 parallel copies of tests

Shell Scripts (1 concurrent) 109229.0 lpm   (60.0 s, 1 samples)
System Benchmarks Partial Index  BASELINE   RESULTINDEX
Shell Scripts (1 concurrent) 42.4