Re: [PATCH] selftests/powerpc: Remove the repeated declaration

2021-06-17 Thread Michael Ellerman
On Tue, 1 Jun 2021 14:36:25 +0800, Shaokun Zhang wrote:
> Function 'event_ebb_init' and 'event_leader_ebb_init' are declared
> twice in the header file, so remove the repeated declaration.

Applied to powerpc/next.

[1/1] selftests/powerpc: Remove the repeated declaration
  https://git.kernel.org/powerpc/c/8f6a54bcaf62a791a7bceccc093497f7f53e2e26

cheers


Re: [PATCH] powerpc: 52xx: add fallthrough in mpc52xx_wdt_ioctl()

2021-06-17 Thread Michael Ellerman
On Tue, 1 Jun 2021 12:02:00 -0700, t...@redhat.com wrote:
> With gcc 10.3, there is this compiler error
> compiler.h:56:26: error: this statement may
>   fall through [-Werror=implicit-fallthrough=]
> 
> mpc52xx_gpt.c:586:2: note: here
>   586 |  case WDIOC_GETTIMEOUT:
>   |  ^~~~
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: 52xx: add fallthrough in mpc52xx_wdt_ioctl()
  https://git.kernel.org/powerpc/c/b629f6c0ab8668a186fda2627296d0cbcc45a368

cheers


Re: [PATCH] powerpc/barrier: Avoid collision with clang's __lwsync macro

2021-06-17 Thread Michael Ellerman
On Fri, 28 May 2021 11:27:52 -0700, Nathan Chancellor wrote:
> A change in clang 13 results in the __lwsync macro being defined as
> __builtin_ppc_lwsync, which emits 'lwsync' or 'msync' depending on what
> the target supports. This breaks the build because of -Werror in
> arch/powerpc, along with thousands of warnings:
> 
>  In file included from arch/powerpc/kernel/pmc.c:12:
>  In file included from include/linux/bug.h:5:
>  In file included from arch/powerpc/include/asm/bug.h:109:
>  In file included from include/asm-generic/bug.h:20:
>  In file included from include/linux/kernel.h:12:
>  In file included from include/linux/bitops.h:32:
>  In file included from arch/powerpc/include/asm/bitops.h:62:
>  arch/powerpc/include/asm/barrier.h:49:9: error: '__lwsync' macro redefined 
> [-Werror,-Wmacro-redefined]
>  #define __lwsync()  __asm__ __volatile__ (stringify_in_c(LWSYNC) : : 
> :"memory")
> ^
>  :308:9: note: previous definition is here
>  #define __lwsync __builtin_ppc_lwsync
> ^
>  1 error generated.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/barrier: Avoid collision with clang's __lwsync macro
  https://git.kernel.org/powerpc/c/015d98149b326e0f1f02e44413112ca8b4330543

cheers


Re: [PATCH] powerpc/signal64: Don't read sigaction arguments back from user memory

2021-06-17 Thread Michael Ellerman
On Thu, 10 Jun 2021 17:29:49 +1000, Michael Ellerman wrote:
> When delivering a signal to a sigaction style handler (SA_SIGINFO), we
> pass pointers to the siginfo and ucontext via r4 and r5.
> 
> Currently we populate the values in those registers by reading the
> pointers out of the sigframe in user memory, even though the values in
> user memory were written by the kernel just prior:
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/signal64: Don't read sigaction arguments back from user memory
  https://git.kernel.org/powerpc/c/a3309226454a7e76d76251579c1183787694f303

cheers


Re: [PATCH -next] powerpc/spufs: disp: Remove set but not used variable 'dummy'

2021-06-17 Thread Michael Ellerman
On Tue, 1 Jun 2021 16:51:27 +0800, Baokun Li wrote:
> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> arch/powerpc/platforms/cell/spufs/switch.c: In function 'check_ppu_mb_stat':
> arch/powerpc/platforms/cell/spufs/switch.c:1660:6: warning:
> variable ‘dummy’ set but not used [-Wunused-but-set-variable]
> 
> arch/powerpc/platforms/cell/spufs/switch.c: In function 
> 'check_ppuint_mb_stat':
> arch/powerpc/platforms/cell/spufs/switch.c:1675:6: warning:
> variable ‘dummy’ set but not used [-Wunused-but-set-variable]
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/spufs: disp: Remove set but not used variable 'dummy'
  https://git.kernel.org/powerpc/c/911bacda4658129bee039dc90fc0c3f193ee2695

cheers


Re: [PATCH -next] powerpc/spider-pci: Remove set but not used variable 'val'

2021-06-17 Thread Michael Ellerman
On Tue, 1 Jun 2021 16:53:19 +0800, Baokun Li wrote:
> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> arch/powerpc/platforms/cell/spider-pci.c: In function 'spiderpci_io_flush':
> arch/powerpc/platforms/cell/spider-pci.c:28:6: warning:
> variable ‘val’ set but not used [-Wunused-but-set-variable]
> 
> It never used since introduction.

Applied to powerpc/next.

[1/1] powerpc/spider-pci: Remove set but not used variable 'val'
  https://git.kernel.org/powerpc/c/f377f7da26d2af87e2ddc39190546f62ecdb2bd8

cheers


Re: [PATCH v2 2/2] powerpc/ps3: Re-align DTB in image

2021-06-17 Thread Michael Ellerman
On Fri, 04 Jun 2021 15:58:25 +, Geoff Levand wrote:
> Change the PS3 linker script to align the DTB at 8 bytes,
> the same alignment as that of the of the 'generic' powerpc
> linker script.

Applied to powerpc/next.

[2/2] powerpc/ps3: Re-align DTB in image
  https://git.kernel.org/powerpc/c/ff4a825e4a24cdf7f840461ced6283bf865ab7be

cheers


Re: [PATCH v2 0/3] DMA fixes for PS3 device drivers

2021-06-17 Thread Michael Ellerman
On Thu, 03 Jun 2021 19:16:56 +, Geoff Levand wrote:
> This is a set of patches that fix various DMA related problems in the PS3
> device drivers, and add better error checking and improved message logging.
> 
> Changes from V1:
>   Split the V1 series into two, one series with powerpc changes, and one 
> series
>   with gelic network driver changes.
> 
> [...]

Applied to powerpc/next.

[1/3] powerpc/ps3: Add CONFIG_PS3_VERBOSE_RESULT option
  https://git.kernel.org/powerpc/c/6caebff168235b6102e5dc57cb95a2374301720a
[2/3] powerpc/ps3: Warn on PS3 device errors
  https://git.kernel.org/powerpc/c/472b440fd26822c645befe459172dafdc2d225de
[3/3] powerpc/ps3: Add dma_mask to ps3_dma_region
  https://git.kernel.org/powerpc/c/9733862e50fdba55e7f1554e4286fcc5302ff28e

cheers


Re: [PATCH v2 0/2] PS3 Updates

2021-06-17 Thread Michael Ellerman
On Fri, 04 Jun 2021 15:58:25 +, Geoff Levand wrote:
> I've rebased the V1 patches to v5.13-rc4, and moved the firmware version 
> export
> from procfs to sysfs/firmware.
> 
> Please consider.
> 
> -Geoff
> 
> [...]

Applied to powerpc/next.

[1/2] powerpc/ps3: Add firmware version to sysfs
  https://git.kernel.org/powerpc/c/07e2d6cf91079ca01db7fb989a02edd8009dcacd

cheers


Re: [PATCH v2] powerpc/tau: Remove superfluous parameter in alloc_workqueue() call

2021-06-17 Thread Michael Ellerman
On Fri, 11 Jun 2021 17:58:27 +1000, Finn Thain wrote:
> This avoids an (optional) compiler warning:
> 
> arch/powerpc/kernel/tau_6xx.c: In function 'TAU_init':
> arch/powerpc/kernel/tau_6xx.c:204:30: error: too many arguments for format 
> [-Werror=format-extra-args]
>   tau_workq = alloc_workqueue("tau", WQ_UNBOUND, 1, 0);

Applied to powerpc/next.

[1/1] powerpc/tau: Remove superfluous parameter in alloc_workqueue() call
  https://git.kernel.org/powerpc/c/ddf4a7bcd09439e82c4d6f959f4ff6c53f07466f

cheers


Re: [PATCH v2] powerpc: make stack walking KASAN-safe

2021-06-17 Thread Michael Ellerman
On Mon, 14 Jun 2021 22:09:07 +1000, Daniel Axtens wrote:
> Make our stack-walking code KASAN-safe by using __no_sanitize_address.
> Generic code, arm64, s390 and x86 all make accesses unchecked for similar
> sorts of reasons: when unwinding a stack, we might touch memory that KASAN
> has marked as being out-of-bounds. In ppc64 KASAN development, I hit this
> sometimes when checking for an exception frame - because we're checking
> an arbitrary offset into the stack frame.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: make stack walking KASAN-safe
  https://git.kernel.org/powerpc/c/b112fb913b5b5705db22efa90ec60f42518934af

cheers


Re: [PATCH v3 1/6] powerpc/nohash: Refactor update of BDI2000 pointers in switch_mmu_context()

2021-06-17 Thread Michael Ellerman
On Thu, 3 Jun 2021 09:29:02 + (UTC), Christophe Leroy wrote:
> Instead of duplicating the update of BDI2000 pointers in
> set_context(), do it directly from switch_mmu_context().

Applied to powerpc/next.

[1/6] powerpc/nohash: Refactor update of BDI2000 pointers in 
switch_mmu_context()
  https://git.kernel.org/powerpc/c/25910260ff69fa0c37e26541aac4e8f978e1f17f
[2/6] powerpc/nohash: Convert set_context() to C
  https://git.kernel.org/powerpc/c/a56ab7c7290f5922363d1ee11bbafc4da2b9bf51
[3/6] powerpc/nohash: Remove CONFIG_SMP #ifdefery in mmu_context.h
  https://git.kernel.org/powerpc/c/c13066e53aabd8f268f051d267270765e10343aa
[4/6] powerpc/nohash: Remove DEBUG_MAP_CONSISTENCY
  https://git.kernel.org/powerpc/c/dac3db1edf8b4c75859f07789f577322f2a51e3a
[5/6] powerpc/nohash: Remove DEBUG_CLAMP_LAST_CONTEXT
  https://git.kernel.org/powerpc/c/a36c0faf3dbc429d5ddcb941afe38dd6fe6c5901
[6/6] powerpc/nohash: Remove DEBUG_HARDER
  https://git.kernel.org/powerpc/c/e2c043163d44f7b3a9e65d9161af72b647b18451

cheers


Re: [PATCH v2] powerpc/8xx: Allow disabling KUAP at boot time

2021-06-17 Thread Michael Ellerman
On Fri, 4 Jun 2021 04:49:25 + (UTC), Christophe Leroy wrote:
> PPC64 uses MMU features to enable/disable KUAP at boot time.
> But feature fixups are applied way too early on PPC32.
> 
> But since commit c16728835eec ("powerpc/32: Manage KUAP in C"),
> all KUAP is in C so it is now possible to use static branches.

Applied to powerpc/next.

[1/1] powerpc/8xx: Allow disabling KUAP at boot time
  https://git.kernel.org/powerpc/c/f6025a140ba8dcabdfb8a1e27ddaf44821700dce

cheers


Re: [PATCH v2 00/12] powerpc: Optimise KUAP on book3s/32

2021-06-17 Thread Michael Ellerman
On Thu, 3 Jun 2021 08:41:35 + (UTC), Christophe Leroy wrote:
> This series is a rework of KUAP on book3s/32.
> 
> On book3s32, KUAP is heavier than on other platform because it can't
> be opened globaly at once, it must be done for each 256Mb segment.
> 
> Instead of opening access to all necessary segments via a heavy logic,
> only open access to the segment matching the start of the range.
> 
> [...]

Applied to powerpc/next.

[01/12] powerpc/32s: Move setup_{kuep/kuap}() into {kuep/kuap}.c

https://git.kernel.org/powerpc/c/91ec66719d4c5c0e7b4e32585b01881660d1bc53
[02/12] powerpc/32s: Refactor update of user segment registers

https://git.kernel.org/powerpc/c/91bb30822a2e1d7900f9f42e9e92647a9015f979
[03/12] powerpc/32s: move CTX_TO_VSID() into mmu-hash.h

https://git.kernel.org/powerpc/c/7235bb3593781ed022d0714a73c2c0d8eb8a835f
[04/12] powerpc/32s: Convert switch_mmu_context() to C

https://git.kernel.org/powerpc/c/863771a28e27dc9eaeaa88cea300370d032f0e0f
[05/12] powerpc/32s: Simplify calculation of segment register content

https://git.kernel.org/powerpc/c/882136fb2f5208a35ddad9205b20e5791edd4782
[06/12] powerpc/32s: Initialise KUAP and KUEP in C

https://git.kernel.org/powerpc/c/86f46f3432727933be82f64b739712a6edb9d704
[07/12] powerpc/32s: Allow disabling KUEP at boot time

https://git.kernel.org/powerpc/c/50d2f104cd9572af476579eae9aa1b38de602ec7
[08/12] powerpc/32s: Allow disabling KUAP at boot time

https://git.kernel.org/powerpc/c/6b4d630068b0c5cdd6d8e599182b131448e0cb06
[09/12] powerpc/32s: Rework Kernel Userspace Access Protection

https://git.kernel.org/powerpc/c/16132529cee586ee9a058bb33cfbdcb5d884f6b3
[10/12] powerpc/32s: Activate KUAP and KUEP by default

https://git.kernel.org/powerpc/c/9f5bd8f1471d7498c934c0a686fd0997cf872653
[11/12] powerpc/kuap: Remove KUAP_CURRENT_XXX

https://git.kernel.org/powerpc/c/d008f8f8a0c3efe4fe1008a797f9497ea5965e27
[12/12] powerpc/kuap: Remove to/from/size parameters of prevent_user_access()

https://git.kernel.org/powerpc/c/cb2f1fb205cc20695fcaef84baf80d9d3e54c88b

cheers


Re: [PATCH v2 00/12] powerpc: Cleanup use of 'struct ppc_inst'

2021-06-17 Thread Michael Ellerman
On Thu, 20 May 2021 13:50:37 + (UTC), Christophe Leroy wrote:
> This series is a cleanup of the use of 'struct ppc_inst'.
> 
> A confusion is made between internal representation of powerpc
> instructions with 'struct ppc_inst' and in-memory code which is
> and will always be an array of 'unsigned int'.
> 
> This series cleans it up.
> 
> [...]

Applied to powerpc/next.

[01/12] powerpc/inst: Fix sparse detection on get_user_instr()

https://git.kernel.org/powerpc/c/b3a9e523237013477bea914b7fbfbe420428b988
[02/12] powerpc/inst: Reduce casts in get_user_instr()

https://git.kernel.org/powerpc/c/9134806e149ebb214f122f0f84254096d3768bb2
[03/12] powerpc/inst: Improve readability of get_user_instr() and friends

https://git.kernel.org/powerpc/c/042e0860e1c1d60a0ab1ff3f16b7f420573133e0
[04/12] powerpc/inst: Avoid pointer dereferencing in ppc_inst_equal()

https://git.kernel.org/powerpc/c/036b5560bebc72c61d955ae0b115e8e69da8a563
[05/12] powerpc: Do not dereference code as 'struct ppc_inst' (uprobe, 
code-patching, feature-fixups)

https://git.kernel.org/powerpc/c/18c85964b10b7b78a5cb59a4959a5f82fdc77e4c
[06/12] powerpc/lib/code-patching: Make instr_is_branch_to_addr() static

https://git.kernel.org/powerpc/c/6c0d181daabcba286db9711eef8800b566fb1cce
[07/12] powerpc/lib/code-patching: Don't use struct 'ppc_inst' for runnable 
code in tests.

https://git.kernel.org/powerpc/c/e90a21ea801d1776d9a786ad02354fd3fe23ce09
[08/12] powerpc: Don't use 'struct ppc_inst' to reference instruction location

https://git.kernel.org/powerpc/c/69d4d6e5fd9f4e805280ad831932c3df7b9d7cc7
[09/12] powerpc/inst: Refactor PPC32 and PPC64 versions

https://git.kernel.org/powerpc/c/077c4dedef09796ade917459a5330e3940fb5860
[10/12] powerpc/optprobes: Minimise casts

https://git.kernel.org/powerpc/c/afd3287c8872142ec4298a2b77bd9077e2209c9c
[11/12] powerpc/optprobes: Compact code source a bit.

https://git.kernel.org/powerpc/c/f38adf86ce4fdae84904f420e175ce5806509c4c
[12/12] powerpc/optprobes: use PPC_RAW_ macros

https://git.kernel.org/powerpc/c/0e628ad2d60896de31148fba00cc73623b8c0aa1

cheers


Re: [PATCH v1 01/12] powerpc: Rework PPC_RAW_xxx() macros for prefixed instructions

2021-06-17 Thread Michael Ellerman
On Thu, 20 May 2021 10:23:00 + (UTC), Christophe Leroy wrote:
> At the time being, we have PPC_RAW_PLXVP() and PPC_RAW_PSTXVP() which
> provide a 64 bits value, and then it gets split by open coding to
> format it into a 'struct ppc_inst' instruction.
> 
> Instead, define a PPC_RAW_xxx_P() and a PPC_RAW_xxx_S() to be used
> as is.

Applied to powerpc/next.

[01/12] powerpc: Rework PPC_RAW_xxx() macros for prefixed instructions

https://git.kernel.org/powerpc/c/148a047602462ab04bff20f3529a255b0439d3df
[02/12] powerpc/opcodes: Add shorter macros for registers for use with 
PPC_RAW_xx()

https://git.kernel.org/powerpc/c/07cd18320ed816dec8ff6f58a2d8b33294dcceba
[03/12] powerpc/lib/code-patching: Use PPC_RAW_() macros

https://git.kernel.org/powerpc/c/8804d5beef9189fd2eae5aee14e1628436742e02
[04/12] powerpc/signal: Use PPC_RAW_xx() macros

https://git.kernel.org/powerpc/c/1c9debbc2eb5391277ae6aa7d95f821e0c28613d
[05/12] powerpc/modules: Use PPC_RAW_xx() macros

https://git.kernel.org/powerpc/c/47b04699d0709f5ff12a8aa0b3050a6246eb570e
[06/12] powerpc/security: Use PPC_RAW_BLR() and PPC_RAW_NOP()

https://git.kernel.org/powerpc/c/e7304597560176d8755e2ae4abb599d0c4efe4f2
[07/12] powerpc/ftrace: Use PPC_RAW_MFLR() and PPC_RAW_NOP()

https://git.kernel.org/powerpc/c/5a03e1e9728edce8f87e3e0bad6d4cd66329b129
[08/12] powerpc/ebpf64: Use PPC_RAW_MFLR()

https://git.kernel.org/powerpc/c/e08021f8dbd256f480b7e172aa4e894219c901f2
[09/12] powerpc/ebpf32: Use _Rx macros instead of __REG_Rx ones

https://git.kernel.org/powerpc/c/e0ea08c0cacf9370e3fd3ee8bb7456c61e79db66
[10/12] powerpc/lib/feature-fixups: Use PPC_RAW_xxx() macros

https://git.kernel.org/powerpc/c/ef909ba954145e35c9e21352133e5e99c64ab3f4
[11/12] powerpc/traps: Start using PPC_RAW_xx() macros

https://git.kernel.org/powerpc/c/deefd0ae990a689089ea1e4f5ad41799d63d4fd9
[12/12] powerpc: Replace PPC_INST_NOP by PPC_RAW_NOP()

https://git.kernel.org/powerpc/c/f30becb5e9ec086257162f78be491c0920c616b7

cheers


Re: [PATCH] powerpc/signal32: Remove impossible #ifdef combinations

2021-06-17 Thread Michael Ellerman
On Thu, 10 Jun 2021 15:58:34 + (UTC), Christophe Leroy wrote:
> PPC_TRANSACTIONAL_MEM is only on book3s/64
> SPE is only on booke
> 
> PPC_TRANSACTIONAL_MEM selects ALTIVEC and VSX
> 
> Therefore, within PPC_TRANSACTIONAL_MEM sections,
> ALTIVEC and VSX are always defined while SPE never is.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/signal32: Remove impossible #ifdef combinations
  https://git.kernel.org/powerpc/c/ac3d085368b3abf19b24d8505b897454c7372855

cheers


Re: [PATCH] powerpc/selftests: Use gettid() instead of getppid() for null_syscall

2021-06-17 Thread Michael Ellerman
On Fri, 4 Jun 2021 12:31:09 + (UTC), Christophe Leroy wrote:
> gettid() is 10% lighter than getppid(), use it for null_syscall selftest.

Applied to powerpc/next.

[1/1] powerpc/selftests: Use gettid() instead of getppid() for null_syscall
  https://git.kernel.org/powerpc/c/a1ea0ca8a6f17d7b79bbc4d05dd4e6ca162d8f15

cheers


Re: [PATCH] powerpc: Remove proc_trap()

2021-06-17 Thread Michael Ellerman
On Wed, 9 Jun 2021 05:52:50 + (UTC), Christophe Leroy wrote:
> proc_trap() has never been used, remove it.

Applied to powerpc/next.

[1/1] powerpc: Remove proc_trap()
  https://git.kernel.org/powerpc/c/77b0bed74232c480b94bae188b6c7cd0ddee92e8

cheers


Re: [PATCH] powerpc: Remove CONFIG_PPC_MMU_NOHASH_32

2021-06-17 Thread Michael Ellerman
On Thu, 3 Jun 2021 07:53:49 + (UTC), Christophe Leroy wrote:
> Since commit Fixes: 555904d07eef ("powerpc/8xx: MM_SLICE is not needed 
> anymore"),
> CONFIG_PPC_MMU_NOHASH_32 has not been used.
> 
> Remove it.

Applied to powerpc/next.

[1/1] powerpc: Remove CONFIG_PPC_MMU_NOHASH_32
  https://git.kernel.org/powerpc/c/c0ca0fe08c9213a5187e4513b5506667f249030f

cheers


Re: [PATCH] powerpc: Don't handle ALTIVEC/SPE in ASM in _switch(). Do it in C.

2021-06-17 Thread Michael Ellerman
On Fri, 14 May 2021 13:14:53 + (UTC), Christophe Leroy wrote:
> _switch() saves and restores ALTIVEC and SPE status.
> For altivec this is redundant with what __switch_to() does with
> save_sprs() and restore_sprs() and giveup_all() before
> calling _switch().
> 
> Add support for SPI in save_sprs() and restore_sprs() and
> remove things from _switch().

Applied to powerpc/next.

[1/1] powerpc: Don't handle ALTIVEC/SPE in ASM in _switch(). Do it in C.
  https://git.kernel.org/powerpc/c/359c2ca74d2fede5c571fbf3f5ee16ba1ad98259

cheers


Re: [PATCH] powerpc/32: Display modules range in virtual memory layout

2021-06-17 Thread Michael Ellerman
On Fri, 11 Jun 2021 19:08:54 + (UTC), Christophe Leroy wrote:
> book3s/32 and 8xx don't use vmalloc for modules.
> 
> Print the modules area at startup as part of the virtual memory layout:
> 
> [0.00] Kernel virtual memory layout:
> [0.00]   * 0xffafc000..0xc000  : fixmap
> [0.00]   * 0xc900..0xffafc000  : vmalloc & ioremap
> [0.00]   * 0xb000..0xc000  : modules
> [0.00] Memory: 118480K/131072K available (7152K kernel code, 2320K 
> rwdata, 1328K rodata, 368K init, 854K bss, 12592K reserved, 0K cma-reserved)

Applied to powerpc/next.

[1/1] powerpc/32: Display modules range in virtual memory layout
  https://git.kernel.org/powerpc/c/baf24d23be7d2357a2aa9c5ffb6a2d680ac2a68c

cheers


Re: [PATCH] powerpc/32: Remove __main()

2021-06-17 Thread Michael Ellerman
On Tue, 8 Jun 2021 17:22:51 + (UTC), Christophe Leroy wrote:
> Comment says that __main() is there to make GCC happy.
> 
> It's been there since the implementation of ppc arch in Linux 1.3.45.
> 
> ppc32 is the only architecture having that. Even ppc64 doesn't have it.
> 
> Seems like GCC is still happy without it.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/32: Remove __main()
  https://git.kernel.org/powerpc/c/4696cfdb1380238dca2bda6199428d7e50c4ea38

cheers


Re: [PATCH] powerpc/44x: Implement Kernel Userspace Exec Protection (KUEP)

2021-06-17 Thread Michael Ellerman
On Wed, 2 Jun 2021 06:42:10 + (UTC), Christophe Leroy wrote:
> Powerpc 44x has two bits for exec protection in TLBs: one
> for user (UX) and one for superviser (SX).
> 
> Clear SX on user pages in TLB miss handlers to provide KUEP.

Applied to powerpc/next.

[1/1] powerpc/44x: Implement Kernel Userspace Exec Protection (KUEP)
  https://git.kernel.org/powerpc/c/10248dcba1205042a3a0ea65eb441030702d97cd

cheers


Re: [PATCH] powerpc/perf: Simplify Makefile

2021-06-17 Thread Michael Ellerman
On Fri, 7 May 2021 14:01:09 + (UTC), Christophe Leroy wrote:
> arch/powerpc/Kbuild decend into arch/powerpc/perf/ only when
> CONFIG_PERF_EVENTS is selected, so there is not need to take
> CONFIG_PERF_EVENTS into account in arch/powerpc/perf/Makefile.

Applied to powerpc/next.

[1/1] powerpc/perf: Simplify Makefile
  https://git.kernel.org/powerpc/c/87f19ea10100892637d4eee9069fad4ed61cb6a5

cheers


Re: [PATCH] powerpc/mm/book3s64: Fix possible build error

2021-06-17 Thread Michael Ellerman
On Thu, 10 Jun 2021 14:06:39 +0530, Aneesh Kumar K.V wrote:
> Update _tlbiel_pid() such that we can avoid build errors like below when
> using this function in other places.
> 
> arch/powerpc/mm/book3s64/radix_tlb.c: In function 
> ‘__radix__flush_tlb_range_psize’:
> arch/powerpc/mm/book3s64/radix_tlb.c:114:2: warning: ‘asm’ operand 3 probably 
> does not match constraints
>   114 |  asm volatile(PPC_TLBIEL(%0, %4, %3, %2, %1)
>   |  ^~~
> arch/powerpc/mm/book3s64/radix_tlb.c:114:2: error: impossible constraint in 
> ‘asm’
> make[4]: *** [scripts/Makefile.build:271: 
> arch/powerpc/mm/book3s64/radix_tlb.o] Error 1
> m
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/mm/book3s64: Fix possible build error
  https://git.kernel.org/powerpc/c/07d8ad6fd8a3d47f50595ca4826f41dbf4f3a0c6

cheers


Re: [PATCH] powerpc: Move update_power8_hid0() into its only user

2021-06-17 Thread Michael Ellerman
On Wed, 9 Jun 2021 06:10:29 + (UTC), Christophe Leroy wrote:
> update_power8_hid0() is used only by powernv platform subcore.c
> 
> Move it there.

Applied to powerpc/next.

[1/1] powerpc: Move update_power8_hid0() into its only user
  https://git.kernel.org/powerpc/c/ab3aab292cb2f417f63b8f4887c1dd01c2a831cd

cheers


Re: [PATCH] powerpc/kuap: Force inlining of all first level KUAP helpers.

2021-06-17 Thread Michael Ellerman
On Thu, 3 Jun 2021 09:13:54 + (UTC), Christophe Leroy wrote:
> All KUAP helpers defined in asm/kup.h are single line functions
> that should be inlined. But on book3s/32 build, we get many
> instances of .
> 
> Force inlining of those helpers.

Applied to powerpc/next.

[1/1] powerpc/kuap: Force inlining of all first level KUAP helpers.
  https://git.kernel.org/powerpc/c/240efd717c415e69511780044f44416bdf161523

cheers


Re: [PATCH] powerpc: Force inlining of csum_add()

2021-06-17 Thread Michael Ellerman
On Tue, 11 May 2021 06:08:06 + (UTC), Christophe Leroy wrote:
> Commit 328e7e487a46 ("powerpc: force inlining of csum_partial() to
> avoid multiple csum_partial() with GCC10") inlined csum_partial().
> 
> Now that csum_partial() is inlined, GCC outlines csum_add() when
> called by csum_partial().
> 
> c064fb28 :
> c064fb28: 7c 63 20 14 addcr3,r3,r4
> c064fb2c: 7c 63 01 94 addze   r3,r3
> c064fb30: 4e 80 00 20 blr
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc: Force inlining of csum_add()
  https://git.kernel.org/powerpc/c/4423eff71ca6b8f2c5e0fc4cea33d8cdfe3c3740

cheers


Re: [PATCH 1/3] powerpc: Define empty_zero_page[] in C

2021-06-17 Thread Michael Ellerman
On Mon, 7 Jun 2021 10:56:04 + (UTC), Christophe Leroy wrote:
> At the time being, empty_zero_page[] is defined in each
> platform head.S.
> 
> Define it in mm/mem.c instead, and put it in BSS section instead
> of the DATA section. Commit 5227cfa71f9e ("arm64: mm: place
> empty_zero_page in bss") explains why it is interesting to have
> it in BSS.

Applied to powerpc/next.

[1/3] powerpc: Define empty_zero_page[] in C
  https://git.kernel.org/powerpc/c/45b30fafe528601f1a4449c9d68d8ebe7bbc39ad
[2/3] powerpc: Define swapper_pg_dir[] in C
  https://git.kernel.org/powerpc/c/e72421a085a8dc81c71b0daeb89612279c2c621c
[3/3] powerpc/32s: Rename PTE_SIZE to PTE_T_SIZE
  https://git.kernel.org/powerpc/c/91e9ee7e949bff08cc3845a4811185e826b6e2f1

cheers


Re: [PATCH 1/2] powerpc/64: drop redundant defination of spin_until_cond

2021-06-17 Thread Michael Ellerman
On Fri, 11 Jun 2021 19:10:57 + (UTC), Christophe Leroy wrote:
> linux/processor.h has exactly same defination for spin_until_cond.
> Drop the redundant defination in asm/processor.h

Applied to powerpc/next.

[1/2] powerpc/64: drop redundant defination of spin_until_cond
  https://git.kernel.org/powerpc/c/db8f7066dc498acf9074ed3c11a7a24f318d8d4f
[2/2] powerpc/watchdog: include linux/processor.h for spin_until_cond
  https://git.kernel.org/powerpc/c/2400c13c437debc99d3399a7100d4e8c3fe20a08

cheers


Re: [PATCH V3 0/2] selftests/powerpc: Updates to EBB selftest for ISA v3.1

2021-06-17 Thread Michael Ellerman
On Tue, 25 May 2021 09:51:41 -0400, Athira Rajeev wrote:
> The "no_handler_test" in ebb selftests attempts to read the PMU
> registers after closing of the event via helper function
> "dump_ebb_state". With the MMCR0 control bit (PMCCEXT) in ISA v3.1,
> read access to group B registers is restricted when MMCR0 PMCC=0b00.
> Hence the call to dump_ebb_state after closing of event will generate
> a SIGILL, which is expected.
> 
> [...]

Applied to powerpc/next.

[1/2] selftests/powerpc: Fix "no_handler" EBB selftest
  https://git.kernel.org/powerpc/c/45677c9aebe926192e59475b35a1ff35ff2d4217
[2/2] selftests/powerpc: EBB selftest for MMCR0 control for PMU SPRs in ISA v3.1
  https://git.kernel.org/powerpc/c/d81090ed44c0d15abf2b07663d5f0b9e5ba51525

cheers


Re: [PATCH v1 1/1] powerpc/prom_init: Move custom isspace() to its own namespace

2021-06-17 Thread Michael Ellerman
On Mon, 10 May 2021 17:49:25 +0300, Andy Shevchenko wrote:
> If by some reason any of the headers will include ctype.h
> we will have a name collision. Avoid this by moving isspace()
> to the dedicate namespace.
> 
> First appearance of the code is in the commit cf68787b68a2
> ("powerpc/prom_init: Evaluate mem kernel parameter for early allocation").

Applied to powerpc/next.

[1/1] powerpc/prom_init: Move custom isspace() to its own namespace
  https://git.kernel.org/powerpc/c/4cfdd9201cfb85538975f5c8fb83941c3d463ed2

cheers


[PATCH v2 6/9] powerpc/microwatt: Add support for hardware random number generator

2021-06-17 Thread Paul Mackerras
Microwatt's hardware RNG is accessed using the DARN instruction.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/platforms/microwatt/Kconfig  |  1 +
 arch/powerpc/platforms/microwatt/Makefile |  2 +-
 arch/powerpc/platforms/microwatt/rng.c| 48 +++
 3 files changed, 50 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/microwatt/rng.c

diff --git a/arch/powerpc/platforms/microwatt/Kconfig 
b/arch/powerpc/platforms/microwatt/Kconfig
index 50ed0cedb5f1..8f6a81978461 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -7,6 +7,7 @@ config PPC_MICROWATT
select PPC_ICP_NATIVE
select PPC_NATIVE
select PPC_UDBG_16550
+   select ARCH_RANDOM
help
   This option enables support for FPGA-based Microwatt implementations.
 
diff --git a/arch/powerpc/platforms/microwatt/Makefile 
b/arch/powerpc/platforms/microwatt/Makefile
index e6885b3b2ee7..116d6d3ad3f0 100644
--- a/arch/powerpc/platforms/microwatt/Makefile
+++ b/arch/powerpc/platforms/microwatt/Makefile
@@ -1 +1 @@
-obj-y  += setup.o
+obj-y  += setup.o rng.o
diff --git a/arch/powerpc/platforms/microwatt/rng.c 
b/arch/powerpc/platforms/microwatt/rng.c
new file mode 100644
index ..3d8ee6eb7dad
--- /dev/null
+++ b/arch/powerpc/platforms/microwatt/rng.c
@@ -0,0 +1,48 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Derived from arch/powerpc/platforms/powernv/rng.c, which is:
+ * Copyright 2013, Michael Ellerman, IBM Corporation.
+ */
+
+#define pr_fmt(fmt)"microwatt-rng: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DARN_ERR 0xul
+
+int microwatt_get_random_darn(unsigned long *v)
+{
+   unsigned long val;
+
+   /* Using DARN with L=1 - 64-bit conditioned random number */
+   asm volatile(PPC_DARN(%0, 1) : "=r"(val));
+
+   if (val == DARN_ERR)
+   return 0;
+
+   *v = val;
+
+   return 1;
+}
+
+static __init int rng_init(void)
+{
+   unsigned long val;
+   int i;
+
+   for (i = 0; i < 10; i++) {
+   if (microwatt_get_random_darn()) {
+   ppc_md.get_random_seed = microwatt_get_random_darn;
+   return 0;
+   }
+   }
+
+   pr_warn("Unable to use DARN for get_random_seed()\n");
+
+   return -EIO;
+}
+machine_subsys_initcall(, rng_init);
-- 
2.31.1



[PATCH v2 0/9] powerpc: Add support for Microwatt soft-core

2021-06-17 Thread Paul Mackerras
This series of patches adds support for the Microwatt soft-core.
Microwatt is an open-source 64-bit Power ISA processor written in VHDL
which targets medium-sized FPGAs such as the Xilinx Artix-7 or the
Lattice ECP5.  Microwatt currently implements the scalar fixed plus
floating-point subset of Power ISA v3.0B plus the radix MMU, but not
logical partitioning (i.e. it does not have hypervisor mode or nested
radix translation).

Changes in v2:

- Dropped the patch that adds support for the PRTBL register, since it
  is not architected.  Instead, I have added support for a 1-entry
  partition table to Microwatt and implemented the PTCR register.

- Updated the device tree.

- Dropped the change to archrandom.h.

- Combined patches 10 and 11 of the previous series into one.

Paul.

 arch/powerpc/Kconfig  |   2 +-
 arch/powerpc/boot/Makefile|   4 +
 arch/powerpc/boot/devtree.c   |  59 ---
 arch/powerpc/boot/dts/microwatt.dts   | 138 
 arch/powerpc/boot/microwatt.c |  24 +++
 arch/powerpc/boot/ns16550.c   |   9 +-
 arch/powerpc/boot/wrapper |   5 +
 arch/powerpc/configs/microwatt_defconfig  |  98 
 arch/powerpc/kernel/udbg_16550.c  |  39 +
 arch/powerpc/platforms/Kconfig|   1 +
 arch/powerpc/platforms/Makefile   |   1 +
 arch/powerpc/platforms/microwatt/Kconfig  |  13 ++
 arch/powerpc/platforms/microwatt/Makefile |   1 +
 arch/powerpc/platforms/microwatt/rng.c|  48 ++
 arch/powerpc/platforms/microwatt/setup.c  |  41 +
 arch/powerpc/sysdev/xics/Kconfig  |   3 +
 arch/powerpc/sysdev/xics/Makefile |   1 +
 arch/powerpc/sysdev/xics/ics-native.c | 257 ++
 arch/powerpc/sysdev/xics/xics-common.c|   2 +
 19 files changed, 718 insertions(+), 28 deletions(-)


[PATCH v2 4/9] powerpc/xics: Add a native ICS backend for microwatt

2021-06-17 Thread Paul Mackerras
From: Benjamin Herrenschmidt 

This is a simple native ICS backend that matches the layout of
the Microwatt implementation of ICS.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/dts/microwatt.dts  |  18 ++
 arch/powerpc/platforms/microwatt/Kconfig |   2 +
 arch/powerpc/platforms/microwatt/setup.c |   8 +
 arch/powerpc/sysdev/xics/Kconfig |   3 +
 arch/powerpc/sysdev/xics/Makefile|   1 +
 arch/powerpc/sysdev/xics/ics-native.c| 257 +++
 arch/powerpc/sysdev/xics/xics-common.c   |   2 +
 7 files changed, 291 insertions(+)
 create mode 100644 arch/powerpc/sysdev/xics/ics-native.c

diff --git a/arch/powerpc/boot/dts/microwatt.dts 
b/arch/powerpc/boot/dts/microwatt.dts
index 9b6140c90370..04e5dd92270e 100644
--- a/arch/powerpc/boot/dts/microwatt.dts
+++ b/arch/powerpc/boot/dts/microwatt.dts
@@ -99,7 +99,25 @@ soc@c000 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;
+   interrupt-parent = <>;
 
ranges = <0 0 0xc000 0x4000>;
+
+   interrupt-controller@4000 {
+   compatible = "openpower,xics-presentation", 
"ibm,ppc-xicp";
+   ibm,interrupt-server-ranges = <0x0 0x1>;
+   reg = <0x4000 0x100>;
+   };
+
+   ICS: interrupt-controller@5000 {
+   compatible = "openpower,xics-sources";
+   interrupt-controller;
+   interrupt-ranges = <0x10 0x10>;
+   reg = <0x5000 0x100>;
+   #address-cells = <0>;
+   #size-cells = <0>;
+   #interrupt-cells = <2>;
+   };
+
};
 };
diff --git a/arch/powerpc/platforms/microwatt/Kconfig 
b/arch/powerpc/platforms/microwatt/Kconfig
index 3be01e78ce57..b52c869c0eb8 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -3,6 +3,8 @@ config PPC_MICROWATT
depends on PPC_BOOK3S_64 && !SMP
bool "Microwatt SoC platform"
select PPC_XICS
+   select PPC_ICS_NATIVE
+   select PPC_ICP_NATIVE
select PPC_NATIVE
help
   This option enables support for FPGA-based Microwatt implementations.
diff --git a/arch/powerpc/platforms/microwatt/setup.c 
b/arch/powerpc/platforms/microwatt/setup.c
index 5af4adf881bc..1c1b7791fa57 100644
--- a/arch/powerpc/platforms/microwatt/setup.c
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -10,8 +10,15 @@
 #include 
 #include 
 #include 
+
 #include 
 #include 
+#include 
+
+static void __init microwatt_init_IRQ(void)
+{
+   xics_init();
+}
 
 static int __init microwatt_probe(void)
 {
@@ -27,5 +34,6 @@ machine_arch_initcall(microwatt, microwatt_populate);
 define_machine(microwatt) {
.name   = "microwatt",
.probe  = microwatt_probe,
+   .init_IRQ   = microwatt_init_IRQ,
.calibrate_decr = generic_calibrate_decr,
 };
diff --git a/arch/powerpc/sysdev/xics/Kconfig b/arch/powerpc/sysdev/xics/Kconfig
index 304614c920aa..063d9195891f 100644
--- a/arch/powerpc/sysdev/xics/Kconfig
+++ b/arch/powerpc/sysdev/xics/Kconfig
@@ -12,3 +12,6 @@ config PPC_ICP_HV
 
 config PPC_ICS_RTAS
def_bool n
+
+config PPC_ICS_NATIVE
+   def_bool n
diff --git a/arch/powerpc/sysdev/xics/Makefile 
b/arch/powerpc/sysdev/xics/Makefile
index ba1e3117b1c0..747063927c6c 100644
--- a/arch/powerpc/sysdev/xics/Makefile
+++ b/arch/powerpc/sysdev/xics/Makefile
@@ -4,4 +4,5 @@ obj-y   += xics-common.o
 obj-$(CONFIG_PPC_ICP_NATIVE)   += icp-native.o
 obj-$(CONFIG_PPC_ICP_HV)   += icp-hv.o
 obj-$(CONFIG_PPC_ICS_RTAS) += ics-rtas.o
+obj-$(CONFIG_PPC_ICS_NATIVE)   += ics-native.o
 obj-$(CONFIG_PPC_POWERNV)  += ics-opal.o icp-opal.o
diff --git a/arch/powerpc/sysdev/xics/ics-native.c 
b/arch/powerpc/sysdev/xics/ics-native.c
new file mode 100644
index ..d450502f4053
--- /dev/null
+++ b/arch/powerpc/sysdev/xics/ics-native.c
@@ -0,0 +1,257 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * ICS backend for OPAL managed interrupts.
+ *
+ * Copyright 2011 IBM Corp.
+ */
+
+//#define DEBUG
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct ics_native {
+   struct ics  ics;
+   struct device_node  *node;
+   void __iomem*base;
+   u32 ibase;
+   u32 icount;
+};
+#define to_ics_native(_ics) container_of(_ics, struct ics_native, ics)
+
+static void __iomem *ics_native_xive(struct ics_native *in, unsigned int vec)
+{
+   return in->base + 0x800 + ((vec - in->ibase) << 2);
+}
+

[PATCH v2 3/9] powerpc/microwatt: Populate platform bus from device-tree

2021-06-17 Thread Paul Mackerras
From: Benjamin Herrenschmidt 

Just like any other embedded platform.

Add an empty soc node.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/dts/microwatt.dts  | 7 +++
 arch/powerpc/platforms/microwatt/setup.c | 8 
 2 files changed, 15 insertions(+)

diff --git a/arch/powerpc/boot/dts/microwatt.dts 
b/arch/powerpc/boot/dts/microwatt.dts
index ac264ad3faaf..9b6140c90370 100644
--- a/arch/powerpc/boot/dts/microwatt.dts
+++ b/arch/powerpc/boot/dts/microwatt.dts
@@ -95,4 +95,11 @@ chosen {
  00 00 00 00 00 00 00 00 40 00 40];
};
 
+   soc@c000 {
+   compatible = "simple-bus";
+   #address-cells = <1>;
+   #size-cells = <1>;
+
+   ranges = <0 0 0xc000 0x4000>;
+   };
 };
diff --git a/arch/powerpc/platforms/microwatt/setup.c 
b/arch/powerpc/platforms/microwatt/setup.c
index d80d52612672..5af4adf881bc 100644
--- a/arch/powerpc/platforms/microwatt/setup.c
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -8,6 +8,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -16,6 +18,12 @@ static int __init microwatt_probe(void)
return of_machine_is_compatible("microwatt-soc");
 }
 
+static int __init microwatt_populate(void)
+{
+   return of_platform_default_populate(NULL, NULL, NULL);
+}
+machine_arch_initcall(microwatt, microwatt_populate);
+
 define_machine(microwatt) {
.name   = "microwatt",
.probe  = microwatt_probe,
-- 
2.31.1



[PATCH v2 1/9] powerpc: Add Microwatt platform

2021-06-17 Thread Paul Mackerras
Microwatt is a FPGA-based implementation of the Power ISA.  It
currently only implements little-endian 64-bit mode, and does
not (yet) support SMP, VMX, VSX or transactional memory.  It has an
optional FPU, and an optional MMU (required for running Linux,
obviously) which implements a configurable radix tree but not
hypervisor mode or nested radix translation.

This adds a new machine type to support FPGA-based SoCs with a
Microwatt core.  CONFIG_MATH_EMULATION can be selected for Microwatt
SOCs which don't have the FPU.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/Kconfig  |  2 +-
 arch/powerpc/platforms/Kconfig|  1 +
 arch/powerpc/platforms/Makefile   |  1 +
 arch/powerpc/platforms/microwatt/Kconfig  |  9 +
 arch/powerpc/platforms/microwatt/Makefile |  1 +
 arch/powerpc/platforms/microwatt/setup.c  | 23 +++
 6 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/platforms/microwatt/Kconfig
 create mode 100644 arch/powerpc/platforms/microwatt/Makefile
 create mode 100644 arch/powerpc/platforms/microwatt/setup.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 386ae12d8523..5ce51c38a346 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -422,7 +422,7 @@ config HUGETLB_PAGE_SIZE_VARIABLE
 
 config MATH_EMULATION
bool "Math emulation"
-   depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE
+   depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE || PPC_MICROWATT
select PPC_FPU_REGS
help
  Some PowerPC chips designed for embedded applications do not have
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index 7a5e8f4541e3..74be4d06afbf 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -20,6 +20,7 @@ source "arch/powerpc/platforms/embedded6xx/Kconfig"
 source "arch/powerpc/platforms/44x/Kconfig"
 source "arch/powerpc/platforms/40x/Kconfig"
 source "arch/powerpc/platforms/amigaone/Kconfig"
+source "arch/powerpc/platforms/microwatt/Kconfig"
 
 config KVM_GUEST
bool "KVM Guest support"
diff --git a/arch/powerpc/platforms/Makefile b/arch/powerpc/platforms/Makefile
index 143d4417f6cc..edcb54cdb1a8 100644
--- a/arch/powerpc/platforms/Makefile
+++ b/arch/powerpc/platforms/Makefile
@@ -22,3 +22,4 @@ obj-$(CONFIG_PPC_CELL)+= cell/
 obj-$(CONFIG_PPC_PS3)  += ps3/
 obj-$(CONFIG_EMBEDDED6xx)  += embedded6xx/
 obj-$(CONFIG_AMIGAONE) += amigaone/
+obj-$(CONFIG_PPC_MICROWATT)+= microwatt/
diff --git a/arch/powerpc/platforms/microwatt/Kconfig 
b/arch/powerpc/platforms/microwatt/Kconfig
new file mode 100644
index ..3be01e78ce57
--- /dev/null
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -0,0 +1,9 @@
+# SPDX-License-Identifier: GPL-2.0
+config PPC_MICROWATT
+   depends on PPC_BOOK3S_64 && !SMP
+   bool "Microwatt SoC platform"
+   select PPC_XICS
+   select PPC_NATIVE
+   help
+  This option enables support for FPGA-based Microwatt implementations.
+
diff --git a/arch/powerpc/platforms/microwatt/Makefile 
b/arch/powerpc/platforms/microwatt/Makefile
new file mode 100644
index ..e6885b3b2ee7
--- /dev/null
+++ b/arch/powerpc/platforms/microwatt/Makefile
@@ -0,0 +1 @@
+obj-y  += setup.o
diff --git a/arch/powerpc/platforms/microwatt/setup.c 
b/arch/powerpc/platforms/microwatt/setup.c
new file mode 100644
index ..d80d52612672
--- /dev/null
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -0,0 +1,23 @@
+/*
+ * Microwatt FPGA-based SoC platform setup code.
+ *
+ * Copyright 2020 Paul Mackerras (pau...@ozlabs.org), IBM Corp.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int __init microwatt_probe(void)
+{
+   return of_machine_is_compatible("microwatt-soc");
+}
+
+define_machine(microwatt) {
+   .name   = "microwatt",
+   .probe  = microwatt_probe,
+   .calibrate_decr = generic_calibrate_decr,
+};
-- 
2.31.1



[PATCH v2 8/9] powerpc/boot: Fixup device-tree on little endian

2021-06-17 Thread Paul Mackerras
From: Benjamin Herrenschmidt 

This fixes the core devtree.c functions and the ns16550 UART backend.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/devtree.c | 59 +
 arch/powerpc/boot/ns16550.c |  9 --
 2 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/arch/powerpc/boot/devtree.c b/arch/powerpc/boot/devtree.c
index 5d91036ad626..58fbcfcc98c9 100644
--- a/arch/powerpc/boot/devtree.c
+++ b/arch/powerpc/boot/devtree.c
@@ -13,6 +13,7 @@
 #include "string.h"
 #include "stdio.h"
 #include "ops.h"
+#include "of.h"
 
 void dt_fixup_memory(u64 start, u64 size)
 {
@@ -23,21 +24,25 @@ void dt_fixup_memory(u64 start, u64 size)
root = finddevice("/");
if (getprop(root, "#address-cells", , sizeof(naddr)) < 0)
naddr = 2;
+   else
+   naddr = be32_to_cpu(naddr);
if (naddr < 1 || naddr > 2)
fatal("Can't cope with #address-cells == %d in /\n\r", naddr);
 
if (getprop(root, "#size-cells", , sizeof(nsize)) < 0)
nsize = 1;
+   else
+   nsize = be32_to_cpu(nsize);
if (nsize < 1 || nsize > 2)
fatal("Can't cope with #size-cells == %d in /\n\r", nsize);
 
i = 0;
if (naddr == 2)
-   memreg[i++] = start >> 32;
-   memreg[i++] = start & 0x;
+   memreg[i++] = cpu_to_be32(start >> 32);
+   memreg[i++] = cpu_to_be32(start & 0x);
if (nsize == 2)
-   memreg[i++] = size >> 32;
-   memreg[i++] = size & 0x;
+   memreg[i++] = cpu_to_be32(size >> 32);
+   memreg[i++] = cpu_to_be32(size & 0x);
 
memory = finddevice("/memory");
if (! memory) {
@@ -45,9 +50,9 @@ void dt_fixup_memory(u64 start, u64 size)
setprop_str(memory, "device_type", "memory");
}
 
-   printf("Memory <- <0x%x", memreg[0]);
+   printf("Memory <- <0x%x", be32_to_cpu(memreg[0]));
for (i = 1; i < (naddr + nsize); i++)
-   printf(" 0x%x", memreg[i]);
+   printf(" 0x%x", be32_to_cpu(memreg[i]));
printf("> (%ldMB)\n\r", (unsigned long)(size >> 20));
 
setprop(memory, "reg", memreg, (naddr + nsize)*sizeof(u32));
@@ -65,10 +70,10 @@ void dt_fixup_cpu_clocks(u32 cpu, u32 tb, u32 bus)
printf("CPU bus-frequency <- 0x%x (%dMHz)\n\r", bus, MHZ(bus));
 
while ((devp = find_node_by_devtype(devp, "cpu"))) {
-   setprop_val(devp, "clock-frequency", cpu);
-   setprop_val(devp, "timebase-frequency", tb);
+   setprop_val(devp, "clock-frequency", cpu_to_be32(cpu));
+   setprop_val(devp, "timebase-frequency", cpu_to_be32(tb));
if (bus > 0)
-   setprop_val(devp, "bus-frequency", bus);
+   setprop_val(devp, "bus-frequency", cpu_to_be32(bus));
}
 
timebase_period_ns = 10 / tb;
@@ -80,7 +85,7 @@ void dt_fixup_clock(const char *path, u32 freq)
 
if (devp) {
printf("%s: clock-frequency <- %x (%dMHz)\n\r", path, freq, 
MHZ(freq));
-   setprop_val(devp, "clock-frequency", freq);
+   setprop_val(devp, "clock-frequency", cpu_to_be32(freq));
}
 }
 
@@ -133,8 +138,12 @@ void dt_get_reg_format(void *node, u32 *naddr, u32 *nsize)
 {
if (getprop(node, "#address-cells", naddr, 4) != 4)
*naddr = 2;
+   else
+   *naddr = be32_to_cpu(*naddr);
if (getprop(node, "#size-cells", nsize, 4) != 4)
*nsize = 1;
+   else
+   *nsize = be32_to_cpu(*nsize);
 }
 
 static void copy_val(u32 *dest, u32 *src, int naddr)
@@ -163,9 +172,9 @@ static int add_reg(u32 *reg, u32 *add, int naddr)
int i, carry = 0;
 
for (i = MAX_ADDR_CELLS - 1; i >= MAX_ADDR_CELLS - naddr; i--) {
-   u64 tmp = (u64)reg[i] + add[i] + carry;
+   u64 tmp = (u64)be32_to_cpu(reg[i]) + be32_to_cpu(add[i]) + 
carry;
carry = tmp >> 32;
-   reg[i] = (u32)tmp;
+   reg[i] = cpu_to_be32((u32)tmp);
}
 
return !carry;
@@ -180,18 +189,18 @@ static int compare_reg(u32 *reg, u32 *range, u32 
*rangesize)
u32 end;
 
for (i = 0; i < MAX_ADDR_CELLS; i++) {
-   if (reg[i] < range[i])
+   if (be32_to_cpu(reg[i]) < be32_to_cpu(range[i]))
return 0;
-   if (reg[i] > range[i])
+   if (be32_to_cpu(reg[i]) > be32_to_cpu(range[i]))
break;
}
 
for (i = 0; i < MAX_ADDR_CELLS; i++) {
-   end = range[i] + rangesize[i];
+   end = be32_to_cpu(range[i]) + be32_to_cpu(rangesize[i]);
 
-   if (reg[i] < end)
+   if (be32_to_cpu(reg[i]) < end)
break;
-   if (reg[i] > 

[PATCH v2 9/9] powerpc/boot: Add a boot wrapper for Microwatt

2021-06-17 Thread Paul Mackerras
From: Joel Stanley 

This allows microwatt's kernel to be built with an embedded device tree.

Load to arch/powerpc/boot/dtbImage.microwatt to 0x50:

 mw_debug -b fpga stop load arch/powerpc/boot/dtbImage.microwatt 50 start

Signed-off-by: Joel Stanley 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/Makefile|  4 
 arch/powerpc/boot/microwatt.c | 24 
 arch/powerpc/boot/wrapper |  5 +
 3 files changed, 33 insertions(+)
 create mode 100644 arch/powerpc/boot/microwatt.c

diff --git a/arch/powerpc/boot/Makefile b/arch/powerpc/boot/Makefile
index 2b8da923ceca..dfaa4094fcae 100644
--- a/arch/powerpc/boot/Makefile
+++ b/arch/powerpc/boot/Makefile
@@ -163,6 +163,8 @@ src-plat-$(CONFIG_PPC_POWERNV) += pseries-head.S
 src-plat-$(CONFIG_PPC_IBM_CELL_BLADE) += pseries-head.S
 src-plat-$(CONFIG_MVME7100) += motload-head.S mvme7100.c
 
+src-plat-$(CONFIG_PPC_MICROWATT) += fixed-head.S microwatt.c
+
 src-wlib := $(sort $(src-wlib-y))
 src-plat := $(sort $(src-plat-y))
 src-boot := $(src-wlib) $(src-plat) empty.c
@@ -355,6 +357,8 @@ image-$(CONFIG_MVME5100)+= dtbImage.mvme5100
 # Board port in arch/powerpc/platform/amigaone/Kconfig
 image-$(CONFIG_AMIGAONE)   += cuImage.amigaone
 
+image-$(CONFIG_PPC_MICROWATT)  += dtbImage.microwatt
+
 # For 32-bit powermacs, build the COFF and miboot images
 # as well as the ELF images.
 ifdef CONFIG_PPC32
diff --git a/arch/powerpc/boot/microwatt.c b/arch/powerpc/boot/microwatt.c
new file mode 100644
index ..ca9d83617fc1
--- /dev/null
+++ b/arch/powerpc/boot/microwatt.c
@@ -0,0 +1,24 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#include 
+#include "stdio.h"
+#include "types.h"
+#include "io.h"
+#include "ops.h"
+
+BSS_STACK(8192);
+
+void platform_init(unsigned long r3, unsigned long r4, unsigned long r5)
+{
+   unsigned long heapsize = 16*1024*1024 - (unsigned long)_end;
+
+   /*
+* Disable interrupts and turn off MSR_RI, since we'll
+* shortly be overwriting the interrupt vectors.
+*/
+   __asm__ volatile("mtmsrd %0,1" : : "r" (0));
+
+   simple_alloc_init(_end, heapsize, 32, 64);
+   fdt_init(_dtb_start);
+   serial_console_init();
+}
diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index 41fa0a8715e3..ae48fffa1e13 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -342,6 +342,11 @@ gamecube|wii)
 link_address='0x60'
 platformo="$object/$platform-head.o $object/$platform.o"
 ;;
+microwatt)
+link_address='0x50'
+platformo="$object/fixed-head.o $object/$platform.o"
+binary=y
+;;
 treeboot-currituck)
 link_address='0x100'
 ;;
-- 
2.31.1



[PATCH v2 2/9] powerpc: Add Microwatt device tree

2021-06-17 Thread Paul Mackerras
Microwatt currently runs with MSR[HV] = 0, hence the usable-privilege
properties don't have bit 2 (for HV support) set, and we need the
/chosen/ibm,architecture-vec-5 property.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/dts/microwatt.dts | 98 +
 1 file changed, 98 insertions(+)
 create mode 100644 arch/powerpc/boot/dts/microwatt.dts

diff --git a/arch/powerpc/boot/dts/microwatt.dts 
b/arch/powerpc/boot/dts/microwatt.dts
new file mode 100644
index ..ac264ad3faaf
--- /dev/null
+++ b/arch/powerpc/boot/dts/microwatt.dts
@@ -0,0 +1,98 @@
+/dts-v1/;
+
+/ {
+   #size-cells = <0x02>;
+   #address-cells = <0x02>;
+   model-name = "microwatt";
+   compatible = "microwatt-soc";
+
+   reserved-memory {
+   #size-cells = <0x02>;
+   #address-cells = <0x02>;
+   ranges;
+   };
+
+   memory@0 {
+   device_type = "memory";
+   reg = <0x 0x 0x 0x1000>;
+   };
+
+   cpus {
+   #size-cells = <0x00>;
+   #address-cells = <0x01>;
+
+   ibm,powerpc-cpu-features {
+   display-name = "Microwatt";
+   isa = <3000>;
+   device_type = "cpu-features";
+   compatible = "ibm,powerpc-cpu-features";
+
+   mmu-radix {
+   isa = <3000>;
+   usable-privilege = <2>;
+   };
+
+   little-endian {
+   isa = <2050>;
+   usable-privilege = <3>;
+   hwcap-bit-nr = <1>;
+   };
+
+   cache-inhibited-large-page {
+   isa = <2040>;
+   usable-privilege = <2>;
+   };
+
+   fixed-point-v3 {
+   isa = <3000>;
+   usable-privilege = <3>;
+   };
+
+   no-execute {
+   isa = <2010>;
+   usable-privilege = <2>;
+   };
+
+   floating-point {
+   hwcap-bit-nr = <27>;
+   isa = <0>;
+   usable-privilege = <3>;
+   };
+   };
+
+   PowerPC,Microwatt@0 {
+   i-cache-sets = <2>;
+   ibm,dec-bits = <64>;
+   reservation-granule-size = <64>;
+   clock-frequency = <1>;
+   timebase-frequency = <1>;
+   i-tlb-sets = <1>;
+   ibm,ppc-interrupt-server#s = <0>;
+   i-cache-block-size = <64>;
+   d-cache-block-size = <64>;
+   d-cache-sets = <2>;
+   i-tlb-size = <64>;
+   cpu-version = <0x99>;
+   status = "okay";
+   i-cache-size = <0x1000>;
+   ibm,processor-radix-AP-encodings = <0x0c 0xa010 
0x2015 0x401e>;
+   tlb-size = <0>;
+   tlb-sets = <0>;
+   device_type = "cpu";
+   d-tlb-size = <128>;
+   d-tlb-sets = <2>;
+   reg = <0>;
+   general-purpose;
+   64-bit;
+   d-cache-size = <0x1000>;
+   ibm,chip-id = <0>;
+   };
+   };
+
+   chosen {
+   bootargs = "";
+   ibm,architecture-vec-5 = [19 00 10 00 00 00 00 00 00 00 00 00 
00 00 00 00
+ 00 00 00 00 00 00 00 00 40 00 40];
+   };
+
+};
-- 
2.31.1



[PATCH v2 5/9] powerpc/microwatt: Use standard 16550 UART for console

2021-06-17 Thread Paul Mackerras
From: Benjamin Herrenschmidt 

This adds support to the Microwatt platform to use the standard
16550-style UART which available in the standalone Microwatt FPGA.

Signed-off-by: Benjamin Herrenschmidt 
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/boot/dts/microwatt.dts  | 27 
 arch/powerpc/kernel/udbg_16550.c | 39 
 arch/powerpc/platforms/microwatt/Kconfig |  1 +
 arch/powerpc/platforms/microwatt/setup.c |  2 ++
 4 files changed, 63 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/boot/dts/microwatt.dts 
b/arch/powerpc/boot/dts/microwatt.dts
index 04e5dd92270e..974abbdda249 100644
--- a/arch/powerpc/boot/dts/microwatt.dts
+++ b/arch/powerpc/boot/dts/microwatt.dts
@@ -6,6 +6,10 @@ / {
model-name = "microwatt";
compatible = "microwatt-soc";
 
+   aliases {
+   serial0 = 
+   };
+
reserved-memory {
#size-cells = <0x02>;
#address-cells = <0x02>;
@@ -89,12 +93,6 @@ PowerPC,Microwatt@0 {
};
};
 
-   chosen {
-   bootargs = "";
-   ibm,architecture-vec-5 = [19 00 10 00 00 00 00 00 00 00 00 00 
00 00 00 00
- 00 00 00 00 00 00 00 00 40 00 40];
-   };
-
soc@c000 {
compatible = "simple-bus";
#address-cells = <1>;
@@ -119,5 +117,22 @@ ICS: interrupt-controller@5000 {
#interrupt-cells = <2>;
};
 
+   UART0: serial@2000 {
+   device_type = "serial";
+   compatible = "ns16550";
+   reg = <0x2000 0x8>;
+   clock-frequency = <1>;
+   current-speed = <115200>;
+   reg-shift = <2>;
+   fifo-size = <16>;
+   interrupts = <0x10 0x1>;
+   };
+   };
+
+   chosen {
+   bootargs = "";
+   ibm,architecture-vec-5 = [19 00 10 00 00 00 00 00 00 00 00 00 
00 00 00 00
+ 00 00 00 00 00 00 00 00 40 00 40];
+   stdout-path = 
};
 };
diff --git a/arch/powerpc/kernel/udbg_16550.c b/arch/powerpc/kernel/udbg_16550.c
index 9356b60d6030..8513aa49614e 100644
--- a/arch/powerpc/kernel/udbg_16550.c
+++ b/arch/powerpc/kernel/udbg_16550.c
@@ -296,3 +296,42 @@ void __init udbg_init_40x_realmode(void)
 }
 
 #endif /* CONFIG_PPC_EARLY_DEBUG_40x */
+
+#ifdef CONFIG_PPC_EARLY_DEBUG_MICROWATT
+
+#define UDBG_UART_MW_ADDR  ((void __iomem *)0xc0002000)
+
+static u8 udbg_uart_in_isa300_rm(unsigned int reg)
+{
+   uint64_t msr = mfmsr();
+   uint8_t  c;
+
+   mtmsr(msr & ~(MSR_EE|MSR_DR));
+   isync();
+   eieio();
+   c = __raw_rm_readb(UDBG_UART_MW_ADDR + (reg << 2));
+   mtmsr(msr);
+   isync();
+   return c;
+}
+
+static void udbg_uart_out_isa300_rm(unsigned int reg, u8 val)
+{
+   uint64_t msr = mfmsr();
+
+   mtmsr(msr & ~(MSR_EE|MSR_DR));
+   isync();
+   eieio();
+   __raw_rm_writeb(val, UDBG_UART_MW_ADDR + (reg << 2));
+   mtmsr(msr);
+   isync();
+}
+
+void __init udbg_init_debug_microwatt(void)
+{
+   udbg_uart_in = udbg_uart_in_isa300_rm;
+   udbg_uart_out = udbg_uart_out_isa300_rm;
+   udbg_use_uart();
+}
+
+#endif /* CONFIG_PPC_EARLY_DEBUG_MICROWATT */
diff --git a/arch/powerpc/platforms/microwatt/Kconfig 
b/arch/powerpc/platforms/microwatt/Kconfig
index b52c869c0eb8..50ed0cedb5f1 100644
--- a/arch/powerpc/platforms/microwatt/Kconfig
+++ b/arch/powerpc/platforms/microwatt/Kconfig
@@ -6,6 +6,7 @@ config PPC_MICROWATT
select PPC_ICS_NATIVE
select PPC_ICP_NATIVE
select PPC_NATIVE
+   select PPC_UDBG_16550
help
   This option enables support for FPGA-based Microwatt implementations.
 
diff --git a/arch/powerpc/platforms/microwatt/setup.c 
b/arch/powerpc/platforms/microwatt/setup.c
index 1c1b7791fa57..0b02603bdb74 100644
--- a/arch/powerpc/platforms/microwatt/setup.c
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static void __init microwatt_init_IRQ(void)
 {
@@ -35,5 +36,6 @@ define_machine(microwatt) {
.name   = "microwatt",
.probe  = microwatt_probe,
.init_IRQ   = microwatt_init_IRQ,
+   .progress   = udbg_progress,
.calibrate_decr = generic_calibrate_decr,
 };
-- 
2.31.1



[PATCH v2 7/9] powerpc/microwatt: Add microwatt_defconfig

2021-06-17 Thread Paul Mackerras
Signed-off-by: Paul Mackerras 
---
 arch/powerpc/configs/microwatt_defconfig | 98 
 1 file changed, 98 insertions(+)
 create mode 100644 arch/powerpc/configs/microwatt_defconfig

diff --git a/arch/powerpc/configs/microwatt_defconfig 
b/arch/powerpc/configs/microwatt_defconfig
new file mode 100644
index ..a08b739123da
--- /dev/null
+++ b/arch/powerpc/configs/microwatt_defconfig
@@ -0,0 +1,98 @@
+# CONFIG_SWAP is not set
+# CONFIG_CROSS_MEMORY_ATTACH is not set
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_PREEMPT_VOLUNTARY=y
+CONFIG_TICK_CPU_ACCOUNTING=y
+CONFIG_LOG_BUF_SHIFT=16
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=12
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_CC_OPTIMIZE_FOR_SIZE=y
+CONFIG_KALLSYMS_ALL=y
+CONFIG_EMBEDDED=y
+# CONFIG_VM_EVENT_COUNTERS is not set
+# CONFIG_SLUB_DEBUG is not set
+# CONFIG_COMPAT_BRK is not set
+# CONFIG_SLAB_MERGE_DEFAULT is not set
+CONFIG_PPC64=y
+# CONFIG_PPC_KUEP is not set
+# CONFIG_PPC_KUAP is not set
+CONFIG_CPU_LITTLE_ENDIAN=y
+CONFIG_NR_IRQS=64
+CONFIG_PANIC_TIMEOUT=10
+# CONFIG_PPC_POWERNV is not set
+# CONFIG_PPC_PSERIES is not set
+CONFIG_PPC_MICROWATT=y
+# CONFIG_PPC_OF_BOOT_TRAMPOLINE is not set
+CONFIG_CPU_FREQ=y
+CONFIG_HZ_100=y
+# CONFIG_PPC_MEM_KEYS is not set
+# CONFIG_SECCOMP is not set
+# CONFIG_MQ_IOSCHED_KYBER is not set
+# CONFIG_COREDUMP is not set
+# CONFIG_COMPACTION is not set
+# CONFIG_MIGRATION is not set
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_PACKET_DIAG=y
+CONFIG_UNIX=y
+CONFIG_UNIX_DIAG=y
+CONFIG_INET=y
+CONFIG_INET_UDP_DIAG=y
+CONFIG_INET_RAW_DIAG=y
+# CONFIG_WIRELESS is not set
+CONFIG_DEVTMPFS=y
+CONFIG_DEVTMPFS_MOUNT=y
+# CONFIG_STANDALONE is not set
+# CONFIG_PREVENT_FIRMWARE_BUILD is not set
+# CONFIG_FW_LOADER is not set
+# CONFIG_ALLOW_DEV_COREDUMP is not set
+CONFIG_MTD=y
+CONFIG_MTD_BLOCK=y
+CONFIG_MTD_PARTITIONED_MASTER=y
+CONFIG_MTD_SPI_NOR=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_BLK_DEV_RAM=y
+CONFIG_NETDEVICES=y
+# CONFIG_WLAN is not set
+# CONFIG_INPUT is not set
+# CONFIG_SERIO is not set
+# CONFIG_VT is not set
+CONFIG_SERIAL_8250=y
+# CONFIG_SERIAL_8250_DEPRECATED_OPTIONS is not set
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_OF_PLATFORM=y
+CONFIG_SERIAL_NONSTANDARD=y
+# CONFIG_NVRAM is not set
+CONFIG_RANDOM_TRUST_CPU=y
+CONFIG_SPI=y
+CONFIG_SPI_DEBUG=y
+CONFIG_SPI_BITBANG=y
+CONFIG_SPI_SPIDEV=y
+# CONFIG_HWMON is not set
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_VIRTIO_MENU is not set
+# CONFIG_IOMMU_SUPPORT is not set
+# CONFIG_NVMEM is not set
+CONFIG_EXT4_FS=y
+# CONFIG_FILE_LOCKING is not set
+# CONFIG_DNOTIFY is not set
+# CONFIG_INOTIFY_USER is not set
+# CONFIG_MISC_FILESYSTEMS is not set
+# CONFIG_CRYPTO_HW is not set
+# CONFIG_XZ_DEC_X86 is not set
+# CONFIG_XZ_DEC_IA64 is not set
+# CONFIG_XZ_DEC_ARM is not set
+# CONFIG_XZ_DEC_ARMTHUMB is not set
+# CONFIG_XZ_DEC_SPARC is not set
+CONFIG_PRINTK_TIME=y
+# CONFIG_SYMBOLIC_ERRNAME is not set
+# CONFIG_DEBUG_BUGVERBOSE is not set
+# CONFIG_DEBUG_MISC is not set
+# CONFIG_SCHED_DEBUG is not set
+# CONFIG_FTRACE is not set
+# CONFIG_STRICT_DEVMEM is not set
+CONFIG_PPC_DISABLE_WERROR=y
+CONFIG_XMON=y
+CONFIG_XMON_DEFAULT=y
+# CONFIG_XMON_DEFAULT_RO_MODE is not set
+# CONFIG_RUNTIME_TESTING_MENU is not set
-- 
2.31.1



Re: [PATCH] powerpc/signal64: Copy siginfo before changing regs->nip

2021-06-17 Thread Michael Ellerman
On Tue, 8 Jun 2021 23:46:05 +1000, Michael Ellerman wrote:
> In commit 96d7a4e06fab ("powerpc/signal64: Rewrite handle_rt_signal64()
> to minimise uaccess switches") the 64-bit signal code was rearranged to
> use user_write_access_begin/end().
> 
> As part of that change the call to copy_siginfo_to_user() was moved
> later in the function, so that it could be done after the
> user_write_access_end().
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/signal64: Copy siginfo before changing regs->nip
  https://git.kernel.org/powerpc/c/e41d6c3f4f9b4804e53ca87aba8ee11ada606c77

cheers


Re: [PATCH] powerpc: Fix initrd corruption with relative jump labels

2021-06-17 Thread Michael Ellerman
On Mon, 14 Jun 2021 23:14:40 +1000, Michael Ellerman wrote:
> Commit b0b3b2c78ec0 ("powerpc: Switch to relative jump labels") switched
> us to using relative jump labels. That involves changing the code,
> target and key members in struct jump_entry to be relative to the
> address of the jump_entry, rather than absolute addresses.
> 
> We have two static inlines that create a struct jump_entry,
> arch_static_branch() and arch_static_branch_jump(), as well as an asm
> macro ARCH_STATIC_BRANCH, which is used by the pseries-only hypervisor
> tracing code.
> 
> [...]

Applied to powerpc/fixes.

[1/1] powerpc: Fix initrd corruption with relative jump labels
  https://git.kernel.org/powerpc/c/478036c4cd1a16e613a2f883d79c03cf187faacb

cheers


Re: [PATCH] powerpc/mem: Add back missing header to fix 'no previous prototype' error

2021-06-17 Thread Michael Ellerman
On Sat, 5 Jun 2021 08:56:09 + (UTC), Christophe Leroy wrote:
> Commit b26e8f27253a ("powerpc/mem: Move cache flushing functions into
> mm/cacheflush.c") removed asm/sparsemem.h which is required when
> CONFIG_MEMORY_HOTPLUG is selected to get the declaration of
> create_section_mapping().
> 
> Add it back.

Applied to powerpc/fixes.

[1/1] powerpc/mem: Add back missing header to fix 'no previous prototype' error
  https://git.kernel.org/powerpc/c/8e11d62e2e8769fe29d1ae98b44b23c7233eb8a2

cheers


Re: [PATCH v3] lockdown,selinux: fix wrong subject in some SELinux lockdown checks

2021-06-17 Thread Paul Moore
On Wed, Jun 16, 2021 at 4:51 AM Ondrej Mosnacek  wrote:
>
> Commit 59438b46471a ("security,lockdown,selinux: implement SELinux
> lockdown") added an implementation of the locked_down LSM hook to
> SELinux, with the aim to restrict which domains are allowed to perform
> operations that would breach lockdown.
>
> However, in several places the security_locked_down() hook is called in
> situations where the current task isn't doing any action that would
> directly breach lockdown, leading to SELinux checks that are basically
> bogus.
>
> To fix this, add an explicit struct cred pointer argument to
> security_lockdown() and define NULL as a special value to pass instead
> of current_cred() in such situations. LSMs that take the subject
> credentials into account can then fall back to some default or ignore
> such calls altogether. In the SELinux lockdown hook implementation, use
> SECINITSID_KERNEL in case the cred argument is NULL.
>
> Most of the callers are updated to pass current_cred() as the cred
> pointer, thus maintaining the same behavior. The following callers are
> modified to pass NULL as the cred pointer instead:
> 1. arch/powerpc/xmon/xmon.c
>  Seems to be some interactive debugging facility. It appears that
>  the lockdown hook is called from interrupt context here, so it
>  should be more appropriate to request a global lockdown decision.
> 2. fs/tracefs/inode.c:tracefs_create_file()
>  Here the call is used to prevent creating new tracefs entries when
>  the kernel is locked down. Assumes that locking down is one-way -
>  i.e. if the hook returns non-zero once, it will never return zero
>  again, thus no point in creating these files. Also, the hook is
>  often called by a module's init function when it is loaded by
>  userspace, where it doesn't make much sense to do a check against
>  the current task's creds, since the task itself doesn't actually
>  use the tracing functionality (i.e. doesn't breach lockdown), just
>  indirectly makes some new tracepoints available to whoever is
>  authorized to use them.
> 3. net/xfrm/xfrm_user.c:copy_to_user_*()
>  Here a cryptographic secret is redacted based on the value returned
>  from the hook. There are two possible actions that may lead here:
>  a) A netlink message XFRM_MSG_GETSA with NLM_F_DUMP set - here the
> task context is relevant, since the dumped data is sent back to
> the current task.
>  b) When adding/deleting/updating an SA via XFRM_MSG_xxxSA, the
> dumped SA is broadcasted to tasks subscribed to XFRM events -
> here the current task context is not relevant as it doesn't
> represent the tasks that could potentially see the secret.
>  It doesn't seem worth it to try to keep using the current task's
>  context in the a) case, since the eventual data leak can be
>  circumvented anyway via b), plus there is no way for the task to
>  indicate that it doesn't care about the actual key value, so the
>  check could generate a lot of "false alert" denials with SELinux.
>  Thus, let's pass NULL instead of current_cred() here faute de
>  mieux.
>
> Improvements-suggested-by: Casey Schaufler 
> Improvements-suggested-by: Paul Moore 
> Fixes: 59438b46471a ("security,lockdown,selinux: implement SELinux lockdown")
> Signed-off-by: Ondrej Mosnacek 

This seems reasonable to me, but before I merge it into the SELinux
tree I think it would be good to get some ACKs from the relevant
subsystem folks.  I don't believe we ever saw a response to the last
question for the PPC folks, did we?

> ---
>
> v3:
> - add the cred argument to security_locked_down() and adapt all callers
> - keep using current_cred() in BPF, as the hook calls have been shifted
>   to program load time (commit ff40e51043af ("bpf, lockdown, audit: Fix
>   buggy SELinux lockdown permission checks"))
> - in SELinux, don't ignore hook calls where cred == NULL, but use
>   SECINITSID_KERNEL as the subject instead
> - update explanations in the commit message
>
> v2: https://lore.kernel.org/lkml/20210517092006.803332-1-omosn...@redhat.com/
> - change to a single hook based on suggestions by Casey Schaufler
>
> v1: https://lore.kernel.org/lkml/20210507114048.138933-1-omosn...@redhat.com/
>
>  arch/powerpc/xmon/xmon.c |  4 ++--
>  arch/x86/kernel/ioport.c |  4 ++--
>  arch/x86/kernel/msr.c|  4 ++--
>  arch/x86/mm/testmmiotrace.c  |  2 +-
>  drivers/acpi/acpi_configfs.c |  2 +-
>  drivers/acpi/custom_method.c |  2 +-
>  drivers/acpi/osl.c   |  3 ++-
>  drivers/acpi/tables.c|  2 +-
>  drivers/char/mem.c   |  2 +-
>  drivers/cxl/mem.c|  2 +-
>  drivers/firmware/efi/efi.c   |  2 +-
>  drivers/firmware/efi/test/efi_test.c |  2 +-
>  drivers/pci/pci-sysfs.c  |  6 +++---
>  drivers/pci/proc.c   |  6 

Re: [PATCH 01/18] mm: add a kunmap_local_dirty helper

2021-06-17 Thread Herbert Xu
On Thu, Jun 17, 2021 at 08:01:57PM -0700, Ira Weiny wrote:
>
> > +   flush_kernel_dcache_page(__page);   \
> 
> Is this required on 32bit systems?  Why is kunmap_flush_on_unmap() not
> sufficient on 64bit systems?  The normal kunmap_local() path does that.
> 
> I'm sorry but I did not see a conclusion to my query on V1. Herbert implied 
> the
> he just copied from the crypto code.[1]  I'm concerned that this _dirty() call
> is just going to confuse the users of kmap even more.  So why can't we get to
> the bottom of why flush_kernel_dcache_page() needs so much logic around it
> before complicating the general kernel users.
> 
> I would like to see it go away if possible.

This thread may be related:

https://lwn.net/Articles/240249/

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: [RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details

2021-06-17 Thread Aneesh Kumar K.V

On 6/18/21 1:30 AM, Daniel Henrique Barboza wrote:



On 6/17/21 8:11 AM, Aneesh Kumar K.V wrote:

Daniel Henrique Barboza  writes:


On 6/17/21 4:46 AM, David Gibson wrote:

On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:

David Gibson  writes:







In fact, the more I speak about this PMEM scenario the more I wonder:
why doesn't the PMEM driver, when switching from persistent to regular
memory and vice-versa, take care of all the necessary updates in the
numa-distance-table and kernel internals to reflect the current distances
of its current mode? Is this a technical limitation?




I sent v4 doing something similar to this .

-aneesh



Re: [PATCH 01/18] mm: add a kunmap_local_dirty helper

2021-06-17 Thread Ira Weiny
On Tue, Jun 15, 2021 at 03:24:39PM +0200, Christoph Hellwig wrote:
> Add a helper that calls flush_kernel_dcache_page before unmapping the
> local mapping.  flush_kernel_dcache_page is required for all pages
> potentially mapped into userspace that were written to using kmap*,
> so having a helper that does the right thing can be very convenient.
> 
> Signed-off-by: Christoph Hellwig 
> ---
>  include/linux/highmem-internal.h | 7 +++
>  include/linux/highmem.h  | 4 
>  2 files changed, 11 insertions(+)
> 
> diff --git a/include/linux/highmem-internal.h 
> b/include/linux/highmem-internal.h
> index 7902c7d8b55f..bd37706db147 100644
> --- a/include/linux/highmem-internal.h
> +++ b/include/linux/highmem-internal.h
> @@ -224,4 +224,11 @@ do { 
> \
>   __kunmap_local(__addr); \
>  } while (0)
>  
> +#define kunmap_local_dirty(__page, __addr)   \

I think having to store the page and addr to return to kunmap_local_dirty() is
going to be a pain in some code paths.  Not a show stopper but see below...

> +do { \
> + if (!PageSlab(__page))  \

Was there some clarification why the page can't be a Slab page?  Or is this
just an optimization?

> + flush_kernel_dcache_page(__page);   \

Is this required on 32bit systems?  Why is kunmap_flush_on_unmap() not
sufficient on 64bit systems?  The normal kunmap_local() path does that.

I'm sorry but I did not see a conclusion to my query on V1. Herbert implied the
he just copied from the crypto code.[1]  I'm concerned that this _dirty() call
is just going to confuse the users of kmap even more.  So why can't we get to
the bottom of why flush_kernel_dcache_page() needs so much logic around it
before complicating the general kernel users.

I would like to see it go away if possible.

Ira

[1] https://lore.kernel.org/lkml/20210615050258.ga5...@gondor.apana.org.au/

> + kunmap_local(__addr);   \
> +} while (0)
> +
>  #endif
> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
> index 832b49b50c7b..65f548db4f2d 100644
> --- a/include/linux/highmem.h
> +++ b/include/linux/highmem.h
> @@ -93,6 +93,10 @@ static inline void kmap_flush_unused(void);
>   * On HIGHMEM enabled systems mapping a highmem page has the side effect of
>   * disabling migration in order to keep the virtual address stable across
>   * preemption. No caller of kmap_local_page() can rely on this side effect.
> + *
> + * If data is written to the returned kernel mapping, the callers needs to
> + * unmap the mapping using kunmap_local_dirty(), else kunmap_local() should
> + * be used.
>   */
>  static inline void *kmap_local_page(struct page *page);
>  
> -- 
> 2.30.2
> 


Re: [PATCH 02/11] powerpc: Add Microwatt device tree

2021-06-17 Thread Paul Mackerras
On Thu, Jun 17, 2021 at 02:41:28PM +1000, Michael Ellerman wrote:
> Paul Mackerras  writes:
> >
> 
> Little bit of change log never hurts :)
> 
> > Signed-off-by: Paul Mackerras 
> > ---
> >  arch/powerpc/boot/dts/microwatt.dts | 105 
> >  1 file changed, 105 insertions(+)
> >  create mode 100644 arch/powerpc/boot/dts/microwatt.dts
> >
> > diff --git a/arch/powerpc/boot/dts/microwatt.dts 
> > b/arch/powerpc/boot/dts/microwatt.dts
> > new file mode 100644
> > index ..9b2e64da9432
> > --- /dev/null
> > +++ b/arch/powerpc/boot/dts/microwatt.dts
> > @@ -0,0 +1,105 @@
> > +/dts-v1/;
> > +
> > +/ {
> > +   #size-cells = <0x02>;
> > +   #address-cells = <0x02>;
> > +   model-name = "microwatt";
> > +   compatible = "microwatt-soc";
> > +
> > +   reserved-memory {
> > +   #size-cells = <0x02>;
> > +   #address-cells = <0x02>;
> > +   ranges;
> > +   };
> > +
> > +   memory@0 {
> > +   device_type = "memory";
> > +   reg = <0x 0x 0x 0x1000>;
> > +   };
> > +
> > +   cpus {
> > +   #size-cells = <0x00>;
> > +   #address-cells = <0x01>;
> > +
> > +   ibm,powerpc-cpu-features {
> > +   display-name = "Microwatt";
> > +   isa = <3000>;
> > +   device_type = "cpu-features";
> > +   compatible = "ibm,powerpc-cpu-features";
> > +
> > +   mmu-radix {
> > +   isa = <3000>;
> > +   usable-privilege = <2>;
> 
> skiboot says 6?

That's for a machine with hypervisor mode - if I make it 6 here, then
the kernel prints a message about "HV feature passed to guest" and
then another about "missing dependency" and ends up not enabling the
feature.

Note that microwatt usually has MSR[HV] = 0 (you can set it to 1 but
it doesn't do anything).  Arguably it should force it to 1 always, but
if I do that, then the kernel starts trying to execute hrfid
instructions, which microwatt doesn't have (for example in
masked_Hinterrupt).

> > +   os-support = <0x00>;
> > +   };
> > +
> > +   little-endian {
> > +   isa = <0>;
> 
> I guess you just copied that from skiboot.
> 
> The binding says it's required, but AFAICS the kernel doesn't use it.
>
> And isa = 0 mean ISA_BASE, according to the skiboot source.

I changed it to 2050 since true little-endian mode was introduced for
POWER6.

> > +   PowerPC,Microwatt@0 {
> > +   i-cache-sets = <2>;
> > +   ibm,dec-bits = <64>;
> > +   reservation-granule-size = <64>;
> 
> Never seen that one before.

It's in PAPR+ (D.6.1.4, CPU Node Properties).

> > +   clock-frequency = <1>;
> > +   timebase-frequency = <1>;
> 
> Those seem quite high?

No, 100MHz is correct.

> > +   i-tlb-sets = <1>;
> > +   ibm,ppc-interrupt-server#s = <0>;
> > +   i-cache-block-size = <64>;
> > +   d-cache-block-size = <64>;
> 
> The kernel reads those, but also hard codes 128 in places.

Interesting, because it all seems to work.  I assume the critical
thing is doing the right dcbz's.

> See L1_CACHE_BYTES.
> 
> > +   ibm,pa-features = [40 00 c2 27 00 00 00 80 00 00 00 00 
> > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00 80 00 80 
> > 00 00 00 80 00 80 00 80 00 80 00 80 00 80 00 80 00 80 00 80 00 80 00 80 00 
> > 80 00 80 00];
> 
> Do you need that?
> 
> You shouldn't, if we've done things right with the cpu-features support.

Turns out I don't need it.

> > +   d-cache-sets = <2>;
> > +   ibm,pir = <0x3c>;
> 
> Needed?

Nope.

> > +   i-tlb-size = <64>;
> > +   cpu-version = <0x99>;
> > +   status = "okay";
> > +   i-cache-size = <0x1000>;
> > +   ibm,processor-radix-AP-encodings = <0x0c 0xa010 
> > 0x2015 0x401e>;
> > +   tlb-size = <0>;
> > +   tlb-sets = <0>;
> 
> Does the kernel use those? I can't find it.
> 
> > +   device_type = "cpu";
> > +   d-tlb-size = <128>;
> > +   d-tlb-sets = <2>;
> > +   reg = <0>;
> > +   general-purpose;
> > +   64-bit;
> > +   d-cache-size = <0x1000>;
> > +   ibm,chip-id = <0x00>;
> > +   };
> > +   };
> > +
> > +   chosen {
> > +   bootargs = "";
> > +   ibm,architecture-vec-5 = [19 00 10 00 00 00 00 00 00 00 00 00 
> > 00 00 00 00 00 00 00 00 00 00 00 00 40 00 40];
> 
> Do you need that?
> 
> I assume you run with MSR[HV] = 1 (you don't say anywhere), in which
> case we never look at that property.

I do need that given we're running with MSR[HV] = 0; 

Re: [PATCH v6 13/17] powerpc/pseries/vas: Setup IRQ and fault handling

2021-06-17 Thread Haren Myneni
On Fri, 2021-06-18 at 09:34 +1000, Nicholas Piggin wrote:
> Excerpts from Haren Myneni's message of June 18, 2021 6:37 am:
> > NX generates an interrupt when sees a fault on the user space
> > buffer and the hypervisor forwards that interrupt to OS. Then
> > the kernel handles the interrupt by issuing H_GET_NX_FAULT hcall
> > to retrieve the fault CRB information.
> > 
> > This patch also adds changes to setup and free IRQ per each
> > window and also handles the fault by updating the CSB.
> 
> In as much as this pretty well corresponds to the PowerNV code
> AFAIKS,
> it looks okay to me.
> 
> Reviewed-by: Nicholas Piggin 
> 
> Could you have an irq handler in your ops vector and have 
> the core code set up the irq and call your handler, so the Linux irq
> handling is in one place? Not something for this series, I was just
> wondering.

Not possible to have common core code for IRQ  setup. 

PowerNV: Every VAS instance will be having IRQ and this setup will be
done during initialization (system boot). A fault FIFO will be assigned
for each instance and registered to VAS so that VAS/NX writes fault CRB
into this FIFO.  

PowerVM: Each window will have an IRQ and the setup will be done during
window open. 

Thanks
Haren

> 
> Thanks,
> Nick
> 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/pseries/vas.c | 102
> > +++
> >  1 file changed, 102 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/pseries/vas.c
> > b/arch/powerpc/platforms/pseries/vas.c
> > index f5a44f2f0e99..3385b5400cc6 100644
> > --- a/arch/powerpc/platforms/pseries/vas.c
> > +++ b/arch/powerpc/platforms/pseries/vas.c
> > @@ -11,6 +11,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -155,6 +156,50 @@ int h_query_vas_capabilities(const u64 hcall,
> > u8 query_type, u64 result)
> >  }
> >  EXPORT_SYMBOL_GPL(h_query_vas_capabilities);
> >  
> > +/*
> > + * hcall to get fault CRB from the hypervisor.
> > + */
> > +static int h_get_nx_fault(u32 winid, u64 buffer)
> > +{
> > +   long rc;
> > +
> > +   rc = plpar_hcall_norets(H_GET_NX_FAULT, winid, buffer);
> > +
> > +   if (rc == H_SUCCESS)
> > +   return 0;
> > +
> > +   pr_err("H_GET_NX_FAULT error: %ld, winid %u, buffer 0x%llx\n",
> > +   rc, winid, buffer);
> > +   return -EIO;
> > +
> > +}
> > +
> > +/*
> > + * Handle the fault interrupt.
> > + * When the fault interrupt is received for each window, query the
> > + * hypervisor to get the fault CRB on the specific fault. Then
> > + * process the CRB by updating CSB or send signal if the user
> > space
> > + * CSB is invalid.
> > + * Note: The hypervisor forwards an interrupt for each fault
> > request.
> > + * So one fault CRB to process for each H_GET_NX_FAULT hcall.
> > + */
> > +irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data)
> > +{
> > +   struct pseries_vas_window *txwin = data;
> > +   struct coprocessor_request_block crb;
> > +   struct vas_user_win_ref *tsk_ref;
> > +   int rc;
> > +
> > +   rc = h_get_nx_fault(txwin->vas_win.winid,
> > (u64)virt_to_phys());
> > +   if (!rc) {
> > +   tsk_ref = >vas_win.task_ref;
> > +   vas_dump_crb();
> > +   vas_update_csb(, tsk_ref);
> > +   }
> > +
> > +   return IRQ_HANDLED;
> > +}
> > +
> >  /*
> >   * Allocate window and setup IRQ mapping.
> >   */
> > @@ -166,10 +211,51 @@ static int allocate_setup_window(struct
> > pseries_vas_window *txwin,
> > rc = h_allocate_vas_window(txwin, domain, wintype,
> > DEF_WIN_CREDS);
> > if (rc)
> > return rc;
> > +   /*
> > +* On PowerVM, the hypervisor setup and forwards the fault
> > +* interrupt per window. So the IRQ setup and fault handling
> > +* will be done for each open window separately.
> > +*/
> > +   txwin->fault_virq = irq_create_mapping(NULL, txwin->fault_irq);
> > +   if (!txwin->fault_virq) {
> > +   pr_err("Failed irq mapping %d\n", txwin->fault_irq);
> > +   rc = -EINVAL;
> > +   goto out_win;
> > +   }
> > +
> > +   txwin->name = kasprintf(GFP_KERNEL, "vas-win-%d",
> > +   txwin->vas_win.winid);
> > +   if (!txwin->name) {
> > +   rc = -ENOMEM;
> > +   goto out_irq;
> > +   }
> > +
> > +   rc = request_threaded_irq(txwin->fault_virq, NULL,
> > + pseries_vas_fault_thread_fn,
> > IRQF_ONESHOT,
> > + txwin->name, txwin);
> > +   if (rc) {
> > +   pr_err("VAS-Window[%d]: Request IRQ(%u) failed with
> > %d\n",
> > +  txwin->vas_win.winid, txwin->fault_virq, rc);
> > +   goto out_free;
> > +   }
> >  
> > txwin->vas_win.wcreds_max = DEF_WIN_CREDS;
> >  
> > return 0;
> > +out_free:
> > +   kfree(txwin->name);
> > +out_irq:
> > +   irq_dispose_mapping(txwin->fault_virq);
> > +out_win:
> > +   h_deallocate_vas_window(txwin->vas_win.winid);
> > +   return rc;
> > +}
> > +
> > +static 

Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation

2021-06-17 Thread Andy Lutomirski
On 6/17/21 8:16 AM, Mathieu Desnoyers wrote:
> - On Jun 15, 2021, at 11:21 PM, Andy Lutomirski l...@kernel.org wrote:
> 
> [...]
> 
>> +# An architecture that wants to support
>> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE needs to define precisely what 
>> it
>> +# is supposed to do and implement membarrier_sync_core_before_usermode() to
>> +# make it do that.  Then it can select ARCH_HAS_MEMBARRIER_SYNC_CORE via
>> +# Kconfig.Unfortunately, MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE is not a
>> +# fantastic API and may not make sense on all architectures.  Once an
>> +# architecture meets these requirements,
> 
> Can we please remove the editorial comment about the quality of the membarrier
> sync-core's API ?

Done
>> +#
>> +# On x86, a program can safely modify code, issue
>> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE, and then execute that code, 
>> via
>> +# the modified address or an alias, from any thread in the calling process.
>> +#
>> +# On arm64, a program can modify code, flush the icache as needed, and issue
>> +# MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE to force a "context 
>> synchronizing
>> +# event", aka pipeline flush on all CPUs that might run the calling process.
>> +# Then the program can execute the modified code as long as it is executed
>> +# from an address consistent with the icache flush and the CPU's cache type.
>> +#
>> +# On powerpc, a program can use MEMBARRIER_CMD_PRIVATE_EXPEDITED_SYNC_CORE
>> +# similarly to arm64.  It would be nice if the powerpc maintainers could
>> +# add a more clear explanantion.
> 
> We should document the requirements on ARMv7 as well.

Done.

> 
> Thanks,
> 
> Mathieu
> 



Re: [PATCH 8/8] membarrier: Rewrite sync_core_before_usermode() and improve documentation

2021-06-17 Thread Andy Lutomirski
On 6/17/21 7:47 AM, Mathieu Desnoyers wrote:

> Please change back this #ifndef / #else / #endif within function for
> 
> if (!IS_ENABLED(CONFIG_ARCH_HAS_MEMBARRIER_SYNC_CORE)) {
>   ...
> } else {
>   ...
> }
> 
> I don't think mixing up preprocessor and code logic makes it more readable.

I agree, but I don't know how to make the result work well.
membarrier_sync_core_before_usermode() isn't defined in the !IS_ENABLED
case, so either I need to fake up a definition or use #ifdef.

If I faked up a definition, I would want to assert, at build time, that
it isn't called.  I don't think we can do:

static void membarrier_sync_core_before_usermode()
{
BUILD_BUG_IF_REACHABLE();
}



[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3

2021-06-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213079

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #296759|0   |1
is obsolete||
 Attachment #296761|0   |1
is obsolete||

--- Comment #10 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 297439
  --> https://bugzilla.kernel.org/attachment.cgi?id=297439=edit
kernel .config (5.13-rc6, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3

2021-06-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213079

--- Comment #9 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 297437
  --> https://bugzilla.kernel.org/attachment.cgi?id=297437=edit
dmesg (5.13-rc6 w. patch fbbefb3 reverted + debug, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3

2021-06-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213079

--- Comment #8 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 297435
  --> https://bugzilla.kernel.org/attachment.cgi?id=297435=edit
dmesg (5.13-rc6 + debug, PowerMac G5 11,2)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 213079] [bisected] IRQ problems and crashes on a PowerMac G5 with 5.12.3

2021-06-17 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=213079

--- Comment #7 from Erhard F. (erhar...@mailbox.org) ---
(In reply to Oliver O'Halloran from comment #5)
> Could you add "debug" to the kernel command line and post the dmesg output
> for a boot with the patch applied and reverted?
Ok, on top of 5.13-rc6 I reverted fbbefb3, which went fine execpt the
"pci-ioda.c"-part where I needed to manually apple the old code.

Here's the vanilla debug dmesg and the debug dmesg with the patch reverted.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v6 16/17] crypto/nx: Add sysfs interface to export NX capabilities

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:39 am:
> 
> Export NX-GZIP capabilities to usrespace in sysfs
> /sys/devices/vio/ibm,compression-v1/nx_gzip_caps directory.
> These are queried by userspace accelerator libraries to set
> minimum length heuristics and maximum limits on request sizes.
> 
> NX-GZIP capabilities:
> min_compress_len  /*Recommended minimum compress length in bytes*/
> min_decompress_len /*Recommended minimum decompress length in bytes*/
> req_max_processed_len /* Maximum number of bytes processed in one
>   request */
> 
> NX will return RMA_Reject if the request buffer size is greater
> than req_max_processed_len.
> 
> Signed-off-by: Haren Myneni 
> Acked-by: Herbert Xu 

Again, if you could just move those ^^ C style comments into the
code. Possibly have a small comment with the sysfs stuff saying that 
it's ABI which userspace code queries when using the accelerator (or
if you have an ABI documentation somewhere, point the comments to that).

sysfs is ABI, but some bits are more ABI than others, it would just be
good to be clear that it is actually used, and can't be changed.

Aside from that,

Acked-by: Nicholas Piggin 

Thanks,
Nick

> ---
>  drivers/crypto/nx/nx-common-pseries.c | 43 +++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/crypto/nx/nx-common-pseries.c 
> b/drivers/crypto/nx/nx-common-pseries.c
> index 9fc2abb56019..f51a50d40504 100644
> --- a/drivers/crypto/nx/nx-common-pseries.c
> +++ b/drivers/crypto/nx/nx-common-pseries.c
> @@ -967,6 +967,36 @@ static struct attribute_group nx842_attribute_group = {
>   .attrs = nx842_sysfs_entries,
>  };
>  
> +#define  nxcop_caps_read(_name)  
> \
> +static ssize_t nxcop_##_name##_show(struct device *dev,  
> \
> + struct device_attribute *attr, char *buf)   \
> +{\
> + return sprintf(buf, "%lld\n", nx_cop_caps._name);   \
> +}
> +
> +#define NXCT_ATTR_RO(_name)  \
> + nxcop_caps_read(_name); \
> + static struct device_attribute dev_attr_##_name = __ATTR(_name, \
> + 0444,   \
> + nxcop_##_name##_show,   \
> + NULL);
> +
> +NXCT_ATTR_RO(req_max_processed_len);
> +NXCT_ATTR_RO(min_compress_len);
> +NXCT_ATTR_RO(min_decompress_len);
> +
> +static struct attribute *nxcop_caps_sysfs_entries[] = {
> + _attr_req_max_processed_len.attr,
> + _attr_min_compress_len.attr,
> + _attr_min_decompress_len.attr,
> + NULL,
> +};
> +
> +static struct attribute_group nxcop_caps_attr_group = {
> + .name   =   "nx_gzip_caps",
> + .attrs  =   nxcop_caps_sysfs_entries,
> +};
> +
>  static struct nx842_driver nx842_pseries_driver = {
>   .name = KBUILD_MODNAME,
>   .owner =THIS_MODULE,
> @@ -1056,6 +1086,16 @@ static int nx842_probe(struct vio_dev *viodev,
>   goto error;
>   }
>  
> + if (caps_feat) {
> + if (sysfs_create_group(>dev.kobj,
> + _caps_attr_group)) {
> + dev_err(>dev,
> + "Could not create sysfs NX capability 
> entries\n");
> + ret = -1;
> + goto error;
> + }
> + }
> +
>   return 0;
>  
>  error_unlock:
> @@ -1075,6 +1115,9 @@ static void nx842_remove(struct vio_dev *viodev)
>   pr_info("Removing IBM Power 842 compression device\n");
>   sysfs_remove_group(>dev.kobj, _attribute_group);
>  
> + if (caps_feat)
> + sysfs_remove_group(>dev.kobj, _caps_attr_group);
> +
>   crypto_unregister_alg(_pseries_alg);
>  
>   spin_lock_irqsave(_mutex, flags);
> -- 
> 2.18.2
> 
> 
> 


Re: [PATCH v6 15/17] crypto/nx: Get NX capabilities for GZIP coprocessor type

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:38 am:
> 
> The hypervisor provides different NX capabilities that it
> supports. These capabilities such as recommended minimum
> compression / decompression lengths and the maximum request
> buffer size in bytes are used to define the user space NX
> request.
> 
> NX will reject the request if the buffer size is more than
> the maximum buffer size. Whereas compression / decompression
> lengths are recommended values for better performance.
> 
> Changes to get NX overall capabilities which points to the
> specific features that the hypervisor supports. Then retrieve
> the capabilities for the specific feature (available only
> for NXGZIP).
> 
> Signed-off-by: Haren Myneni 
> Acked-by: Herbert Xu 

Acked-by: Nicholas Piggin 

> ---
>  drivers/crypto/nx/nx-common-pseries.c | 87 +++
>  1 file changed, 87 insertions(+)
> 
> diff --git a/drivers/crypto/nx/nx-common-pseries.c 
> b/drivers/crypto/nx/nx-common-pseries.c
> index cc8dd3072b8b..9fc2abb56019 100644
> --- a/drivers/crypto/nx/nx-common-pseries.c
> +++ b/drivers/crypto/nx/nx-common-pseries.c
> @@ -9,6 +9,8 @@
>   */
>  
>  #include 
> +#include 
> +#include 
>  
>  #include "nx-842.h"
>  #include "nx_csbcpb.h" /* struct nx_csbcpb */
> @@ -19,6 +21,29 @@ MODULE_DESCRIPTION("842 H/W Compression driver for IBM 
> Power processors");
>  MODULE_ALIAS_CRYPTO("842");
>  MODULE_ALIAS_CRYPTO("842-nx");
>  
> +/*
> + * Coprocessor type specific capabilities from the hypervisor.
> + */
> +struct hv_nx_cop_caps {
> + __be64  descriptor;
> + __be64  req_max_processed_len;  /* Max bytes in one GZIP request */
> + __be64  min_compress_len;   /* Min compression size in bytes */
> + __be64  min_decompress_len; /* Min decompression size in bytes */
> +} __packed __aligned(0x1000);
> +
> +/*
> + * Coprocessor type specific capabilities.
> + */
> +struct nx_cop_caps {
> + u64 descriptor;
> + u64 req_max_processed_len;  /* Max bytes in one GZIP request */
> + u64 min_compress_len;   /* Min compression in bytes */
> + u64 min_decompress_len; /* Min decompression in bytes */
> +};
> +
> +static u64 caps_feat;
> +static struct nx_cop_caps nx_cop_caps;
> +
>  static struct nx842_constraints nx842_pseries_constraints = {
>   .alignment =DDE_BUFFER_ALIGN,
>   .multiple = DDE_BUFFER_LAST_MULT,
> @@ -1065,6 +1090,64 @@ static void nx842_remove(struct vio_dev *viodev)
>   kfree(old_devdata);
>  }
>  
> +/*
> + * Get NX capabilities from the hypervisor.
> + * Only NXGZIP capabilities are provided by the hypersvisor right
> + * now and these values are available to user space with sysfs.
> + */
> +static void __init nxcop_get_capabilities(void)
> +{
> + struct hv_vas_all_caps *hv_caps;
> + struct hv_nx_cop_caps *hv_nxc;
> + int rc;
> +
> + hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
> + if (!hv_caps)
> + return;
> + /*
> +  * Get NX overall capabilities with feature type=0
> +  */
> + rc = h_query_vas_capabilities(H_QUERY_NX_CAPABILITIES, 0,
> +   (u64)virt_to_phys(hv_caps));
> + if (rc)
> + goto out;
> +
> + caps_feat = be64_to_cpu(hv_caps->feat_type);
> + /*
> +  * NX-GZIP feature available
> +  */
> + if (caps_feat & VAS_NX_GZIP_FEAT_BIT) {
> + hv_nxc = kmalloc(sizeof(*hv_nxc), GFP_KERNEL);
> + if (!hv_nxc)
> + goto out;
> + /*
> +  * Get capabilities for NX-GZIP feature
> +  */
> + rc = h_query_vas_capabilities(H_QUERY_NX_CAPABILITIES,
> +   VAS_NX_GZIP_FEAT,
> +   (u64)virt_to_phys(hv_nxc));
> + } else {
> + pr_err("NX-GZIP feature is not available\n");
> + rc = -EINVAL;
> + }
> +
> + if (!rc) {
> + nx_cop_caps.descriptor = be64_to_cpu(hv_nxc->descriptor);
> + nx_cop_caps.req_max_processed_len =
> + be64_to_cpu(hv_nxc->req_max_processed_len);
> + nx_cop_caps.min_compress_len =
> + be64_to_cpu(hv_nxc->min_compress_len);
> + nx_cop_caps.min_decompress_len =
> + be64_to_cpu(hv_nxc->min_decompress_len);
> + } else {
> + caps_feat = 0;
> + }
> +
> + kfree(hv_nxc);
> +out:
> + kfree(hv_caps);
> +}
> +
>  static const struct vio_device_id nx842_vio_driver_ids[] = {
>   {"ibm,compression-v1", "ibm,compression"},
>   {"", ""},
> @@ -1092,6 +1175,10 @@ static int __init nx842_pseries_init(void)
>   return -ENOMEM;
>  
>   RCU_INIT_POINTER(devdata, new_devdata);
> + /*
> +  * Get NX capabilities from the hypervisor.
> +  */
> + nxcop_get_capabilities();
>  
>   ret = 

Re: [PATCH v6 13/17] powerpc/pseries/vas: Setup IRQ and fault handling

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:37 am:
> 
> NX generates an interrupt when sees a fault on the user space
> buffer and the hypervisor forwards that interrupt to OS. Then
> the kernel handles the interrupt by issuing H_GET_NX_FAULT hcall
> to retrieve the fault CRB information.
> 
> This patch also adds changes to setup and free IRQ per each
> window and also handles the fault by updating the CSB.

In as much as this pretty well corresponds to the PowerNV code AFAIKS,
it looks okay to me.

Reviewed-by: Nicholas Piggin 

Could you have an irq handler in your ops vector and have 
the core code set up the irq and call your handler, so the Linux irq
handling is in one place? Not something for this series, I was just
wondering.

Thanks,
Nick

> 
> Signed-off-by: Haren Myneni 
> ---
>  arch/powerpc/platforms/pseries/vas.c | 102 +++
>  1 file changed, 102 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/vas.c 
> b/arch/powerpc/platforms/pseries/vas.c
> index f5a44f2f0e99..3385b5400cc6 100644
> --- a/arch/powerpc/platforms/pseries/vas.c
> +++ b/arch/powerpc/platforms/pseries/vas.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -155,6 +156,50 @@ int h_query_vas_capabilities(const u64 hcall, u8 
> query_type, u64 result)
>  }
>  EXPORT_SYMBOL_GPL(h_query_vas_capabilities);
>  
> +/*
> + * hcall to get fault CRB from the hypervisor.
> + */
> +static int h_get_nx_fault(u32 winid, u64 buffer)
> +{
> + long rc;
> +
> + rc = plpar_hcall_norets(H_GET_NX_FAULT, winid, buffer);
> +
> + if (rc == H_SUCCESS)
> + return 0;
> +
> + pr_err("H_GET_NX_FAULT error: %ld, winid %u, buffer 0x%llx\n",
> + rc, winid, buffer);
> + return -EIO;
> +
> +}
> +
> +/*
> + * Handle the fault interrupt.
> + * When the fault interrupt is received for each window, query the
> + * hypervisor to get the fault CRB on the specific fault. Then
> + * process the CRB by updating CSB or send signal if the user space
> + * CSB is invalid.
> + * Note: The hypervisor forwards an interrupt for each fault request.
> + *   So one fault CRB to process for each H_GET_NX_FAULT hcall.
> + */
> +irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data)
> +{
> + struct pseries_vas_window *txwin = data;
> + struct coprocessor_request_block crb;
> + struct vas_user_win_ref *tsk_ref;
> + int rc;
> +
> + rc = h_get_nx_fault(txwin->vas_win.winid, (u64)virt_to_phys());
> + if (!rc) {
> + tsk_ref = >vas_win.task_ref;
> + vas_dump_crb();
> + vas_update_csb(, tsk_ref);
> + }
> +
> + return IRQ_HANDLED;
> +}
> +
>  /*
>   * Allocate window and setup IRQ mapping.
>   */
> @@ -166,10 +211,51 @@ static int allocate_setup_window(struct 
> pseries_vas_window *txwin,
>   rc = h_allocate_vas_window(txwin, domain, wintype, DEF_WIN_CREDS);
>   if (rc)
>   return rc;
> + /*
> +  * On PowerVM, the hypervisor setup and forwards the fault
> +  * interrupt per window. So the IRQ setup and fault handling
> +  * will be done for each open window separately.
> +  */
> + txwin->fault_virq = irq_create_mapping(NULL, txwin->fault_irq);
> + if (!txwin->fault_virq) {
> + pr_err("Failed irq mapping %d\n", txwin->fault_irq);
> + rc = -EINVAL;
> + goto out_win;
> + }
> +
> + txwin->name = kasprintf(GFP_KERNEL, "vas-win-%d",
> + txwin->vas_win.winid);
> + if (!txwin->name) {
> + rc = -ENOMEM;
> + goto out_irq;
> + }
> +
> + rc = request_threaded_irq(txwin->fault_virq, NULL,
> +   pseries_vas_fault_thread_fn, IRQF_ONESHOT,
> +   txwin->name, txwin);
> + if (rc) {
> + pr_err("VAS-Window[%d]: Request IRQ(%u) failed with %d\n",
> +txwin->vas_win.winid, txwin->fault_virq, rc);
> + goto out_free;
> + }
>  
>   txwin->vas_win.wcreds_max = DEF_WIN_CREDS;
>  
>   return 0;
> +out_free:
> + kfree(txwin->name);
> +out_irq:
> + irq_dispose_mapping(txwin->fault_virq);
> +out_win:
> + h_deallocate_vas_window(txwin->vas_win.winid);
> + return rc;
> +}
> +
> +static inline void free_irq_setup(struct pseries_vas_window *txwin)
> +{
> + free_irq(txwin->fault_virq, txwin);
> + kfree(txwin->name);
> + irq_dispose_mapping(txwin->fault_virq);
>  }
>  
>  static struct vas_window *vas_allocate_window(int vas_id, u64 flags,
> @@ -284,6 +370,11 @@ static struct vas_window *vas_allocate_window(int 
> vas_id, u64 flags,
>   return >vas_win;
>  
>  out_free:
> + /*
> +  * Window is not operational. Free IRQ before closing
> +  * window so that do not have to hold mutex.
> +  */
> + free_irq_setup(txwin);
>   h_deallocate_vas_window(txwin->vas_win.winid);
>  out:
>

Re: [PATCH v13 09/12] swiotlb: Add restricted DMA alloc/free support

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Add the functions, swiotlb_{alloc,free} and is_swiotlb_for_alloc to
> support the memory allocation from restricted DMA pool.
> 
> The restricted DMA pool is preferred if available.
> 
> Note that since coherent allocation needs remapping, one must set up
> another device coherent pool by shared-dma-pool and use
> dma_alloc_from_dev_coherent instead for atomic coherent allocation.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

Acked-by: Stefano Stabellini 


> ---
>  include/linux/swiotlb.h | 26 ++
>  kernel/dma/direct.c | 49 +++--
>  kernel/dma/swiotlb.c| 38 ++--
>  3 files changed, 99 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 8d8855c77d9a..a73fad460162 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -85,6 +85,7 @@ extern enum swiotlb_force swiotlb_force;
>   * @debugfs: The dentry to debugfs.
>   * @late_alloc:  %true if allocated using the page allocator
>   * @force_bounce: %true if swiotlb bouncing is forced
> + * @for_alloc:  %true if the pool is used for memory allocation
>   */
>  struct io_tlb_mem {
>   phys_addr_t start;
> @@ -96,6 +97,7 @@ struct io_tlb_mem {
>   struct dentry *debugfs;
>   bool late_alloc;
>   bool force_bounce;
> + bool for_alloc;
>   struct io_tlb_slot {
>   phys_addr_t orig_addr;
>   size_t alloc_size;
> @@ -156,4 +158,28 @@ static inline void swiotlb_adjust_size(unsigned long 
> size)
>  extern void swiotlb_print_info(void);
>  extern void swiotlb_set_max_segment(unsigned int);
>  
> +#ifdef CONFIG_DMA_RESTRICTED_POOL
> +struct page *swiotlb_alloc(struct device *dev, size_t size);
> +bool swiotlb_free(struct device *dev, struct page *page, size_t size);
> +
> +static inline bool is_swiotlb_for_alloc(struct device *dev)
> +{
> + return dev->dma_io_tlb_mem->for_alloc;
> +}
> +#else
> +static inline struct page *swiotlb_alloc(struct device *dev, size_t size)
> +{
> + return NULL;
> +}
> +static inline bool swiotlb_free(struct device *dev, struct page *page,
> + size_t size)
> +{
> + return false;
> +}
> +static inline bool is_swiotlb_for_alloc(struct device *dev)
> +{
> + return false;
> +}
> +#endif /* CONFIG_DMA_RESTRICTED_POOL */
> +
>  #endif /* __LINUX_SWIOTLB_H */
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index a92465b4eb12..2de33e5d302b 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -75,6 +75,15 @@ static bool dma_coherent_ok(struct device *dev, 
> phys_addr_t phys, size_t size)
>   min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit);
>  }
>  
> +static void __dma_direct_free_pages(struct device *dev, struct page *page,
> + size_t size)
> +{
> + if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
> + swiotlb_free(dev, page, size))
> + return;
> + dma_free_contiguous(dev, page, size);
> +}
> +
>  static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
>   gfp_t gfp)
>  {
> @@ -86,6 +95,16 @@ static struct page *__dma_direct_alloc_pages(struct device 
> *dev, size_t size,
>  
>   gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
>  _limit);
> + if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
> + is_swiotlb_for_alloc(dev)) {
> + page = swiotlb_alloc(dev, size);
> + if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> + __dma_direct_free_pages(dev, page, size);
> + return NULL;
> + }
> + return page;
> + }
> +
>   page = dma_alloc_contiguous(dev, size, gfp);
>   if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
>   dma_free_contiguous(dev, page, size);
> @@ -142,7 +161,7 @@ void *dma_direct_alloc(struct device *dev, size_t size,
>   gfp |= __GFP_NOWARN;
>  
>   if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> - !force_dma_unencrypted(dev)) {
> + !force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
>   page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
>   if (!page)
>   return NULL;
> @@ -155,18 +174,23 @@ void *dma_direct_alloc(struct device *dev, size_t size,
>   }
>  
>   if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
> - !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
> - !dev_is_dma_coherent(dev))
> + !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) &&
> + !is_swiotlb_for_alloc(dev))
>   return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
>  
>   /*
>* 

Re: [PATCH v13 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> use it to determine whether to bounce the data or not. This will be
> useful later to allow for different pools.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

Acked-by: Stefano Stabellini 


> ---
>  drivers/xen/swiotlb-xen.c |  2 +-
>  include/linux/swiotlb.h   | 11 +++
>  kernel/dma/direct.c   |  2 +-
>  kernel/dma/direct.h   |  2 +-
>  kernel/dma/swiotlb.c  |  4 
>  5 files changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 0c6ed09f8513..4730a146fa35 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -369,7 +369,7 @@ static dma_addr_t xen_swiotlb_map_page(struct device 
> *dev, struct page *page,
>   if (dma_capable(dev, dev_addr, size, true) &&
>   !range_straddles_page_boundary(phys, size) &&
>   !xen_arch_need_swiotlb(dev, phys, dev_addr) &&
> - swiotlb_force != SWIOTLB_FORCE)
> + !is_swiotlb_force_bounce(dev))
>   goto done;
>  
>   /*
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index dd1c30a83058..8d8855c77d9a 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -84,6 +84,7 @@ extern enum swiotlb_force swiotlb_force;
>   *   unmap calls.
>   * @debugfs: The dentry to debugfs.
>   * @late_alloc:  %true if allocated using the page allocator
> + * @force_bounce: %true if swiotlb bouncing is forced
>   */
>  struct io_tlb_mem {
>   phys_addr_t start;
> @@ -94,6 +95,7 @@ struct io_tlb_mem {
>   spinlock_t lock;
>   struct dentry *debugfs;
>   bool late_alloc;
> + bool force_bounce;
>   struct io_tlb_slot {
>   phys_addr_t orig_addr;
>   size_t alloc_size;
> @@ -109,6 +111,11 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
> phys_addr_t paddr)
>   return mem && paddr >= mem->start && paddr < mem->end;
>  }
>  
> +static inline bool is_swiotlb_force_bounce(struct device *dev)
> +{
> + return dev->dma_io_tlb_mem->force_bounce;
> +}
> +
>  void __init swiotlb_exit(void);
>  unsigned int swiotlb_max_segment(void);
>  size_t swiotlb_max_mapping_size(struct device *dev);
> @@ -120,6 +127,10 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
> phys_addr_t paddr)
>  {
>   return false;
>  }
> +static inline bool is_swiotlb_force_bounce(struct device *dev)
> +{
> + return false;
> +}
>  static inline void swiotlb_exit(void)
>  {
>  }
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 7a88c34d0867..a92465b4eb12 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -496,7 +496,7 @@ size_t dma_direct_max_mapping_size(struct device *dev)
>  {
>   /* If SWIOTLB is active, use its maximum mapping size */
>   if (is_swiotlb_active(dev) &&
> - (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
> + (dma_addressing_limited(dev) || is_swiotlb_force_bounce(dev)))
>   return swiotlb_max_mapping_size(dev);
>   return SIZE_MAX;
>  }
> diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
> index 13e9e7158d94..4632b0f4f72e 100644
> --- a/kernel/dma/direct.h
> +++ b/kernel/dma/direct.h
> @@ -87,7 +87,7 @@ static inline dma_addr_t dma_direct_map_page(struct device 
> *dev,
>   phys_addr_t phys = page_to_phys(page) + offset;
>   dma_addr_t dma_addr = phys_to_dma(dev, phys);
>  
> - if (unlikely(swiotlb_force == SWIOTLB_FORCE))
> + if (is_swiotlb_force_bounce(dev))
>   return swiotlb_map(dev, phys, size, dir, attrs);
>  
>   if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 409694d7a8ad..13891d5de8c9 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -179,6 +179,10 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem 
> *mem, phys_addr_t start,
>   mem->end = mem->start + bytes;
>   mem->index = 0;
>   mem->late_alloc = late_alloc;
> +
> + if (swiotlb_force == SWIOTLB_FORCE)
> + mem->force_bounce = true;
> +
>   spin_lock_init(>lock);
>   for (i = 0; i < mem->nslabs; i++) {
>   mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> -- 
> 2.32.0.288.g62a8d224e6-goog
> 


Re: [PATCH v13 05/12] swiotlb: Update is_swiotlb_active to add a struct device argument

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Update is_swiotlb_active to add a struct device argument. This will be
> useful later to allow for different pools.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

Acked-by: Stefano Stabellini 


> ---
>  drivers/gpu/drm/i915/gem/i915_gem_internal.c | 2 +-
>  drivers/gpu/drm/nouveau/nouveau_ttm.c| 2 +-
>  drivers/pci/xen-pcifront.c   | 2 +-
>  include/linux/swiotlb.h  | 4 ++--
>  kernel/dma/direct.c  | 2 +-
>  kernel/dma/swiotlb.c | 4 ++--
>  6 files changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c 
> b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> index a9d65fc8aa0e..4b7afa0fc85d 100644
> --- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> +++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
> @@ -42,7 +42,7 @@ static int i915_gem_object_get_pages_internal(struct 
> drm_i915_gem_object *obj)
>  
>   max_order = MAX_ORDER;
>  #ifdef CONFIG_SWIOTLB
> - if (is_swiotlb_active()) {
> + if (is_swiotlb_active(obj->base.dev->dev)) {
>   unsigned int max_segment;
>  
>   max_segment = swiotlb_max_segment();
> diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c 
> b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> index 9662522aa066..be15bfd9e0ee 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
> @@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
>   }
>  
>  #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
> - need_swiotlb = is_swiotlb_active();
> + need_swiotlb = is_swiotlb_active(dev->dev);
>  #endif
>  
>   ret = ttm_bo_device_init(>ttm.bdev, _bo_driver,
> diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
> index b7a8f3a1921f..0d56985bfe81 100644
> --- a/drivers/pci/xen-pcifront.c
> +++ b/drivers/pci/xen-pcifront.c
> @@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct 
> pcifront_device *pdev)
>  
>   spin_unlock(_dev_lock);
>  
> - if (!err && !is_swiotlb_active()) {
> + if (!err && !is_swiotlb_active(>xdev->dev)) {
>   err = pci_xen_swiotlb_init_late();
>   if (err)
>   dev_err(>xdev->dev, "Could not setup SWIOTLB!\n");
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index d1f3d95881cd..dd1c30a83058 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -112,7 +112,7 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
> phys_addr_t paddr)
>  void __init swiotlb_exit(void);
>  unsigned int swiotlb_max_segment(void);
>  size_t swiotlb_max_mapping_size(struct device *dev);
> -bool is_swiotlb_active(void);
> +bool is_swiotlb_active(struct device *dev);
>  void __init swiotlb_adjust_size(unsigned long size);
>  #else
>  #define swiotlb_force SWIOTLB_NO_FORCE
> @@ -132,7 +132,7 @@ static inline size_t swiotlb_max_mapping_size(struct 
> device *dev)
>   return SIZE_MAX;
>  }
>  
> -static inline bool is_swiotlb_active(void)
> +static inline bool is_swiotlb_active(struct device *dev)
>  {
>   return false;
>  }
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 84c9feb5474a..7a88c34d0867 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -495,7 +495,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>  size_t dma_direct_max_mapping_size(struct device *dev)
>  {
>   /* If SWIOTLB is active, use its maximum mapping size */
> - if (is_swiotlb_active() &&
> + if (is_swiotlb_active(dev) &&
>   (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
>   return swiotlb_max_mapping_size(dev);
>   return SIZE_MAX;
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index de79e9437030..409694d7a8ad 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -664,9 +664,9 @@ size_t swiotlb_max_mapping_size(struct device *dev)
>   return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
>  }
>  
> -bool is_swiotlb_active(void)
> +bool is_swiotlb_active(struct device *dev)
>  {
> - return io_tlb_default_mem != NULL;
> + return dev->dma_io_tlb_mem != NULL;
>  }
>  EXPORT_SYMBOL_GPL(is_swiotlb_active);
>  
> -- 
> 2.32.0.288.g62a8d224e6-goog
> 


Re: [PATCH v13 04/12] swiotlb: Update is_swiotlb_buffer to add a struct device argument

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Update is_swiotlb_buffer to add a struct device argument. This will be
> useful later to allow for different pools.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

Acked-by: Stefano Stabellini 


> ---
>  drivers/iommu/dma-iommu.c | 12 ++--
>  drivers/xen/swiotlb-xen.c |  2 +-
>  include/linux/swiotlb.h   |  7 ---
>  kernel/dma/direct.c   |  6 +++---
>  kernel/dma/direct.h   |  6 +++---
>  5 files changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 3087d9fa6065..10997ef541f8 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -507,7 +507,7 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, 
> dma_addr_t dma_addr,
>  
>   __iommu_dma_unmap(dev, dma_addr, size);
>  
> - if (unlikely(is_swiotlb_buffer(phys)))
> + if (unlikely(is_swiotlb_buffer(dev, phys)))
>   swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
>  }
>  
> @@ -578,7 +578,7 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device 
> *dev, phys_addr_t phys,
>   }
>  
>   iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask);
> - if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(phys))
> + if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(dev, phys))
>   swiotlb_tbl_unmap_single(dev, phys, org_size, dir, attrs);
>   return iova;
>  }
> @@ -749,7 +749,7 @@ static void iommu_dma_sync_single_for_cpu(struct device 
> *dev,
>   if (!dev_is_dma_coherent(dev))
>   arch_sync_dma_for_cpu(phys, size, dir);
>  
> - if (is_swiotlb_buffer(phys))
> + if (is_swiotlb_buffer(dev, phys))
>   swiotlb_sync_single_for_cpu(dev, phys, size, dir);
>  }
>  
> @@ -762,7 +762,7 @@ static void iommu_dma_sync_single_for_device(struct 
> device *dev,
>   return;
>  
>   phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
> - if (is_swiotlb_buffer(phys))
> + if (is_swiotlb_buffer(dev, phys))
>   swiotlb_sync_single_for_device(dev, phys, size, dir);
>  
>   if (!dev_is_dma_coherent(dev))
> @@ -783,7 +783,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
>   if (!dev_is_dma_coherent(dev))
>   arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
>  
> - if (is_swiotlb_buffer(sg_phys(sg)))
> + if (is_swiotlb_buffer(dev, sg_phys(sg)))
>   swiotlb_sync_single_for_cpu(dev, sg_phys(sg),
>   sg->length, dir);
>   }
> @@ -800,7 +800,7 @@ static void iommu_dma_sync_sg_for_device(struct device 
> *dev,
>   return;
>  
>   for_each_sg(sgl, sg, nelems, i) {
> - if (is_swiotlb_buffer(sg_phys(sg)))
> + if (is_swiotlb_buffer(dev, sg_phys(sg)))
>   swiotlb_sync_single_for_device(dev, sg_phys(sg),
>  sg->length, dir);
>  
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 4c89afc0df62..0c6ed09f8513 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -100,7 +100,7 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
> dma_addr_t dma_addr)
>* in our domain. Therefore _only_ check address within our domain.
>*/
>   if (pfn_valid(PFN_DOWN(paddr)))
> - return is_swiotlb_buffer(paddr);
> + return is_swiotlb_buffer(dev, paddr);
>   return 0;
>  }
>  
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index 216854a5e513..d1f3d95881cd 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -2,6 +2,7 @@
>  #ifndef __LINUX_SWIOTLB_H
>  #define __LINUX_SWIOTLB_H
>  
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -101,9 +102,9 @@ struct io_tlb_mem {
>  };
>  extern struct io_tlb_mem *io_tlb_default_mem;
>  
> -static inline bool is_swiotlb_buffer(phys_addr_t paddr)
> +static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
>  {
> - struct io_tlb_mem *mem = io_tlb_default_mem;
> + struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
>  
>   return mem && paddr >= mem->start && paddr < mem->end;
>  }
> @@ -115,7 +116,7 @@ bool is_swiotlb_active(void);
>  void __init swiotlb_adjust_size(unsigned long size);
>  #else
>  #define swiotlb_force SWIOTLB_NO_FORCE
> -static inline bool is_swiotlb_buffer(phys_addr_t paddr)
> +static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
>  {
>   return false;
>  }
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index f737e3347059..84c9feb5474a 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -343,7 +343,7 @@ void dma_direct_sync_sg_for_device(struct device *dev,
>   for_each_sg(sgl, sg, 

Re: [PATCH v13 03/12] swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Always have the pointer to the swiotlb pool used in struct device. This
> could help simplify the code for other pools.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 

Acked-by: Stefano Stabellini 

> ---
>  drivers/base/core.c| 4 
>  include/linux/device.h | 4 
>  kernel/dma/swiotlb.c   | 8 
>  3 files changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index f29839382f81..cb3123e3954d 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -27,6 +27,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include  /* for dma_default_coherent */
>  
> @@ -2736,6 +2737,9 @@ void device_initialize(struct device *dev)
>  defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
>   dev->dma_coherent = dma_default_coherent;
>  #endif
> +#ifdef CONFIG_SWIOTLB
> + dev->dma_io_tlb_mem = io_tlb_default_mem;
> +#endif
>  }
>  EXPORT_SYMBOL_GPL(device_initialize);
>  
> diff --git a/include/linux/device.h b/include/linux/device.h
> index ba660731bd25..240d652a0696 100644
> --- a/include/linux/device.h
> +++ b/include/linux/device.h
> @@ -416,6 +416,7 @@ struct dev_links_info {
>   * @dma_pools:   Dma pools (if dma'ble device).
>   * @dma_mem: Internal for coherent mem override.
>   * @cma_area:Contiguous memory area for dma allocations
> + * @dma_io_tlb_mem: Pointer to the swiotlb pool used.  Not for driver use.
>   * @archdata:For arch-specific additions.
>   * @of_node: Associated device tree node.
>   * @fwnode:  Associated device node supplied by platform firmware.
> @@ -518,6 +519,9 @@ struct device {
>  #ifdef CONFIG_DMA_CMA
>   struct cma *cma_area;   /* contiguous memory area for dma
>  allocations */
> +#endif
> +#ifdef CONFIG_SWIOTLB
> + struct io_tlb_mem *dma_io_tlb_mem;
>  #endif
>   /* arch specific additions */
>   struct dev_archdata archdata;
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 2dba659a1e73..de79e9437030 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -340,7 +340,7 @@ void __init swiotlb_exit(void)
>  static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t 
> size,
>  enum dma_data_direction dir)
>  {
> - struct io_tlb_mem *mem = io_tlb_default_mem;
> + struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
>   int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
>   unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
>   phys_addr_t orig_addr = mem->slots[index].orig_addr;
> @@ -431,7 +431,7 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, 
> unsigned int index)
>  static int find_slots(struct device *dev, phys_addr_t orig_addr,
>   size_t alloc_size)
>  {
> - struct io_tlb_mem *mem = io_tlb_default_mem;
> + struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
>   unsigned long boundary_mask = dma_get_seg_boundary(dev);
>   dma_addr_t tbl_dma_addr =
>   phys_to_dma_unencrypted(dev, mem->start) & boundary_mask;
> @@ -508,7 +508,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, 
> phys_addr_t orig_addr,
>   size_t mapping_size, size_t alloc_size,
>   enum dma_data_direction dir, unsigned long attrs)
>  {
> - struct io_tlb_mem *mem = io_tlb_default_mem;
> + struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
>   unsigned int offset = swiotlb_align_offset(dev, orig_addr);
>   unsigned int i;
>   int index;
> @@ -559,7 +559,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
> phys_addr_t tlb_addr,
> size_t mapping_size, enum dma_data_direction dir,
> unsigned long attrs)
>  {
> - struct io_tlb_mem *mem = io_tlb_default_mem;
> + struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem;
>   unsigned long flags;
>   unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr);
>   int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
> -- 
> 2.32.0.288.g62a8d224e6-goog
> 


Re: [PATCH v13 01/12] swiotlb: Refactor swiotlb init functions

2021-06-17 Thread Stefano Stabellini
On Thu, 17 Jun 2021, Claire Chang wrote:
> Add a new function, swiotlb_init_io_tlb_mem, for the io_tlb_mem struct
> initialization to make the code reusable.
> 
> Signed-off-by: Claire Chang 
> Reviewed-by: Christoph Hellwig 
> Tested-by: Stefano Stabellini 
> Tested-by: Will Deacon 
> ---
>  kernel/dma/swiotlb.c | 50 ++--
>  1 file changed, 25 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index 52e2ac526757..47bb2a766798 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -168,9 +168,28 @@ void __init swiotlb_update_mem_attributes(void)
>   memset(vaddr, 0, bytes);
>  }
>  
> -int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> verbose)
> +static void swiotlb_init_io_tlb_mem(struct io_tlb_mem *mem, phys_addr_t 
> start,
> + unsigned long nslabs, bool late_alloc)
>  {
> + void *vaddr = phys_to_virt(start);
>   unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
> +
> + mem->nslabs = nslabs;
> + mem->start = start;
> + mem->end = mem->start + bytes;
> + mem->index = 0;
> + mem->late_alloc = late_alloc;
> + spin_lock_init(>lock);
> + for (i = 0; i < mem->nslabs; i++) {
> + mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> + mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> + mem->slots[i].alloc_size = 0;
> + }
> + memset(vaddr, 0, bytes);
> +}
> +
> +int __init swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
> verbose)
> +{
>   struct io_tlb_mem *mem;
>   size_t alloc_size;
>  
> @@ -186,16 +205,8 @@ int __init swiotlb_init_with_tbl(char *tlb, unsigned 
> long nslabs, int verbose)
>   if (!mem)
>   panic("%s: Failed to allocate %zu bytes align=0x%lx\n",
> __func__, alloc_size, PAGE_SIZE);
> - mem->nslabs = nslabs;
> - mem->start = __pa(tlb);
> - mem->end = mem->start + bytes;
> - mem->index = 0;
> - spin_lock_init(>lock);
> - for (i = 0; i < mem->nslabs; i++) {
> - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> - mem->slots[i].alloc_size = 0;
> - }
> +
> + swiotlb_init_io_tlb_mem(mem, __pa(tlb), nslabs, false);
>  
>   io_tlb_default_mem = mem;
>   if (verbose)
> @@ -282,8 +293,8 @@ swiotlb_late_init_with_default_size(size_t default_size)
>  int
>  swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs)
>  {
> - unsigned long bytes = nslabs << IO_TLB_SHIFT, i;
>   struct io_tlb_mem *mem;
> + unsigned long bytes = nslabs << IO_TLB_SHIFT;
>  
>   if (swiotlb_force == SWIOTLB_NO_FORCE)
>   return 0;
> @@ -297,20 +308,9 @@ swiotlb_late_init_with_tbl(char *tlb, unsigned long 
> nslabs)
>   if (!mem)
>   return -ENOMEM;
>  
> - mem->nslabs = nslabs;
> - mem->start = virt_to_phys(tlb);
> - mem->end = mem->start + bytes;
> - mem->index = 0;
> - mem->late_alloc = 1;
> - spin_lock_init(>lock);
> - for (i = 0; i < mem->nslabs; i++) {
> - mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
> - mem->slots[i].orig_addr = INVALID_PHYS_ADDR;
> - mem->slots[i].alloc_size = 0;
> - }
> -
> + memset(mem, 0, sizeof(*mem));
> + swiotlb_init_io_tlb_mem(mem, virt_to_phys(tlb), nslabs, true);
>   set_memory_decrypted((unsigned long)tlb, bytes >> PAGE_SHIFT);
> - memset(tlb, 0, bytes);
 
This is good for swiotlb_late_init_with_tbl. However I have just noticed
that mem could also be allocated from swiotlb_init_with_tbl, in which
case the zeroing is missing. I think we need another memset in
swiotlb_init_with_tbl as well. Or maybe it could be better to have a
single memset at the beginning of swiotlb_init_io_tlb_mem instead. Up to
you.


Re: [PATCH v6 04/17] powerpc/vas: Add platform specific user window operations

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:31 am:
> 
> PowerNV uses registers to open/close VAS windows, and getting the
> paste address. Whereas the hypervisor calls are used on PowerVM.
> 
> This patch adds the platform specific user space window operations
> and register with the common VAS user space interface.
> 
> Signed-off-by: Haren Myneni 
> Reviewed-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/vas.h  | 16 +--
>  arch/powerpc/platforms/book3s/vas-api.c | 53 +
>  arch/powerpc/platforms/powernv/vas-window.c | 45 -
>  arch/powerpc/platforms/powernv/vas.h|  2 +
>  4 files changed, 91 insertions(+), 25 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
> index 6076adf9ab4f..163a8bb85d02 100644
> --- a/arch/powerpc/include/asm/vas.h
> +++ b/arch/powerpc/include/asm/vas.h
> @@ -5,6 +5,7 @@
>  
>  #ifndef _ASM_POWERPC_VAS_H
>  #define _ASM_POWERPC_VAS_H
> +#include 
>  
>  struct vas_window;
>  
> @@ -48,6 +49,16 @@ enum vas_cop_type {
>   VAS_COP_TYPE_MAX,
>  };
>  
> +/*
> + * User space window operations used for powernv and powerVM
> + */
> +struct vas_user_win_ops {
> + struct vas_window * (*open_win)(int vas_id, u64 flags,
> + enum vas_cop_type);

Thanks for changing that to not pass down the struct passed in by the 
user. Looks good.

Thanks,
Nick



Re: [PATCH v6 11/17] powerpc/pseries/vas: Implement getting capabilities from hypervisor

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:35 am:
> 
> The hypervisor provides VAS capabilities for GZIP default and QoS
> features. These capabilities gives information for the specific
> features such as total number of credits available in LPAR,
> maximum credits allowed per window, maximum credits allowed in
> LPAR, whether usermode copy/paste is supported, and etc.
> 
> This patch adds the following:
> - Retrieve all features that are provided by hypervisor using
>   H_QUERY_VAS_CAPABILITIES hcall with 0 as feature type.
> - Retrieve capabilities for the specific feature using the same
>   hcall and the feature type (1 for QoS and 2 for default type).
> 
> Signed-off-by: Haren Myneni 

Reviewed-by: Nicholas Piggin 

> ---
>  arch/powerpc/platforms/pseries/vas.c | 122 +++
>  1 file changed, 122 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/pseries/vas.c 
> b/arch/powerpc/platforms/pseries/vas.c
> index a73d7d00bf55..93794e12527d 100644
> --- a/arch/powerpc/platforms/pseries/vas.c
> +++ b/arch/powerpc/platforms/pseries/vas.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -20,6 +21,11 @@
>  /* The hypervisor allows one credit per window right now */
>  #define DEF_WIN_CREDS1
>  
> +static struct vas_all_caps caps_all;
> +static bool copypaste_feat;
> +
> +static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
> +
>  static long hcall_return_busy_check(long rc)
>  {
>   /* Check if we are stalled for some time */
> @@ -145,3 +151,119 @@ int h_query_vas_capabilities(const u64 hcall, u8 
> query_type, u64 result)
>   hcall, rc, query_type, result);
>   return -EIO;
>  }
> +
> +/*
> + * Get the specific capabilities based on the feature type.
> + * Right now supports GZIP default and GZIP QoS capabilities.
> + */
> +static int get_vas_capabilities(u8 feat, enum vas_cop_feat_type type,
> + struct hv_vas_cop_feat_caps *hv_caps)
> +{
> + struct vas_cop_feat_caps *caps;
> + struct vas_caps *vcaps;
> + int rc = 0;
> +
> + vcaps = [type];
> + memset(vcaps, 0, sizeof(*vcaps));
> + INIT_LIST_HEAD(>list);
> +
> + caps = >caps;
> +
> + rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, feat,
> +   (u64)virt_to_phys(hv_caps));
> + if (rc)
> + return rc;
> +
> + caps->user_mode = hv_caps->user_mode;
> + if (!(caps->user_mode & VAS_COPY_PASTE_USER_MODE)) {
> + pr_err("User space COPY/PASTE is not supported\n");
> + return -ENOTSUPP;
> + }
> +
> + caps->descriptor = be64_to_cpu(hv_caps->descriptor);
> + caps->win_type = hv_caps->win_type;
> + if (caps->win_type >= VAS_MAX_FEAT_TYPE) {
> + pr_err("Unsupported window type %u\n", caps->win_type);
> + return -EINVAL;
> + }
> + caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
> + caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
> + atomic_set(>target_lpar_creds,
> +be16_to_cpu(hv_caps->target_lpar_creds));
> + if (feat == VAS_GZIP_DEF_FEAT) {
> + caps->def_lpar_creds = be16_to_cpu(hv_caps->def_lpar_creds);
> +
> + if (caps->max_win_creds < DEF_WIN_CREDS) {
> + pr_err("Window creds(%u) > max allowed window 
> creds(%u)\n",
> +DEF_WIN_CREDS, caps->max_win_creds);
> + return -EINVAL;
> + }
> + }
> +
> + copypaste_feat = true;
> +
> + return 0;
> +}
> +
> +static int __init pseries_vas_init(void)
> +{
> + struct hv_vas_cop_feat_caps *hv_cop_caps;
> + struct hv_vas_all_caps *hv_caps;
> + int rc;
> +
> + /*
> +  * Linux supports user space COPY/PASTE only with Radix
> +  */
> + if (!radix_enabled()) {
> + pr_err("API is supported only with radix page tables\n");
> + return -ENOTSUPP;
> + }
> +
> + hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
> + if (!hv_caps)
> + return -ENOMEM;
> + /*
> +  * Get VAS overall capabilities by passing 0 to feature type.
> +  */
> + rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, 0,
> +   (u64)virt_to_phys(hv_caps));
> + if (rc)
> + goto out;
> +
> + caps_all.descriptor = be64_to_cpu(hv_caps->descriptor);
> + caps_all.feat_type = be64_to_cpu(hv_caps->feat_type);
> +
> + hv_cop_caps = kmalloc(sizeof(*hv_cop_caps), GFP_KERNEL);
> + if (!hv_cop_caps) {
> + rc = -ENOMEM;
> + goto out;
> + }
> + /*
> +  * QOS capabilities available
> +  */
> + if (caps_all.feat_type & VAS_GZIP_QOS_FEAT_BIT) {
> + rc = get_vas_capabilities(VAS_GZIP_QOS_FEAT,
> +   VAS_GZIP_QOS_FEAT_TYPE, hv_cop_caps);
> +
> 

Re: [PATCH v6 12/17] powerpc/pseries/vas: Integrate API with open/close windows

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:36 am:
> 
> This patch adds VAS window allocatioa/close with the corresponding
> hcalls. Also changes to integrate with the existing user space VAS
> API and provide register/unregister functions to NX pseries driver.
> 
> The driver register function is used to create the user space
> interface (/dev/crypto/nx-gzip) and unregister to remove this entry.
> 
> The user space process opens this device node and makes an ioctl
> to allocate VAS window. The close interface is used to deallocate
> window.
> 
> Signed-off-by: Haren Myneni 

Reviewed-by: Nicholas Piggin 

Unless there is some significant performance reason it might be simplest
to take the mutex for the duration of the allocate and frees rather than 
taking it several times, covering the atomic with the lock instead.

You have a big lock, might as well use it and not have to wonder what if 
things race here or there.

But don't rework that now, maybe just something to consider for later.

Thanks,
Nick



Re: [PATCH v6 06/17] powerpc/vas: Move update_csb/dump_crb to common book3s platform

2021-06-17 Thread Nicholas Piggin
Excerpts from Haren Myneni's message of June 18, 2021 6:32 am:
> 
> If a coprocessor encounters an error translating an address, the
> VAS will cause an interrupt in the host. The kernel processes
> the fault by updating CSB. This functionality is same for both
> powerNV and pseries. So this patch moves these functions to
> common vas-api.c and the actual functionality is not changed.
> 
> Signed-off-by: Haren Myneni 
> Reviewed-by: Nicholas Piggin 
> ---
>  arch/powerpc/include/asm/vas.h |   3 +
>  arch/powerpc/platforms/book3s/vas-api.c| 167 +
>  arch/powerpc/platforms/powernv/vas-fault.c | 155 ++-
>  3 files changed, 179 insertions(+), 146 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
> index 71cff6d6bf3a..6b41c0818958 100644
> --- a/arch/powerpc/include/asm/vas.h
> +++ b/arch/powerpc/include/asm/vas.h
> @@ -230,4 +230,7 @@ int vas_register_coproc_api(struct module *mod, enum 
> vas_cop_type cop_type,
>  void vas_unregister_coproc_api(void);
>  
>  int get_vas_user_win_ref(struct vas_user_win_ref *task_ref);
> +void vas_update_csb(struct coprocessor_request_block *crb,
> + struct vas_user_win_ref *task_ref);
> +void vas_dump_crb(struct coprocessor_request_block *crb);
>  #endif /* __ASM_POWERPC_VAS_H */
> diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
> b/arch/powerpc/platforms/book3s/vas-api.c
> index 4ce82500f4c5..30172e52e16b 100644
> --- a/arch/powerpc/platforms/book3s/vas-api.c
> +++ b/arch/powerpc/platforms/book3s/vas-api.c
> @@ -10,6 +10,9 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -94,6 +97,170 @@ int get_vas_user_win_ref(struct vas_user_win_ref 
> *task_ref)
>   return 0;
>  }
>  
> +/*
> + * Successful return must release the task reference with
> + * put_task_struct
> + */
> +static bool ref_get_pid_and_task(struct vas_user_win_ref *task_ref,
> +   struct task_struct **tskp, struct pid **pidp)
> +{
> + struct task_struct *tsk;
> + struct pid *pid;
> +
> + pid = task_ref->pid;
> + tsk = get_pid_task(pid, PIDTYPE_PID);
> + if (!tsk) {
> + pid = task_ref->tgid;
> + tsk = get_pid_task(pid, PIDTYPE_PID);
> + /*
> +  * Parent thread (tgid) will be closing window when it
> +  * exits. So should not get here.
> +  */
> + if (WARN_ON_ONCE(!tsk))
> + return false;
> + }
> +
> + /* Return if the task is exiting. */
> + if (tsk->flags & PF_EXITING) {
> + put_task_struct(tsk);
> + return false;
> + }
> +
> + *tskp = tsk;
> + *pidp = pid;
> +
> + return true;
> +}

Thanks for making this change.

I think that's good to factor all these out and put them together.

Reviewed-by: Nicholas Piggin 

> +
> +/*
> + * Update the CSB to indicate a translation error.
> + *
> + * User space will be polling on CSB after the request is issued.
> + * If NX can handle the request without any issues, it updates CSB.
> + * Whereas if NX encounters page fault, the kernel will handle the
> + * fault and update CSB with translation error.
> + *
> + * If we are unable to update the CSB means copy_to_user failed due to
> + * invalid csb_addr, send a signal to the process.
> + */
> +void vas_update_csb(struct coprocessor_request_block *crb,
> + struct vas_user_win_ref *task_ref)
> +{
> + struct coprocessor_status_block csb;
> + struct kernel_siginfo info;
> + struct task_struct *tsk;
> + void __user *csb_addr;
> + struct pid *pid;
> + int rc;
> +
> + /*
> +  * NX user space windows can not be opened for task->mm=NULL
> +  * and faults will not be generated for kernel requests.
> +  */
> + if (WARN_ON_ONCE(!task_ref->mm))
> + return;
> +
> + csb_addr = (void __user *)be64_to_cpu(crb->csb_addr);
> +
> + memset(, 0, sizeof(csb));
> + csb.cc = CSB_CC_FAULT_ADDRESS;
> + csb.ce = CSB_CE_TERMINATION;
> + csb.cs = 0;
> + csb.count = 0;
> +
> + /*
> +  * NX operates and returns in BE format as defined CRB struct.
> +  * So saves fault_storage_addr in BE as NX pastes in FIFO and
> +  * expects user space to convert to CPU format.
> +  */
> + csb.address = crb->stamp.nx.fault_storage_addr;
> + csb.flags = 0;
> +
> + /*
> +  * Process closes send window after all pending NX requests are
> +  * completed. In multi-thread applications, a child thread can
> +  * open a window and can exit without closing it. May be some
> +  * requests are pending or this window can be used by other
> +  * threads later. We should handle faults if NX encounters
> +  * pages faults on these requests. Update CSB with translation
> +  * error and fault address. If csb_addr passed by user space is
> +  * 

[powerpc:next] BUILD SUCCESS 07d8ad6fd8a3d47f50595ca4826f41dbf4f3a0c6

2021-06-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next
branch HEAD: 07d8ad6fd8a3d47f50595ca4826f41dbf4f3a0c6  powerpc/mm/book3s64: Fix 
possible build error

elapsed time: 843m

configs tested: 99
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
m68km5307c3_defconfig
sh   j2_defconfig
arm   cns3420vb_defconfig
mips  ath25_defconfig
arm  pxa3xx_defconfig
sh  rsk7203_defconfig
sh ap325rxa_defconfig
parisc   alldefconfig
h8300h8300h-sim_defconfig
powerpc mpc834x_mds_defconfig
nios2 3c120_defconfig
m68kmvme16x_defconfig
shhp6xx_defconfig
s390defconfig
sh   se7750_defconfig
archsdk_defconfig
openrisc simple_smp_defconfig
arm  tct_hammer_defconfig
arm s5pv210_defconfig
armkeystone_defconfig
m68k  multi_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
x86_64allnoconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a002-20210617
i386 randconfig-a006-20210617
i386 randconfig-a001-20210617
i386 randconfig-a004-20210617
i386 randconfig-a005-20210617
i386 randconfig-a003-20210617
i386 randconfig-a015-20210617
i386 randconfig-a013-20210617
i386 randconfig-a016-20210617
i386 randconfig-a012-20210617
i386 randconfig-a014-20210617
i386 randconfig-a011-20210617
x86_64   randconfig-a004-20210617
x86_64   randconfig-a001-20210617
x86_64   randconfig-a002-20210617
x86_64   randconfig-a003-20210617
x86_64   randconfig-a006-20210617
x86_64   randconfig-a005-20210617
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
um   x86_64_defconfig
um i386_defconfig
umkunit_defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-b001-20210617
x86_64   randconfig-a015-20210617
x86_64   randconfig-a011-20210617
x86_64   randconfig-a014-20210617
x86_64   randconfig-a012-20210617
x86_64   randconfig-a013-20210617
x86_64   randconfig-a016-20210617

---
0-DAY CI Kernel Test Service, Intel Corporation

[PATCH v6 17/17] crypto/nx: Register and unregister VAS interface on PowerVM

2021-06-17 Thread Haren Myneni


The user space uses /dev/crypto/nx-gzip interface to setup VAS
windows, create paste mapping and close windows. This patch adds
changes to create/remove this interface with VAS register/unregister
functions on PowerVM platform.

Signed-off-by: Haren Myneni 
Acked-by: Herbert Xu 
Acked-by: Nicholas Piggin 
---
 drivers/crypto/nx/Kconfig | 1 +
 drivers/crypto/nx/nx-common-pseries.c | 8 
 2 files changed, 9 insertions(+)

diff --git a/drivers/crypto/nx/Kconfig b/drivers/crypto/nx/Kconfig
index 23e3d0160e67..2a35e0e785bd 100644
--- a/drivers/crypto/nx/Kconfig
+++ b/drivers/crypto/nx/Kconfig
@@ -29,6 +29,7 @@ if CRYPTO_DEV_NX_COMPRESS
 config CRYPTO_DEV_NX_COMPRESS_PSERIES
tristate "Compression acceleration support on pSeries platform"
depends on PPC_PSERIES && IBMVIO
+   depends on PPC_VAS
default y
help
  Support for PowerPC Nest (NX) compression acceleration. This
diff --git a/drivers/crypto/nx/nx-common-pseries.c 
b/drivers/crypto/nx/nx-common-pseries.c
index f51a50d40504..6671f6634dda 100644
--- a/drivers/crypto/nx/nx-common-pseries.c
+++ b/drivers/crypto/nx/nx-common-pseries.c
@@ -1231,6 +1231,12 @@ static int __init nx842_pseries_init(void)
return ret;
}
 
+   ret = vas_register_api_pseries(THIS_MODULE, VAS_COP_TYPE_GZIP,
+  "nx-gzip");
+
+   if (ret)
+   pr_err("NX-GZIP is not supported. Returned=%d\n", ret);
+
return 0;
 }
 
@@ -1241,6 +1247,8 @@ static void __exit nx842_pseries_exit(void)
struct nx842_devdata *old_devdata;
unsigned long flags;
 
+   vas_unregister_api_pseries();
+
crypto_unregister_alg(_pseries_alg);
 
spin_lock_irqsave(_mutex, flags);
-- 
2.18.2




[PATCH v6 16/17] crypto/nx: Add sysfs interface to export NX capabilities

2021-06-17 Thread Haren Myneni


Export NX-GZIP capabilities to usrespace in sysfs
/sys/devices/vio/ibm,compression-v1/nx_gzip_caps directory.
These are queried by userspace accelerator libraries to set
minimum length heuristics and maximum limits on request sizes.

NX-GZIP capabilities:
min_compress_len  /*Recommended minimum compress length in bytes*/
min_decompress_len /*Recommended minimum decompress length in bytes*/
req_max_processed_len /* Maximum number of bytes processed in one
request */

NX will return RMA_Reject if the request buffer size is greater
than req_max_processed_len.

Signed-off-by: Haren Myneni 
Acked-by: Herbert Xu 
---
 drivers/crypto/nx/nx-common-pseries.c | 43 +++
 1 file changed, 43 insertions(+)

diff --git a/drivers/crypto/nx/nx-common-pseries.c 
b/drivers/crypto/nx/nx-common-pseries.c
index 9fc2abb56019..f51a50d40504 100644
--- a/drivers/crypto/nx/nx-common-pseries.c
+++ b/drivers/crypto/nx/nx-common-pseries.c
@@ -967,6 +967,36 @@ static struct attribute_group nx842_attribute_group = {
.attrs = nx842_sysfs_entries,
 };
 
+#definenxcop_caps_read(_name)  
\
+static ssize_t nxcop_##_name##_show(struct device *dev,
\
+   struct device_attribute *attr, char *buf)   \
+{  \
+   return sprintf(buf, "%lld\n", nx_cop_caps._name);   \
+}
+
+#define NXCT_ATTR_RO(_name)\
+   nxcop_caps_read(_name); \
+   static struct device_attribute dev_attr_##_name = __ATTR(_name, \
+   0444,   \
+   nxcop_##_name##_show,   \
+   NULL);
+
+NXCT_ATTR_RO(req_max_processed_len);
+NXCT_ATTR_RO(min_compress_len);
+NXCT_ATTR_RO(min_decompress_len);
+
+static struct attribute *nxcop_caps_sysfs_entries[] = {
+   _attr_req_max_processed_len.attr,
+   _attr_min_compress_len.attr,
+   _attr_min_decompress_len.attr,
+   NULL,
+};
+
+static struct attribute_group nxcop_caps_attr_group = {
+   .name   =   "nx_gzip_caps",
+   .attrs  =   nxcop_caps_sysfs_entries,
+};
+
 static struct nx842_driver nx842_pseries_driver = {
.name = KBUILD_MODNAME,
.owner =THIS_MODULE,
@@ -1056,6 +1086,16 @@ static int nx842_probe(struct vio_dev *viodev,
goto error;
}
 
+   if (caps_feat) {
+   if (sysfs_create_group(>dev.kobj,
+   _caps_attr_group)) {
+   dev_err(>dev,
+   "Could not create sysfs NX capability 
entries\n");
+   ret = -1;
+   goto error;
+   }
+   }
+
return 0;
 
 error_unlock:
@@ -1075,6 +1115,9 @@ static void nx842_remove(struct vio_dev *viodev)
pr_info("Removing IBM Power 842 compression device\n");
sysfs_remove_group(>dev.kobj, _attribute_group);
 
+   if (caps_feat)
+   sysfs_remove_group(>dev.kobj, _caps_attr_group);
+
crypto_unregister_alg(_pseries_alg);
 
spin_lock_irqsave(_mutex, flags);
-- 
2.18.2




[PATCH v6 15/17] crypto/nx: Get NX capabilities for GZIP coprocessor type

2021-06-17 Thread Haren Myneni


The hypervisor provides different NX capabilities that it
supports. These capabilities such as recommended minimum
compression / decompression lengths and the maximum request
buffer size in bytes are used to define the user space NX
request.

NX will reject the request if the buffer size is more than
the maximum buffer size. Whereas compression / decompression
lengths are recommended values for better performance.

Changes to get NX overall capabilities which points to the
specific features that the hypervisor supports. Then retrieve
the capabilities for the specific feature (available only
for NXGZIP).

Signed-off-by: Haren Myneni 
Acked-by: Herbert Xu 
---
 drivers/crypto/nx/nx-common-pseries.c | 87 +++
 1 file changed, 87 insertions(+)

diff --git a/drivers/crypto/nx/nx-common-pseries.c 
b/drivers/crypto/nx/nx-common-pseries.c
index cc8dd3072b8b..9fc2abb56019 100644
--- a/drivers/crypto/nx/nx-common-pseries.c
+++ b/drivers/crypto/nx/nx-common-pseries.c
@@ -9,6 +9,8 @@
  */
 
 #include 
+#include 
+#include 
 
 #include "nx-842.h"
 #include "nx_csbcpb.h" /* struct nx_csbcpb */
@@ -19,6 +21,29 @@ MODULE_DESCRIPTION("842 H/W Compression driver for IBM Power 
processors");
 MODULE_ALIAS_CRYPTO("842");
 MODULE_ALIAS_CRYPTO("842-nx");
 
+/*
+ * Coprocessor type specific capabilities from the hypervisor.
+ */
+struct hv_nx_cop_caps {
+   __be64  descriptor;
+   __be64  req_max_processed_len;  /* Max bytes in one GZIP request */
+   __be64  min_compress_len;   /* Min compression size in bytes */
+   __be64  min_decompress_len; /* Min decompression size in bytes */
+} __packed __aligned(0x1000);
+
+/*
+ * Coprocessor type specific capabilities.
+ */
+struct nx_cop_caps {
+   u64 descriptor;
+   u64 req_max_processed_len;  /* Max bytes in one GZIP request */
+   u64 min_compress_len;   /* Min compression in bytes */
+   u64 min_decompress_len; /* Min decompression in bytes */
+};
+
+static u64 caps_feat;
+static struct nx_cop_caps nx_cop_caps;
+
 static struct nx842_constraints nx842_pseries_constraints = {
.alignment =DDE_BUFFER_ALIGN,
.multiple = DDE_BUFFER_LAST_MULT,
@@ -1065,6 +1090,64 @@ static void nx842_remove(struct vio_dev *viodev)
kfree(old_devdata);
 }
 
+/*
+ * Get NX capabilities from the hypervisor.
+ * Only NXGZIP capabilities are provided by the hypersvisor right
+ * now and these values are available to user space with sysfs.
+ */
+static void __init nxcop_get_capabilities(void)
+{
+   struct hv_vas_all_caps *hv_caps;
+   struct hv_nx_cop_caps *hv_nxc;
+   int rc;
+
+   hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
+   if (!hv_caps)
+   return;
+   /*
+* Get NX overall capabilities with feature type=0
+*/
+   rc = h_query_vas_capabilities(H_QUERY_NX_CAPABILITIES, 0,
+ (u64)virt_to_phys(hv_caps));
+   if (rc)
+   goto out;
+
+   caps_feat = be64_to_cpu(hv_caps->feat_type);
+   /*
+* NX-GZIP feature available
+*/
+   if (caps_feat & VAS_NX_GZIP_FEAT_BIT) {
+   hv_nxc = kmalloc(sizeof(*hv_nxc), GFP_KERNEL);
+   if (!hv_nxc)
+   goto out;
+   /*
+* Get capabilities for NX-GZIP feature
+*/
+   rc = h_query_vas_capabilities(H_QUERY_NX_CAPABILITIES,
+ VAS_NX_GZIP_FEAT,
+ (u64)virt_to_phys(hv_nxc));
+   } else {
+   pr_err("NX-GZIP feature is not available\n");
+   rc = -EINVAL;
+   }
+
+   if (!rc) {
+   nx_cop_caps.descriptor = be64_to_cpu(hv_nxc->descriptor);
+   nx_cop_caps.req_max_processed_len =
+   be64_to_cpu(hv_nxc->req_max_processed_len);
+   nx_cop_caps.min_compress_len =
+   be64_to_cpu(hv_nxc->min_compress_len);
+   nx_cop_caps.min_decompress_len =
+   be64_to_cpu(hv_nxc->min_decompress_len);
+   } else {
+   caps_feat = 0;
+   }
+
+   kfree(hv_nxc);
+out:
+   kfree(hv_caps);
+}
+
 static const struct vio_device_id nx842_vio_driver_ids[] = {
{"ibm,compression-v1", "ibm,compression"},
{"", ""},
@@ -1092,6 +1175,10 @@ static int __init nx842_pseries_init(void)
return -ENOMEM;
 
RCU_INIT_POINTER(devdata, new_devdata);
+   /*
+* Get NX capabilities from the hypervisor.
+*/
+   nxcop_get_capabilities();
 
ret = vio_register_driver(_vio_driver);
if (ret) {
-- 
2.18.2




[PATCH v6 14/17] crypto/nx: Rename nx-842-pseries file name to nx-common-pseries

2021-06-17 Thread Haren Myneni


Rename nx-842-pseries.c to nx-common-pseries.c to add code for new
GZIP compression type. The actual functionality is not changed in
this patch.

Signed-off-by: Haren Myneni 
Acked-by: Herbert Xu 
Acked-by: Nicholas Piggin 
---
 drivers/crypto/nx/Makefile  | 2 +-
 drivers/crypto/nx/{nx-842-pseries.c => nx-common-pseries.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename drivers/crypto/nx/{nx-842-pseries.c => nx-common-pseries.c} (100%)

diff --git a/drivers/crypto/nx/Makefile b/drivers/crypto/nx/Makefile
index bc89a20e5d9d..d00181a26dd6 100644
--- a/drivers/crypto/nx/Makefile
+++ b/drivers/crypto/nx/Makefile
@@ -14,5 +14,5 @@ nx-crypto-objs := nx.o \
 obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_PSERIES) += nx-compress-pseries.o 
nx-compress.o
 obj-$(CONFIG_CRYPTO_DEV_NX_COMPRESS_POWERNV) += nx-compress-powernv.o 
nx-compress.o
 nx-compress-objs := nx-842.o
-nx-compress-pseries-objs := nx-842-pseries.o
+nx-compress-pseries-objs := nx-common-pseries.o
 nx-compress-powernv-objs := nx-common-powernv.o
diff --git a/drivers/crypto/nx/nx-842-pseries.c 
b/drivers/crypto/nx/nx-common-pseries.c
similarity index 100%
rename from drivers/crypto/nx/nx-842-pseries.c
rename to drivers/crypto/nx/nx-common-pseries.c
-- 
2.18.2




[PATCH v6 13/17] powerpc/pseries/vas: Setup IRQ and fault handling

2021-06-17 Thread Haren Myneni


NX generates an interrupt when sees a fault on the user space
buffer and the hypervisor forwards that interrupt to OS. Then
the kernel handles the interrupt by issuing H_GET_NX_FAULT hcall
to retrieve the fault CRB information.

This patch also adds changes to setup and free IRQ per each
window and also handles the fault by updating the CSB.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 102 +++
 1 file changed, 102 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index f5a44f2f0e99..3385b5400cc6 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -155,6 +156,50 @@ int h_query_vas_capabilities(const u64 hcall, u8 
query_type, u64 result)
 }
 EXPORT_SYMBOL_GPL(h_query_vas_capabilities);
 
+/*
+ * hcall to get fault CRB from the hypervisor.
+ */
+static int h_get_nx_fault(u32 winid, u64 buffer)
+{
+   long rc;
+
+   rc = plpar_hcall_norets(H_GET_NX_FAULT, winid, buffer);
+
+   if (rc == H_SUCCESS)
+   return 0;
+
+   pr_err("H_GET_NX_FAULT error: %ld, winid %u, buffer 0x%llx\n",
+   rc, winid, buffer);
+   return -EIO;
+
+}
+
+/*
+ * Handle the fault interrupt.
+ * When the fault interrupt is received for each window, query the
+ * hypervisor to get the fault CRB on the specific fault. Then
+ * process the CRB by updating CSB or send signal if the user space
+ * CSB is invalid.
+ * Note: The hypervisor forwards an interrupt for each fault request.
+ * So one fault CRB to process for each H_GET_NX_FAULT hcall.
+ */
+irqreturn_t pseries_vas_fault_thread_fn(int irq, void *data)
+{
+   struct pseries_vas_window *txwin = data;
+   struct coprocessor_request_block crb;
+   struct vas_user_win_ref *tsk_ref;
+   int rc;
+
+   rc = h_get_nx_fault(txwin->vas_win.winid, (u64)virt_to_phys());
+   if (!rc) {
+   tsk_ref = >vas_win.task_ref;
+   vas_dump_crb();
+   vas_update_csb(, tsk_ref);
+   }
+
+   return IRQ_HANDLED;
+}
+
 /*
  * Allocate window and setup IRQ mapping.
  */
@@ -166,10 +211,51 @@ static int allocate_setup_window(struct 
pseries_vas_window *txwin,
rc = h_allocate_vas_window(txwin, domain, wintype, DEF_WIN_CREDS);
if (rc)
return rc;
+   /*
+* On PowerVM, the hypervisor setup and forwards the fault
+* interrupt per window. So the IRQ setup and fault handling
+* will be done for each open window separately.
+*/
+   txwin->fault_virq = irq_create_mapping(NULL, txwin->fault_irq);
+   if (!txwin->fault_virq) {
+   pr_err("Failed irq mapping %d\n", txwin->fault_irq);
+   rc = -EINVAL;
+   goto out_win;
+   }
+
+   txwin->name = kasprintf(GFP_KERNEL, "vas-win-%d",
+   txwin->vas_win.winid);
+   if (!txwin->name) {
+   rc = -ENOMEM;
+   goto out_irq;
+   }
+
+   rc = request_threaded_irq(txwin->fault_virq, NULL,
+ pseries_vas_fault_thread_fn, IRQF_ONESHOT,
+ txwin->name, txwin);
+   if (rc) {
+   pr_err("VAS-Window[%d]: Request IRQ(%u) failed with %d\n",
+  txwin->vas_win.winid, txwin->fault_virq, rc);
+   goto out_free;
+   }
 
txwin->vas_win.wcreds_max = DEF_WIN_CREDS;
 
return 0;
+out_free:
+   kfree(txwin->name);
+out_irq:
+   irq_dispose_mapping(txwin->fault_virq);
+out_win:
+   h_deallocate_vas_window(txwin->vas_win.winid);
+   return rc;
+}
+
+static inline void free_irq_setup(struct pseries_vas_window *txwin)
+{
+   free_irq(txwin->fault_virq, txwin);
+   kfree(txwin->name);
+   irq_dispose_mapping(txwin->fault_virq);
 }
 
 static struct vas_window *vas_allocate_window(int vas_id, u64 flags,
@@ -284,6 +370,11 @@ static struct vas_window *vas_allocate_window(int vas_id, 
u64 flags,
return >vas_win;
 
 out_free:
+   /*
+* Window is not operational. Free IRQ before closing
+* window so that do not have to hold mutex.
+*/
+   free_irq_setup(txwin);
h_deallocate_vas_window(txwin->vas_win.winid);
 out:
atomic_dec(_feat_caps->used_lpar_creds);
@@ -303,7 +394,18 @@ static int deallocate_free_window(struct 
pseries_vas_window *win)
 {
int rc = 0;
 
+   /*
+* The hypervisor waits for all requests including faults
+* are processed before closing the window - Means all
+* credits have to be returned. In the case of fault
+* request, a credit is returned after OS issues
+* H_GET_NX_FAULT hcall.
+* So free IRQ after executing H_DEALLOCATE_VAS_WINDOW
+* hcall.
+*/
rc = 

[PATCH v6 12/17] powerpc/pseries/vas: Integrate API with open/close windows

2021-06-17 Thread Haren Myneni


This patch adds VAS window allocatioa/close with the corresponding
hcalls. Also changes to integrate with the existing user space VAS
API and provide register/unregister functions to NX pseries driver.

The driver register function is used to create the user space
interface (/dev/crypto/nx-gzip) and unregister to remove this entry.

The user space process opens this device node and makes an ioctl
to allocate VAS window. The close interface is used to deallocate
window.

Signed-off-by: Haren Myneni 
---
 arch/powerpc/include/asm/vas.h  |   4 +
 arch/powerpc/platforms/pseries/Makefile |   1 +
 arch/powerpc/platforms/pseries/vas.c| 223 
 3 files changed, 228 insertions(+)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 99570c33058f..57573d9c1e09 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -254,6 +254,10 @@ struct vas_all_caps {
u64 feat_type;
 };
 
+int h_query_vas_capabilities(const u64 hcall, u8 query_type, u64 result);
+int vas_register_api_pseries(struct module *mod,
+enum vas_cop_type cop_type, const char *name);
+void vas_unregister_api_pseries(void);
 #endif
 
 /*
diff --git a/arch/powerpc/platforms/pseries/Makefile 
b/arch/powerpc/platforms/pseries/Makefile
index c8a2b0b05ac0..4cda0ef87be0 100644
--- a/arch/powerpc/platforms/pseries/Makefile
+++ b/arch/powerpc/platforms/pseries/Makefile
@@ -30,3 +30,4 @@ obj-$(CONFIG_PPC_SVM) += svm.o
 obj-$(CONFIG_FA_DUMP)  += rtas-fadump.o
 
 obj-$(CONFIG_SUSPEND)  += suspend.o
+obj-$(CONFIG_PPC_VAS)  += vas.o
diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 93794e12527d..f5a44f2f0e99 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -25,6 +26,7 @@ static struct vas_all_caps caps_all;
 static bool copypaste_feat;
 
 static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
+static DEFINE_MUTEX(vas_pseries_mutex);
 
 static long hcall_return_busy_check(long rc)
 {
@@ -151,6 +153,227 @@ int h_query_vas_capabilities(const u64 hcall, u8 
query_type, u64 result)
hcall, rc, query_type, result);
return -EIO;
 }
+EXPORT_SYMBOL_GPL(h_query_vas_capabilities);
+
+/*
+ * Allocate window and setup IRQ mapping.
+ */
+static int allocate_setup_window(struct pseries_vas_window *txwin,
+u64 *domain, u8 wintype)
+{
+   int rc;
+
+   rc = h_allocate_vas_window(txwin, domain, wintype, DEF_WIN_CREDS);
+   if (rc)
+   return rc;
+
+   txwin->vas_win.wcreds_max = DEF_WIN_CREDS;
+
+   return 0;
+}
+
+static struct vas_window *vas_allocate_window(int vas_id, u64 flags,
+ enum vas_cop_type cop_type)
+{
+   long domain[PLPAR_HCALL9_BUFSIZE] = {VAS_DEFAULT_DOMAIN_ID};
+   struct vas_cop_feat_caps *cop_feat_caps;
+   struct vas_caps *caps;
+   struct pseries_vas_window *txwin;
+   int rc;
+
+   txwin = kzalloc(sizeof(*txwin), GFP_KERNEL);
+   if (!txwin)
+   return ERR_PTR(-ENOMEM);
+
+   /*
+* A VAS window can have many credits which means that many
+* requests can be issued simultaneously. But the hypervisor
+* restricts one credit per window.
+* The hypervisor introduces 2 different types of credits:
+* Default credit type (Uses normal priority FIFO):
+*  A limited number of credits are assigned to partitions
+*  based on processor entitlement. But these credits may be
+*  over-committed on a system depends on whether the CPUs
+*  are in shared or dedicated modes - that is, more requests
+*  may be issued across the system than NX can service at
+*  once which can result in paste command failure (RMA_busy).
+*  Then the process has to resend requests or fall-back to
+*  SW compression.
+* Quality of Service (QoS) credit type (Uses high priority FIFO):
+*  To avoid NX HW contention, the system admins can assign
+*  QoS credits for each LPAR so that this partition is
+*  guaranteed access to NX resources. These credits are
+*  assigned to partitions via the HMC.
+*  Refer PAPR for more information.
+*
+* Allocate window with QoS credits if user requested. Otherwise
+* default credits are used.
+*/
+   if (flags & VAS_TX_WIN_FLAG_QOS_CREDIT)
+   caps = [VAS_GZIP_QOS_FEAT_TYPE];
+   else
+   caps = [VAS_GZIP_DEF_FEAT_TYPE];
+
+   cop_feat_caps = >caps;
+
+   if (atomic_inc_return(_feat_caps->used_lpar_creds) >
+   atomic_read(_feat_caps->target_lpar_creds)) {
+

[PATCH v6 11/17] powerpc/pseries/vas: Implement getting capabilities from hypervisor

2021-06-17 Thread Haren Myneni


The hypervisor provides VAS capabilities for GZIP default and QoS
features. These capabilities gives information for the specific
features such as total number of credits available in LPAR,
maximum credits allowed per window, maximum credits allowed in
LPAR, whether usermode copy/paste is supported, and etc.

This patch adds the following:
- Retrieve all features that are provided by hypervisor using
  H_QUERY_VAS_CAPABILITIES hcall with 0 as feature type.
- Retrieve capabilities for the specific feature using the same
  hcall and the feature type (1 for QoS and 2 for default type).

Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 122 +++
 1 file changed, 122 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index a73d7d00bf55..93794e12527d 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -20,6 +21,11 @@
 /* The hypervisor allows one credit per window right now */
 #define DEF_WIN_CREDS  1
 
+static struct vas_all_caps caps_all;
+static bool copypaste_feat;
+
+static struct vas_caps vascaps[VAS_MAX_FEAT_TYPE];
+
 static long hcall_return_busy_check(long rc)
 {
/* Check if we are stalled for some time */
@@ -145,3 +151,119 @@ int h_query_vas_capabilities(const u64 hcall, u8 
query_type, u64 result)
hcall, rc, query_type, result);
return -EIO;
 }
+
+/*
+ * Get the specific capabilities based on the feature type.
+ * Right now supports GZIP default and GZIP QoS capabilities.
+ */
+static int get_vas_capabilities(u8 feat, enum vas_cop_feat_type type,
+   struct hv_vas_cop_feat_caps *hv_caps)
+{
+   struct vas_cop_feat_caps *caps;
+   struct vas_caps *vcaps;
+   int rc = 0;
+
+   vcaps = [type];
+   memset(vcaps, 0, sizeof(*vcaps));
+   INIT_LIST_HEAD(>list);
+
+   caps = >caps;
+
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, feat,
+ (u64)virt_to_phys(hv_caps));
+   if (rc)
+   return rc;
+
+   caps->user_mode = hv_caps->user_mode;
+   if (!(caps->user_mode & VAS_COPY_PASTE_USER_MODE)) {
+   pr_err("User space COPY/PASTE is not supported\n");
+   return -ENOTSUPP;
+   }
+
+   caps->descriptor = be64_to_cpu(hv_caps->descriptor);
+   caps->win_type = hv_caps->win_type;
+   if (caps->win_type >= VAS_MAX_FEAT_TYPE) {
+   pr_err("Unsupported window type %u\n", caps->win_type);
+   return -EINVAL;
+   }
+   caps->max_lpar_creds = be16_to_cpu(hv_caps->max_lpar_creds);
+   caps->max_win_creds = be16_to_cpu(hv_caps->max_win_creds);
+   atomic_set(>target_lpar_creds,
+  be16_to_cpu(hv_caps->target_lpar_creds));
+   if (feat == VAS_GZIP_DEF_FEAT) {
+   caps->def_lpar_creds = be16_to_cpu(hv_caps->def_lpar_creds);
+
+   if (caps->max_win_creds < DEF_WIN_CREDS) {
+   pr_err("Window creds(%u) > max allowed window 
creds(%u)\n",
+  DEF_WIN_CREDS, caps->max_win_creds);
+   return -EINVAL;
+   }
+   }
+
+   copypaste_feat = true;
+
+   return 0;
+}
+
+static int __init pseries_vas_init(void)
+{
+   struct hv_vas_cop_feat_caps *hv_cop_caps;
+   struct hv_vas_all_caps *hv_caps;
+   int rc;
+
+   /*
+* Linux supports user space COPY/PASTE only with Radix
+*/
+   if (!radix_enabled()) {
+   pr_err("API is supported only with radix page tables\n");
+   return -ENOTSUPP;
+   }
+
+   hv_caps = kmalloc(sizeof(*hv_caps), GFP_KERNEL);
+   if (!hv_caps)
+   return -ENOMEM;
+   /*
+* Get VAS overall capabilities by passing 0 to feature type.
+*/
+   rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, 0,
+ (u64)virt_to_phys(hv_caps));
+   if (rc)
+   goto out;
+
+   caps_all.descriptor = be64_to_cpu(hv_caps->descriptor);
+   caps_all.feat_type = be64_to_cpu(hv_caps->feat_type);
+
+   hv_cop_caps = kmalloc(sizeof(*hv_cop_caps), GFP_KERNEL);
+   if (!hv_cop_caps) {
+   rc = -ENOMEM;
+   goto out;
+   }
+   /*
+* QOS capabilities available
+*/
+   if (caps_all.feat_type & VAS_GZIP_QOS_FEAT_BIT) {
+   rc = get_vas_capabilities(VAS_GZIP_QOS_FEAT,
+ VAS_GZIP_QOS_FEAT_TYPE, hv_cop_caps);
+
+   if (rc)
+   goto out_cop;
+   }
+   /*
+* Default capabilities available
+*/
+   if (caps_all.feat_type & VAS_GZIP_DEF_FEAT_BIT) {
+   rc = 

[PATCH v6 10/17] powerpc/pseries/vas: Add hcall wrappers for VAS handling

2021-06-17 Thread Haren Myneni


This patch adds the following hcall wrapper functions to allocate,
modify and deallocate VAS windows, and retrieve VAS capabilities.

H_ALLOCATE_VAS_WINDOW: Allocate VAS window
H_DEALLOCATE_VAS_WINDOW: Close VAS window
H_MODIFY_VAS_WINDOW: Setup window before using
H_QUERY_VAS_CAPABILITIES: Get VAS capabilities

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/pseries/vas.c | 147 +++
 1 file changed, 147 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/vas.c

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
new file mode 100644
index ..a73d7d00bf55
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright 2020-21 IBM Corp.
+ */
+
+#define pr_fmt(fmt) "vas: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "vas.h"
+
+#define VAS_INVALID_WIN_ADDRESS0xul
+#define VAS_DEFAULT_DOMAIN_ID  0xul
+/* The hypervisor allows one credit per window right now */
+#define DEF_WIN_CREDS  1
+
+static long hcall_return_busy_check(long rc)
+{
+   /* Check if we are stalled for some time */
+   if (H_IS_LONG_BUSY(rc)) {
+   msleep(get_longbusy_msecs(rc));
+   rc = H_BUSY;
+   } else if (rc == H_BUSY) {
+   cond_resched();
+   }
+
+   return rc;
+}
+
+/*
+ * Allocate VAS window hcall
+ */
+static int h_allocate_vas_window(struct pseries_vas_window *win, u64 *domain,
+u8 wintype, u16 credits)
+{
+   long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
+   long rc;
+
+   do {
+   rc = plpar_hcall9(H_ALLOCATE_VAS_WINDOW, retbuf, wintype,
+ credits, domain[0], domain[1], domain[2],
+ domain[3], domain[4], domain[5]);
+
+   rc = hcall_return_busy_check(rc);
+   } while (rc == H_BUSY);
+
+   if (rc == H_SUCCESS) {
+   if (win->win_addr == VAS_INVALID_WIN_ADDRESS) {
+   pr_err("H_ALLOCATE_VAS_WINDOW: COPY/PASTE is not 
supported\n");
+   return -ENOTSUPP;
+   }
+   win->vas_win.winid = retbuf[0];
+   win->win_addr = retbuf[1];
+   win->complete_irq = retbuf[2];
+   win->fault_irq = retbuf[3];
+   return 0;
+   }
+
+   pr_err("H_ALLOCATE_VAS_WINDOW error: %ld, wintype: %u, credits: %u\n",
+   rc, wintype, credits);
+
+   return -EIO;
+}
+
+/*
+ * Deallocate VAS window hcall.
+ */
+static int h_deallocate_vas_window(u64 winid)
+{
+   long rc;
+
+   do {
+   rc = plpar_hcall_norets(H_DEALLOCATE_VAS_WINDOW, winid);
+
+   rc = hcall_return_busy_check(rc);
+   } while (rc == H_BUSY);
+
+   if (rc == H_SUCCESS)
+   return 0;
+
+   pr_err("H_DEALLOCATE_VAS_WINDOW error: %ld, winid: %llu\n",
+   rc, winid);
+   return -EIO;
+}
+
+/*
+ * Modify VAS window.
+ * After the window is opened with allocate window hcall, configure it
+ * with flags and LPAR PID before using.
+ */
+static int h_modify_vas_window(struct pseries_vas_window *win)
+{
+   long rc;
+   u32 lpid = mfspr(SPRN_PID);
+
+   /*
+* AMR value is not supported in Linux VAS implementation.
+* The hypervisor ignores it if 0 is passed.
+*/
+   do {
+   rc = plpar_hcall_norets(H_MODIFY_VAS_WINDOW,
+   win->vas_win.winid, lpid, 0,
+   VAS_MOD_WIN_FLAGS, 0);
+
+   rc = hcall_return_busy_check(rc);
+   } while (rc == H_BUSY);
+
+   if (rc == H_SUCCESS)
+   return 0;
+
+   pr_err("H_MODIFY_VAS_WINDOW error: %ld, winid %u lpid %u\n",
+   rc, win->vas_win.winid, lpid);
+   return -EIO;
+}
+
+/*
+ * This hcall is used to determine the capabilities from the hypervisor.
+ * @hcall: H_QUERY_VAS_CAPABILITIES or H_QUERY_NX_CAPABILITIES
+ * @query_type: If 0 is passed, the hypervisor returns the overall
+ * capabilities which provides all feature(s) that are
+ * available. Then query the hypervisor to get the
+ * corresponding capabilities for the specific feature.
+ * Example: H_QUERY_VAS_CAPABILITIES provides VAS GZIP QoS
+ * and VAS GZIP Default capabilities.
+ * H_QUERY_NX_CAPABILITIES provides NX GZIP
+ * capabilities.
+ * @result: Return buffer to save capabilities.
+ */
+int h_query_vas_capabilities(const u64 hcall, u8 query_type, u64 result)
+{
+   long rc;
+
+   rc = plpar_hcall_norets(hcall, query_type, result);
+
+   if (rc == H_SUCCESS)
+   return 0;
+
+   

[PATCH v6 09/17] powerpc/vas: Define QoS credit flag to allocate window

2021-06-17 Thread Haren Myneni


PowerVM introduces two different type of credits: Default and Quality
of service (QoS).

The total number of default credits available on each LPAR depends
on CPU resources configured. But these credits can be shared or
over-committed across LPARs in shared mode which can result in
paste command failure (RMA_busy). To avoid NX HW contention, the
hypervisor ntroduces QoS credit type which makes sure guaranteed
access to NX esources. The system admins can assign QoS credits
or each LPAR via HMC.

Default credit type is used to allocate a VAS window by default as
on PowerVM implementation. But the process can pass
VAS_TX_WIN_FLAG_QOS_CREDIT flag with VAS_TX_WIN_OPEN ioctl to open
QoS type window.

Signed-off-by: Haren Myneni 
Acked-by: Nicholas Piggin 
---
 arch/powerpc/include/uapi/asm/vas-api.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/uapi/asm/vas-api.h 
b/arch/powerpc/include/uapi/asm/vas-api.h
index ebd4b2424785..7c81301ecdba 100644
--- a/arch/powerpc/include/uapi/asm/vas-api.h
+++ b/arch/powerpc/include/uapi/asm/vas-api.h
@@ -13,11 +13,15 @@
 #define VAS_MAGIC  'v'
 #define VAS_TX_WIN_OPEN_IOW(VAS_MAGIC, 0x20, struct 
vas_tx_win_open_attr)
 
+/* Flags to VAS TX open window ioctl */
+/* To allocate a window with QoS credit, otherwise use default credit */
+#define VAS_TX_WIN_FLAG_QOS_CREDIT 0x0001
+
 struct vas_tx_win_open_attr {
__u32   version;
__s16   vas_id; /* specific instance of vas or -1 for default */
__u16   reserved1;
-   __u64   flags;  /* Future use */
+   __u64   flags;
__u64   reserved2[6];
 };
 
-- 
2.18.2




[PATCH v6 08/17] powerpc/pseries/vas: Define VAS/NXGZIP hcalls and structs

2021-06-17 Thread Haren Myneni


This patch adds hcalls and other definitions. Also define structs
that are used in VAS implementation on PowerVM.

Signed-off-by: Haren Myneni 
Acked-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/hvcall.h|   7 ++
 arch/powerpc/include/asm/vas.h   |  30 +++
 arch/powerpc/platforms/pseries/vas.h | 125 +++
 3 files changed, 162 insertions(+)
 create mode 100644 arch/powerpc/platforms/pseries/vas.h

diff --git a/arch/powerpc/include/asm/hvcall.h 
b/arch/powerpc/include/asm/hvcall.h
index e3b29eda8074..7c3418d1b5e9 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -294,6 +294,13 @@
 #define H_RESIZE_HPT_COMMIT0x370
 #define H_REGISTER_PROC_TBL0x37C
 #define H_SIGNAL_SYS_RESET 0x380
+#define H_ALLOCATE_VAS_WINDOW  0x388
+#define H_MODIFY_VAS_WINDOW0x38C
+#define H_DEALLOCATE_VAS_WINDOW0x390
+#define H_QUERY_VAS_WINDOW 0x394
+#define H_QUERY_VAS_CAPABILITIES   0x398
+#define H_QUERY_NX_CAPABILITIES0x39C
+#define H_GET_NX_FAULT 0x3A0
 #define H_INT_GET_SOURCE_INFO   0x3A8
 #define H_INT_SET_SOURCE_CONFIG 0x3AC
 #define H_INT_GET_SOURCE_CONFIG 0x3B0
diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 14ad7982874c..99570c33058f 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -160,6 +160,7 @@ struct vas_tx_win_attr {
bool rx_win_ord_mode;
 };
 
+#ifdef CONFIG_PPC_POWERNV
 /*
  * Helper to map a chip id to VAS id.
  * For POWER9, this is a 1:1 mapping. In the future this maybe a 1:N
@@ -225,6 +226,35 @@ int vas_paste_crb(struct vas_window *win, int offset, bool 
re);
 int vas_register_api_powernv(struct module *mod, enum vas_cop_type cop_type,
 const char *name);
 void vas_unregister_api_powernv(void);
+#endif
+
+#ifdef CONFIG_PPC_PSERIES
+
+/* VAS Capabilities */
+#define VAS_GZIP_QOS_FEAT  0x1
+#define VAS_GZIP_DEF_FEAT  0x2
+#define VAS_GZIP_QOS_FEAT_BIT  PPC_BIT(VAS_GZIP_QOS_FEAT) /* Bit 1 */
+#define VAS_GZIP_DEF_FEAT_BIT  PPC_BIT(VAS_GZIP_DEF_FEAT) /* Bit 2 */
+
+/* NX Capabilities */
+#define VAS_NX_GZIP_FEAT   0x1
+#define VAS_NX_GZIP_FEAT_BIT   PPC_BIT(VAS_NX_GZIP_FEAT) /* Bit 1 */
+
+/*
+ * These structs are used to retrieve overall VAS capabilities that
+ * the hypervisor provides.
+ */
+struct hv_vas_all_caps {
+   __be64  descriptor;
+   __be64  feat_type;
+} __packed __aligned(0x1000);
+
+struct vas_all_caps {
+   u64 descriptor;
+   u64 feat_type;
+};
+
+#endif
 
 /*
  * Register / unregister coprocessor type to VAS API which will be exported
diff --git a/arch/powerpc/platforms/pseries/vas.h 
b/arch/powerpc/platforms/pseries/vas.h
new file mode 100644
index ..4ecb3fcabd10
--- /dev/null
+++ b/arch/powerpc/platforms/pseries/vas.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright 2020-21 IBM Corp.
+ */
+
+#ifndef _VAS_H
+#define _VAS_H
+#include 
+#include 
+#include 
+
+/*
+ * VAS window modify flags
+ */
+#define VAS_MOD_WIN_CLOSE  PPC_BIT(0)
+#define VAS_MOD_WIN_JOBS_KILL  PPC_BIT(1)
+#define VAS_MOD_WIN_DR PPC_BIT(3)
+#define VAS_MOD_WIN_PR PPC_BIT(4)
+#define VAS_MOD_WIN_SF PPC_BIT(5)
+#define VAS_MOD_WIN_TA PPC_BIT(6)
+#define VAS_MOD_WIN_FLAGS  (VAS_MOD_WIN_JOBS_KILL | VAS_MOD_WIN_DR | \
+   VAS_MOD_WIN_PR | VAS_MOD_WIN_SF)
+
+#define VAS_WIN_ACTIVE 0x0
+#define VAS_WIN_CLOSED 0x1
+#define VAS_WIN_INACTIVE   0x2 /* Inactive due to HW failure */
+/* Process of being modified, deallocated, or quiesced */
+#define VAS_WIN_MOD_IN_PROCESS 0x3
+
+#define VAS_COPY_PASTE_USER_MODE   0x0001
+#define VAS_COP_OP_USER_MODE   0x0010
+
+/*
+ * Co-processor feature - GZIP QoS windows or GZIP default windows
+ */
+enum vas_cop_feat_type {
+   VAS_GZIP_QOS_FEAT_TYPE,
+   VAS_GZIP_DEF_FEAT_TYPE,
+   VAS_MAX_FEAT_TYPE,
+};
+
+/*
+ * Use to get feature specific capabilities from the
+ * hypervisor.
+ */
+struct hv_vas_cop_feat_caps {
+   __be64  descriptor;
+   u8  win_type;   /* Default or QoS type */
+   u8  user_mode;
+   __be16  max_lpar_creds;
+   __be16  max_win_creds;
+   union {
+   __be16  reserved;
+   __be16  def_lpar_creds; /* Used for default capabilities */
+   };
+   __be16  target_lpar_creds;
+} __packed __aligned(0x1000);
+
+/*
+ * Feature specific (QoS or default) capabilities.
+ */
+struct vas_cop_feat_caps {
+   u64 descriptor;
+   u8  win_type;   /* Default or QoS type */
+   u8  user_mode;  /* User mode copy/paste or COP HCALL */
+   u16 max_lpar_creds; /* Max credits available in LPAR */
+   /* Max credits can be assigned per window */
+   u16 max_win_creds;
+   union {
+   u16 

[PATCH v6 07/17] powerpc/vas: Define and use common vas_window struct

2021-06-17 Thread Haren Myneni


Many elements in vas_struct are used on PowerNV and PowerVM
platforms. vas_window is used for both TX and RX windows on
PowerNV and for TX windows on PowerVM. So some elements are
specific to these platforms.

So this patch defines common vas_window and platform
specific window structs (pnv_vas_window on PowerNV). Also adds
the corresponding changes in PowerNV vas code.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  |  14 +-
 arch/powerpc/platforms/powernv/vas-debug.c  |  27 ++--
 arch/powerpc/platforms/powernv/vas-fault.c  |  20 +--
 arch/powerpc/platforms/powernv/vas-trace.h  |   4 +-
 arch/powerpc/platforms/powernv/vas-window.c | 161 +++-
 arch/powerpc/platforms/powernv/vas.h|  44 +++---
 6 files changed, 144 insertions(+), 126 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 6b41c0818958..14ad7982874c 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -10,8 +10,6 @@
 #include 
 #include 
 
-struct vas_window;
-
 /*
  * Min and max FIFO sizes are based on Version 1.05 Section 3.1.4.25
  * (Local FIFO Size Register) of the VAS workbook.
@@ -63,6 +61,18 @@ struct vas_user_win_ref {
struct mm_struct *mm;   /* Linux process mm_struct */
 };
 
+/*
+ * Common VAS window struct on PowerNV and PowerVM
+ */
+struct vas_window {
+   u32 winid;
+   u32 wcreds_max; /* Window credits */
+   enum vas_cop_type cop;
+   struct vas_user_win_ref task_ref;
+   char *dbgname;
+   struct dentry *dbgdir;
+};
+
 /*
  * User space window operations used for powernv and powerVM
  */
diff --git a/arch/powerpc/platforms/powernv/vas-debug.c 
b/arch/powerpc/platforms/powernv/vas-debug.c
index 41fa90d2f4ab..3ce89a4b54be 100644
--- a/arch/powerpc/platforms/powernv/vas-debug.c
+++ b/arch/powerpc/platforms/powernv/vas-debug.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "vas.h"
 
 static struct dentry *vas_debugfs;
@@ -28,7 +29,7 @@ static char *cop_to_str(int cop)
 
 static int info_show(struct seq_file *s, void *private)
 {
-   struct vas_window *window = s->private;
+   struct pnv_vas_window *window = s->private;
 
mutex_lock(_mutex);
 
@@ -36,9 +37,9 @@ static int info_show(struct seq_file *s, void *private)
if (!window->hvwc_map)
goto unlock;
 
-   seq_printf(s, "Type: %s, %s\n", cop_to_str(window->cop),
+   seq_printf(s, "Type: %s, %s\n", cop_to_str(window->vas_win.cop),
window->tx_win ? "Send" : "Receive");
-   seq_printf(s, "Pid : %d\n", vas_window_pid(window));
+   seq_printf(s, "Pid : %d\n", vas_window_pid(>vas_win));
 
 unlock:
mutex_unlock(_mutex);
@@ -47,7 +48,7 @@ static int info_show(struct seq_file *s, void *private)
 
 DEFINE_SHOW_ATTRIBUTE(info);
 
-static inline void print_reg(struct seq_file *s, struct vas_window *win,
+static inline void print_reg(struct seq_file *s, struct pnv_vas_window *win,
char *name, u32 reg)
 {
seq_printf(s, "0x%016llx %s\n", read_hvwc_reg(win, name, reg), name);
@@ -55,7 +56,7 @@ static inline void print_reg(struct seq_file *s, struct 
vas_window *win,
 
 static int hvwc_show(struct seq_file *s, void *private)
 {
-   struct vas_window *window = s->private;
+   struct pnv_vas_window *window = s->private;
 
mutex_lock(_mutex);
 
@@ -103,8 +104,10 @@ static int hvwc_show(struct seq_file *s, void *private)
 
 DEFINE_SHOW_ATTRIBUTE(hvwc);
 
-void vas_window_free_dbgdir(struct vas_window *window)
+void vas_window_free_dbgdir(struct pnv_vas_window *pnv_win)
 {
+   struct vas_window *window =  _win->vas_win;
+
if (window->dbgdir) {
debugfs_remove_recursive(window->dbgdir);
kfree(window->dbgname);
@@ -113,21 +116,21 @@ void vas_window_free_dbgdir(struct vas_window *window)
}
 }
 
-void vas_window_init_dbgdir(struct vas_window *window)
+void vas_window_init_dbgdir(struct pnv_vas_window *window)
 {
struct dentry *d;
 
if (!window->vinst->dbgdir)
return;
 
-   window->dbgname = kzalloc(16, GFP_KERNEL);
-   if (!window->dbgname)
+   window->vas_win.dbgname = kzalloc(16, GFP_KERNEL);
+   if (!window->vas_win.dbgname)
return;
 
-   snprintf(window->dbgname, 16, "w%d", window->winid);
+   snprintf(window->vas_win.dbgname, 16, "w%d", window->vas_win.winid);
 
-   d = debugfs_create_dir(window->dbgname, window->vinst->dbgdir);
-   window->dbgdir = d;
+   d = debugfs_create_dir(window->vas_win.dbgname, window->vinst->dbgdir);
+   window->vas_win.dbgdir = d;
 
debugfs_create_file("info", 0444, d, window, _fops);
debugfs_create_file("hvwc", 0444, d, window, _fops);
diff --git a/arch/powerpc/platforms/powernv/vas-fault.c 
b/arch/powerpc/platforms/powernv/vas-fault.c
index 

[PATCH v6 06/17] powerpc/vas: Move update_csb/dump_crb to common book3s platform

2021-06-17 Thread Haren Myneni


If a coprocessor encounters an error translating an address, the
VAS will cause an interrupt in the host. The kernel processes
the fault by updating CSB. This functionality is same for both
powerNV and pseries. So this patch moves these functions to
common vas-api.c and the actual functionality is not changed.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h |   3 +
 arch/powerpc/platforms/book3s/vas-api.c| 167 +
 arch/powerpc/platforms/powernv/vas-fault.c | 155 ++-
 3 files changed, 179 insertions(+), 146 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 71cff6d6bf3a..6b41c0818958 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -230,4 +230,7 @@ int vas_register_coproc_api(struct module *mod, enum 
vas_cop_type cop_type,
 void vas_unregister_coproc_api(void);
 
 int get_vas_user_win_ref(struct vas_user_win_ref *task_ref);
+void vas_update_csb(struct coprocessor_request_block *crb,
+   struct vas_user_win_ref *task_ref);
+void vas_dump_crb(struct coprocessor_request_block *crb);
 #endif /* __ASM_POWERPC_VAS_H */
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 4ce82500f4c5..30172e52e16b 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -10,6 +10,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -94,6 +97,170 @@ int get_vas_user_win_ref(struct vas_user_win_ref *task_ref)
return 0;
 }
 
+/*
+ * Successful return must release the task reference with
+ * put_task_struct
+ */
+static bool ref_get_pid_and_task(struct vas_user_win_ref *task_ref,
+ struct task_struct **tskp, struct pid **pidp)
+{
+   struct task_struct *tsk;
+   struct pid *pid;
+
+   pid = task_ref->pid;
+   tsk = get_pid_task(pid, PIDTYPE_PID);
+   if (!tsk) {
+   pid = task_ref->tgid;
+   tsk = get_pid_task(pid, PIDTYPE_PID);
+   /*
+* Parent thread (tgid) will be closing window when it
+* exits. So should not get here.
+*/
+   if (WARN_ON_ONCE(!tsk))
+   return false;
+   }
+
+   /* Return if the task is exiting. */
+   if (tsk->flags & PF_EXITING) {
+   put_task_struct(tsk);
+   return false;
+   }
+
+   *tskp = tsk;
+   *pidp = pid;
+
+   return true;
+}
+
+/*
+ * Update the CSB to indicate a translation error.
+ *
+ * User space will be polling on CSB after the request is issued.
+ * If NX can handle the request without any issues, it updates CSB.
+ * Whereas if NX encounters page fault, the kernel will handle the
+ * fault and update CSB with translation error.
+ *
+ * If we are unable to update the CSB means copy_to_user failed due to
+ * invalid csb_addr, send a signal to the process.
+ */
+void vas_update_csb(struct coprocessor_request_block *crb,
+   struct vas_user_win_ref *task_ref)
+{
+   struct coprocessor_status_block csb;
+   struct kernel_siginfo info;
+   struct task_struct *tsk;
+   void __user *csb_addr;
+   struct pid *pid;
+   int rc;
+
+   /*
+* NX user space windows can not be opened for task->mm=NULL
+* and faults will not be generated for kernel requests.
+*/
+   if (WARN_ON_ONCE(!task_ref->mm))
+   return;
+
+   csb_addr = (void __user *)be64_to_cpu(crb->csb_addr);
+
+   memset(, 0, sizeof(csb));
+   csb.cc = CSB_CC_FAULT_ADDRESS;
+   csb.ce = CSB_CE_TERMINATION;
+   csb.cs = 0;
+   csb.count = 0;
+
+   /*
+* NX operates and returns in BE format as defined CRB struct.
+* So saves fault_storage_addr in BE as NX pastes in FIFO and
+* expects user space to convert to CPU format.
+*/
+   csb.address = crb->stamp.nx.fault_storage_addr;
+   csb.flags = 0;
+
+   /*
+* Process closes send window after all pending NX requests are
+* completed. In multi-thread applications, a child thread can
+* open a window and can exit without closing it. May be some
+* requests are pending or this window can be used by other
+* threads later. We should handle faults if NX encounters
+* pages faults on these requests. Update CSB with translation
+* error and fault address. If csb_addr passed by user space is
+* invalid, send SEGV signal to pid saved in window. If the
+* child thread is not running, send the signal to tgid.
+* Parent thread (tgid) will close this window upon its exit.
+*
+* pid and mm references are taken when window is opened by
+* process (pid). So tgid is used only when child thread opens
+

[PATCH v6 05/17] powerpc/vas: Create take/drop pid and mm reference functions

2021-06-17 Thread Haren Myneni


Take pid and mm references when each window opens and drops during
close. This functionality is needed for powerNV and pseries. So
this patch defines the existing code as functions in common book3s
platform vas-api.c

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  | 40 +++
 arch/powerpc/platforms/book3s/vas-api.c | 39 +++
 arch/powerpc/platforms/powernv/vas-fault.c  | 10 ++--
 arch/powerpc/platforms/powernv/vas-window.c | 55 ++---
 arch/powerpc/platforms/powernv/vas.h|  6 +--
 5 files changed, 91 insertions(+), 59 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 163a8bb85d02..71cff6d6bf3a 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -5,6 +5,9 @@
 
 #ifndef _ASM_POWERPC_VAS_H
 #define _ASM_POWERPC_VAS_H
+#include 
+#include 
+#include 
 #include 
 
 struct vas_window;
@@ -49,6 +52,17 @@ enum vas_cop_type {
VAS_COP_TYPE_MAX,
 };
 
+/*
+ * User space VAS windows are opened by tasks and take references
+ * to pid and mm until windows are closed.
+ * Stores pid, mm, and tgid for each window.
+ */
+struct vas_user_win_ref {
+   struct pid *pid;/* PID of owner */
+   struct pid *tgid;   /* Thread group ID of owner */
+   struct mm_struct *mm;   /* Linux process mm_struct */
+};
+
 /*
  * User space window operations used for powernv and powerVM
  */
@@ -59,6 +73,31 @@ struct vas_user_win_ops {
int (*close_win)(struct vas_window *);
 };
 
+static inline void put_vas_user_win_ref(struct vas_user_win_ref *ref)
+{
+   /* Drop references to pid, tgid, and mm */
+   put_pid(ref->pid);
+   put_pid(ref->tgid);
+   if (ref->mm)
+   mmdrop(ref->mm);
+}
+
+static inline void vas_user_win_add_mm_context(struct vas_user_win_ref *ref)
+{
+   mm_context_add_vas_window(ref->mm);
+   /*
+* Even a process that has no foreign real address mapping can
+* use an unpaired COPY instruction (to no real effect). Issue
+* CP_ABORT to clear any pending COPY and prevent a covert
+* channel.
+*
+* __switch_to() will issue CP_ABORT on future context switches
+* if process / thread has any open VAS window (Use
+* current->mm->context.vas_windows).
+*/
+   asm volatile(PPC_CP_ABORT);
+}
+
 /*
  * Receive window attributes specified by the (in-kernel) owner of window.
  */
@@ -190,4 +229,5 @@ int vas_register_coproc_api(struct module *mod, enum 
vas_cop_type cop_type,
const struct vas_user_win_ops *vops);
 void vas_unregister_coproc_api(void);
 
+int get_vas_user_win_ref(struct vas_user_win_ref *task_ref);
 #endif /* __ASM_POWERPC_VAS_H */
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index ad566464b55b..4ce82500f4c5 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -55,6 +55,45 @@ static char *coproc_devnode(struct device *dev, umode_t 
*mode)
return kasprintf(GFP_KERNEL, "crypto/%s", dev_name(dev));
 }
 
+/*
+ * Take reference to pid and mm
+ */
+int get_vas_user_win_ref(struct vas_user_win_ref *task_ref)
+{
+   /*
+* Window opened by a child thread may not be closed when
+* it exits. So take reference to its pid and release it
+* when the window is free by parent thread.
+* Acquire a reference to the task's pid to make sure
+* pid will not be re-used - needed only for multithread
+* applications.
+*/
+   task_ref->pid = get_task_pid(current, PIDTYPE_PID);
+   /*
+* Acquire a reference to the task's mm.
+*/
+   task_ref->mm = get_task_mm(current);
+   if (!task_ref->mm) {
+   put_pid(task_ref->pid);
+   pr_err("VAS: pid(%d): mm_struct is not found\n",
+   current->pid);
+   return -EPERM;
+   }
+
+   mmgrab(task_ref->mm);
+   mmput(task_ref->mm);
+   /*
+* Process closes window during exit. In the case of
+* multithread application, the child thread can open
+* window and can exit without closing it. So takes tgid
+* reference until window closed to make sure tgid is not
+* reused.
+*/
+   task_ref->tgid = find_get_pid(task_tgid_vnr(current));
+
+   return 0;
+}
+
 static int coproc_open(struct inode *inode, struct file *fp)
 {
struct coproc_instance *cp_inst;
diff --git a/arch/powerpc/platforms/powernv/vas-fault.c 
b/arch/powerpc/platforms/powernv/vas-fault.c
index 3d21fce254b7..ac3a71ec3bd5 100644
--- a/arch/powerpc/platforms/powernv/vas-fault.c
+++ b/arch/powerpc/platforms/powernv/vas-fault.c
@@ -73,7 +73,7 @@ static void update_csb(struct vas_window *window,
 * NX user space windows can not be opened for 

[PATCH v6 04/17] powerpc/vas: Add platform specific user window operations

2021-06-17 Thread Haren Myneni


PowerNV uses registers to open/close VAS windows, and getting the
paste address. Whereas the hypervisor calls are used on PowerVM.

This patch adds the platform specific user space window operations
and register with the common VAS user space interface.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  | 16 +--
 arch/powerpc/platforms/book3s/vas-api.c | 53 +
 arch/powerpc/platforms/powernv/vas-window.c | 45 -
 arch/powerpc/platforms/powernv/vas.h|  2 +
 4 files changed, 91 insertions(+), 25 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 6076adf9ab4f..163a8bb85d02 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -5,6 +5,7 @@
 
 #ifndef _ASM_POWERPC_VAS_H
 #define _ASM_POWERPC_VAS_H
+#include 
 
 struct vas_window;
 
@@ -48,6 +49,16 @@ enum vas_cop_type {
VAS_COP_TYPE_MAX,
 };
 
+/*
+ * User space window operations used for powernv and powerVM
+ */
+struct vas_user_win_ops {
+   struct vas_window * (*open_win)(int vas_id, u64 flags,
+   enum vas_cop_type);
+   u64 (*paste_addr)(struct vas_window *);
+   int (*close_win)(struct vas_window *);
+};
+
 /*
  * Receive window attributes specified by the (in-kernel) owner of window.
  */
@@ -162,8 +173,6 @@ int vas_copy_crb(void *crb, int offset);
  */
 int vas_paste_crb(struct vas_window *win, int offset, bool re);
 
-void vas_win_paste_addr(struct vas_window *window, u64 *addr,
-   int *len);
 int vas_register_api_powernv(struct module *mod, enum vas_cop_type cop_type,
 const char *name);
 void vas_unregister_api_powernv(void);
@@ -177,7 +186,8 @@ void vas_unregister_api_powernv(void);
  * used for others in future.
  */
 int vas_register_coproc_api(struct module *mod, enum vas_cop_type cop_type,
-   const char *name);
+   const char *name,
+   const struct vas_user_win_ops *vops);
 void vas_unregister_coproc_api(void);
 
 #endif /* __ASM_POWERPC_VAS_H */
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index 72c126d87216..ad566464b55b 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -42,6 +42,7 @@ static struct coproc_dev {
dev_t devt;
struct class *class;
enum vas_cop_type cop_type;
+   const struct vas_user_win_ops *vops;
 } coproc_device;
 
 struct coproc_instance {
@@ -72,11 +73,10 @@ static int coproc_open(struct inode *inode, struct file *fp)
 static int coproc_ioc_tx_win_open(struct file *fp, unsigned long arg)
 {
void __user *uptr = (void __user *)arg;
-   struct vas_tx_win_attr txattr = {};
struct vas_tx_win_open_attr uattr;
struct coproc_instance *cp_inst;
struct vas_window *txwin;
-   int rc, vasid;
+   int rc;
 
cp_inst = fp->private_data;
 
@@ -93,27 +93,20 @@ static int coproc_ioc_tx_win_open(struct file *fp, unsigned 
long arg)
}
 
if (uattr.version != 1) {
-   pr_err("Invalid version\n");
+   pr_err("Invalid window open API version\n");
return -EINVAL;
}
 
-   vasid = uattr.vas_id;
-
-   vas_init_tx_win_attr(, cp_inst->coproc->cop_type);
-
-   txattr.lpid = mfspr(SPRN_LPID);
-   txattr.pidr = mfspr(SPRN_PID);
-   txattr.user_win = true;
-   txattr.rsvd_txbuf_count = false;
-   txattr.pswid = false;
-
-   pr_devel("Pid %d: Opening txwin, PIDR %ld\n", txattr.pidr,
-   mfspr(SPRN_PID));
+   if (!cp_inst->coproc->vops && !cp_inst->coproc->vops->open_win) {
+   pr_err("VAS API is not registered\n");
+   return -EACCES;
+   }
 
-   txwin = vas_tx_win_open(vasid, cp_inst->coproc->cop_type, );
+   txwin = cp_inst->coproc->vops->open_win(uattr.vas_id, uattr.flags,
+   cp_inst->coproc->cop_type);
if (IS_ERR(txwin)) {
-   pr_err("%s() vas_tx_win_open() failed, %ld\n", __func__,
-   PTR_ERR(txwin));
+   pr_err("%s() VAS window open failed, %ld\n", __func__,
+   PTR_ERR(txwin));
return PTR_ERR(txwin);
}
 
@@ -125,9 +118,15 @@ static int coproc_ioc_tx_win_open(struct file *fp, 
unsigned long arg)
 static int coproc_release(struct inode *inode, struct file *fp)
 {
struct coproc_instance *cp_inst = fp->private_data;
+   int rc;
 
if (cp_inst->txwin) {
-   vas_win_close(cp_inst->txwin);
+   if (cp_inst->coproc->vops &&
+   cp_inst->coproc->vops->close_win) {
+   rc = cp_inst->coproc->vops->close_win(cp_inst->txwin);
+  

[PATCH v6 03/17] powerpc/powernv/vas: Rename register/unregister functions

2021-06-17 Thread Haren Myneni


powerNV and pseries drivers register / unregister to the corresponding
platform specific VAS separately. Then these VAS functions call the
common API with the specific window operations. So rename powerNV VAS
API register/unregister functions.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h  |  3 +++
 arch/powerpc/platforms/book3s/vas-api.c |  2 --
 arch/powerpc/platforms/powernv/vas-window.c | 18 ++
 drivers/crypto/nx/nx-common-powernv.c   |  6 +++---
 4 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index 3be76e813e2d..6076adf9ab4f 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -164,6 +164,9 @@ int vas_paste_crb(struct vas_window *win, int offset, bool 
re);
 
 void vas_win_paste_addr(struct vas_window *window, u64 *addr,
int *len);
+int vas_register_api_powernv(struct module *mod, enum vas_cop_type cop_type,
+const char *name);
+void vas_unregister_api_powernv(void);
 
 /*
  * Register / unregister coprocessor type to VAS API which will be exported
diff --git a/arch/powerpc/platforms/book3s/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
index cfc9d7dd65ab..72c126d87216 100644
--- a/arch/powerpc/platforms/book3s/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -262,7 +262,6 @@ int vas_register_coproc_api(struct module *mod, enum 
vas_cop_type cop_type,
unregister_chrdev_region(coproc_device.devt, 1);
return rc;
 }
-EXPORT_SYMBOL_GPL(vas_register_coproc_api);
 
 void vas_unregister_coproc_api(void)
 {
@@ -275,4 +274,3 @@ void vas_unregister_coproc_api(void)
class_destroy(coproc_device.class);
unregister_chrdev_region(coproc_device.devt, 1);
 }
-EXPORT_SYMBOL_GPL(vas_unregister_coproc_api);
diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 7ba0840fc3b5..41712b4b268e 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1442,3 +1442,21 @@ struct vas_window *vas_pswid_to_window(struct 
vas_instance *vinst,
 
return window;
 }
+
+/*
+ * Supporting only nx-gzip coprocessor type now, but this API code
+ * extended to other coprocessor types later.
+ */
+int vas_register_api_powernv(struct module *mod, enum vas_cop_type cop_type,
+const char *name)
+{
+
+   return vas_register_coproc_api(mod, cop_type, name);
+}
+EXPORT_SYMBOL_GPL(vas_register_api_powernv);
+
+void vas_unregister_api_powernv(void)
+{
+   vas_unregister_coproc_api();
+}
+EXPORT_SYMBOL_GPL(vas_unregister_api_powernv);
diff --git a/drivers/crypto/nx/nx-common-powernv.c 
b/drivers/crypto/nx/nx-common-powernv.c
index 446f611726df..3b159f2fae17 100644
--- a/drivers/crypto/nx/nx-common-powernv.c
+++ b/drivers/crypto/nx/nx-common-powernv.c
@@ -1092,8 +1092,8 @@ static __init int nx_compress_powernv_init(void)
 * normal FIFO priority is assigned for userspace.
 * 842 compression is supported only in kernel.
 */
-   ret = vas_register_coproc_api(THIS_MODULE, VAS_COP_TYPE_GZIP,
-   "nx-gzip");
+   ret = vas_register_api_powernv(THIS_MODULE, VAS_COP_TYPE_GZIP,
+  "nx-gzip");
 
/*
 * GZIP is not supported in kernel right now.
@@ -1129,7 +1129,7 @@ static void __exit nx_compress_powernv_exit(void)
 * use. So delete this API use for GZIP engine.
 */
if (!nx842_ct)
-   vas_unregister_coproc_api();
+   vas_unregister_api_powernv();
 
crypto_unregister_alg(_powernv_alg);
 
-- 
2.18.2




[PATCH v6 02/17] powerpc/vas: Move VAS API to book3s common platform

2021-06-17 Thread Haren Myneni


The pseries platform will share vas and nx code and interfaces
with the PowerNV platform, so create the
arch/powerpc/platforms/book3s/ directory and move VAS API code
there. Functionality is not changed.

Signed-off-by: Haren Myneni 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/vas.h|  3 +++
 arch/powerpc/platforms/Kconfig|  1 +
 arch/powerpc/platforms/Makefile   |  1 +
 arch/powerpc/platforms/book3s/Kconfig | 15 +++
 arch/powerpc/platforms/book3s/Makefile|  2 ++
 .../platforms/{powernv => book3s}/vas-api.c   |  2 +-
 arch/powerpc/platforms/powernv/Kconfig| 14 --
 arch/powerpc/platforms/powernv/Makefile   |  2 +-
 arch/powerpc/platforms/powernv/vas.h  |  2 --
 9 files changed, 24 insertions(+), 18 deletions(-)
 create mode 100644 arch/powerpc/platforms/book3s/Kconfig
 create mode 100644 arch/powerpc/platforms/book3s/Makefile
 rename arch/powerpc/platforms/{powernv => book3s}/vas-api.c (99%)

diff --git a/arch/powerpc/include/asm/vas.h b/arch/powerpc/include/asm/vas.h
index e33f80b0ea81..3be76e813e2d 100644
--- a/arch/powerpc/include/asm/vas.h
+++ b/arch/powerpc/include/asm/vas.h
@@ -162,6 +162,9 @@ int vas_copy_crb(void *crb, int offset);
  */
 int vas_paste_crb(struct vas_window *win, int offset, bool re);
 
+void vas_win_paste_addr(struct vas_window *window, u64 *addr,
+   int *len);
+
 /*
  * Register / unregister coprocessor type to VAS API which will be exported
  * to user space. Applications can use this API to open / close window
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index 7a5e8f4541e3..594544a65b02 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -20,6 +20,7 @@ source "arch/powerpc/platforms/embedded6xx/Kconfig"
 source "arch/powerpc/platforms/44x/Kconfig"
 source "arch/powerpc/platforms/40x/Kconfig"
 source "arch/powerpc/platforms/amigaone/Kconfig"
+source "arch/powerpc/platforms/book3s/Kconfig"
 
 config KVM_GUEST
bool "KVM Guest support"
diff --git a/arch/powerpc/platforms/Makefile b/arch/powerpc/platforms/Makefile
index 143d4417f6cc..0e75d7df387b 100644
--- a/arch/powerpc/platforms/Makefile
+++ b/arch/powerpc/platforms/Makefile
@@ -22,3 +22,4 @@ obj-$(CONFIG_PPC_CELL)+= cell/
 obj-$(CONFIG_PPC_PS3)  += ps3/
 obj-$(CONFIG_EMBEDDED6xx)  += embedded6xx/
 obj-$(CONFIG_AMIGAONE) += amigaone/
+obj-$(CONFIG_PPC_BOOK3S)   += book3s/
diff --git a/arch/powerpc/platforms/book3s/Kconfig 
b/arch/powerpc/platforms/book3s/Kconfig
new file mode 100644
index ..34c931592ef0
--- /dev/null
+++ b/arch/powerpc/platforms/book3s/Kconfig
@@ -0,0 +1,15 @@
+# SPDX-License-Identifier: GPL-2.0
+config PPC_VAS
+   bool "IBM Virtual Accelerator Switchboard (VAS)"
+   depends on (PPC_POWERNV || PPC_PSERIES) && PPC_64K_PAGES
+   default y
+   help
+ This enables support for IBM Virtual Accelerator Switchboard (VAS).
+
+ VAS devices are found in POWER9-based and later systems, they
+ provide access to accelerator coprocessors such as NX-GZIP and
+ NX-842. This config allows the kernel to use NX-842 accelerators,
+ and user-mode APIs for the NX-GZIP accelerator on POWER9 PowerNV
+ and POWER10 PowerVM platforms.
+
+ If unsure, say "N".
diff --git a/arch/powerpc/platforms/book3s/Makefile 
b/arch/powerpc/platforms/book3s/Makefile
new file mode 100644
index ..e790f1910f61
--- /dev/null
+++ b/arch/powerpc/platforms/book3s/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+obj-$(CONFIG_PPC_VAS)  += vas-api.o
diff --git a/arch/powerpc/platforms/powernv/vas-api.c 
b/arch/powerpc/platforms/book3s/vas-api.c
similarity index 99%
rename from arch/powerpc/platforms/powernv/vas-api.c
rename to arch/powerpc/platforms/book3s/vas-api.c
index 98ed5d8c5441..cfc9d7dd65ab 100644
--- a/arch/powerpc/platforms/powernv/vas-api.c
+++ b/arch/powerpc/platforms/book3s/vas-api.c
@@ -10,9 +10,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
-#include "vas.h"
 
 /*
  * The driver creates the device node that can be used as follows:
diff --git a/arch/powerpc/platforms/powernv/Kconfig 
b/arch/powerpc/platforms/powernv/Kconfig
index 619b093a0657..043eefbbdd28 100644
--- a/arch/powerpc/platforms/powernv/Kconfig
+++ b/arch/powerpc/platforms/powernv/Kconfig
@@ -33,20 +33,6 @@ config PPC_MEMTRACE
  Enabling this option allows for runtime allocation of memory (RAM)
  for hardware tracing.
 
-config PPC_VAS
-   bool "IBM Virtual Accelerator Switchboard (VAS)"
-   depends on PPC_POWERNV && PPC_64K_PAGES
-   default y
-   help
- This enables support for IBM Virtual Accelerator Switchboard (VAS).
-
- VAS allows accelerators in co-processors like NX-GZIP and NX-842
- to be accessible to 

[PATCH v6 01/17] powerpc/powernv/vas: Release reference to tgid during window close

2021-06-17 Thread Haren Myneni


The kernel handles the NX fault by updating CSB or sending
signal to process. In multithread applications, children can
open VAS windows and can exit without closing them. But the
parent can continue to send NX requests with these windows. To
prevent pid reuse, reference will be taken on pid and tgid
when the window is opened and release them during window close.

The current code is not releasing the tgid reference which can
cause pid leak and this patch fixes the issue.

Fixes: db1c08a740635 ("powerpc/vas: Take reference to PID and mm for user space 
windows")
Cc: sta...@vger.kernel.org # 5.8+
Signed-off-by: Haren Myneni 
Reported-by: Nicholas Piggin 
Reviewed-by: Nicholas Piggin 
---
 arch/powerpc/platforms/powernv/vas-window.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/vas-window.c 
b/arch/powerpc/platforms/powernv/vas-window.c
index 5f5fe63a3d1c..7ba0840fc3b5 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -1093,9 +1093,9 @@ struct vas_window *vas_tx_win_open(int vasid, enum 
vas_cop_type cop,
/*
 * Process closes window during exit. In the case of
 * multithread application, the child thread can open
-* window and can exit without closing it. Expects parent
-* thread to use and close the window. So do not need
-* to take pid reference for parent thread.
+* window and can exit without closing it. so takes tgid
+* reference until window closed to make sure tgid is not
+* reused.
 */
txwin->tgid = find_get_pid(task_tgid_vnr(current));
/*
@@ -1339,8 +1339,9 @@ int vas_win_close(struct vas_window *window)
/* if send window, drop reference to matching receive window */
if (window->tx_win) {
if (window->user_win) {
-   /* Drop references to pid and mm */
+   /* Drop references to pid. tgid and mm */
put_pid(window->pid);
+   put_pid(window->tgid);
if (window->mm) {
mm_context_remove_vas_window(window->mm);
mmdrop(window->mm);
-- 
2.18.2




[PATCH v6 00/17] Enable VAS and NX-GZIP support on PowerVM

2021-06-17 Thread Haren Myneni


Virtual Accelerator Switchboard (VAS) allows kernel subsystems
and user space processes to directly access the Nest Accelerator
(NX) engines which provides HW compression. The true user mode
VAS/NX support on PowerNV is already included in Linux. Whereas
PowerVM support is available from P10 onwards.

This patch series enables VAS / NX-GZIP on PowerVM which allows
the user space to do copy/paste with the same existing interface
that is available on PowerNV.

VAS Enablement:
- Get all VAS capabilities using H_QUERY_VAS_CAPABILITIES that are
  available in the hypervisor. These capabilities tells OS which
  type of features (credit types such as Default and Quality of
  Service (QoS)). Also gives specific capabilities for each credit
  type: Maximum window credits, Maximum LPAR credits, Target credits
  in that parition (varies from max LPAR credits based DLPAR
  operation), whether supports user mode COPY/PASTE and etc.
- Register LPAR VAS operations such as open window. get paste
  address and close window with the current VAS user space API.
- Open window operation - Use H_ALLOCATE_VAS_WINDOW HCALL to open
  window and H_MODIFY_VAS_WINDOW HCALL to setup the window with LPAR
  PID and etc.
- mmap to paste address returned in H_ALLOCATE_VAS_WINDOW HCALL
- To close window, H_DEALLOCATE_VAS_WINDOW HCALL is used to close in
  the hypervisor.

NX Enablement:
- Get NX capabilities from the the hypervisor which provides Maximum
  buffer length in a single GZIP request, recommended minimum
  compression / decompression lengths.
- Register to VAS to enable user space VAS API

Main feature differences with PowerNV implementation:
- Each VAS window will be configured with a number of credits which
  means that many requests can be issues simultaniously on that
  window. On PowerNV, 1K credits are configured per window.
  Whereas on PowerVM, the hypervisor allows 1 credit per window
  at present.
- The hypervisor introduced 2 different types of credits: Default -
  Uses normal priority FIFO and Quality of Service (QoS) - Uses high
  priority FIFO. On PowerVM, VAS/NX HW resources are shared across
  LPARs. The total number of credits available on a system depends
  on cores configured. We may see more credits are assigned across
  the system than the NX HW resources can handle. So to avoid NX HW
  contention, the hypervisor introduced QoS credits which can be
  configured by system administration with HMC API. Then the total
  number of available default credits on LPAR varies based on QoS
  credits configured.
- On PowerNV, windows are allocated on a specific VAS instance
  and the user space can select VAS instance with the open window
  ioctl. Since VAS instances can be shared across partitions on
  PowerVM, the hypervisor manages window allocations on different
  VAS instances. So H_ALLOCATE_VAS_WINDOW allows to select by domain
  indentifiers (H_HOME_NODE_ASSOCIATIVITY values by cpu). By default
  the hypervisor selects VAS instance closer to CPU resources that the
  parition uses. So vas_id in ioctl interface is ignored on PowerVM
  except vas_id=-1 which is used to allocate window based on CPU that
  the process is executing. This option is needed for process affinity
  to NUMA node.

  The existing applications that linked with libnxz should work as
  long as the job request length is restricted to
  req_max_processed_len.

  Tested the following patches on P10 successfully with test cases
  given: https://github.com/libnxz/power-gzip

  Note: The hypervisor supports user mode NX from p10 onwards. Linux
supports user mode VAS/NX on P10 only with radix page tables.

Patch 1:Fix to release reference to tgid during window close
Patches 2- 6:   Move the code that is needed for both PowerNV and
PowerVM to powerpc book3s platform directory
Patch 7:Modify vas-window struct to support both platforms
and the related changes.
Patch 8:Define HCALL and the related VAS/NXGZIP specific
structs.
Patch 9:Define QoS credit flag in window open ioctl
Patch 10:   Implement Allocate, Modify and Deallocate HCALLs
Patch 11:   Retrieve VAS capabilities from the hypervisor
Patch 12;   Implement window operations and integrate with API
Patch 13:   Setup IRQ and NX fault handling
Patch 14 - 15:  Make the code common to add NX-GZIP enablement
Patch 16:   Get NX capabilities from the hypervisor
patch 17;   Add sysfs interface to expose NX capabilities

Changes in v2:
  - Rebase on 5.12-rc6
  - Moved VAS Kconfig changes to arch/powerpc/platform as suggested
by Christophe Leroy
  - build fix with allyesconfig (reported by kernel test build)

Changes in v3:
  - Rebase on 5.12-rc7
  - Moved vas-api.c and VAS Kconfig changes to
arch/powerpc/platform/book3s as Michael Ellerman suggested

Changes in v4:
  - Rebase on 5.13-rc2
  - Changes based on review comments from Nicholas Piggin
- Add seperate patch to define user 

[powerpc:next-test] BUILD SUCCESS 3c53642324f526c0aba411bf8e6cf2ab2471192a

2021-06-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
next-test
branch HEAD: 3c53642324f526c0aba411bf8e6cf2ab2471192a  Merge branch 
'topic/ppc-kvm' into next

elapsed time: 736m

configs tested: 90
configs skipped: 3

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
m68km5307c3_defconfig
sh   j2_defconfig
arm   cns3420vb_defconfig
mips  ath25_defconfig
sh   se7750_defconfig
archsdk_defconfig
openrisc simple_smp_defconfig
arm  tct_hammer_defconfig
arm s5pv210_defconfig
armkeystone_defconfig
m68k  multi_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
x86_64allnoconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa   allyesconfig
h8300allyesconfig
arc defconfig
sh   allmodconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
x86_64   randconfig-a004-20210617
x86_64   randconfig-a001-20210617
x86_64   randconfig-a002-20210617
x86_64   randconfig-a003-20210617
x86_64   randconfig-a006-20210617
x86_64   randconfig-a005-20210617
i386 randconfig-a002-20210617
i386 randconfig-a006-20210617
i386 randconfig-a001-20210617
i386 randconfig-a004-20210617
i386 randconfig-a005-20210617
i386 randconfig-a003-20210617
i386 randconfig-a015-20210617
i386 randconfig-a013-20210617
i386 randconfig-a016-20210617
i386 randconfig-a012-20210617
i386 randconfig-a014-20210617
i386 randconfig-a011-20210617
riscvnommu_k210_defconfig
riscvallyesconfig
riscvnommu_virt_defconfig
riscv allnoconfig
riscv   defconfig
riscv  rv32_defconfig
riscvallmodconfig
um   x86_64_defconfig
um i386_defconfig
umkunit_defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-b001-20210617
x86_64   randconfig-a015-20210617
x86_64   randconfig-a011-20210617
x86_64   randconfig-a014-20210617
x86_64   randconfig-a012-20210617
x86_64   randconfig-a013-20210617
x86_64   randconfig-a016-20210617

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


Re: [RFC PATCH 8/8] powerpc/papr_scm: Use FORM2 associativity details

2021-06-17 Thread Daniel Henrique Barboza




On 6/17/21 8:11 AM, Aneesh Kumar K.V wrote:

Daniel Henrique Barboza  writes:


On 6/17/21 4:46 AM, David Gibson wrote:

On Tue, Jun 15, 2021 at 12:35:17PM +0530, Aneesh Kumar K.V wrote:

David Gibson  writes:


On Tue, Jun 15, 2021 at 11:27:50AM +0530, Aneesh Kumar K.V wrote:

David Gibson  writes:


On Mon, Jun 14, 2021 at 10:10:03PM +0530, Aneesh Kumar K.V wrote:

FORM2 introduce a concept of secondary domain which is identical to the
conceept of FORM1 primary domain. Use secondary domain as the numa node
when using persistent memory device. For DAX kmem use the logical domain
id introduced in FORM2. This new numa node

Signed-off-by: Aneesh Kumar K.V 
---
   arch/powerpc/mm/numa.c| 28 +++
   arch/powerpc/platforms/pseries/papr_scm.c | 26 +
   arch/powerpc/platforms/pseries/pseries.h  |  1 +
   3 files changed, 45 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 86cd2af014f7..b9ac6d02e944 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -265,6 +265,34 @@ static int associativity_to_nid(const __be32 
*associativity)
return nid;
   }
   
+int get_primary_and_secondary_domain(struct device_node *node, int *primary, int *secondary)

+{
+   int secondary_index;
+   const __be32 *associativity;
+
+   if (!numa_enabled) {
+   *primary = NUMA_NO_NODE;
+   *secondary = NUMA_NO_NODE;
+   return 0;
+   }
+
+   associativity = of_get_associativity(node);
+   if (!associativity)
+   return -ENODEV;
+
+   if (of_read_number(associativity, 1) >= primary_domain_index) {
+   *primary = of_read_number([primary_domain_index], 
1);
+   secondary_index = of_read_number(_ref_points[1], 1);


Secondary ID is always the second reference point, but primary depends
on the length of resources?  That seems very weird.


primary_domain_index is distance_ref_point[0]. With Form2 we would find
both primary and secondary domain ID same for all resources other than
persistent memory device. The usage w.r.t. persistent memory is
explained in patch 7.


Right, I misunderstood



With Form2 the primary domainID and secondary domainID are used to identify the 
NUMA nodes
the kernel should use when using persistent memory devices.


This seems kind of bogus.  With Form1, the primary/secondary ID are a
sort of heirarchy of distance (things with same primary ID are very
close, things with same secondary are kinda-close, etc.).  With Form2,
it's referring to their effective node for different purposes.

Using the same terms for different meanings seems unnecessarily
confusing.


They are essentially domainIDs. The interpretation of them are different
between Form1 and Form2. Hence I kept referring to them as primary and
secondary domainID. Any suggestion on what to name them with Form2?


My point is that reusing associativity-reference-points for something
with completely unrelated semantics seems like a very poor choice.



I agree that this reuse can be confusing. I could argue that there is
precedent for that in PAPR - FORM0 puts a different spin on the same
property as well - but there is no need to keep following existing PAPR
practices in new spec (and some might argue it's best not to).

As far as QEMU goes, renaming this property to "numa-associativity-mode"
(just an example) is a quick change to do since we separated FORM1 and FORM2
code over there.

Doing such a rename can also help with the issue of having to describe new
FORM2 semantics using "least significant boundary" or "primary domain" or
any FORM0|FORM1 related terminology.



It is not just changing the name, we will then have to explain the
meaning of ibm,associativity-reference-points with FORM2 right?


H why? My idea over there was to add a new property that indicates that
resource might have a different NUMA affinity based on the mode of operation
(like PMEM), and get rid of ibm,associativity-reference-points altogether.

The NUMA distances already express the topology. Closer distances indicates
closer proximity, larger distances indicates otherwise. Having
"associativity-reference-points" to reflect a  associativity domain
relationship, when you already have all the distances from each node, is
somewhat redundant.

The concept of 'associativity domain' was necessary in FORM1 because we had no
other way of telling distance between NUMA nodes. We needed to rely on these
overly complex and convoluted subdomain abstractions to say that "nodeA belongs
to the same third-level domain as node B, and in the second-level domain with
node C". The kernel would read that and calculate that each level is doubling
the distance from the level before and local_distance is 10, so:

distAA = 10  distAB= 20 distAC = 40

With FORM2, if this information is already explicit in ibm,numa-distance-table,
why bother calculating associativity domains? 

[powerpc:topic/ppc-kvm] BUILD SUCCESS fae5c9f3664ba278137e54a2083b39b90c64093a

2021-06-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
topic/ppc-kvm
branch HEAD: fae5c9f3664ba278137e54a2083b39b90c64093a  KVM: PPC: Book3S HV: 
remove ISA v3.0 and v3.1 support from P7/8 path

elapsed time: 723m

configs tested: 75
configs skipped: 97

The following configs have been built successfully.
More configs may be tested in the coming days.

gcc tested configs:
arm defconfig
arm64allyesconfig
arm64   defconfig
arm  allyesconfig
arm  allmodconfig
sh   se7750_defconfig
archsdk_defconfig
openrisc simple_smp_defconfig
arm  tct_hammer_defconfig
arm s5pv210_defconfig
armkeystone_defconfig
m68k  multi_defconfig
ia64 allmodconfig
ia64defconfig
ia64 allyesconfig
x86_64allnoconfig
m68k allmodconfig
m68kdefconfig
m68k allyesconfig
nios2   defconfig
arc  allyesconfig
nds32 allnoconfig
nds32   defconfig
nios2allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
parisc  defconfig
s390 allyesconfig
s390 allmodconfig
parisc   allyesconfig
s390defconfig
i386 allyesconfig
sparcallyesconfig
sparc   defconfig
i386defconfig
mips allyesconfig
mips allmodconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a002-20210617
i386 randconfig-a006-20210617
i386 randconfig-a001-20210617
i386 randconfig-a004-20210617
i386 randconfig-a005-20210617
i386 randconfig-a003-20210617
i386 randconfig-a015-20210617
i386 randconfig-a013-20210617
i386 randconfig-a016-20210617
i386 randconfig-a012-20210617
i386 randconfig-a014-20210617
i386 randconfig-a011-20210617
x86_64   randconfig-a004-20210617
x86_64   randconfig-a001-20210617
x86_64   randconfig-a002-20210617
x86_64   randconfig-a003-20210617
x86_64   randconfig-a006-20210617
x86_64   randconfig-a005-20210617
um   x86_64_defconfig
um i386_defconfig
umkunit_defconfig
x86_64   allyesconfig
x86_64rhel-8.3-kselftests
x86_64  defconfig
x86_64   rhel-8.3
x86_64  rhel-8.3-kbuiltin
x86_64  kexec

clang tested configs:
x86_64   randconfig-b001-20210617
x86_64   randconfig-a015-20210617
x86_64   randconfig-a011-20210617
x86_64   randconfig-a014-20210617
x86_64   randconfig-a012-20210617
x86_64   randconfig-a013-20210617
x86_64   randconfig-a016-20210617

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[PATCH] powerpc/perf: Fix crash with 'perf_instruction_pointer' when pmu is not set

2021-06-17 Thread Athira Rajeev
On systems without any specific PMU driver support registered, running
perf record causes Oops.

The relevant portion from call trace:

BUG: Kernel NULL pointer dereference on read at 0x0040
Faulting instruction address: 0xc0021f0c
Oops: Kernel access of bad area, sig: 11 [#1]
BE PAGE_SIZE=4K PREEMPT CMPCPRO
SAF3000 DIE NOTIFICATION
CPU: 0 PID: 442 Comm: null_syscall Not tainted 
5.13.0-rc6-s3k-dev-01645-g7649ee3d2957 #5164
NIP:  c0021f0c LR: c00e8ad8 CTR: c00d8a5c
NIP [c0021f0c] perf_instruction_pointer+0x10/0x60
LR [c00e8ad8] perf_prepare_sample+0x344/0x674
Call Trace:
[e6775880] [c00e8810] perf_prepare_sample+0x7c/0x674 (unreliable)
[e67758c0] [c00e8e44] perf_event_output_forward+0x3c/0x94
[e6775910] [c00dea8c] __perf_event_overflow+0x74/0x14c
[e6775930] [c00dec5c] perf_swevent_hrtimer+0xf8/0x170
[e6775a40] [c008c8d0] __hrtimer_run_queues.constprop.0+0x160/0x318
[e6775a90] [c008d94c] hrtimer_interrupt+0x148/0x3b0
[e6775ae0] [c000c0c0] timer_interrupt+0xc4/0x22c
[e6775b10] [c00046f0] Decrementer_virt+0xb8/0xbc

During perf record session, perf_instruction_pointer() is called to
capture the sample ip. This function in core-book3s accesses ppmu->flags.
If a platform specific PMU driver is not registered, ppmu is set to NULL
and accessing its members results in a crash. Fix this crash by checking
if ppmu is set.

Fixes: 2ca13a4cc56c ("powerpc/perf: Use regs->nip when SIAR is zero")
[ Including stable for kernel versions 5.11 and 5.12 ]
Cc: sta...@vger.kernel.org
Signed-off-by: Athira Rajeev 
Reported-by: Christophe Leroy 
Tested-by: Christophe Leroy 
---
 arch/powerpc/perf/core-book3s.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 16d4d1b..5162241 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2254,7 +2254,7 @@ unsigned long perf_instruction_pointer(struct pt_regs 
*regs)
bool use_siar = regs_use_siar(regs);
unsigned long siar = mfspr(SPRN_SIAR);
 
-   if (ppmu->flags & PPMU_P10_DD1) {
+   if (ppmu && (ppmu->flags & PPMU_P10_DD1)) {
if (siar)
return siar;
else
-- 
1.8.3.1



Re: Oops (NULL pointer) with 'perf record' of selftest 'null_syscall'

2021-06-17 Thread Athira Rajeev



> On 17-Jun-2021, at 10:05 PM, Christophe Leroy  
> wrote:
> 
> 
> 
> Le 17/06/2021 à 08:36, Athira Rajeev a écrit :
>>> On 16-Jun-2021, at 11:56 AM, Christophe Leroy  
>>> wrote:
>>> 
>>> 
>>> 
>>> Le 16/06/2021 à 05:40, Athira Rajeev a écrit :
> On 16-Jun-2021, at 8:53 AM, Madhavan Srinivasan  
> wrote:
> 
> 
> On 6/15/21 8:35 PM, Christophe Leroy wrote:
>> For your information, I'm getting the following Oops. Detected with 
>> 5.13-rc6, it also oopses on 5.12 and 5.11.
>> Runs ok on 5.10. I'm starting bisecting now.
> 
> 
> Thanks for reporting, got the issue. What has happened in this case is 
> that, pmu device is not registered
> and trying to access the instruction point which will land in 
> perf_instruction_pointer(). And recently I have added
> a workaround patch for power10 DD1 which has caused this breakage. My 
> bad. We are working on a fix patch
> for the same and will post it out. Sorry again.
> 
 Hi Christophe,
 Can you please try with below patch in your environment and test if it 
 works for you.
 From 55d3afc9369dfbe28a7152c8e9f856c11c7fe43d Mon Sep 17 00:00:00 2001
 From: Athira Rajeev 
 Date: Tue, 15 Jun 2021 22:28:11 -0400
 Subject: [PATCH] powerpc/perf: Fix crash with 'perf_instruction_pointer' 
 when
 pmu is not set
 On systems without any specific PMU driver support registered, running
 perf record causes oops:
 [   38.841073] NIP [c013af54] perf_instruction_pointer+0x24/0x100
 [   38.841079] LR [c03c7358] perf_prepare_sample+0x4e8/0x820
 [   38.841085] --- interrupt: 300
 [   38.841088] [c0001cf03440] [c03c6ef8] 
 perf_prepare_sample+0x88/0x820 (unreliable)
 [   38.841096] [c0001cf034a0] [c03c76d0] 
 perf_event_output_forward+0x40/0xc0
 [   38.841104] [c0001cf03520] [c03b45e8] 
 __perf_event_overflow+0x88/0x1b0
 [   38.841112] [c0001cf03570] [c03b480c] 
 perf_swevent_hrtimer+0xfc/0x1a0
 [   38.841119] [c0001cf03740] [c02399cc] 
 __hrtimer_run_queues+0x17c/0x380
 [   38.841127] [c0001cf037c0] [c023a5f8] 
 hrtimer_interrupt+0x128/0x2f0
 [   38.841135] [c0001cf03870] [c002962c] 
 timer_interrupt+0x13c/0x370
 [   38.841143i] [c0001cf038d0] [c0009ba4] 
 decrementer_common_virt+0x1a4/0x1b0
 [   38.841151] --- interrupt: 900 at copypage_power7+0xd4/0x1c0
 During perf record session, perf_instruction_pointer() is called to
 capture the sample ip. This function in core-book3s accesses ppmu->flags.
 If a platform specific PMU driver is not registered, ppmu is set to NULL
 and accessing its members results in a crash. Fix this crash by checking
 if ppmu is set.
 Signed-off-by: Athira Rajeev 
 Reported-by: Christophe Leroy 
>>> 
>>> Fixes: 2ca13a4cc56c ("powerpc/perf: Use regs->nip when SIAR is zero")
>>> Cc: sta...@vger.kernel.org
>>> Tested-by: Christophe Leroy 
>> Hi Christophe,
>> Thanks for testing with the change. I have a newer version where I have 
>> added braces around the check.
>> Can you please check once and can I add your tested-by for the below patch.
> 
> Yes it works, you can add my Tested-by:
> Please also add Cc: sta...@vger.kernel.org, this needs to be backported as 
> soon as possible.

Sure Christophe, will add Cc also. Thanks for testing.

Athira
> 
> Thanks
> Christophe



Re: [PATCH 11/11] powerpc/microwatt: Disable interrupts in boot wrapper main program

2021-06-17 Thread Segher Boessenkool
On Thu, Jun 17, 2021 at 11:40:23AM +1000, Nicholas Piggin wrote:
> Excerpts from Segher Boessenkool's message of June 17, 2021 9:37 am:
> > On Tue, Jun 15, 2021 at 09:05:27AM +1000, Paul Mackerras wrote:
> >> This ensures that we don't get a decrementer interrupt arriving before
> >> we have set up a handler for it.
> > 
> > Maybe add a comment saying this is setting MSR[EE]=0 for that?  Or do
> > other bits here matter as well?
> 
> Hmm, it actually clears MSR[RI] as well.
> 
> __hard_irq_disable() is what we want here, unless the MSR[RI] clearing 
> is required as well, in which case there is __hard_EE_RI_disable().

I don't think it matters if MSR[RI] is set or not here, nothing will try
to recover from an actual reboot I hope :-)


Segher


[PATCH v4 7/7] powerpc/pseries: Add support for FORM2 associativity

2021-06-17 Thread Aneesh Kumar K.V
PAPR interface currently supports two different ways of communicating resource
grouping details to the OS. These are referred to as Form 0 and Form 1
associativity grouping. Form 0 is the older format and is now considered
deprecated. This patch adds another resource grouping named FORM2.

Signed-off-by: Daniel Henrique Barboza 
Signed-off-by: Aneesh Kumar K.V 
---
 Documentation/powerpc/associativity.rst   | 135 
 arch/powerpc/include/asm/firmware.h   |   3 +-
 arch/powerpc/include/asm/prom.h   |   1 +
 arch/powerpc/kernel/prom_init.c   |   3 +-
 arch/powerpc/mm/numa.c| 149 +-
 arch/powerpc/platforms/pseries/firmware.c |   1 +
 6 files changed, 286 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/powerpc/associativity.rst

diff --git a/Documentation/powerpc/associativity.rst 
b/Documentation/powerpc/associativity.rst
new file mode 100644
index ..93be604ac54d
--- /dev/null
+++ b/Documentation/powerpc/associativity.rst
@@ -0,0 +1,135 @@
+
+NUMA resource associativity
+=
+
+Associativity represents the groupings of the various platform resources into
+domains of substantially similar mean performance relative to resources outside
+of that domain. Resources subsets of a given domain that exhibit better
+performance relative to each other than relative to other resources subsets
+are represented as being members of a sub-grouping domain. This performance
+characteristic is presented in terms of NUMA node distance within the Linux 
kernel.
+From the platform view, these groups are also referred to as domains.
+
+PAPR interface currently supports different ways of communicating these 
resource
+grouping details to the OS. These are referred to as Form 0, Form 1 and Form2
+associativity grouping. Form 0 is the older format and is now considered 
deprecated.
+
+Hypervisor indicates the type/form of associativity used via 
"ibm,arcitecture-vec-5 property".
+Bit 0 of byte 5 in the "ibm,architecture-vec-5" property indicates usage of 
Form 0 or Form 1.
+A value of 1 indicates the usage of Form 1 associativity. For Form 2 
associativity
+bit 2 of byte 5 in the "ibm,architecture-vec-5" property is used.
+
+Form 0
+-
+Form 0 associativity supports only two NUMA distance (LOCAL and REMOTE).
+
+Form 1
+-
+With Form 1 a combination of ibm,associativity-reference-points and 
ibm,associativity
+device tree properties are used to determine the NUMA distance between 
resource groups/domains.
+
+The “ibm,associativity” property contains one or more lists of numbers 
(domainID)
+representing the resource’s platform grouping domains.
+
+The “ibm,associativity-reference-points” property contains one or more list of 
numbers
+(domainID index) that represents the 1 based ordinal in the associativity 
lists.
+The list of domainID index represnets increasing hierachy of resource 
grouping. 
+
+ex:
+{ primary domainID index, secondary domainID index, tertiary domainID index.. }
+
+Linux kernel uses the domainID at the primary domainID index as the NUMA node 
id.
+Linux kernel computes NUMA distance between two domains by recursively 
comparing
+if they belong to the same higher-level domains. For mismatch at every higher
+level of the resource group, the kernel doubles the NUMA distance between the
+comparing domains.
+
+Form 2
+---
+Form 2 associativity format adds separate device tree properties representing 
NUMA node distance
+thereby making the node distance computation flexible. Form 2 also allows 
flexible primary
+domain numbering. With numa distance computation now detached from the index 
value of
+"ibm,associativity" property, Form 2 allows a large number of primary domain 
ids at the
+same domainID index representing resource groups of different 
performance/latency characteristics.
+
+Hypervisor indicates the usage of FORM2 associativity using bit 2 of byte 5 in 
the
+"ibm,architecture-vec-5" property.
+
+"ibm,numa-lookup-index-table" property contains one or more list numbers 
representing
+the domainIDs present in the system. The offset of the domainID in this 
property is considered
+the domainID index.
+
+prop-encoded-array: The number N of the domainIDs encoded as with encode-int, 
followed by
+N domainID encoded as with encode-int
+
+For ex:
+ibm,numa-lookup-index-table =  {4, 0, 8, 250, 252}, domainID index for 
domainID 8 is 1.
+
+"ibm,numa-distance-table" property contains one or more list of numbers 
representing the NUMA
+distance between resource groups/domains present in the system.
+
+prop-encoded-array: The number N of the distance values encoded as with 
encode-int, followed by
+N distance values encoded as with encode-bytes. The max distance value we 
could encode is 255.
+
+For ex:
+ibm,numa-lookup-index-table =  {3, 0, 8, 40}
+ibm,numa-distance-table =  {9, 10, 20, 80, 20, 10, 160, 80, 160, 10}
+
+  | 08   40

[PATCH v4 6/7] powerpc/pseries: Add a helper for form1 cpu distance

2021-06-17 Thread Aneesh Kumar K.V
This helper is only used with the dispatch trace log collection.
A later patch will add Form2 affinity support and this change helps
in keeping that simpler. Also add a comment explaining we don't expect
the code to be called with FORM0

Reviewed-by: David Gibson 
Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/numa.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index c481f08d565b..d32729f235b8 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -166,7 +166,7 @@ static void unmap_cpu_from_node(unsigned long cpu)
 }
 #endif /* CONFIG_HOTPLUG_CPU || CONFIG_PPC_SPLPAR */
 
-int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
+static int __cpu_form1_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
 {
int dist = 0;
 
@@ -182,6 +182,14 @@ int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
return dist;
 }
 
+int cpu_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
+{
+   /* We should not get called with FORM0 */
+   VM_WARN_ON(affinity_form == FORM0_AFFINITY);
+
+   return __cpu_form1_distance(cpu1_assoc, cpu2_assoc);
+}
+
 /* must hold reference to node during call */
 static const __be32 *of_get_associativity(struct device_node *dev)
 {
-- 
2.31.1



[PATCH v4 5/7] powerpc/pseries: Consolidate NUMA distance update during boot

2021-06-17 Thread Aneesh Kumar K.V
Instead of updating NUMA distance every time we lookup a node id
from the associativity property, add helpers that can be used
during boot which does this only once. Also remove the distance
update from node id lookup helpers.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/numa.c | 135 +++--
 1 file changed, 88 insertions(+), 47 deletions(-)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 645a95e3a7ea..c481f08d565b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -208,22 +208,6 @@ int __node_distance(int a, int b)
 }
 EXPORT_SYMBOL(__node_distance);
 
-static void initialize_distance_lookup_table(int nid,
-   const __be32 *associativity)
-{
-   int i;
-
-   if (affinity_form != FORM1_AFFINITY)
-   return;
-
-   for (i = 0; i < max_associativity_domain_index; i++) {
-   const __be32 *entry;
-
-   entry = [be32_to_cpu(distance_ref_points[i]) - 1];
-   distance_lookup_table[nid][i] = of_read_number(entry, 1);
-   }
-}
-
 /*
  * Returns nid in the range [0..nr_node_ids], or -1 if no useful NUMA
  * info is found.
@@ -241,15 +225,6 @@ static int associativity_to_nid(const __be32 
*associativity)
/* POWER4 LPAR uses 0x as invalid node */
if (nid == 0x || nid >= nr_node_ids)
nid = NUMA_NO_NODE;
-
-   if (nid > 0 &&
-   of_read_number(associativity, 1) >= 
max_associativity_domain_index) {
-   /*
-* Skip the length field and send start of associativity array
-*/
-   initialize_distance_lookup_table(nid, associativity + 1);
-   }
-
 out:
return nid;
 }
@@ -291,10 +266,13 @@ static void __initialize_form1_numa_distance(const __be32 
*associativity)
 {
int i, nid;
 
+   if (affinity_form != FORM1_AFFINITY)
+   return;
+
if (of_read_number(associativity, 1) >= primary_domain_index) {
nid = of_read_number([primary_domain_index], 1);
 
-   for (i = 0; i < max_domain_index; i++) {
+   for (i = 0; i < max_associativity_domain_index; i++) {
const __be32 *entry;
 
entry = 
[be32_to_cpu(distance_ref_points[i])];
@@ -474,6 +452,48 @@ static int of_get_assoc_arrays(struct assoc_arrays *aa)
return 0;
 }
 
+static int get_nid_and_numa_distance(struct drmem_lmb *lmb)
+{
+   struct assoc_arrays aa = { .arrays = NULL };
+   int default_nid = NUMA_NO_NODE;
+   int nid = default_nid;
+   int rc, index;
+
+   if ((primary_domain_index < 0) || !numa_enabled)
+   return default_nid;
+
+   rc = of_get_assoc_arrays();
+   if (rc)
+   return default_nid;
+
+   if (primary_domain_index <= aa.array_sz &&
+   !(lmb->flags & DRCONF_MEM_AI_INVALID) && lmb->aa_index < 
aa.n_arrays) {
+   index = lmb->aa_index * aa.array_sz + primary_domain_index - 1;
+   nid = of_read_number([index], 1);
+
+   if (nid == 0x || nid >= nr_node_ids)
+   nid = default_nid;
+   if (nid > 0 && affinity_form == FORM1_AFFINITY) {
+   int i;
+   const __be32 *associativity;
+
+   index = lmb->aa_index * aa.array_sz;
+   associativity = [index];
+   /*
+* lookup array associativity entries have different 
format
+* There is no length of the array as the first element.
+*/
+   for (i = 0; i < max_associativity_domain_index; i++) {
+   const __be32 *entry;
+
+   entry = 
[be32_to_cpu(distance_ref_points[i]) - 1];
+   distance_lookup_table[nid][i] = 
of_read_number(entry, 1);
+   }
+   }
+   }
+   return nid;
+}
+
 /*
  * This is like of_node_to_nid_single() for memory represented in the
  * ibm,dynamic-reconfiguration-memory node.
@@ -499,21 +519,14 @@ int of_drconf_to_nid_single(struct drmem_lmb *lmb)
 
if (nid == 0x || nid >= nr_node_ids)
nid = default_nid;
-
-   if (nid > 0) {
-   index = lmb->aa_index * aa.array_sz;
-   initialize_distance_lookup_table(nid,
-   [index]);
-   }
}
-
return nid;
 }
 
 #ifdef CONFIG_PPC_SPLPAR
-static int vphn_get_nid(long lcpu)
+
+static int __vphn_get_associativity(long lcpu, __be32 *associativity)
 {
-   __be32 associativity[VPHN_ASSOC_BUFSIZE] = {0};
long rc, hwid;
 
/*
@@ -533,10 +546,22 @@ static int vphn_get_nid(long lcpu)
 
rc = hcall_vphn(hwid, VPHN_FLAG_VCPU, associativity);
 

[PATCH v4 4/7] powerpc/pseries: Consolidate DLPAR NUMA distance update

2021-06-17 Thread Aneesh Kumar K.V
The associativity details of the newly added resourced are collected from
the hypervisor via "ibm,configure-connector" rtas call. Update the numa
distance details of the newly added numa node after the above call. In
later patch we will remove updating NUMA distance when we are looking
for node id from associativity array.

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/mm/numa.c| 41 +++
 arch/powerpc/platforms/pseries/hotplug-cpu.c  |  2 +
 .../platforms/pseries/hotplug-memory.c|  2 +
 arch/powerpc/platforms/pseries/pseries.h  |  1 +
 4 files changed, 46 insertions(+)

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 0ec16999beef..645a95e3a7ea 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -287,6 +287,47 @@ int of_node_to_nid(struct device_node *device)
 }
 EXPORT_SYMBOL(of_node_to_nid);
 
+static void __initialize_form1_numa_distance(const __be32 *associativity)
+{
+   int i, nid;
+
+   if (of_read_number(associativity, 1) >= primary_domain_index) {
+   nid = of_read_number([primary_domain_index], 1);
+
+   for (i = 0; i < max_domain_index; i++) {
+   const __be32 *entry;
+
+   entry = 
[be32_to_cpu(distance_ref_points[i])];
+   distance_lookup_table[nid][i] = of_read_number(entry, 
1);
+   }
+   }
+}
+
+static void initialize_form1_numa_distance(struct device_node *node)
+{
+   const __be32 *associativity;
+
+   associativity = of_get_associativity(node);
+   if (!associativity)
+   return;
+
+   __initialize_form1_numa_distance(associativity);
+   return;
+}
+
+/*
+ * Used to update distance information w.r.t newly added node.
+ */
+void update_numa_distance(struct device_node *node)
+{
+   if (affinity_form == FORM0_AFFINITY)
+   return;
+   else if (affinity_form == FORM1_AFFINITY) {
+   initialize_form1_numa_distance(node);
+   return;
+   }
+}
+
 static int __init find_primary_domain_index(void)
 {
int index;
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c 
b/arch/powerpc/platforms/pseries/hotplug-cpu.c
index 7e970f81d8ff..778b6ab35f0d 100644
--- a/arch/powerpc/platforms/pseries/hotplug-cpu.c
+++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c
@@ -498,6 +498,8 @@ static ssize_t dlpar_cpu_add(u32 drc_index)
return saved_rc;
}
 
+   update_numa_distance(dn);
+
rc = dlpar_online_cpu(dn);
if (rc) {
saved_rc = rc;
diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c 
b/arch/powerpc/platforms/pseries/hotplug-memory.c
index 8377f1f7c78e..0e602c3b01ea 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -180,6 +180,8 @@ static int update_lmb_associativity_index(struct drmem_lmb 
*lmb)
return -ENODEV;
}
 
+   update_numa_distance(lmb_node);
+
dr_node = of_find_node_by_path("/ibm,dynamic-reconfiguration-memory");
if (!dr_node) {
dlpar_free_cc_nodes(lmb_node);
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index 1f051a786fb3..663a0859cf13 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -113,4 +113,5 @@ extern u32 pseries_security_flavor;
 void pseries_setup_security_mitigations(void);
 void pseries_lpar_read_hblkrm_characteristics(void);
 
+void update_numa_distance(struct device_node *node);
 #endif /* _PSERIES_PSERIES_H */
-- 
2.31.1



  1   2   3   >