Re: [PATCH v3 2/2] arch/*/io.h: remove ioremap_uc in some architectures

2023-03-06 Thread Arnd Bergmann
On Tue, Mar 7, 2023, at 02:30, Baoquan He wrote:
> On 03/07/23 at 11:58am, Michael Ellerman wrote:
>> "Arnd Bergmann"  writes:
>> > On Sun, Mar 5, 2023, at 10:29, Geert Uytterhoeven wrote:
>> >> On Sun, Mar 5, 2023 at 10:23 AM Michael Ellerman  
>> >> wrote:
>> >>> Maybe that exact code path is only reachable on x86/ia64? But if so
>> >>> please explain why.
>> >>>
>> >>> Otherwise it looks like this series could break that driver on powerpc
>> >>> at least.
>> >>
>> >> Indeed.
>> >
>> > When I last looked into this, I sent a patch to use ioremap()
>> > on non-x86:
>> >
>> > https://lore.kernel.org/all/2019192258.2234502-1-a...@arndb.de/
>> 
>> OK thanks.
>> 
>> Baoquan can you add that patch to the start of this series if/when you
>> post the next version?
>
> Sure, will do. Wondering if we need make change to cover powerpc other
> than x86 and ia64 in Arnd's patch as you and Geert pointed out.

The patch fixes the aty driver for all architectures, including the
ones that were already broken before your series with the 'return NULL'
version.

The only other callers of ioremap_uc() and devm_ioremap_uc() are
in architecture specific code and in drivers/mfd/intel-lpss.c, which
is x86 specific.

 Arnd


Re: [PATCH 2/2] selftests/powerpc/pmu: fix including of utils.h when event.h is included

2023-03-06 Thread Madhavan Srinivasan



On 3/2/23 8:49 AM, Madhavan Srinivasan wrote:


On 3/2/23 3:35 AM, Benjamin Gray wrote:

On Wed, 2023-03-01 at 22:39 +0530, Kajol Jain wrote:

From: Madhavan Srinivasan 

event.h header already includes utlis.h. Avoid including
the same explicitly in the code when event.h included.

Signed-off-by: Madhavan Srinivasan 

As I understand, transitive includes should not be depended upon. If
you use a thing, and the thing is declared in a header, you should
include _that_ header. Anything else is a recipe for weird include
dependencies, ordering of the includes, etc.

These files all use FAIL_IF, etc., which are declared in utils.h. So
utils.h is a legitimate include. The fact that events.h also includes
it (for u64) is a coincidence. If the u64 type def gets moved to, e.g.,
types.h, and utils.h is removed from events.h, suddenly all these files
stop compiling.


thanks for the review. IIUC utils.h also carries the some test harness 
func declarations, also some of these tests does not use type defs 
anyway. I should have had a better commit message, my bad. But i will 
try out the suggested case.
yeah, "utils.h" included in the testcase files are for the tast_harness 
declarations.

So we could get typedef moved from utils.h. Good catch. Thanks.
Kajol, kindly drop this patch.

Maddy


Maddy


Re: [RFC PATCH 07/13] powerpc/dexcr: Add sysctl entry for SBHE system override

2023-03-06 Thread Benjamin Gray
On Tue, 2023-03-07 at 15:30 +1000, Nicholas Piggin wrote:
> On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> > The DEXCR Speculative Branch Hint Enable (SBHE) aspect controls
> > whether
> > the hints provided by BO field of Branch instructions are obeyed
> > during
> > speculative execution.
> > 
> > SBHE behaviour per ISA 3.1B:
> > 
> > 0:  The hints provided by BO field of Branch instructions may
> > be
> > ignored during speculative execution
> > 
> > 1:  The hints provided by BO field of Branch instructions are
> > obeyed
> > during speculative execution
> > 
> > Add a sysctl entry to allow changing this aspect globally in the
> > system
> > at runtime:
> > 
> > /proc/sys/kernel/speculative_branch_hint_enable
> > 
> > Three values are supported:
> > 
> > -1: Disable DEXCR SBHE sysctl override
> >  0: Override and set DEXCR[SBHE] aspect to 0
> >  1: Override and set DEXCR[SBHE] aspect to 1
> > 
> > Internally, introduces a mechanism to apply arbitrary system wide
> > overrides on top of the prctl() config.
> 
> Why have an override for this, and not others?
> 

Should be in the commit message of course, but this aspect bleeds over
to other processes, so a user may wish to prevent all processes from
changing the value. The other aspects are probably only relevant to
their own process, though the implementation here should support
arbitrary system wide overrides.



Re: [RFC PATCH 13/13] Documentation: Document PowerPC kernel DEXCR interface

2023-03-06 Thread Benjamin Gray
On Tue, 2023-03-07 at 15:40 +1000, Nicholas Piggin wrote:
> Might need a bit more time and discussion on the API. Interestingly
> because the hashchk aspect is architectural, we may not be able to
> necessarily sanely enable that, because if it was disabled to start
> out with, our callchain up to the prctl call I think would have no
> return hashes set so we'd immediately fail on our first return.

I assumed it could eventually be supported in whatever startup wrapper
programs are built with, so either as one of the first things before
any calls, or the compiler could skip putting hash instructions in the
wrapper altogether. The ELF file itself might even be able to request
bits be enabled, so the kernel would start the process correctly.

As it is inherited, it's also possible for a wrapper program to set a
specific DEXCR before a child runs. E.g.,

fork();
prctl(...NPHIE...);
exec();

I hadn't thought of the prctl call itself causing an unbalanced hashchk
when it returns, but that should be solvable with an inline syscall.


Re: [RFC PATCH 13/13] Documentation: Document PowerPC kernel DEXCR interface

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> Describe the DEXCR and document how to interact with it via the
> prctl and sysctl interfaces.

Oh you've got the docs here, sorry. Thanks for that. I don't know enough
yet to give much useful feedback on the API. I think at least all the
mechanism stuff up to the prctl API looks pretty straightfoward so would
like to get that merged if we can.

Might need a bit more time and discussion on the API. Interestingly
because the hashchk aspect is architectural, we may not be able to
necessarily sanely enable that, because if it was disabled to start
out with, our callchain up to the prctl call I think would have no
return hashes set so we'd immediately fail on our first return.

Thanks,
Nick



Re: [RFC PATCH 04/13] powerpc/dexcr: Support userspace ROP protection

2023-03-06 Thread Benjamin Gray
On Tue, 2023-03-07 at 15:05 +1000, Nicholas Piggin wrote:
> I think it is not quite per-process? I don't actually know how the
> user
> toolchain side is put together, but I'm thinking we can not give it a
> new
> salt on fork(), but we could on exec(). I think we could actually
> give
> each thread their own salt within a process too, right?

Yeah, the error case is we return further than we called in a given
execution context. A forked child may return after the fork, meaning it
needs the same key as the parent for the hashchk to work. Exec can get
a new key because we can't return with any existing hashes. I haven't
seen enough of kernel thread support to know if/how we can give threads
their own key. I believe they go through the fork() call that copies
the parent key currently.


Re: [RFC PATCH 07/13] powerpc/dexcr: Add sysctl entry for SBHE system override

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> The DEXCR Speculative Branch Hint Enable (SBHE) aspect controls whether
> the hints provided by BO field of Branch instructions are obeyed during
> speculative execution.
>
> SBHE behaviour per ISA 3.1B:
>
> 0:The hints provided by BO field of Branch instructions may be
>   ignored during speculative execution
>
> 1:The hints provided by BO field of Branch instructions are obeyed
>   during speculative execution
>
> Add a sysctl entry to allow changing this aspect globally in the system
> at runtime:
>
>   /proc/sys/kernel/speculative_branch_hint_enable
>
> Three values are supported:
>
> -1:   Disable DEXCR SBHE sysctl override
>  0:   Override and set DEXCR[SBHE] aspect to 0
>  1:   Override and set DEXCR[SBHE] aspect to 1
>
> Internally, introduces a mechanism to apply arbitrary system wide
> overrides on top of the prctl() config.

Why have an override for this, and not others?

Thanks,
Nick



Re: [RFC PATCH 06/13] powerpc/dexcr: Add prctl implementation

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> Adds an initial prctl interface implementation. Unprivileged processes
> can query the current prctl setting, including whether an aspect is
> implemented by the hardware or is permitted to be modified by a setter
> prctl. Editable aspects can be changed by a CAP_SYS_ADMIN privileged
> process.
>
> The prctl setting represents what the process itself has requested, and
> does not account for any overrides. Either the kernel or a hypervisor
> may enforce a different setting for an aspect.
>
> Userspace can access a readonly view of the current DEXCR via SPR 812,
> and a readonly view of the aspects enforced by the hypervisor via
> SPR 455. A bitwise OR of these two SPRs will give the effective
> DEXCR aspect state of the process.

You said (offline) that you were looking at the PR_SPEC_* speculation
control APIs but that this was different enough that you needed a
different one.

It would be good to know what some of those issues were in the
changelog, would be nice to have some docs (could we add something
to spec_ctrl.rst maybe?). I assume at least one difference is that
some of our bits are not speculative but architectural (e.g., the
stack hash check).

I also wonder if we could implement some of the PR_SPEC controls
APIs by mapping relevant DEXCR aspects to them instead of (or as well
as) the DEXCR controls? Or would the PR_SPEC users be amenable to
extensions that make our usage fit a bit better?

I'm just thinking if we can reduce reliance on arch specific APIs a
bit would be nice.

>
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/processor.h |  13 +++
>  arch/powerpc/kernel/dexcr.c  | 133 ++-
>  arch/powerpc/kernel/process.c|   6 ++
>  3 files changed, 151 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/processor.h 
> b/arch/powerpc/include/asm/processor.h
> index 2381217c95dc..4c995258f668 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -265,6 +265,9 @@ struct thread_struct {
>   unsigned long   sier2;
>   unsigned long   sier3;
>   unsigned long   hashkeyr;
> + unsigned intdexcr_override;
> + unsigned intdexcr_mask;

Hmm, what's the mask doing here? It only gets bits set and never
cleared AFAIKS. What is different between an initial state and a
SET then CLEAR state?

Thanks,
Nick


Re: [RFC PATCH 05/13] prctl: Define PowerPC DEXCR interface

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> Adds the definitions and generic handler for prctl control of the
> PowerPC Dynamic Execution Control Register (DEXCR).

Assuming we'd go with the later prctl patches, this prep patch
is nice way to split out some of the mechanism.

Reviewed-by: Nicholas Piggin 

>
> Signed-off-by: Benjamin Gray 
> ---
>  include/uapi/linux/prctl.h | 14 ++
>  kernel/sys.c   | 16 
>  2 files changed, 30 insertions(+)
>
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a5e06dcbba13..b4720e8de6f3 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -281,6 +281,20 @@ struct prctl_mm_map {
>  # define PR_SME_VL_LEN_MASK  0x
>  # define PR_SME_VL_INHERIT   (1 << 17) /* inherit across exec */
>  
> +/* PowerPC Dynamic Execution Control Register (DEXCR) controls */
> +#define PR_PPC_GET_DEXCR 65
> +#define PR_PPC_SET_DEXCR 66
> +/* DEXCR aspect to act on */
> +# define PR_PPC_DEXCR_SBHE   0 /* Speculative branch hint enable */
> +# define PR_PPC_DEXCR_IBRTPD 1 /* Indirect branch recurrent target 
> prediction disable */
> +# define PR_PPC_DEXCR_SRAPD  2 /* Subroutine return address 
> prediction disable */
> +# define PR_PPC_DEXCR_NPHIE  3 /* Non-privileged hash instruction 
> enable */
> +/* Action to apply / return */
> +# define PR_PPC_DEXCR_PRCTL  (1 << 0)
> +# define PR_PPC_DEXCR_SET_ASPECT (1 << 1)
> +# define PR_PPC_DEXCR_FORCE_SET_ASPECT   (1 << 2)
> +# define PR_PPC_DEXCR_CLEAR_ASPECT   (1 << 3)
> +
>  #define PR_SET_VMA   0x53564d41
>  # define PR_SET_VMA_ANON_NAME0
>  
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 5fd54bf0e886..55b8f7369059 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -139,6 +139,12 @@
>  #ifndef GET_TAGGED_ADDR_CTRL
>  # define GET_TAGGED_ADDR_CTRL()  (-EINVAL)
>  #endif
> +#ifndef PPC_GET_DEXCR_ASPECT
> +# define PPC_GET_DEXCR_ASPECT(a, b)  (-EINVAL)
> +#endif
> +#ifndef PPC_SET_DEXCR_ASPECT
> +# define PPC_SET_DEXCR_ASPECT(a, b, c)   (-EINVAL)
> +#endif
>  
>  /*
>   * this is where the system-wide overflow UID and GID are defined, for
> @@ -2623,6 +2629,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, 
> arg2, unsigned long, arg3,
>   error = sched_core_share_pid(arg2, arg3, arg4, arg5);
>   break;
>  #endif
> + case PR_PPC_GET_DEXCR:
> + if (arg3 || arg4 || arg5)
> + return -EINVAL;
> + error = PPC_GET_DEXCR_ASPECT(me, arg2);
> + break;
> + case PR_PPC_SET_DEXCR:
> + if (arg4 || arg5)
> + return -EINVAL;
> + error = PPC_SET_DEXCR_ASPECT(me, arg2, arg3);
> + break;
>   case PR_SET_VMA:
>   error = prctl_set_vma(arg2, arg3, arg4, arg5);
>   break;
> -- 
> 2.38.1



Re: [RFC PATCH 04/13] powerpc/dexcr: Support userspace ROP protection

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> The ISA 3.1B hashst and hashchk instructions use a per-cpu SPR HASHKEYR
> to hold a key used in the hash calculation. This key should be different
> for each process to make it harder for a malicious process to recreate
> valid hash values for a victim process.
>
> Add support for storing a per-thread hash key, and setting/clearing
> HASHKEYR appropriately.
>
> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/book3s/64/kexec.h |  3 +++
>  arch/powerpc/include/asm/processor.h   |  1 +
>  arch/powerpc/include/asm/reg.h |  1 +
>  arch/powerpc/kernel/process.c  | 12 
>  4 files changed, 17 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kexec.h 
> b/arch/powerpc/include/asm/book3s/64/kexec.h
> index 563baf94a962..163de935df28 100644
> --- a/arch/powerpc/include/asm/book3s/64/kexec.h
> +++ b/arch/powerpc/include/asm/book3s/64/kexec.h
> @@ -24,6 +24,9 @@ static inline void reset_sprs(void)
>   if (cpu_has_feature(CPU_FTR_ARCH_31))
>   mtspr(SPRN_DEXCR, 0);
>  
> + if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE))
> + mtspr(SPRN_HASHKEYR, 0);
> +
>   /*  Do we need isync()? We are going via a kexec reset */
>   isync();
>  }
> diff --git a/arch/powerpc/include/asm/processor.h 
> b/arch/powerpc/include/asm/processor.h
> index c17ec1e44c86..2381217c95dc 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -264,6 +264,7 @@ struct thread_struct {
>   unsigned long   mmcr3;
>   unsigned long   sier2;
>   unsigned long   sier3;
> + unsigned long   hashkeyr;
>  
>  #endif
>  };
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index cdd1f174c399..854664cf844f 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -384,6 +384,7 @@
>  #define SPRN_HRMOR   0x139   /* Real mode offset register */
>  #define SPRN_HSRR0   0x13A   /* Hypervisor Save/Restore 0 */
>  #define SPRN_HSRR1   0x13B   /* Hypervisor Save/Restore 1 */
> +#define SPRN_HASHKEYR0x1D4   /* Non-privileged hashst/hashchk key 
> register */
>  #define SPRN_ASDR0x330   /* Access segment descriptor register */
>  #define SPRN_DEXCR   0x33C   /* Dynamic execution control register */
>  #define   DEXCR_PRO_MASK(aspect) __MASK(63 - (32 + (aspect)))/* 
> Aspect number to problem state aspect mask */
> diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
> index 17d26f652b80..4d7b0c7641d0 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1229,6 +1229,9 @@ static inline void restore_sprs(struct thread_struct 
> *old_thread,
>   old_thread->tidr != new_thread->tidr)
>   mtspr(SPRN_TIDR, new_thread->tidr);
>  
> + if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE))
> + mtspr(SPRN_HASHKEYR, new_thread->hashkeyr);

I wonder if we'd want to avoid switching it when switching to kernel
threads, and from kernel thread back to the same user thread. Might
want to optimise it to do that in future but for an initial enablement
patch this is okay.

> +
>   if (cpu_has_feature(CPU_FTR_ARCH_31)) {
>   unsigned long new_dexcr = get_thread_dexcr(new_thread);
>  
> @@ -1818,6 +1821,10 @@ int copy_thread(struct task_struct *p, const struct 
> kernel_clone_args *args)
>   childregs->ppr = DEFAULT_PPR;
>  
>   p->thread.tidr = 0;
> +#endif
> +#ifdef CONFIG_PPC_BOOK3S_64
> + if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE))
> + p->thread.hashkeyr = current->thread.hashkeyr;
>  #endif

Similar comment about your accessor style, if we had get/set_thread_hashkeyr()
functions then no ifdef required.

I think it is not quite per-process? I don't actually know how the user
toolchain side is put together, but I'm thinking we can not give it a new
salt on fork(), but we could on exec(). I think we could actually give
each thread their own salt within a process too, right?

I don't know off the top of my head whether that can be translated into
a simple test at the copy_thread level. For now you're giving out a new
salt on exec I think, which should be fine at least to start with.

Thanks,
Nick

Reviewed-by: Nicholas Piggin 

>   /*
>* Run with the current AMR value of the kernel
> @@ -1947,6 +1954,11 @@ void start_thread(struct pt_regs *regs, unsigned long 
> start, unsigned long sp)
>   current->thread.load_tm = 0;
>  #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  #ifdef CONFIG_PPC_BOOK3S_64
> + if (cpu_has_feature(CPU_FTR_DEXCR_NPHIE)) {
> + current->thread.hashkeyr = get_random_long();
> + mtspr(SPRN_HASHKEYR, current->thread.hashkeyr);
> + }
> +
>   if (cpu_has_feature(CPU_FTR_ARCH_31))
>   mtspr(SPRN_DEXCR, get_thread_dexcr(>thread));
>  #endif /* CONFIG_PPC_BOOK3S_64 */
> -- 
> 2.38.1



Re: [RFC PATCH 02/13] powerpc: Add initial Dynamic Execution Control Register (DEXCR) support

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> ISA 3.1B introduces the Dynamic Execution Control Register (DEXCR). It
> is a per-cpu register that allows control over various CPU behaviours
> including branch hint usage, indirect branch speculation, and
> hashst/hashchk support.
>
> Though introduced in 3.1B, no CPUs using 3.1 were released, so
> CPU_FTR_ARCH_31 is used to determine support for the register itself.
> Support for each DEXCR bit (aspect) is reported separately by the
> firmware.
>
> Add various definitions and basic support for the DEXCR in the kernel.
> Right now it just initialises and maintains the DEXCR on process
> creation/swap, and clears it in reset_sprs().
>

A couple of comments below, but it looks good:

Reviewed-by: Nicholas Piggin 

> Signed-off-by: Benjamin Gray 
> ---
>  arch/powerpc/include/asm/book3s/64/kexec.h |  3 +++
>  arch/powerpc/include/asm/cputable.h|  8 ++-
>  arch/powerpc/include/asm/processor.h   | 13 +++
>  arch/powerpc/include/asm/reg.h |  6 ++
>  arch/powerpc/kernel/Makefile   |  1 +
>  arch/powerpc/kernel/dexcr.c| 25 ++
>  arch/powerpc/kernel/dt_cpu_ftrs.c  |  4 
>  arch/powerpc/kernel/process.c  | 13 ++-
>  arch/powerpc/kernel/prom.c |  4 
>  9 files changed, 75 insertions(+), 2 deletions(-)
>  create mode 100644 arch/powerpc/kernel/dexcr.c
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kexec.h 
> b/arch/powerpc/include/asm/book3s/64/kexec.h
> index d4b9d476ecba..563baf94a962 100644
> --- a/arch/powerpc/include/asm/book3s/64/kexec.h
> +++ b/arch/powerpc/include/asm/book3s/64/kexec.h
> @@ -21,6 +21,9 @@ static inline void reset_sprs(void)
>   plpar_set_ciabr(0);
>   }
>  
> + if (cpu_has_feature(CPU_FTR_ARCH_31))
> + mtspr(SPRN_DEXCR, 0);
> +
>   /*  Do we need isync()? We are going via a kexec reset */
>   isync();
>  }
> diff --git a/arch/powerpc/include/asm/cputable.h 
> b/arch/powerpc/include/asm/cputable.h
> index 757dbded11dc..03bc192f2d8b 100644
> --- a/arch/powerpc/include/asm/cputable.h
> +++ b/arch/powerpc/include/asm/cputable.h
> @@ -192,6 +192,10 @@ static inline void cpu_feature_keys_init(void) { }
>  #define CPU_FTR_P9_RADIX_PREFETCH_BUG
> LONG_ASM_CONST(0x0002)
>  #define CPU_FTR_ARCH_31  
> LONG_ASM_CONST(0x0004)
>  #define CPU_FTR_DAWR1
> LONG_ASM_CONST(0x0008)
> +#define CPU_FTR_DEXCR_SBHE   LONG_ASM_CONST(0x0010)
> +#define CPU_FTR_DEXCR_IBRTPD LONG_ASM_CONST(0x0020)
> +#define CPU_FTR_DEXCR_SRAPD  LONG_ASM_CONST(0x0040)
> +#define CPU_FTR_DEXCR_NPHIE  LONG_ASM_CONST(0x0080)

We potentially don't need to use CPU_FTR bits for each of these. We
only really want them to use instruction patching and make feature
tests fast. But we have been a bit liberal with using them and they
are kind of tied into cpu feature parsing code so maybe it's easier
to go with them for now.

>  
>  #ifndef __ASSEMBLY__
>  
> @@ -451,7 +455,9 @@ static inline void cpu_feature_keys_init(void) { }
>   CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
>   CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_ARCH_207S | \
>   CPU_FTR_ARCH_300 | CPU_FTR_ARCH_31 | \
> - CPU_FTR_DAWR | CPU_FTR_DAWR1)
> + CPU_FTR_DAWR | CPU_FTR_DAWR1 | \
> + CPU_FTR_DEXCR_SBHE | CPU_FTR_DEXCR_IBRTPD | CPU_FTR_DEXCR_SRAPD | \
> + CPU_FTR_DEXCR_NPHIE)
>  #define CPU_FTRS_CELL(CPU_FTR_LWSYNC | \
>   CPU_FTR_PPCAS_ARCH_V2 | CPU_FTR_CTRL | \
>   CPU_FTR_ALTIVEC_COMP | CPU_FTR_MMCRA | CPU_FTR_SMT | \
> diff --git a/arch/powerpc/include/asm/processor.h 
> b/arch/powerpc/include/asm/processor.h
> index 631802999d59..0a8a793b8b8b 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -446,6 +446,19 @@ int exit_vmx_usercopy(void);
>  int enter_vmx_ops(void);
>  void *exit_vmx_ops(void *dest);
>  
> +#ifdef CONFIG_PPC_BOOK3S_64
> +
> +unsigned long get_thread_dexcr(struct thread_struct const *t);
> +
> +#else
> +
> +static inline unsigned long get_thread_dexcr(struct thread_struct const *t)
> +{
> + return 0;
> +}
> +
> +#endif /* CONFIG_PPC_BOOK3S_64 */
> +
>  #endif /* __KERNEL__ */
>  #endif /* __ASSEMBLY__ */
>  #endif /* _ASM_POWERPC_PROCESSOR_H */
> diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> index 1e8b2e04e626..cdd1f174c399 100644
> --- a/arch/powerpc/include/asm/reg.h
> +++ b/arch/powerpc/include/asm/reg.h
> @@ -385,6 +385,12 @@
>  #define SPRN_HSRR0   0x13A   /* Hypervisor Save/Restore 0 */
>  #define SPRN_HSRR1   0x13B   /* Hypervisor Save/Restore 1 */
>  #define SPRN_ASDR0x330   /* Access segment descriptor register */
> +#define SPRN_DEXCR   0x33C   /* Dynamic 

Re: [RFC PATCH 01/13] powerpc/book3s: Add missing include

2023-03-06 Thread Nicholas Piggin
On Mon Nov 28, 2022 at 12:44 PM AEST, Benjamin Gray wrote:
> The functions here use struct thread_struct fields, so need to import
> the full definition from . The  header
> that defines current only forward declares struct thread_struct.
>
> Failing to include this  header leads to a compilation
> error when a translation unit does not also include 
> indirectly.
>
> Signed-off-by: Benjamin Gray 

Reviewed-by: Nicholas Piggin 

> ---
>  arch/powerpc/include/asm/book3s/64/kup.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/kup.h 
> b/arch/powerpc/include/asm/book3s/64/kup.h
> index 54cf46808157..84c09e546115 100644
> --- a/arch/powerpc/include/asm/book3s/64/kup.h
> +++ b/arch/powerpc/include/asm/book3s/64/kup.h
> @@ -194,6 +194,7 @@
>  #else /* !__ASSEMBLY__ */
>  
>  #include 
> +#include 
>  
>  DECLARE_STATIC_KEY_FALSE(uaccess_flush_key);
>  
> -- 
> 2.38.1



Re: [PATCH v3 2/2] arch/*/io.h: remove ioremap_uc in some architectures

2023-03-06 Thread Baoquan He
On 03/07/23 at 11:58am, Michael Ellerman wrote:
> "Arnd Bergmann"  writes:
> > On Sun, Mar 5, 2023, at 10:29, Geert Uytterhoeven wrote:
> >> On Sun, Mar 5, 2023 at 10:23 AM Michael Ellerman  
> >> wrote:
> >>> Maybe that exact code path is only reachable on x86/ia64? But if so
> >>> please explain why.
> >>>
> >>> Otherwise it looks like this series could break that driver on powerpc
> >>> at least.
> >>
> >> Indeed.
> >
> > When I last looked into this, I sent a patch to use ioremap()
> > on non-x86:
> >
> > https://lore.kernel.org/all/2019192258.2234502-1-a...@arndb.de/
> 
> OK thanks.
> 
> Baoquan can you add that patch to the start of this series if/when you
> post the next version?

Sure, will do. Wondering if we need make change to cover powerpc other
than x86 and ia64 in Arnd's patch as you and Geert pointed out.



Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems

2023-03-06 Thread Alex Williamson
On Mon, 6 Mar 2023 18:35:22 -0600 (CST)
Timothy Pearson  wrote:

> - Original Message -
> > From: "Alex Williamson" 
> > To: "Timothy Pearson" 
> > Cc: "kvm" , "linuxppc-dev" 
> > 
> > Sent: Monday, March 6, 2023 5:46:07 PM
> > Subject: Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems  
> 
> > On Mon, 6 Mar 2023 11:29:53 -0600 (CST)
> > Timothy Pearson  wrote:
> >   
> >> This patch series reenables VFIO support on POWER systems.  It
> >> is based on Alexey Kardashevskiys's patch series, rebased and
> >> successfully tested under QEMU with a Marvell PCIe SATA controller
> >> on a POWER9 Blackbird host.
> >> 
> >> Alexey Kardashevskiy (3):
> >>   powerpc/iommu: Add "borrowing" iommu_table_group_ops
> >>   powerpc/pci_64: Init pcibios subsys a bit later
> >>   powerpc/iommu: Add iommu_ops to report capabilities and allow blocking
> >> domains
> >> 
> >> Timothy Pearson (1):
> >>   Add myself to MAINTAINERS for Power VFIO support
> >> 
> >>  MAINTAINERS   |   5 +
> >>  arch/powerpc/include/asm/iommu.h  |   6 +-
> >>  arch/powerpc/include/asm/pci-bridge.h |   7 +
> >>  arch/powerpc/kernel/iommu.c   | 246 +-
> >>  arch/powerpc/kernel/pci_64.c  |   2 +-
> >>  arch/powerpc/platforms/powernv/pci-ioda.c |  36 +++-
> >>  arch/powerpc/platforms/pseries/iommu.c|  27 +++
> >>  arch/powerpc/platforms/pseries/pseries.h  |   4 +
> >>  arch/powerpc/platforms/pseries/setup.c|   3 +
> >>  drivers/vfio/vfio_iommu_spapr_tce.c   |  96 ++---
> >>  10 files changed, 338 insertions(+), 94 deletions(-)
> >>   
> > 
> > For vfio and MAINTAINERS portions,
> > 
> > Acked-by: Alex Williamson 
> > 
> > I'll note though that spapr_tce_take_ownership() looks like it copied a
> > bug from the old tce_iommu_take_ownership() where tbl and tbl->it_map
> > are tested before calling iommu_take_ownership() but not in the unwind
> > loop, ie. tables we might have skipped on setup are unconditionally
> > released on unwind.  Thanks,
> > 
> > Alex  
> 
> Thanks for that.  I'll put together a patch to get rid of that
> potential bug that can be applied after this series is merged, unless
> you'd rather I resubmit a v3 with the issue fixed?

Follow-up fix is fine by me.  Thanks,

Alex



Re: [PATCH v3 2/2] arch/*/io.h: remove ioremap_uc in some architectures

2023-03-06 Thread Michael Ellerman
"Arnd Bergmann"  writes:
> On Sun, Mar 5, 2023, at 10:29, Geert Uytterhoeven wrote:
>> On Sun, Mar 5, 2023 at 10:23 AM Michael Ellerman  wrote:
>>> Maybe that exact code path is only reachable on x86/ia64? But if so
>>> please explain why.
>>>
>>> Otherwise it looks like this series could break that driver on powerpc
>>> at least.
>>
>> Indeed.
>
> When I last looked into this, I sent a patch to use ioremap()
> on non-x86:
>
> https://lore.kernel.org/all/2019192258.2234502-1-a...@arndb.de/

OK thanks.

Baoquan can you add that patch to the start of this series if/when you
post the next version?

cheers


[PATCH 3/5] selftests/powerpc/dscr: Improve DSCR explicit random test case

2023-03-06 Thread Benjamin Gray
The tests currently have a single writer thread updating the system
DSCR with a 1/1000 chance looped only 100 times. So only around one in
10 runs actually do anything.

* Add multiple threads to the dscr_explicit_random_test case.
* Use a barrier to make all the threads start work as simultaneously as
  possible.
* Use a rwlock and make all threads have a reasonable chance to write to
  the DSCR on each iteration.
  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP is used to prevent
  writers from starving while all the other threads keep reading.
  Logging the reads/writes shows a decent mix across the whole test.
* Allow all threads a chance to write.
* Make the chance of writing more likely.

Signed-off-by: Benjamin Gray 
---
 tools/testing/selftests/powerpc/dscr/dscr.h   |   5 -
 .../powerpc/dscr/dscr_default_test.c  | 132 --
 .../powerpc/dscr/dscr_explicit_test.c |  94 -
 3 files changed, 114 insertions(+), 117 deletions(-)

diff --git a/tools/testing/selftests/powerpc/dscr/dscr.h 
b/tools/testing/selftests/powerpc/dscr/dscr.h
index 903ee0c83fac..9cd5488ab7c2 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr.h
+++ b/tools/testing/selftests/powerpc/dscr/dscr.h
@@ -88,11 +88,6 @@ void set_default_dscr(unsigned long val)
}
 }
 
-double uniform_deviate(int seed)
-{
-   return seed * (1.0 / (RAND_MAX + 1.0));
-}
-
 int restrict_to_one_cpu(void)
 {
cpu_set_t cpus;
diff --git a/tools/testing/selftests/powerpc/dscr/dscr_default_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
index 8b7d0ff8a20a..758823d59daa 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
@@ -69,105 +69,85 @@ int dscr_default_lockstep_test(void)
return 0;
 }
 
-static unsigned long dscr; /* System DSCR default */
-static unsigned long sequence;
-static unsigned long result[THREADS];
+struct random_thread_args {
+   pthread_t thread_id;
+   unsigned long *expected_system_dscr;
+   pthread_rwlock_t *rw_lock;
+   pthread_barrier_t *barrier;
+};
 
-static void *do_test(void *in)
+static void *dscr_default_random_thread(void *in)
 {
-   unsigned long thread = (unsigned long)in;
-   unsigned long i;
+   struct random_thread_args *args = (struct random_thread_args *)in;
+   unsigned long *expected_dscr_p = args->expected_system_dscr;
+   pthread_rwlock_t *rw_lock = args->rw_lock;
+   int err;
 
-   for (i = 0; i < COUNT; i++) {
-   unsigned long d, cur_dscr, cur_dscr_usr;
-   unsigned long s1, s2;
+   srand(gettid());
 
-   s1 = READ_ONCE(sequence);
-   if (s1 & 1)
-   continue;
-   rmb();
+   err = pthread_barrier_wait(args->barrier);
+   FAIL_IF_EXIT(err != 0 && err != PTHREAD_BARRIER_SERIAL_THREAD);
 
-   d = dscr;
-   cur_dscr = get_dscr();
-   cur_dscr_usr = get_dscr_usr();
+   for (int i = 0; i < COUNT; i++) {
+   unsigned long expected_dscr;
+   unsigned long current_dscr;
+   unsigned long current_dscr_usr;
 
-   rmb();
-   s2 = sequence;
+   FAIL_IF_EXIT(pthread_rwlock_rdlock(rw_lock));
+   expected_dscr = *expected_dscr_p;
+   current_dscr = get_dscr();
+   current_dscr_usr = get_dscr_usr();
+   FAIL_IF_EXIT(pthread_rwlock_unlock(rw_lock));
 
-   if (s1 != s2)
-   continue;
+   FAIL_IF_EXIT(current_dscr != expected_dscr);
+   FAIL_IF_EXIT(current_dscr_usr != expected_dscr);
 
-   if (cur_dscr != d) {
-   fprintf(stderr, "thread %ld kernel DSCR should be %ld "
-   "but is %ld\n", thread, d, cur_dscr);
-   result[thread] = 1;
-   pthread_exit([thread]);
-   }
+   if (rand() % 10 == 0) {
+   unsigned long next_dscr;
 
-   if (cur_dscr_usr != d) {
-   fprintf(stderr, "thread %ld user DSCR should be %ld "
-   "but is %ld\n", thread, d, cur_dscr_usr);
-   result[thread] = 1;
-   pthread_exit([thread]);
+   FAIL_IF_EXIT(pthread_rwlock_wrlock(rw_lock));
+   next_dscr = (*expected_dscr_p + 1) % DSCR_MAX;
+   set_default_dscr(next_dscr);
+   *expected_dscr_p = next_dscr;
+   FAIL_IF_EXIT(pthread_rwlock_unlock(rw_lock));
}
}
-   result[thread] = 0;
-   pthread_exit([thread]);
+
+   pthread_exit((void *)0);
 }
 
 int dscr_default_random_test(void)
 {
-   pthread_t threads[THREADS];
-   unsigned long i, *status[THREADS];
+   struct 

[PATCH 5/5] selftests/powerpc/dscr: Restore timeout to DSCR selftests

2023-03-06 Thread Benjamin Gray
Reducing the time taken by dscr_sysfs_test.c allows restoring the
default timeout, which was removed in
commit 850507f30c38 ("selftests/powerpc: Turn off timeout setting for
benchmarks, dscr, signal, tm") because that test took too long.

Signed-off-by: Benjamin Gray 
---
 tools/testing/selftests/powerpc/dscr/Makefile | 2 --
 tools/testing/selftests/powerpc/dscr/settings | 1 -
 2 files changed, 3 deletions(-)
 delete mode 100644 tools/testing/selftests/powerpc/dscr/settings

diff --git a/tools/testing/selftests/powerpc/dscr/Makefile 
b/tools/testing/selftests/powerpc/dscr/Makefile
index b29a8863a734..9289d5febe1e 100644
--- a/tools/testing/selftests/powerpc/dscr/Makefile
+++ b/tools/testing/selftests/powerpc/dscr/Makefile
@@ -3,8 +3,6 @@ TEST_GEN_PROGS := dscr_default_test dscr_explicit_test 
dscr_user_test   \
  dscr_inherit_test dscr_inherit_exec_test dscr_sysfs_test  \
  dscr_sysfs_thread_test
 
-TEST_FILES := settings
-
 top_srcdir = ../../../../..
 include ../../lib.mk
 
diff --git a/tools/testing/selftests/powerpc/dscr/settings 
b/tools/testing/selftests/powerpc/dscr/settings
deleted file mode 100644
index e7b9417537fb..
--- a/tools/testing/selftests/powerpc/dscr/settings
+++ /dev/null
@@ -1 +0,0 @@
-timeout=0
-- 
2.39.2



[PATCH 2/5] selftests/powerpc/dscr: Add lockstep test cases to DSCR explicit tests

2023-03-06 Thread Benjamin Gray
Add new cases to the relevant tests that use explicitly synchronized
threads to test the behaviour across context switches with less
randomness. By locking the participants to the same CPU we guarantee a
context switch occurs each time they make progress, which is a likely
failure point if the kernel is not tracking the thread local DSCR
correctly.

The random case is left in to keep exercising potential edge cases.

Signed-off-by: Benjamin Gray 
---
 tools/testing/selftests/powerpc/dscr/Makefile |  1 +
 tools/testing/selftests/powerpc/dscr/dscr.h   | 23 +
 .../powerpc/dscr/dscr_default_test.c  | 87 ---
 .../powerpc/dscr/dscr_explicit_test.c | 87 ++-
 4 files changed, 183 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/powerpc/dscr/Makefile 
b/tools/testing/selftests/powerpc/dscr/Makefile
index 845db6273a1b..b29a8863a734 100644
--- a/tools/testing/selftests/powerpc/dscr/Makefile
+++ b/tools/testing/selftests/powerpc/dscr/Makefile
@@ -9,5 +9,6 @@ top_srcdir = ../../../../..
 include ../../lib.mk
 
 $(OUTPUT)/dscr_default_test: LDLIBS += -lpthread
+$(OUTPUT)/dscr_explicit_test: LDLIBS += -lpthread
 
 $(TEST_GEN_PROGS): ../harness.c ../utils.c
diff --git a/tools/testing/selftests/powerpc/dscr/dscr.h 
b/tools/testing/selftests/powerpc/dscr/dscr.h
index 2c54998d4715..903ee0c83fac 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr.h
+++ b/tools/testing/selftests/powerpc/dscr/dscr.h
@@ -11,6 +11,8 @@
 #ifndef _SELFTESTS_POWERPC_DSCR_DSCR_H
 #define _SELFTESTS_POWERPC_DSCR_DSCR_H
 
+#define _GNU_SOURCE
+
 #include 
 #include 
 #include 
@@ -90,4 +92,25 @@ double uniform_deviate(int seed)
 {
return seed * (1.0 / (RAND_MAX + 1.0));
 }
+
+int restrict_to_one_cpu(void)
+{
+   cpu_set_t cpus;
+   int cpu;
+
+   FAIL_IF(sched_getaffinity(0, sizeof(cpu_set_t), ));
+
+   for (cpu = 0; cpu < CPU_SETSIZE; cpu++)
+   if (CPU_ISSET(cpu, ))
+   break;
+
+   FAIL_IF(cpu == CPU_SETSIZE);
+
+   CPU_ZERO();
+   CPU_SET(cpu, );
+   FAIL_IF(sched_setaffinity(0, sizeof(cpu_set_t), ));
+
+   return 0;
+}
+
 #endif /* _SELFTESTS_POWERPC_DSCR_DSCR_H */
diff --git a/tools/testing/selftests/powerpc/dscr/dscr_default_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
index e76611e608af..8b7d0ff8a20a 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_default_test.c
@@ -9,8 +9,66 @@
  * Copyright 2012, Anton Blanchard, IBM Corporation.
  * Copyright 2015, Anshuman Khandual, IBM Corporation.
  */
+
+#define _GNU_SOURCE
+
 #include "dscr.h"
 
+#include 
+#include 
+#include 
+
+static void *dscr_default_lockstep_writer(void *arg)
+{
+   sem_t *reader_sem = (sem_t *)arg;
+   sem_t *writer_sem = (sem_t *)arg + 1;
+   unsigned long expected_dscr = 0;
+
+   for (int i = 0; i < COUNT; i++) {
+   FAIL_IF_EXIT(sem_wait(writer_sem));
+
+   set_default_dscr(expected_dscr);
+   expected_dscr = (expected_dscr + 1) % DSCR_MAX;
+
+   FAIL_IF_EXIT(sem_post(reader_sem));
+   }
+
+   return NULL;
+}
+
+int dscr_default_lockstep_test(void)
+{
+   pthread_t writer;
+   sem_t rw_semaphores[2];
+   sem_t *reader_sem = _semaphores[0];
+   sem_t *writer_sem = _semaphores[1];
+   unsigned long expected_dscr = 0;
+
+   SKIP_IF(!have_hwcap2(PPC_FEATURE2_DSCR));
+
+   FAIL_IF(sem_init(reader_sem, 0, 0));
+   FAIL_IF(sem_init(writer_sem, 0, 1));  /* writer starts first */
+   FAIL_IF(restrict_to_one_cpu());
+   FAIL_IF(pthread_create(, NULL, dscr_default_lockstep_writer, 
(void *)rw_semaphores));
+
+   for (int i = 0; i < COUNT ; i++) {
+   FAIL_IF(sem_wait(reader_sem));
+
+   FAIL_IF(get_dscr() != expected_dscr);
+   FAIL_IF(get_dscr_usr() != expected_dscr);
+
+   expected_dscr = (expected_dscr + 1) % DSCR_MAX;
+
+   FAIL_IF(sem_post(writer_sem));
+   }
+
+   FAIL_IF(pthread_join(writer, NULL));
+   FAIL_IF(sem_destroy(reader_sem));
+   FAIL_IF(sem_destroy(writer_sem));
+
+   return 0;
+}
+
 static unsigned long dscr; /* System DSCR default */
 static unsigned long sequence;
 static unsigned long result[THREADS];
@@ -57,16 +115,13 @@ static void *do_test(void *in)
pthread_exit([thread]);
 }
 
-int dscr_default(void)
+int dscr_default_random_test(void)
 {
pthread_t threads[THREADS];
unsigned long i, *status[THREADS];
-   unsigned long orig_dscr_default;
 
SKIP_IF(!have_hwcap2(PPC_FEATURE2_DSCR));
 
-   orig_dscr_default = get_default_dscr();
-
/* Initial DSCR default */
dscr = 1;
set_default_dscr(dscr);
@@ -75,7 +130,7 @@ int dscr_default(void)
for (i = 0; i < THREADS; i++) {
if (pthread_create([i], NULL, do_test, (void *)i)) {

[PATCH 1/5] selftests/powerpc/dscr: Correct typos

2023-03-06 Thread Benjamin Gray
Correct a couple of typos while working on other improvements to the
DSCR tests.

Signed-off-by: Benjamin Gray 
---
 tools/testing/selftests/powerpc/dscr/dscr_explicit_test.c | 4 ++--
 tools/testing/selftests/powerpc/dscr/dscr_inherit_test.c  | 4 ++--
 tools/testing/selftests/powerpc/dscr/dscr_user_test.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/powerpc/dscr/dscr_explicit_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_explicit_test.c
index 32fcf2b324b1..5659d98cf340 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_explicit_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_explicit_test.c
@@ -7,8 +7,8 @@
  * privilege state SPR and the problem state SPR for this purpose.
  *
  * When using the privilege state SPR, the instructions such as
- * mfspr or mtspr are priviledged and the kernel emulates them
- * for us. Instructions using problem state SPR can be exuecuted
+ * mfspr or mtspr are privileged and the kernel emulates them
+ * for us. Instructions using problem state SPR can be executed
  * directly without any emulation if the HW supports them. Else
  * they also get emulated by the kernel.
  *
diff --git a/tools/testing/selftests/powerpc/dscr/dscr_inherit_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_inherit_test.c
index f9dfd3d3c2d5..68ce328e813e 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_inherit_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_inherit_test.c
@@ -7,8 +7,8 @@
  * value using mfspr.
  *
  * When using the privilege state SPR, the instructions such as
- * mfspr or mtspr are priviledged and the kernel emulates them
- * for us. Instructions using problem state SPR can be exuecuted
+ * mfspr or mtspr are privileged and the kernel emulates them
+ * for us. Instructions using problem state SPR can be executed
  * directly without any emulation if the HW supports them. Else
  * they also get emulated by the kernel.
  *
diff --git a/tools/testing/selftests/powerpc/dscr/dscr_user_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_user_test.c
index e09072446dd3..67bb872a246a 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_user_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_user_test.c
@@ -8,8 +8,8 @@
  * numbers.
  *
  * When using the privilege state SPR, the instructions such as
- * mfspr or mtspr are priviledged and the kernel emulates them
- * for us. Instructions using problem state SPR can be exuecuted
+ * mfspr or mtspr are privileged and the kernel emulates them
+ * for us. Instructions using problem state SPR can be executed
  * directly without any emulation if the HW supports them. Else
  * they also get emulated by the kernel.
  *
-- 
2.39.2



[PATCH 4/5] selftests/powerpc/dscr: Speed up DSCR sysfs tests

2023-03-06 Thread Benjamin Gray
This test case is extremely slow, taking around a minute compared to
most of the other DSCR tests taking a second at most. Perf shows most
time is spent by the kernel switching to each CPU it reads in
/sys/devices/system/cpu. This switching is an unavoidable consequnce
of reading all the .../cpuN/dscr values.

Remove the outer iteration loop from this test case, reducing the reads
from 1600 to 16. This still updates the DSCR 16 times and verifies on
every CPU each time, so I do not expect the lower coverage to be
meaningful. The speedup is significant: back down to ~1 second like the
other tests.

Signed-off-by: Benjamin Gray 
---
 .../testing/selftests/powerpc/dscr/dscr_sysfs_test.c  | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/powerpc/dscr/dscr_sysfs_test.c 
b/tools/testing/selftests/powerpc/dscr/dscr_sysfs_test.c
index 4f1fef6198fc..e7cd0d6b1fad 100644
--- a/tools/testing/selftests/powerpc/dscr/dscr_sysfs_test.c
+++ b/tools/testing/selftests/powerpc/dscr/dscr_sysfs_test.c
@@ -67,17 +67,14 @@ static int check_all_cpu_dscr_defaults(unsigned long val)
 int dscr_sysfs(void)
 {
unsigned long orig_dscr_default;
-   int i, j;
 
SKIP_IF(!have_hwcap2(PPC_FEATURE2_DSCR));
 
orig_dscr_default = get_default_dscr();
-   for (i = 0; i < COUNT; i++) {
-   for (j = 0; j < DSCR_MAX; j++) {
-   set_default_dscr(j);
-   if (check_all_cpu_dscr_defaults(j))
-   goto fail;
-   }
+   for (int i = 0; i < DSCR_MAX; i++) {
+   set_default_dscr(i);
+   if (check_all_cpu_dscr_defaults(i))
+   goto fail;
}
set_default_dscr(orig_dscr_default);
return 0;
-- 
2.39.2



[PATCH 0/5] Update DSCR tests

2023-03-06 Thread Benjamin Gray
The randomness based DSCR tests currently have a low probability of doing
any writes to the DSCR, making them inefficient in uncovering bugs.

This series adds lockstep variants to these RNG tests, to ensure the happy
path is always tested, and improves the randomness and size of the RNG
tests.

It also removes many iterations of the sysfs DSCR test, allowing the default
timeout to be re-enabled.

Benjamin Gray (5):
  selftests/powerpc/dscr: Correct typos
  selftests/powerpc/dscr: Add lockstep test cases to DSCR explicit tests
  selftests/powerpc/dscr: Improve DSCR explicit random test case
  selftests/powerpc/dscr: Speed up DSCR sysfs tests
  selftests/powerpc/dscr: Restore timeout to DSCR selftests

 tools/testing/selftests/powerpc/dscr/Makefile |   3 +-
 tools/testing/selftests/powerpc/dscr/dscr.h   |  22 +-
 .../powerpc/dscr/dscr_default_test.c  | 223 +++---
 .../powerpc/dscr/dscr_explicit_test.c | 173 +++---
 .../powerpc/dscr/dscr_inherit_test.c  |   4 +-
 .../selftests/powerpc/dscr/dscr_sysfs_test.c  |  11 +-
 .../selftests/powerpc/dscr/dscr_user_test.c   |   4 +-
 tools/testing/selftests/powerpc/dscr/settings |   1 -
 8 files changed, 300 insertions(+), 141 deletions(-)
 delete mode 100644 tools/testing/selftests/powerpc/dscr/settings


base-commit: 422fbcbf91303706823bc3babceb1df1a42112bf
--
2.39.2


Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems

2023-03-06 Thread Alexey Kardashevskiy




On 07/03/2023 10:46, Alex Williamson wrote:

On Mon, 6 Mar 2023 11:29:53 -0600 (CST)
Timothy Pearson  wrote:


This patch series reenables VFIO support on POWER systems.  It
is based on Alexey Kardashevskiys's patch series, rebased and
successfully tested under QEMU with a Marvell PCIe SATA controller
on a POWER9 Blackbird host.

Alexey Kardashevskiy (3):
   powerpc/iommu: Add "borrowing" iommu_table_group_ops
   powerpc/pci_64: Init pcibios subsys a bit later
   powerpc/iommu: Add iommu_ops to report capabilities and allow blocking
 domains

Timothy Pearson (1):
   Add myself to MAINTAINERS for Power VFIO support

  MAINTAINERS   |   5 +
  arch/powerpc/include/asm/iommu.h  |   6 +-
  arch/powerpc/include/asm/pci-bridge.h |   7 +
  arch/powerpc/kernel/iommu.c   | 246 +-
  arch/powerpc/kernel/pci_64.c  |   2 +-
  arch/powerpc/platforms/powernv/pci-ioda.c |  36 +++-
  arch/powerpc/platforms/pseries/iommu.c|  27 +++
  arch/powerpc/platforms/pseries/pseries.h  |   4 +
  arch/powerpc/platforms/pseries/setup.c|   3 +
  drivers/vfio/vfio_iommu_spapr_tce.c   |  96 ++---
  10 files changed, 338 insertions(+), 94 deletions(-)



For vfio and MAINTAINERS portions,

Acked-by: Alex Williamson 

I'll note though that spapr_tce_take_ownership() looks like it copied a
bug from the old tce_iommu_take_ownership() where tbl and tbl->it_map
are tested before calling iommu_take_ownership() but not in the unwind
loop, ie. tables we might have skipped on setup are unconditionally
released on unwind.  Thanks,



Ah, true, a bug. Thanks for pointing out.


--
Alexey


Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems

2023-03-06 Thread Timothy Pearson



- Original Message -
> From: "Alex Williamson" 
> To: "Timothy Pearson" 
> Cc: "kvm" , "linuxppc-dev" 
> 
> Sent: Monday, March 6, 2023 5:46:07 PM
> Subject: Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems

> On Mon, 6 Mar 2023 11:29:53 -0600 (CST)
> Timothy Pearson  wrote:
> 
>> This patch series reenables VFIO support on POWER systems.  It
>> is based on Alexey Kardashevskiys's patch series, rebased and
>> successfully tested under QEMU with a Marvell PCIe SATA controller
>> on a POWER9 Blackbird host.
>> 
>> Alexey Kardashevskiy (3):
>>   powerpc/iommu: Add "borrowing" iommu_table_group_ops
>>   powerpc/pci_64: Init pcibios subsys a bit later
>>   powerpc/iommu: Add iommu_ops to report capabilities and allow blocking
>> domains
>> 
>> Timothy Pearson (1):
>>   Add myself to MAINTAINERS for Power VFIO support
>> 
>>  MAINTAINERS   |   5 +
>>  arch/powerpc/include/asm/iommu.h  |   6 +-
>>  arch/powerpc/include/asm/pci-bridge.h |   7 +
>>  arch/powerpc/kernel/iommu.c   | 246 +-
>>  arch/powerpc/kernel/pci_64.c  |   2 +-
>>  arch/powerpc/platforms/powernv/pci-ioda.c |  36 +++-
>>  arch/powerpc/platforms/pseries/iommu.c|  27 +++
>>  arch/powerpc/platforms/pseries/pseries.h  |   4 +
>>  arch/powerpc/platforms/pseries/setup.c|   3 +
>>  drivers/vfio/vfio_iommu_spapr_tce.c   |  96 ++---
>>  10 files changed, 338 insertions(+), 94 deletions(-)
>> 
> 
> For vfio and MAINTAINERS portions,
> 
> Acked-by: Alex Williamson 
> 
> I'll note though that spapr_tce_take_ownership() looks like it copied a
> bug from the old tce_iommu_take_ownership() where tbl and tbl->it_map
> are tested before calling iommu_take_ownership() but not in the unwind
> loop, ie. tables we might have skipped on setup are unconditionally
> released on unwind.  Thanks,
> 
> Alex

Thanks for that.  I'll put together a patch to get rid of that potential bug 
that can be applied after this series is merged, unless you'd rather I resubmit 
a v3 with the issue fixed?


[PATCH] powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled

2023-03-06 Thread Haren Myneni


The hypervisor supports user-mode NX from Power10. pseries_vas_dlpar_cpu()
is called from lparcfg_write() to update VAS windows for DLPAR CPU event
and the kernel gets -ENOTSUPP for HCALLs if the user-mode NX is not
supported.

This patch ignores updating VAS capabilities and returns success if the
copy/paste feature is not enabled.

Fixes: 2147783d6bf0 ("powerpc/pseries: Use lparcfg to reconfig VAS windows for 
DLPAR CPU")
Signed-off-by: Haren Myneni 
---
 arch/powerpc/platforms/pseries/vas.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/vas.c 
b/arch/powerpc/platforms/pseries/vas.c
index 559112312810..dc003849d2c5 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -856,6 +856,13 @@ int pseries_vas_dlpar_cpu(void)
 {
int new_nr_creds, rc;
 
+   /*
+* NX-GZIP is not enabled. Nothing to do for DLPAR event
+*/
+   if (!copypaste_feat)
+   return 0;
+
+
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
  vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
  (u64)virt_to_phys(_cop_caps));
@@ -1012,6 +1019,7 @@ static int __init pseries_vas_init(void)
 * Linux supports user space COPY/PASTE only with Radix
 */
if (!radix_enabled()) {
+   copypaste_feat = 0;
pr_err("API is supported only with radix page tables\n");
return -ENOTSUPP;
}
-- 
2.26.3




Re: [PATCH v2 0/4] Reenable VFIO support on POWER systems

2023-03-06 Thread Alex Williamson
On Mon, 6 Mar 2023 11:29:53 -0600 (CST)
Timothy Pearson  wrote:

> This patch series reenables VFIO support on POWER systems.  It
> is based on Alexey Kardashevskiys's patch series, rebased and
> successfully tested under QEMU with a Marvell PCIe SATA controller
> on a POWER9 Blackbird host.
> 
> Alexey Kardashevskiy (3):
>   powerpc/iommu: Add "borrowing" iommu_table_group_ops
>   powerpc/pci_64: Init pcibios subsys a bit later
>   powerpc/iommu: Add iommu_ops to report capabilities and allow blocking
> domains
> 
> Timothy Pearson (1):
>   Add myself to MAINTAINERS for Power VFIO support
> 
>  MAINTAINERS   |   5 +
>  arch/powerpc/include/asm/iommu.h  |   6 +-
>  arch/powerpc/include/asm/pci-bridge.h |   7 +
>  arch/powerpc/kernel/iommu.c   | 246 +-
>  arch/powerpc/kernel/pci_64.c  |   2 +-
>  arch/powerpc/platforms/powernv/pci-ioda.c |  36 +++-
>  arch/powerpc/platforms/pseries/iommu.c|  27 +++
>  arch/powerpc/platforms/pseries/pseries.h  |   4 +
>  arch/powerpc/platforms/pseries/setup.c|   3 +
>  drivers/vfio/vfio_iommu_spapr_tce.c   |  96 ++---
>  10 files changed, 338 insertions(+), 94 deletions(-)
> 

For vfio and MAINTAINERS portions,

Acked-by: Alex Williamson 

I'll note though that spapr_tce_take_ownership() looks like it copied a
bug from the old tce_iommu_take_ownership() where tbl and tbl->it_map
are tested before calling iommu_take_ownership() but not in the unwind
loop, ie. tables we might have skipped on setup are unconditionally
released on unwind.  Thanks,

Alex



Re: [PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Michael Ellerman
Gerald Schaefer  writes:
> s390 can do more fine-grained handling of spurious TLB protection faults,
> when there also is the PTE pointer available.
>
> Therefore, pass on the PTE pointer to flush_tlb_fix_spurious_fault() as
> an additional parameter.
>
> This will add no functional change to other architectures, but those with
> private flush_tlb_fix_spurious_fault() implementations need to be made
> aware of the new parameter.
>
> Reviewed-by: Alexander Gordeev 
> Signed-off-by: Gerald Schaefer 
> ---
>  arch/arm64/include/asm/pgtable.h  |  2 +-
>  arch/mips/include/asm/pgtable.h   |  3 ++-
>  arch/powerpc/include/asm/book3s/64/tlbflush.h |  3 ++-
>  arch/s390/include/asm/pgtable.h   | 12 +++-
>  arch/x86/include/asm/pgtable.h|  2 +-
>  include/linux/pgtable.h   |  2 +-
>  mm/memory.c   |  3 ++-
>  mm/pgtable-generic.c  |  2 +-
>  8 files changed, 17 insertions(+), 12 deletions(-)
...
> diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
> b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> index 2bbc0fcce04a..ff7f0ee179e5 100644
> --- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
> +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
> @@ -121,7 +121,8 @@ static inline void flush_tlb_page(struct vm_area_struct 
> *vma,
>  
>  #define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault
>  static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
> - unsigned long address)
> + unsigned long address,
> + pte_t *ptep)
>  {
>   /*
>* Book3S 64 does not require spurious fault flushes because the PTE

Acked-by: Michael Ellerman  (powerpc)

cheers


Re: [PATCH v10 03/13] dt-bindings: Convert gpio-mmio to yaml

2023-03-06 Thread Linus Walleij
Hi Sean,

thanks for doing this. I never got around to because time.

On Mon, Mar 6, 2023 at 8:16 PM Sean Anderson  wrote:

> This is a generic binding for simple MMIO GPIO controllers. Although we
> have a single driver for these controllers, they were previously spread
> over several files. Consolidate them. The register descriptions are
> adapted from the comments in the source. There is no set order for the
> registers, so I have not specified one.
>
> Signed-off-by: Sean Anderson 
(...)

> +  compatible:
> +enum:
> +  - brcm,bcm6345-gpio # Broadcom BCM6345 GPIO controller
> +  - wd,mbl-gpio # Western Digital MyBook Live memory-mapped GPIO 
> controller
> +  - ni,169445-nand-gpio # National Instruments 169445 GPIO NAND 
> controller

I think you can inline description: statements in the enum instead of
the # hash comments, however IIRC you have to use oneOf and
const: to do it, like I do in
Documentation/devicetree/bindings/input/touchscreen/cypress,cy8ctma340.yaml
but don't overinvest in this if it is cumbersome.

Either way:
Reviewed-by: Linus Walleij 

Yours,
Linus Walleij


Re: [PATCH V5 01/15] spi: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Mark Brown
On Mon, Mar 06, 2023 at 10:50:55PM +0530, Amit Kumar Mahapatra wrote:

>  drivers/spi/spi-omap-100k.c   |  2 +-

This is also not against -rc1, this file was removed in bcace9c4c9270292
("spi: remove omap 100K driver").


signature.asc
Description: PGP signature


Re: [PATCH V5 09/15] spi: Add stacked and parallel memories support in SPI core

2023-03-06 Thread Jonas Gorski
Hi,

On Mon, 6 Mar 2023 at 18:26, Amit Kumar Mahapatra
 wrote:
>
> For supporting multiple CS the SPI device need to be aware of all the CS
> values. So, the "chip_select" member in the spi_device structure is now an
> array that holds all the CS values.
>
> spi_device structure now has a "cs_index_mask" member. This acts as an
> index to the chip_select array. If nth bit of spi->cs_index_mask is set
> then the driver would assert spi->chip_select[n].
>
> In parallel mode all the chip selects are asserted/de-asserted
> simultaneously and each byte of data is stored in both devices, the even
> bits in one, the odd bits in the other. The split is automatically handled
> by the GQSPI controller. The GQSPI controller supports a maximum of two
> flashes connected in parallel mode. A "multi-cs-cap" flag is added in the
> spi controntroller data, through ctlr->multi-cs-cap the spi core will make
> sure that the controller is capable of handling multiple chip selects at
> once.
>
> For supporting multiple CS via GPIO the cs_gpiod member of the spi_device
> structure is now an array that holds the gpio descriptor for each
> chipselect.
>
> Multi CS support using GPIO is not tested due to unavailability of
> necessary hardware setup.
>
> Signed-off-by: Amit Kumar Mahapatra 
> ---
>  drivers/spi/spi.c   | 213 +++-
>  include/linux/spi/spi.h |  34 +--
>  2 files changed, 173 insertions(+), 74 deletions(-)
>
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 5866bf5813a4..8ec7f58fa111 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -613,7 +613,8 @@ static int spi_dev_check(struct device *dev, void *data)
> struct spi_device *new_spi = data;
>
> if (spi->controller == new_spi->controller &&
> -   spi_get_chipselect(spi, 0) == spi_get_chipselect(new_spi, 0))
> +   spi_get_chipselect(spi, 0) == spi_get_chipselect(new_spi, 0) &&
> +   spi_get_chipselect(spi, 1) == spi_get_chipselect(new_spi, 1))
> return -EBUSY;

This will only reject new devices if both chip selects are identical,
but not if they only share one, e.g. CS 1 + 2 vs 1 + 3, or 1 + 2 vs
only 2, or if the order is different (1 + 2 vs 2 + 1 - haven't read
the code too close to know if this is allowed/possible).

Regards,
Jonas


Re: [PATCH V5 01/15] spi: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Mark Brown
On Mon, Mar 06, 2023 at 10:50:55PM +0530, Amit Kumar Mahapatra wrote:
> Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
> members of struct spi_device to be an array. But changing the type of these
> members to array would break the spi driver functionality. To make the
> transition smoother introduced four new APIs to get/set the
> spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
> spi->cs_gpiod references with get or set API calls.
> While adding multi-cs support in further patches the chip_select & cs_gpiod
> members of the spi_device structure would be converted to arrays & the
> "idx" parameter of the APIs would be used as array index i.e.,
> spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

This doesn't apply against current code, there was a rework of the
mpc512x-psc driver.  Please check and resend.


signature.asc
Description: PGP signature


[PATCH V5 15/15] spi: spi-zynqmp-gqspi: Add parallel memories support in GQSPI driver

2023-03-06 Thread Amit Kumar Mahapatra
During GQSPI driver probe set ctlr->multi-cs-cap for enabling multi CS
capability of the controller. In parallel mode the controller can either
split the data between both the flash or can send the same data to both the
flashes, this is determined by the STRIPE bit. While sending commands to
the flashes the GQSPI driver send the same command to both the flashes by
resetting the STRIPE bit, but while writing/reading data to & from the
flash the GQSPI driver splits the data evenly between both the flashes by
setting the STRIPE bit.

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 39 +-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index 4759f704bf5c..9e44371bfda2 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Generic QSPI register offsets */
 #define GQSPI_CONFIG_OFST  0x0100
@@ -192,6 +193,7 @@ struct qspi_platform_data {
  * @op_lock:   Operational lock
  * @speed_hz:  Current SPI bus clock speed in hz
  * @has_tapdelay:  Used for tapdelay register available in qspi
+ * @is_parallel:   Used for multi CS support
  */
 struct zynqmp_qspi {
struct spi_controller *ctlr;
@@ -214,8 +216,33 @@ struct zynqmp_qspi {
struct mutex op_lock;
u32 speed_hz;
bool has_tapdelay;
+   bool is_parallel;
 };
 
+/**
+ * zynqmp_gqspi_update_stripe - For GQSPI controller data stripe capabilities
+ * @op:Pointer to mem ops
+ * Return:  Status of the data stripe
+ *
+ * Returns true if data stripe need to be enabled, else returns false
+ */
+bool zynqmp_gqspi_update_stripe(const struct spi_mem_op *op)
+{
+   if (op->cmd.opcode ==  SPINOR_OP_BE_4K ||
+   op->cmd.opcode ==  SPINOR_OP_BE_32K ||
+   op->cmd.opcode ==  SPINOR_OP_CHIP_ERASE ||
+   op->cmd.opcode ==  SPINOR_OP_SE ||
+   op->cmd.opcode ==  SPINOR_OP_BE_32K_4B ||
+   op->cmd.opcode ==  SPINOR_OP_SE_4B ||
+   op->cmd.opcode == SPINOR_OP_BE_4K_4B ||
+   op->cmd.opcode ==  SPINOR_OP_WRSR ||
+   op->cmd.opcode ==  SPINOR_OP_BRWR ||
+   (op->cmd.opcode ==  SPINOR_OP_WRSR2 && !op->addr.nbytes))
+   return false;
+
+   return true;
+}
+
 /**
  * zynqmp_gqspi_read - For GQSPI controller read operation
  * @xqspi: Pointer to the zynqmp_qspi structure
@@ -470,7 +497,14 @@ static void zynqmp_qspi_chipselect(struct spi_device 
*qspi, bool is_high)
 
genfifoentry |= GQSPI_GENFIFO_MODE_SPI;
 
-   if (qspi->cs_index_mask & GQSPI_SELECT_UPPER_CS) {
+   if ((qspi->cs_index_mask & GQSPI_SELECT_LOWER_CS) &&
+   (qspi->cs_index_mask & GQSPI_SELECT_UPPER_CS)) {
+   zynqmp_gqspi_selectslave(xqspi,
+GQSPI_SELECT_FLASH_CS_BOTH,
+GQSPI_SELECT_FLASH_BUS_BOTH);
+   if (!xqspi->is_parallel)
+   xqspi->is_parallel = true;
+   } else if (qspi->cs_index_mask & GQSPI_SELECT_UPPER_CS) {
zynqmp_gqspi_selectslave(xqspi,
 GQSPI_SELECT_FLASH_CS_UPPER,
 GQSPI_SELECT_FLASH_BUS_LOWER);
@@ -1139,6 +1173,8 @@ static int zynqmp_qspi_exec_op(struct spi_mem *mem,
}
 
if (op->data.nbytes) {
+   if (xqspi->is_parallel && zynqmp_gqspi_update_stripe(op))
+   genfifoentry |= GQSPI_GENFIFO_STRIPE;
reinit_completion(>data_completion);
if (op->data.dir == SPI_MEM_DATA_OUT) {
xqspi->txbuf = (u8 *)op->data.buf.out;
@@ -1334,6 +1370,7 @@ static int zynqmp_qspi_probe(struct platform_device *pdev)
ctlr->bits_per_word_mask = SPI_BPW_MASK(8);
ctlr->dev.of_node = np;
ctlr->auto_runtime_pm = true;
+   ctlr->multi_cs_cap = true;
 
ret = devm_spi_register_controller(>dev, ctlr);
if (ret) {
-- 
2.25.1



[PATCH V5 14/15] mtd: spi-nor: Add parallel memories support in spi-nor

2023-03-06 Thread Amit Kumar Mahapatra
The current implementation assumes that a maximum of two flashes are
connected in parallel mode. The QSPI controller splits the data evenly
between both the flashes so, both the flashes that are connected in
parallel mode should be identical.
During each operation SPI-NOR sets 0th bit for CS0 & 1st bit for CS1 in
nor->spimem->spi->cs_index_mask. The QSPI driver will then assert/de-assert
CS0 & CS1.
Write operation in parallel mode are performed in page size * 2 chunks as
each write operation results in writing both the flashes. For doubling the
address space each operation is performed at addr/2 flash offset, where
addr is the address specified by the user.

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/mtd/spi-nor/core.c  | 514 +++-
 drivers/mtd/spi-nor/core.h  |   4 +
 drivers/mtd/spi-nor/micron-st.c |   5 +
 3 files changed, 384 insertions(+), 139 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index bb7326dc8b70..367cbb36ef69 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -464,17 +464,29 @@ int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
op.data.nbytes = 2;
}
 
+   if (nor->flags & SNOR_F_HAS_PARALLEL)
+   op.data.nbytes = 2;
+
spi_nor_spimem_setup_op(nor, , nor->reg_proto);
 
ret = spi_mem_exec_op(nor->spimem, );
} else {
-   ret = spi_nor_controller_ops_read_reg(nor, SPINOR_OP_RDSR, sr,
- 1);
+   if (nor->flags & SNOR_F_HAS_PARALLEL)
+   ret = spi_nor_controller_ops_read_reg(nor,
+ SPINOR_OP_RDSR,
+ sr, 2);
+   else
+   ret = spi_nor_controller_ops_read_reg(nor,
+ SPINOR_OP_RDSR,
+ sr, 1);
}
 
if (ret)
dev_dbg(nor->dev, "error %d reading SR\n", ret);
 
+   if (nor->flags & SNOR_F_HAS_PARALLEL)
+   sr[0] |= sr[1];
+
return ret;
 }
 
@@ -1466,12 +1478,122 @@ static int spi_nor_erase(struct mtd_info *mtd, struct 
erase_info *instr)
if (ret)
return ret;
 
-   /* whole-chip erase? */
-   if (len == mtd->size && !(nor->flags & SNOR_F_NO_OP_CHIP_ERASE)) {
-   unsigned long timeout;
+   if (!(nor->flags & SNOR_F_HAS_PARALLEL)) {
+   /* whole-chip erase? */
+   if (len == mtd->size && !(nor->flags & 
SNOR_F_NO_OP_CHIP_ERASE)) {
+   unsigned long timeout;
+
+   while ((cur_cs_num < SNOR_FLASH_CNT_MAX) && params) {
+   nor->spimem->spi->cs_index_mask = 1 << 
cur_cs_num;
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   goto erase_err;
+
+   ret = spi_nor_erase_chip(nor);
+   if (ret)
+   goto erase_err;
+
+   /*
+* Scale the timeout linearly with the size of 
the flash, with
+* a minimum calibrated to an old 2MB flash. We 
could try to
+* pull these from CFI/SFDP, but these values 
should be good
+* enough for now.
+*/
+   timeout = max(CHIP_ERASE_2MB_READY_WAIT_JIFFIES,
+ CHIP_ERASE_2MB_READY_WAIT_JIFFIES 
*
+ (unsigned long)(params->size /
+ SZ_2M));
+   ret = spi_nor_wait_till_ready_with_timeout(nor, 
timeout);
+   if (ret)
+   goto erase_err;
+
+   cur_cs_num++;
+   params = spi_nor_get_params(nor, cur_cs_num);
+   }
+
+   /* REVISIT in some cases we could speed up erasing large regions
+* by using SPINOR_OP_SE instead of SPINOR_OP_BE_4K.  We may 
have set up
+* to use "small sector erase", but that's not always optimal.
+*/
+
+   /* "sector"-at-a-time erase */
+   } else if (spi_nor_has_uniform_erase(nor)) {
+   /* Determine the flash from which the operation need to 
start */
+   while ((cur_cs_num < SNOR_FLASH_CNT_MAX) &&
+  (addr > sz - 1) && params) {
+ 

[PATCH V5 12/15] mtd: spi-nor: Add stacked memories support in spi-nor

2023-03-06 Thread Amit Kumar Mahapatra
Each flash that is connected in stacked mode should have a separate
parameter structure. So, the flash parameter member(*params) of the spi_nor
structure is changed to an array (*params[2]). The array is used to store
the parameters of each flash connected in stacked configuration.

The current implementation assumes that a maximum of two flashes are
connected in stacked mode and both the flashes are of same make but can
differ in sizes. So, except the sizes all other flash parameters of both
the flashes are identical.

SPI-NOR is not aware of the chip_select values, for any incoming request
SPI-NOR will decide the flash index with the help of individual flash size
and the configuration type (single/stacked). SPI-NOR will pass on the flash
index information to the SPI core & SPI driver by setting the appropriate
bit in nor->spimem->spi->cs_index_mask. For example, if nth bit of
nor->spimem->spi->cs_index_mask is set then the driver would
assert/de-assert spi->chip_slect[n].

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/mtd/spi-nor/core.c  | 282 +---
 drivers/mtd/spi-nor/core.h  |   4 +
 include/linux/mtd/spi-nor.h |  12 +-
 3 files changed, 244 insertions(+), 54 deletions(-)

diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 8a4a54bf2d0e..bb7326dc8b70 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -1441,13 +1441,18 @@ static int spi_nor_erase_multi_sectors(struct spi_nor 
*nor, u64 addr, u32 len)
 static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
 {
struct spi_nor *nor = mtd_to_spi_nor(mtd);
-   u32 addr, len;
+   struct spi_nor_flash_parameter *params;
+   u32 addr, len, offset, cur_cs_num = 0;
uint32_t rem;
int ret;
+   u64 sz;
 
dev_dbg(nor->dev, "at 0x%llx, len %lld\n", (long long)instr->addr,
(long long)instr->len);
 
+   params = spi_nor_get_params(nor, 0);
+   sz = params->size;
+
if (spi_nor_has_uniform_erase(nor)) {
div_u64_rem(instr->len, mtd->erasesize, );
if (rem)
@@ -1465,26 +1470,30 @@ static int spi_nor_erase(struct mtd_info *mtd, struct 
erase_info *instr)
if (len == mtd->size && !(nor->flags & SNOR_F_NO_OP_CHIP_ERASE)) {
unsigned long timeout;
 
-   ret = spi_nor_write_enable(nor);
-   if (ret)
-   goto erase_err;
+   while (cur_cs_num < SNOR_FLASH_CNT_MAX && params) {
+   nor->spimem->spi->cs_index_mask = 0x01 << cur_cs_num;
+   ret = spi_nor_write_enable(nor);
+   if (ret)
+   goto erase_err;
 
-   ret = spi_nor_erase_chip(nor);
-   if (ret)
-   goto erase_err;
+   ret = spi_nor_erase_chip(nor);
+   if (ret)
+   goto erase_err;
 
-   /*
-* Scale the timeout linearly with the size of the flash, with
-* a minimum calibrated to an old 2MB flash. We could try to
-* pull these from CFI/SFDP, but these values should be good
-* enough for now.
-*/
-   timeout = max(CHIP_ERASE_2MB_READY_WAIT_JIFFIES,
- CHIP_ERASE_2MB_READY_WAIT_JIFFIES *
- (unsigned long)(mtd->size / SZ_2M));
-   ret = spi_nor_wait_till_ready_with_timeout(nor, timeout);
-   if (ret)
-   goto erase_err;
+   /*
+* Scale the timeout linearly with the size of the 
flash, with
+* a minimum calibrated to an old 2MB flash. We could 
try to
+* pull these from CFI/SFDP, but these values should be 
good
+* enough for now.
+*/
+   timeout = max(CHIP_ERASE_2MB_READY_WAIT_JIFFIES,
+ CHIP_ERASE_2MB_READY_WAIT_JIFFIES *
+ (unsigned long)(params->size / SZ_2M));
+   ret = spi_nor_wait_till_ready_with_timeout(nor, 
timeout);
+   if (ret)
+   goto erase_err;
+   cur_cs_num++;
+   }
 
/* REVISIT in some cases we could speed up erasing large regions
 * by using SPINOR_OP_SE instead of SPINOR_OP_BE_4K.  We may have set up
@@ -1493,12 +1502,26 @@ static int spi_nor_erase(struct mtd_info *mtd, struct 
erase_info *instr)
 
/* "sector"-at-a-time erase */
} else if (spi_nor_has_uniform_erase(nor)) {
+   /* Determine the flash from which the operation need to start */
+   while ((cur_cs_num < SNOR_FLASH_CNT_MAX) && (addr > sz - 1) && 
params) {
+   

[PATCH V5 13/15] spi: spi-zynqmp-gqspi: Add stacked memories support in GQSPI driver

2023-03-06 Thread Amit Kumar Mahapatra
GQSPI supports two chip select CS0 & CS1. Update the driver to
assert/de-assert the appropriate chip select as per the bits set in
qspi->cs_index_mask.

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi-zynqmp-gqspi.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/spi/spi-zynqmp-gqspi.c b/drivers/spi/spi-zynqmp-gqspi.c
index 319cdd5a0bdc..4759f704bf5c 100644
--- a/drivers/spi/spi-zynqmp-gqspi.c
+++ b/drivers/spi/spi-zynqmp-gqspi.c
@@ -156,6 +156,9 @@
 #define GQSPI_FREQ_100MHZ  1
 #define GQSPI_FREQ_150MHZ  15000
 
+#define GQSPI_SELECT_LOWER_CS  BIT(0)
+#define GQSPI_SELECT_UPPER_CS  BIT(1)
+
 #define SPI_AUTOSUSPEND_TIMEOUT3000
 enum mode_type {GQSPI_MODE_IO, GQSPI_MODE_DMA};
 
@@ -467,15 +470,17 @@ static void zynqmp_qspi_chipselect(struct spi_device 
*qspi, bool is_high)
 
genfifoentry |= GQSPI_GENFIFO_MODE_SPI;
 
+   if (qspi->cs_index_mask & GQSPI_SELECT_UPPER_CS) {
+   zynqmp_gqspi_selectslave(xqspi,
+GQSPI_SELECT_FLASH_CS_UPPER,
+GQSPI_SELECT_FLASH_BUS_LOWER);
+   } else if (qspi->cs_index_mask & GQSPI_SELECT_LOWER_CS) {
+   zynqmp_gqspi_selectslave(xqspi,
+GQSPI_SELECT_FLASH_CS_LOWER,
+GQSPI_SELECT_FLASH_BUS_LOWER);
+   }
+   genfifoentry |= xqspi->genfifobus;
if (!is_high) {
-   if (!spi_get_chipselect(qspi, 0)) {
-   xqspi->genfifobus = GQSPI_GENFIFO_BUS_LOWER;
-   xqspi->genfifocs = GQSPI_GENFIFO_CS_LOWER;
-   } else {
-   xqspi->genfifobus = GQSPI_GENFIFO_BUS_UPPER;
-   xqspi->genfifocs = GQSPI_GENFIFO_CS_UPPER;
-   }
-   genfifoentry |= xqspi->genfifobus;
genfifoentry |= xqspi->genfifocs;
genfifoentry |= GQSPI_GENFIFO_CS_SETUP;
} else {
-- 
2.25.1



[PATCH V5 11/15] mtd: spi-nor: Add APIs to set/get nor->params

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi-nor would require the *params member of
struct spi_nor to be an array. To make the transition smoother introduced
spi_nor_get_params() & spi_nor_set_params() APIs to get & set nor->params,
added a new local variable (struct spi_nor_flash_parameter *params) to hold
the return value of the spi_nor_get_params() function call and replaced all
nor->params references with the "params".
While adding multi-cs support in further patches the *params member of the
spi_nor structure would be converted to arrays & the "idx" parameter of
the APIs would be used as array index i.e., nor->params[idx].

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/mtd/spi-nor/atmel.c  |  17 ++--
 drivers/mtd/spi-nor/core.c   | 129 ---
 drivers/mtd/spi-nor/debugfs.c|   4 +-
 drivers/mtd/spi-nor/gigadevice.c |   4 +-
 drivers/mtd/spi-nor/issi.c   |  11 ++-
 drivers/mtd/spi-nor/macronix.c   |   6 +-
 drivers/mtd/spi-nor/micron-st.c  |  34 +---
 drivers/mtd/spi-nor/otp.c|  29 ---
 drivers/mtd/spi-nor/sfdp.c   |  29 ---
 drivers/mtd/spi-nor/spansion.c   |  50 +++-
 drivers/mtd/spi-nor/sst.c|   7 +-
 drivers/mtd/spi-nor/swp.c|  22 --
 drivers/mtd/spi-nor/winbond.c|  10 ++-
 drivers/mtd/spi-nor/xilinx.c |  18 +++--
 include/linux/mtd/spi-nor.h  |  10 +++
 15 files changed, 254 insertions(+), 126 deletions(-)

diff --git a/drivers/mtd/spi-nor/atmel.c b/drivers/mtd/spi-nor/atmel.c
index 656dd80a0be7..57ca9f5ee205 100644
--- a/drivers/mtd/spi-nor/atmel.c
+++ b/drivers/mtd/spi-nor/atmel.c
@@ -23,10 +23,11 @@ static int at25fs_nor_lock(struct spi_nor *nor, loff_t ofs, 
uint64_t len)
 
 static int at25fs_nor_unlock(struct spi_nor *nor, loff_t ofs, uint64_t len)
 {
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
int ret;
 
/* We only support unlocking the whole flash array */
-   if (ofs || len != nor->params->size)
+   if (ofs || len != params->size)
return -EINVAL;
 
/* Write 0x00 to the status register to disable write protection */
@@ -50,7 +51,9 @@ static const struct spi_nor_locking_ops 
at25fs_nor_locking_ops = {
 
 static void at25fs_nor_late_init(struct spi_nor *nor)
 {
-   nor->params->locking_ops = _nor_locking_ops;
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
+
+   params->locking_ops = _nor_locking_ops;
 }
 
 static const struct spi_nor_fixups at25fs_nor_fixups = {
@@ -69,11 +72,12 @@ static const struct spi_nor_fixups at25fs_nor_fixups = {
 static int atmel_nor_set_global_protection(struct spi_nor *nor, loff_t ofs,
   uint64_t len, bool is_protect)
 {
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
int ret;
u8 sr;
 
/* We only support locking the whole flash array */
-   if (ofs || len != nor->params->size)
+   if (ofs || len != params->size)
return -EINVAL;
 
ret = spi_nor_read_sr(nor, nor->bouncebuf);
@@ -131,9 +135,10 @@ static int atmel_nor_global_unprotect(struct spi_nor *nor, 
loff_t ofs,
 static int atmel_nor_is_global_protected(struct spi_nor *nor, loff_t ofs,
 uint64_t len)
 {
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
int ret;
 
-   if (ofs >= nor->params->size || (ofs + len) > nor->params->size)
+   if (ofs >= params->size || (ofs + len) > params->size)
return -EINVAL;
 
ret = spi_nor_read_sr(nor, nor->bouncebuf);
@@ -151,7 +156,9 @@ static const struct spi_nor_locking_ops 
atmel_nor_global_protection_ops = {
 
 static void atmel_nor_global_protection_late_init(struct spi_nor *nor)
 {
-   nor->params->locking_ops = _nor_global_protection_ops;
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
+
+   params->locking_ops = _nor_global_protection_ops;
 }
 
 static const struct spi_nor_fixups atmel_nor_global_protection_fixups = {
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index d8703d7dfd0a..8a4a54bf2d0e 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -448,14 +448,15 @@ int spi_nor_read_id(struct spi_nor *nor, u8 naddr, u8 
ndummy, u8 *id,
  */
 int spi_nor_read_sr(struct spi_nor *nor, u8 *sr)
 {
+   struct spi_nor_flash_parameter *params = spi_nor_get_params(nor, 0);
int ret;
 
if (nor->spimem) {
struct spi_mem_op op = SPI_NOR_RDSR_OP(sr);
 
if (nor->reg_proto == SNOR_PROTO_8_8_8_DTR) {
-   op.addr.nbytes = nor->params->rdsr_addr_nbytes;
-   op.dummy.nbytes = nor->params->rdsr_dummy;
+   op.addr.nbytes = params->rdsr_addr_nbytes;
+   op.dummy.nbytes = params->rdsr_dummy;
/*
 * We 

[PATCH V5 10/15] mtd: spi-nor: Convert macros with inline functions

2023-03-06 Thread Amit Kumar Mahapatra
In further patches the nor->params references in
spi_nor_otp_region_len(nor) & spi_nor_otp_n_regions(nor) macros will be
replaced with spi_nor_get_params() API. To make the transition smoother,
first converting the macros into static inline functions.

Suggested-by: Michal Simek 
Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/mtd/spi-nor/otp.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/spi-nor/otp.c b/drivers/mtd/spi-nor/otp.c
index 00ab0d2d6d2f..3d75899de303 100644
--- a/drivers/mtd/spi-nor/otp.c
+++ b/drivers/mtd/spi-nor/otp.c
@@ -11,8 +11,27 @@
 
 #include "core.h"
 
-#define spi_nor_otp_region_len(nor) ((nor)->params->otp.org->len)
-#define spi_nor_otp_n_regions(nor) ((nor)->params->otp.org->n_regions)
+/**
+ * spi_nor_otp_region_len() - get size of one OTP region in bytes
+ * @nor:pointer to 'struct spi_nor'
+ *
+ * Return: size of one OTP region in bytes
+ */
+static inline unsigned int spi_nor_otp_region_len(struct spi_nor *nor)
+{
+   return nor->params->otp.org->len;
+}
+
+/**
+ * spi_nor_otp_n_regions() - get number of individual OTP regions
+ * @nor:pointer to 'struct spi_nor'
+ *
+ * Return: number of individual OTP regions
+ */
+static inline unsigned int spi_nor_otp_n_regions(struct spi_nor *nor)
+{
+   return nor->params->otp.org->n_regions;
+}
 
 /**
  * spi_nor_otp_read_secr() - read security register
-- 
2.25.1



[PATCH V5 09/15] spi: Add stacked and parallel memories support in SPI core

2023-03-06 Thread Amit Kumar Mahapatra
For supporting multiple CS the SPI device need to be aware of all the CS
values. So, the "chip_select" member in the spi_device structure is now an
array that holds all the CS values.

spi_device structure now has a "cs_index_mask" member. This acts as an
index to the chip_select array. If nth bit of spi->cs_index_mask is set
then the driver would assert spi->chip_select[n].

In parallel mode all the chip selects are asserted/de-asserted
simultaneously and each byte of data is stored in both devices, the even
bits in one, the odd bits in the other. The split is automatically handled
by the GQSPI controller. The GQSPI controller supports a maximum of two
flashes connected in parallel mode. A "multi-cs-cap" flag is added in the
spi controntroller data, through ctlr->multi-cs-cap the spi core will make
sure that the controller is capable of handling multiple chip selects at
once.

For supporting multiple CS via GPIO the cs_gpiod member of the spi_device
structure is now an array that holds the gpio descriptor for each
chipselect.

Multi CS support using GPIO is not tested due to unavailability of
necessary hardware setup.

Signed-off-by: Amit Kumar Mahapatra 
---
 drivers/spi/spi.c   | 213 +++-
 include/linux/spi/spi.h |  34 +--
 2 files changed, 173 insertions(+), 74 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 5866bf5813a4..8ec7f58fa111 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -613,7 +613,8 @@ static int spi_dev_check(struct device *dev, void *data)
struct spi_device *new_spi = data;
 
if (spi->controller == new_spi->controller &&
-   spi_get_chipselect(spi, 0) == spi_get_chipselect(new_spi, 0))
+   spi_get_chipselect(spi, 0) == spi_get_chipselect(new_spi, 0) &&
+   spi_get_chipselect(spi, 1) == spi_get_chipselect(new_spi, 1))
return -EBUSY;
return 0;
 }
@@ -628,7 +629,7 @@ static int __spi_add_device(struct spi_device *spi)
 {
struct spi_controller *ctlr = spi->controller;
struct device *dev = ctlr->dev.parent;
-   int status;
+   int status, idx;
 
/*
 * We need to make sure there's no other device with this
@@ -637,8 +638,7 @@ static int __spi_add_device(struct spi_device *spi)
 */
status = bus_for_each_dev(_bus_type, NULL, spi, spi_dev_check);
if (status) {
-   dev_err(dev, "chipselect %d already in use\n",
-   spi_get_chipselect(spi, 0));
+   dev_err(dev, "chipselect %d already in use\n", 
spi_get_chipselect(spi, 0));
return status;
}
 
@@ -648,8 +648,10 @@ static int __spi_add_device(struct spi_device *spi)
return -ENODEV;
}
 
-   if (ctlr->cs_gpiods)
-   spi_set_csgpiod(spi, 0, ctlr->cs_gpiods[spi_get_chipselect(spi, 
0)]);
+   if (ctlr->cs_gpiods) {
+   for (idx = 0; idx < SPI_CS_CNT_MAX; idx++)
+   spi_set_csgpiod(spi, idx, 
ctlr->cs_gpiods[spi_get_chipselect(spi, idx)]);
+   }
 
/*
 * Drivers may modify this initial i/o setup, but will
@@ -689,13 +691,15 @@ int spi_add_device(struct spi_device *spi)
 {
struct spi_controller *ctlr = spi->controller;
struct device *dev = ctlr->dev.parent;
-   int status;
+   int status, idx;
 
-   /* Chipselects are numbered 0..max; validate. */
-   if (spi_get_chipselect(spi, 0) >= ctlr->num_chipselect) {
-   dev_err(dev, "cs%d >= max %d\n", spi_get_chipselect(spi, 0),
-   ctlr->num_chipselect);
-   return -EINVAL;
+   for (idx = 0; idx < SPI_CS_CNT_MAX; idx++) {
+   /* Chipselects are numbered 0..max; validate. */
+   if (spi_get_chipselect(spi, idx) >= ctlr->num_chipselect) {
+   dev_err(dev, "cs%d >= max %d\n", 
spi_get_chipselect(spi, idx),
+   ctlr->num_chipselect);
+   return -EINVAL;
+   }
}
 
/* Set the bus ID string */
@@ -712,12 +716,15 @@ static int spi_add_device_locked(struct spi_device *spi)
 {
struct spi_controller *ctlr = spi->controller;
struct device *dev = ctlr->dev.parent;
+   int idx;
 
-   /* Chipselects are numbered 0..max; validate. */
-   if (spi_get_chipselect(spi, 0) >= ctlr->num_chipselect) {
-   dev_err(dev, "cs%d >= max %d\n", spi_get_chipselect(spi, 0),
-   ctlr->num_chipselect);
-   return -EINVAL;
+   for (idx = 0; idx < SPI_CS_CNT_MAX; idx++) {
+   /* Chipselects are numbered 0..max; validate. */
+   if (spi_get_chipselect(spi, idx) >= ctlr->num_chipselect) {
+   dev_err(dev, "cs%d >= max %d\n", 
spi_get_chipselect(spi, idx),
+   ctlr->num_chipselect);
+   return -EINVAL;
+   }
 

[PATCH V5 08/15] ALSA: hda: cs35l41: Replace all spi->chip_select references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
---
 sound/pci/hda/cs35l41_hda_spi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sound/pci/hda/cs35l41_hda_spi.c b/sound/pci/hda/cs35l41_hda_spi.c
index 71979cfb4d7e..eb287aa5f782 100644
--- a/sound/pci/hda/cs35l41_hda_spi.c
+++ b/sound/pci/hda/cs35l41_hda_spi.c
@@ -25,7 +25,7 @@ static int cs35l41_hda_spi_probe(struct spi_device *spi)
else
return -ENODEV;
 
-   return cs35l41_hda_probe(>dev, device_name, spi->chip_select, 
spi->irq,
+   return cs35l41_hda_probe(>dev, device_name, 
spi_get_chipselect(spi, 0), spi->irq,
 devm_regmap_init_spi(spi, 
_regmap_spi));
 }
 
-- 
2.25.1



[PATCH V5 07/15] powerpc/83xx/mpc832x_rdb: Replace all spi->chip_select references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
---
 arch/powerpc/platforms/83xx/mpc832x_rdb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/83xx/mpc832x_rdb.c 
b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
index caa96edf0e72..4ab1d48cd229 100644
--- a/arch/powerpc/platforms/83xx/mpc832x_rdb.c
+++ b/arch/powerpc/platforms/83xx/mpc832x_rdb.c
@@ -144,7 +144,7 @@ static int __init fsl_spi_init(struct spi_board_info 
*board_infos,
 
 static void mpc83xx_spi_cs_control(struct spi_device *spi, bool on)
 {
-   pr_debug("%s %d %d\n", __func__, spi->chip_select, on);
+   pr_debug("%s %d %d\n", __func__, spi_get_chipselect(spi, 0), on);
par_io_data_set(3, 13, on);
 }
 
-- 
2.25.1



[PATCH V5 06/15] platform/x86: serial-multi-instantiate: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Reviewed-by: Michal Simek 
---
 drivers/platform/x86/serial-multi-instantiate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/platform/x86/serial-multi-instantiate.c 
b/drivers/platform/x86/serial-multi-instantiate.c
index 5362f1a7b77c..270a4700d25d 100644
--- a/drivers/platform/x86/serial-multi-instantiate.c
+++ b/drivers/platform/x86/serial-multi-instantiate.c
@@ -139,7 +139,8 @@ static int smi_spi_probe(struct platform_device *pdev, 
struct smi *smi,
goto error;
}
 
-   dev_dbg(dev, "SPI device %s using chip select %u", name, 
spi_dev->chip_select);
+   dev_dbg(dev, "SPI device %s using chip select %u", name,
+   spi_get_chipselect(spi_dev, 0));
 
smi->spi_devs[i] = spi_dev;
smi->spi_num++;
-- 
2.25.1



[PATCH V5 05/15] staging: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Reviewed-by: Greg Kroah-Hartman 
Reviewed-by: Michal Simek 
---
 drivers/staging/fbtft/fbtft-core.c | 2 +-
 drivers/staging/greybus/spilib.c   | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/fbtft/fbtft-core.c 
b/drivers/staging/fbtft/fbtft-core.c
index afaba94d1d1c..3a4abf3bae40 100644
--- a/drivers/staging/fbtft/fbtft-core.c
+++ b/drivers/staging/fbtft/fbtft-core.c
@@ -840,7 +840,7 @@ int fbtft_register_framebuffer(struct fb_info *fb_info)
sprintf(text1, ", %zu KiB buffer memory", par->txbuf.len >> 10);
if (spi)
sprintf(text2, ", spi%d.%d at %d MHz", spi->master->bus_num,
-   spi->chip_select, spi->max_speed_hz / 100);
+   spi_get_chipselect(spi, 0), spi->max_speed_hz / 
100);
dev_info(fb_info->dev,
 "%s frame buffer, %dx%d, %d KiB video memory%s, fps=%lu%s\n",
 fb_info->fix.id, fb_info->var.xres, fb_info->var.yres,
diff --git a/drivers/staging/greybus/spilib.c b/drivers/staging/greybus/spilib.c
index ad0700a0bb81..efb3bec58e15 100644
--- a/drivers/staging/greybus/spilib.c
+++ b/drivers/staging/greybus/spilib.c
@@ -237,7 +237,7 @@ static struct gb_operation *gb_spi_operation_create(struct 
gb_spilib *spi,
request = operation->request->payload;
request->count = cpu_to_le16(count);
request->mode = dev->mode;
-   request->chip_select = dev->chip_select;
+   request->chip_select = spi_get_chipselect(dev, 0);
 
gb_xfer = >transfers[0];
tx_data = gb_xfer + count;  /* place tx data after last gb_xfer */
-- 
2.25.1



[PATCH V5 04/15] mtd: devices: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Reviewed-by: Michal Simek 
---
 drivers/mtd/devices/mtd_dataflash.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mtd/devices/mtd_dataflash.c 
b/drivers/mtd/devices/mtd_dataflash.c
index 25bad4318305..34d7a0c4807b 100644
--- a/drivers/mtd/devices/mtd_dataflash.c
+++ b/drivers/mtd/devices/mtd_dataflash.c
@@ -646,7 +646,7 @@ static int add_dataflash_otp(struct spi_device *spi, char 
*name, int nr_pages,
 
/* name must be usable with cmdlinepart */
sprintf(priv->name, "spi%d.%d-%s",
-   spi->master->bus_num, spi->chip_select,
+   spi->master->bus_num, spi_get_chipselect(spi, 0),
name);
 
device = >mtd;
-- 
2.25.1



[PATCH V5 03/15] iio: imu: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Acked-by: Jonathan Cameron 
Reviewed-by: Michal Simek 
---
 drivers/iio/imu/adis16400.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iio/imu/adis16400.c b/drivers/iio/imu/adis16400.c
index c02fc35dceb4..3eda32e12a53 100644
--- a/drivers/iio/imu/adis16400.c
+++ b/drivers/iio/imu/adis16400.c
@@ -466,7 +466,7 @@ static int adis16400_initial_setup(struct iio_dev 
*indio_dev)
 
dev_info(_dev->dev, "%s: prod_id 0x%04x at CS%d (irq 
%d)\n",
indio_dev->name, prod_id,
-   st->adis.spi->chip_select, st->adis.spi->irq);
+   spi_get_chipselect(st->adis.spi, 0), st->adis.spi->irq);
}
/* use high spi speed if possible */
if (st->variant->flags & ADIS16400_HAS_SLOW_MODE) {
-- 
2.25.1



[PATCH V5 02/15] net: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Reviewed-by: Michal Simek 
---
 drivers/net/ethernet/adi/adin1110.c| 2 +-
 drivers/net/ethernet/asix/ax88796c_main.c  | 2 +-
 drivers/net/ethernet/davicom/dm9051.c  | 2 +-
 drivers/net/ethernet/qualcomm/qca_debug.c  | 2 +-
 drivers/net/ieee802154/ca8210.c| 2 +-
 drivers/net/wan/slic_ds26522.c | 2 +-
 drivers/net/wireless/marvell/libertas/if_spi.c | 2 +-
 drivers/net/wireless/silabs/wfx/bus_spi.c  | 2 +-
 drivers/net/wireless/st/cw1200/cw1200_spi.c| 2 +-
 9 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/net/ethernet/adi/adin1110.c 
b/drivers/net/ethernet/adi/adin1110.c
index 0805f249fff2..aee7a98725ba 100644
--- a/drivers/net/ethernet/adi/adin1110.c
+++ b/drivers/net/ethernet/adi/adin1110.c
@@ -515,7 +515,7 @@ static int adin1110_register_mdiobus(struct adin1110_priv 
*priv,
return -ENOMEM;
 
snprintf(priv->mii_bus_name, MII_BUS_ID_SIZE, "%s-%u",
-priv->cfg->name, priv->spidev->chip_select);
+priv->cfg->name, spi_get_chipselect(priv->spidev, 0));
 
mii_bus->name = priv->mii_bus_name;
mii_bus->read = adin1110_mdio_read;
diff --git a/drivers/net/ethernet/asix/ax88796c_main.c 
b/drivers/net/ethernet/asix/ax88796c_main.c
index 21376c79f671..e551ffaed20d 100644
--- a/drivers/net/ethernet/asix/ax88796c_main.c
+++ b/drivers/net/ethernet/asix/ax88796c_main.c
@@ -1006,7 +1006,7 @@ static int ax88796c_probe(struct spi_device *spi)
ax_local->mdiobus->parent = >dev;
 
snprintf(ax_local->mdiobus->id, MII_BUS_ID_SIZE,
-"ax88796c-%s.%u", dev_name(>dev), spi->chip_select);
+"ax88796c-%s.%u", dev_name(>dev), spi_get_chipselect(spi, 
0));
 
ret = devm_mdiobus_register(>dev, ax_local->mdiobus);
if (ret < 0) {
diff --git a/drivers/net/ethernet/davicom/dm9051.c 
b/drivers/net/ethernet/davicom/dm9051.c
index de7105a84747..70728b2e5f18 100644
--- a/drivers/net/ethernet/davicom/dm9051.c
+++ b/drivers/net/ethernet/davicom/dm9051.c
@@ -1123,7 +1123,7 @@ static int dm9051_mdio_register(struct board_info *db)
db->mdiobus->phy_mask = (u32)~BIT(1);
db->mdiobus->parent = >dev;
snprintf(db->mdiobus->id, MII_BUS_ID_SIZE,
-"dm9051-%s.%u", dev_name(>dev), spi->chip_select);
+"dm9051-%s.%u", dev_name(>dev), spi_get_chipselect(spi, 
0));
 
ret = devm_mdiobus_register(>dev, db->mdiobus);
if (ret)
diff --git a/drivers/net/ethernet/qualcomm/qca_debug.c 
b/drivers/net/ethernet/qualcomm/qca_debug.c
index f62c39544e08..6f2fa2a42770 100644
--- a/drivers/net/ethernet/qualcomm/qca_debug.c
+++ b/drivers/net/ethernet/qualcomm/qca_debug.c
@@ -119,7 +119,7 @@ qcaspi_info_show(struct seq_file *s, void *what)
seq_printf(s, "SPI mode : %x\n",
   qca->spi_dev->mode);
seq_printf(s, "SPI chip select  : %u\n",
-  (unsigned int)qca->spi_dev->chip_select);
+  (unsigned int)spi_get_chipselect(qca->spi_dev, 0));
seq_printf(s, "SPI legacy mode  : %u\n",
   (unsigned int)qca->legacy_mode);
seq_printf(s, "SPI burst length : %u\n",
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c
index e1a569b99e4a..7093a07141bb 100644
--- a/drivers/net/ieee802154/ca8210.c
+++ b/drivers/net/ieee802154/ca8210.c
@@ -2967,7 +2967,7 @@ static int ca8210_test_interface_init(struct ca8210_priv 
*priv)
sizeof(node_name),
"ca8210@%d_%d",
priv->spi->master->bus_num,
-   priv->spi->chip_select
+   spi_get_chipselect(priv->spi, 0)
);
 
test->ca8210_dfs_spi_int = debugfs_create_file(
diff --git a/drivers/net/wan/slic_ds26522.c b/drivers/net/wan/slic_ds26522.c
index 6063552cea9b..8a51cfcff99e 100644
--- a/drivers/net/wan/slic_ds26522.c
+++ b/drivers/net/wan/slic_ds26522.c
@@ -211,7 +211,7 @@ static int slic_ds26522_probe(struct spi_device *spi)
 
ret = slic_ds26522_init_configure(spi);
if (ret == 0)
-   pr_info("DS26522 cs%d configured\n", spi->chip_select);
+   pr_info("DS26522 cs%d configured\n", spi_get_chipselect(spi, 
0));
 
return ret;
 }

[PATCH V5 01/15] spi: Replace all spi->chip_select and spi->cs_gpiod references with function call

2023-03-06 Thread Amit Kumar Mahapatra
Supporting multi-cs in spi drivers would require the chip_select & cs_gpiod
members of struct spi_device to be an array. But changing the type of these
members to array would break the spi driver functionality. To make the
transition smoother introduced four new APIs to get/set the
spi->chip_select & spi->cs_gpiod and replaced all spi->chip_select and
spi->cs_gpiod references with get or set API calls.
While adding multi-cs support in further patches the chip_select & cs_gpiod
members of the spi_device structure would be converted to arrays & the
"idx" parameter of the APIs would be used as array index i.e.,
spi->chip_select[idx] & spi->cs_gpiod[idx] respectively.

Signed-off-by: Amit Kumar Mahapatra 
Acked-by: Heiko Stuebner  # Rockchip drivers
Reviewed-by: Michal Simek 
Reviewed-by: Cédric Le Goater  # Aspeed driver
Reviewed-by: Dhruva Gole  # SPI Cadence QSPI
Reviewed-by: Patrice Chotard  # spi-stm32-qspi
Acked-by: William Zhang  # bcm63xx-hsspi driver
Reviewed-by: Serge Semin  # DW SSI part
---
 drivers/spi/spi-altera-core.c |  2 +-
 drivers/spi/spi-amd.c |  4 ++--
 drivers/spi/spi-ar934x.c  |  2 +-
 drivers/spi/spi-armada-3700.c |  4 ++--
 drivers/spi/spi-aspeed-smc.c  | 13 +++--
 drivers/spi/spi-at91-usart.c  |  2 +-
 drivers/spi/spi-ath79.c   |  4 ++--
 drivers/spi/spi-atmel.c   | 26 +-
 drivers/spi/spi-au1550.c  |  4 ++--
 drivers/spi/spi-axi-spi-engine.c  |  2 +-
 drivers/spi/spi-bcm-qspi.c| 10 +-
 drivers/spi/spi-bcm2835.c | 19 ++-
 drivers/spi/spi-bcm2835aux.c  |  4 ++--
 drivers/spi/spi-bcm63xx-hsspi.c   | 30 +++---
 drivers/spi/spi-bcm63xx.c |  2 +-
 drivers/spi/spi-bcmbca-hsspi.c| 30 +++---
 drivers/spi/spi-cadence-quadspi.c |  5 +++--
 drivers/spi/spi-cadence-xspi.c|  4 ++--
 drivers/spi/spi-cadence.c |  4 ++--
 drivers/spi/spi-cavium.c  |  8 
 drivers/spi/spi-coldfire-qspi.c   |  8 
 drivers/spi/spi-davinci.c | 18 +-
 drivers/spi/spi-dln2.c|  6 +++---
 drivers/spi/spi-dw-core.c |  2 +-
 drivers/spi/spi-dw-mmio.c |  4 ++--
 drivers/spi/spi-falcon.c  |  2 +-
 drivers/spi/spi-fsi.c |  2 +-
 drivers/spi/spi-fsl-dspi.c| 16 
 drivers/spi/spi-fsl-espi.c|  6 +++---
 drivers/spi/spi-fsl-lpspi.c   |  2 +-
 drivers/spi/spi-fsl-qspi.c|  6 +++---
 drivers/spi/spi-fsl-spi.c |  2 +-
 drivers/spi/spi-geni-qcom.c   |  6 +++---
 drivers/spi/spi-gpio.c|  4 ++--
 drivers/spi/spi-gxp.c |  4 ++--
 drivers/spi/spi-hisi-sfc-v3xx.c   |  2 +-
 drivers/spi/spi-img-spfi.c| 14 +++---
 drivers/spi/spi-imx.c | 30 +++---
 drivers/spi/spi-ingenic.c |  4 ++--
 drivers/spi/spi-intel.c   |  2 +-
 drivers/spi/spi-jcore.c   |  4 ++--
 drivers/spi/spi-lantiq-ssc.c  |  6 +++---
 drivers/spi/spi-mem.c |  4 ++--
 drivers/spi/spi-meson-spicc.c |  2 +-
 drivers/spi/spi-microchip-core.c  |  6 +++---
 drivers/spi/spi-mpc512x-psc.c |  8 
 drivers/spi/spi-mpc52xx.c |  2 +-
 drivers/spi/spi-mt65xx.c  |  6 +++---
 drivers/spi/spi-mt7621.c  |  2 +-
 drivers/spi/spi-mux.c |  8 
 drivers/spi/spi-mxic.c| 10 +-
 drivers/spi/spi-mxs.c |  2 +-
 drivers/spi/spi-npcm-fiu.c| 20 ++--
 drivers/spi/spi-nxp-fspi.c| 10 +-
 drivers/spi/spi-omap-100k.c   |  2 +-
 drivers/spi/spi-omap-uwire.c  |  8 
 drivers/spi/spi-omap2-mcspi.c | 24 
 drivers/spi/spi-orion.c   |  4 ++--
 drivers/spi/spi-pci1.c|  4 ++--
 drivers/spi/spi-pic32-sqi.c   |  2 +-
 drivers/spi/spi-pic32.c   |  4 ++--
 drivers/spi/spi-pl022.c   |  4 ++--
 drivers/spi/spi-pxa2xx.c  |  6 +++---
 drivers/spi/spi-qcom-qspi.c   |  2 +-
 drivers/spi/spi-rb4xx.c   |  2 +-
 drivers/spi/spi-rockchip-sfc.c|  2 +-
 drivers/spi/spi-rockchip.c| 26 ++
 drivers/spi/spi-rspi.c| 10 +-
 drivers/spi/spi-s3c64xx.c |  2 +-
 drivers/spi/spi-sc18is602.c   |  4 ++--
 drivers/spi/spi-sh-msiof.c|  6 +++---
 drivers/spi/spi-sh-sci.c  |  2 +-
 drivers/spi/spi-sifive.c  |  6 +++---
 drivers/spi/spi-sn-f-ospi.c   |  2 +-
 drivers/spi/spi-st-ssc4.c |  2 +-
 drivers/spi/spi-stm32-qspi.c  | 12 ++--
 drivers/spi/spi-sun4i.c   |  2 +-
 drivers/spi/spi-sun6i.c   |  2 +-
 drivers/spi/spi-synquacer.c   |  6 +++---
 drivers/spi/spi-tegra114.c| 28 ++--
 drivers/spi/spi-tegra20-sflash.c  |  2 +-
 drivers/spi/spi-tegra20-slink.c   |  6 +++---
 

[PATCH V5 00/15] spi: Add support for stacked/parallel memories

2023-03-06 Thread Amit Kumar Mahapatra
This patch is in the continuation to the discussions which happened on
'commit f89504300e94 ("spi: Stacked/parallel memories bindings")' for
adding dt-binding support for stacked/parallel memories.

This patch series updated the spi-nor, spi core and the spi drivers
to add stacked and parallel memories support.

The first patch
https://lore.kernel.org/all/20230119185342.2093323-1-amit.kumar-mahapa...@amd.com/
of the previous series got applied to
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
But the rest of the patches in the series did not get applied due to merge
conflict, so send the remaining patches in the series after rebasing it
on top of for-next branch.
---
BRANCH: for-next

Changes in v5:
- Rebased the patches on top of v6.3-rc1 and fixed the merge conflicts.
- Fixed compilation warnings in spi-sh-msiof.c with shmobile_defconfig  

Changes in v4:
- Fixed build error in spi-pl022.c file - reported by Mark.
- Fixed build error in spi-sn-f-ospi.c file.
- Added Reviewed-by: Serge Semin  tag.
- Added two more patches to replace spi->chip_select with API calls in
  mpc832x_rdb.c & cs35l41_hda_spi.c files.

Changes in v3:
- Rebased the patches on top of
  https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next
- Added a patch to convert spi_nor_otp_region_len(nor) &
  spi_nor_otp_n_regions(nor) macros into inline functions
- Added Reviewed-by & Acked-by tags

Changes in v2:
- Rebased the patches on top of v6.2-rc1
- Created separate patch to add get & set APIs for spi->chip_select &
  spi->cs_gpiod, and replaced all spi->chip_select and spi->cs_gpiod
  references with the API calls.
- Created separate patch to add get & set APIs for nor->params.
---
Amit Kumar Mahapatra (15):
  spi: Replace all spi->chip_select and spi->cs_gpiod references with
function call
  net: Replace all spi->chip_select and spi->cs_gpiod references with
function call
  iio: imu: Replace all spi->chip_select and spi->cs_gpiod references
with function call
  mtd: devices: Replace all spi->chip_select and spi->cs_gpiod
references with function call
  staging: Replace all spi->chip_select and spi->cs_gpiod references
with function call
  platform/x86: serial-multi-instantiate: Replace all spi->chip_select
and spi->cs_gpiod references with function call
  powerpc/83xx/mpc832x_rdb: Replace all spi->chip_select references with
function call
  ALSA: hda: cs35l41: Replace all spi->chip_select references with
function call
  spi: Add stacked and parallel memories support in SPI core
  mtd: spi-nor: Convert macros with inline functions
  mtd: spi-nor: Add APIs to set/get nor->params
  mtd: spi-nor: Add stacked memories support in spi-nor
  spi: spi-zynqmp-gqspi: Add stacked memories support in GQSPI driver
  mtd: spi-nor: Add parallel memories support in spi-nor
  spi: spi-zynqmp-gqspi: Add parallel memories support in GQSPI driver

 arch/powerpc/platforms/83xx/mpc832x_rdb.c |   2 +-
 drivers/iio/imu/adis16400.c   |   2 +-
 drivers/mtd/devices/mtd_dataflash.c   |   2 +-
 drivers/mtd/spi-nor/atmel.c   |  17 +-
 drivers/mtd/spi-nor/core.c| 665 +++---
 drivers/mtd/spi-nor/core.h|   8 +
 drivers/mtd/spi-nor/debugfs.c |   4 +-
 drivers/mtd/spi-nor/gigadevice.c  |   4 +-
 drivers/mtd/spi-nor/issi.c|  11 +-
 drivers/mtd/spi-nor/macronix.c|   6 +-
 drivers/mtd/spi-nor/micron-st.c   |  39 +-
 drivers/mtd/spi-nor/otp.c |  48 +-
 drivers/mtd/spi-nor/sfdp.c|  29 +-
 drivers/mtd/spi-nor/spansion.c|  50 +-
 drivers/mtd/spi-nor/sst.c |   7 +-
 drivers/mtd/spi-nor/swp.c |  22 +-
 drivers/mtd/spi-nor/winbond.c |  10 +-
 drivers/mtd/spi-nor/xilinx.c  |  18 +-
 drivers/net/ethernet/adi/adin1110.c   |   2 +-
 drivers/net/ethernet/asix/ax88796c_main.c |   2 +-
 drivers/net/ethernet/davicom/dm9051.c |   2 +-
 drivers/net/ethernet/qualcomm/qca_debug.c |   2 +-
 drivers/net/ieee802154/ca8210.c   |   2 +-
 drivers/net/wan/slic_ds26522.c|   2 +-
 .../net/wireless/marvell/libertas/if_spi.c|   2 +-
 drivers/net/wireless/silabs/wfx/bus_spi.c |   2 +-
 drivers/net/wireless/st/cw1200/cw1200_spi.c   |   2 +-
 .../platform/x86/serial-multi-instantiate.c   |   3 +-
 drivers/spi/spi-altera-core.c |   2 +-
 drivers/spi/spi-amd.c |   4 +-
 drivers/spi/spi-ar934x.c  |   2 +-
 drivers/spi/spi-armada-3700.c |   4 +-
 drivers/spi/spi-aspeed-smc.c  |  13 +-
 drivers/spi/spi-at91-usart.c  |   2 +-
 drivers/spi/spi-ath79.c   |   4 +-
 drivers/spi/spi-atmel.c   |  26 +-
 drivers/spi/spi-au1550.c 

[PATCH 6/8] powerpc/rtas: lockdep annotations

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

Add lockdep annotations for the following properties that must hold:

* Any error log retrieval must be atomically coupled with the prior
  RTAS call, without a window for another RTAS call to occur before the
  error log can be retrieved.

* All users of the core rtas_args parameter block must hold rtas_lock.

Move the definitions of rtas_lock and rtas_args up in the file so that
__do_enter_rtas_trace() can refer to them.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 33 +++--
 1 file changed, 23 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 96a10a0abe3a..633c925164e7 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -453,6 +454,16 @@ static struct rtas_function rtas_function_table[] 
__ro_after_init = {
},
 };
 
+/*
+ * Nearly all RTAS calls need to be serialized. All uses of the
+ * default rtas_args block must hold rtas_lock.
+ *
+ * Exceptions to the RTAS serialization requirement (e.g. stop-self)
+ * must use a separate rtas_args structure.
+ */
+static DEFINE_RAW_SPINLOCK(rtas_lock);
+static struct rtas_args rtas_args;
+
 /**
  * rtas_function_token() - RTAS function token lookup.
  * @handle: Function handle, e.g. RTAS_FN_EVENT_SCAN.
@@ -560,6 +571,9 @@ static void __do_enter_rtas(struct rtas_args *args)
 static void __do_enter_rtas_trace(struct rtas_args *args)
 {
const char *name = NULL;
+
+   if (args == _args)
+   lockdep_assert_held(_lock);
/*
 * If the tracepoints that consume the function name aren't
 * active, avoid the lookup.
@@ -619,16 +633,6 @@ static void do_enter_rtas(struct rtas_args *args)
 
 struct rtas_t rtas;
 
-/*
- * Nearly all RTAS calls need to be serialized. All uses of the
- * default rtas_args block must hold rtas_lock.
- *
- * Exceptions to the RTAS serialization requirement (e.g. stop-self)
- * must use a separate rtas_args structure.
- */
-static DEFINE_RAW_SPINLOCK(rtas_lock);
-static struct rtas_args rtas_args;
-
 DEFINE_SPINLOCK(rtas_data_buf_lock);
 EXPORT_SYMBOL_GPL(rtas_data_buf_lock);
 
@@ -951,6 +955,8 @@ static char *__fetch_rtas_last_error(char *altbuf)
u32 bufsz;
char *buf = NULL;
 
+   lockdep_assert_held(_lock);
+
if (token == -1)
return NULL;
 
@@ -1107,6 +1113,7 @@ static bool token_is_restricted_errinjct(s32 token)
  */
 int rtas_call(int token, int nargs, int nret, int *outputs, ...)
 {
+   struct pin_cookie cookie;
va_list list;
int i;
unsigned long flags;
@@ -1133,6 +1140,8 @@ int rtas_call(int token, int nargs, int nret, int 
*outputs, ...)
}
 
raw_spin_lock_irqsave(_lock, flags);
+   cookie = lockdep_pin_lock(_lock);
+
/* We use the global rtas args buffer */
args = _args;
 
@@ -1150,6 +1159,7 @@ int rtas_call(int token, int nargs, int nret, int 
*outputs, ...)
outputs[i] = be32_to_cpu(args->rets[i + 1]);
ret = (nret > 0) ? be32_to_cpu(args->rets[0]) : 0;
 
+   lockdep_unpin_lock(_lock, cookie);
raw_spin_unlock_irqrestore(_lock, flags);
 
if (buff_copy) {
@@ -1781,6 +1791,7 @@ static bool block_rtas_call(int token, int nargs,
 /* We assume to be passed big endian arguments */
 SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 {
+   struct pin_cookie cookie;
struct rtas_args args;
unsigned long flags;
char *buff_copy, *errbuf = NULL;
@@ -1849,6 +1860,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
buff_copy = get_errorlog_buffer();
 
raw_spin_lock_irqsave(_lock, flags);
+   cookie = lockdep_pin_lock(_lock);
 
rtas_args = args;
do_enter_rtas(_args);
@@ -1859,6 +1871,7 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
if (be32_to_cpu(args.rets[0]) == -1)
errbuf = __fetch_rtas_last_error(buff_copy);
 
+   lockdep_unpin_lock(_lock, cookie);
raw_spin_unlock_irqrestore(_lock, flags);
 
if (buff_copy) {

-- 
2.39.1



[PATCH 2/8] powerpc/rtas: use memmove for potentially overlapping buffer copy

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

Using memcpy() isn't safe when buf is identical to rtas_err_buf, which
can happen during boot before slab is up. Full context which may not
be obvious from the diff:

if (altbuf) {
buf = altbuf;
} else {
buf = rtas_err_buf;
if (slab_is_available())
buf = kmalloc(RTAS_ERROR_LOG_MAX, GFP_ATOMIC);
}
if (buf)
memcpy(buf, rtas_err_buf, RTAS_ERROR_LOG_MAX);

This was found by inspection and I'm not aware of it causing problems
in practice. It appears to have been introduced by commit
033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel"); the
old ppc64 version of this code did not have this problem.

Use memmove() instead.

Fixes: 033ef338b6e0 ("powerpc: Merge rtas.c into arch/powerpc/kernel")
Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 31175b34856a..9256cfaa8b6f 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -981,7 +981,7 @@ static char *__fetch_rtas_last_error(char *altbuf)
buf = kmalloc(RTAS_ERROR_LOG_MAX, GFP_ATOMIC);
}
if (buf)
-   memcpy(buf, rtas_err_buf, RTAS_ERROR_LOG_MAX);
+   memmove(buf, rtas_err_buf, RTAS_ERROR_LOG_MAX);
}
 
return buf;

-- 
2.39.1



[PATCH 5/8] powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call()

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

The function name va_rtas_call_unlocked() is confusing: it may be
called with or without rtas_lock held. Rename it to va_rtas_call().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c29c38b1a55a..96a10a0abe3a 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -996,9 +996,8 @@ static void __init init_error_log_max(void) {}
 #endif
 
 
-static void
-va_rtas_call_unlocked(struct rtas_args *args, int token, int nargs, int nret,
- va_list list)
+static void va_rtas_call(struct rtas_args *args, int token, int nargs, int 
nret,
+va_list list)
 {
int i;
 
@@ -1038,7 +1037,7 @@ void rtas_call_unlocked(struct rtas_args *args, int 
token, int nargs, int nret,
va_list list;
 
va_start(list, nret);
-   va_rtas_call_unlocked(args, token, nargs, nret, list);
+   va_rtas_call(args, token, nargs, nret, list);
va_end(list);
 }
 
@@ -1138,7 +1137,7 @@ int rtas_call(int token, int nargs, int nret, int 
*outputs, ...)
args = _args;
 
va_start(list, outputs);
-   va_rtas_call_unlocked(args, token, nargs, nret, list);
+   va_rtas_call(args, token, nargs, nret, list);
va_end(list);
 
/* A -1 return code indicates that the last command couldn't

-- 
2.39.1



[PATCH 1/8] powerpc/rtas: ensure 8-byte alignment for struct rtas_args

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

CHRP and PAPR agree: "In order to make an RTAS call, the operating
system must construct an argument call buffer aligned on an eight byte
boundary in physically contiguous real memory [...]." (7.2.7 Calling
Mechanism and Conventions).

struct rtas_args is the type used for this argument call buffer. The
unarchitected 'rets' member happens to produce 8-byte alignment for
the struct on 64-bit targets in practice. But without an alignment
directive the structure will have only 4-byte alignment on 32-bit
targets:

  $ nm b/{before,after}/chrp32/vmlinux | grep rtas_args
  c096881c b rtas_args
  c0968820 b rtas_args

Add an alignment directive to the struct rtas_args declaration so all
instances have the alignment required by the specs. rtas-types.h no
longer refers to any spinlock types, so drop the spinlock_types.h
inclusion while we're here.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/include/asm/rtas-types.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/rtas-types.h 
b/arch/powerpc/include/asm/rtas-types.h
index f2ad4a96cbc5..861145c8a021 100644
--- a/arch/powerpc/include/asm/rtas-types.h
+++ b/arch/powerpc/include/asm/rtas-types.h
@@ -2,7 +2,8 @@
 #ifndef _ASM_POWERPC_RTAS_TYPES_H
 #define _ASM_POWERPC_RTAS_TYPES_H
 
-#include 
+#include 
+#include 
 
 typedef __be32 rtas_arg_t;
 
@@ -12,7 +13,7 @@ struct rtas_args {
__be32 nret;
rtas_arg_t args[16];
rtas_arg_t *rets; /* Pointer to return values in args[]. */
-};
+} __aligned(SZ_8);
 
 struct rtas_t {
unsigned long entry;/* physical address pointer */

-- 
2.39.1



[PATCH 0/8] RTAS changes for 6.4

2023-03-06 Thread Nathan Lynch via B4 Relay
Proposed changes for the RTAS subsystem and client code.

Fixes that are subject to backporting are at the front of the queue,
followed by documentation and cleanups, with enhancements at the end.

Noteworthy changes:
* Change sys_rtas() to consume -2/990x statuses instead of returning
  them to user space.
* Lockdep annotations for invariants in rtas.c.

Signed-off-by: Nathan Lynch 
---
Nathan Lynch (8):
  powerpc/rtas: ensure 8-byte alignment for struct rtas_args
  powerpc/rtas: use memmove for potentially overlapping buffer copy
  powerpc/rtas: rtas_call_unlocked() kerneldoc
  powerpc/rtas: fix miswording in rtas_function kerneldoc
  powerpc/rtas: rename va_rtas_call_unlocked() to va_rtas_call()
  powerpc/rtas: lockdep annotations
  powerpc/rtas: warn on unsafe argument to rtas_call_unlocked()
  powerpc/rtas: consume retry statuses in sys_rtas()

 arch/powerpc/include/asm/rtas-types.h |  5 +-
 arch/powerpc/kernel/rtas.c| 92 +--
 2 files changed, 69 insertions(+), 28 deletions(-)
---
base-commit: 422fbcbf91303706823bc3babceb1df1a42112bf
change-id: 20230220-rtas-queue-for-6-4-214eb2ba1407

Best regards,
-- 
Nathan Lynch 



[PATCH 8/8] powerpc/rtas: consume retry statuses in sys_rtas()

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

The kernel can handle retrying RTAS function calls in response to
-2/990x in the sys_rtas() handler instead of relaying the intermediate
status to user space.

Justifications:

* Currently it's nondeterministic and quite variable in practice
  whether a retry status is returned for any given invocation of
  sys_rtas(). Therefore user space code cannot be expecting a retry
  result without already being broken.

* This tends to significantly reduce the total number of system calls
  issued by programs such as drmgr which make use of sys_rtas(),
  improving the experience of tracing and debugging such
  programs. This is the main motivation for me: I think this change
  will make it easier for us to characterize current sys_rtas() use
  cases as we move them to other interfaces over time.

* It reduces the number of opportunities for user space to leave
  complex operations, such as those associated with DLPAR, incomplete
  and diffcult to recover.

* We can expect performance improvements for existing sys_rtas()
  users, not only because of overall reduction in the number of system
  calls issued, but also due to the better handling of -2/990x in the
  kernel. For example, librtas still sleeps for 1ms on -2, which is
  completely unnecessary.

Performance differences for PHB add and remove on a small P10 PowerVM
partition are included below. For add, elapsed time is slightly
reduced. For remove, there are more significant improvements: the
number of context switches is reduced by an order of magnitude, and
elapsed time is reduced by over half.

(- before, + after):

  Performance counter stats for 'drmgr -c phb -a -s PHB 23' (5 runs):

-  1,847.58 msec task-clock   #0.135 CPUs 
utilized   ( +- 14.15% )
-10,867  cs   #9.800 K/sec  
 ( +- 14.14% )
+  1,901.15 msec task-clock   #0.148 CPUs 
utilized   ( +- 14.13% )
+10,451  cs   #9.158 K/sec  
 ( +- 14.14% )

- 13.656557 +- 0.000124 seconds time elapsed  ( +-  0.00% )
+  12.88080 +- 0.00404 seconds time elapsed  ( +-  0.03% )

  Performance counter stats for 'drmgr -c phb -r -s PHB 23' (5 runs):

-  1,473.75 msec task-clock   #0.092 CPUs 
utilized   ( +- 14.15% )
- 2,652  cs   #3.000 K/sec  
 ( +- 14.16% )
+  1,444.55 msec task-clock   #0.221 CPUs 
utilized   ( +- 14.14% )
+   104  cs   #  119.957 /sec   
 ( +- 14.63% )

-  15.99718 +- 0.00801 seconds time elapsed  ( +-  0.05% )
+   6.54256 +- 0.00830 seconds time elapsed  ( +-  0.13% )

Move the existing rtas_lock-guarded critical section in sys_rtas()
into a conventional rtas_busy_delay()-based loop, returning to user
space only when a final success or failure result is available.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 47a2aa43d7d4..c330a22ccc70 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1798,7 +1798,6 @@ static bool block_rtas_call(int token, int nargs,
 /* We assume to be passed big endian arguments */
 SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 {
-   struct pin_cookie cookie;
struct rtas_args args;
unsigned long flags;
char *buff_copy, *errbuf = NULL;
@@ -1866,20 +1865,25 @@ SYSCALL_DEFINE1(rtas, struct rtas_args __user *, uargs)
 
buff_copy = get_errorlog_buffer();
 
-   raw_spin_lock_irqsave(_lock, flags);
-   cookie = lockdep_pin_lock(_lock);
+   do {
+   struct pin_cookie cookie;
 
-   rtas_args = args;
-   do_enter_rtas(_args);
-   args = rtas_args;
+   raw_spin_lock_irqsave(_lock, flags);
+   cookie = lockdep_pin_lock(_lock);
 
-   /* A -1 return code indicates that the last command couldn't
-  be completed due to a hardware error. */
-   if (be32_to_cpu(args.rets[0]) == -1)
-   errbuf = __fetch_rtas_last_error(buff_copy);
+   rtas_args = args;
+   do_enter_rtas(_args);
+   args = rtas_args;
 
-   lockdep_unpin_lock(_lock, cookie);
-   raw_spin_unlock_irqrestore(_lock, flags);
+   /*
+* Handle error record retrieval before releasing the lock.
+*/
+   if (be32_to_cpu(args.rets[0]) == -1)
+   errbuf = __fetch_rtas_last_error(buff_copy);
+
+   lockdep_unpin_lock(_lock, cookie);
+   raw_spin_unlock_irqrestore(_lock, flags);

[PATCH 3/8] powerpc/rtas: rtas_call_unlocked() kerneldoc

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

Add documentation for rtas_call_unlocked(), including details on how
it differs from rtas_call().

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 9256cfaa8b6f..c73b01d722f6 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1016,6 +1016,23 @@ va_rtas_call_unlocked(struct rtas_args *args, int token, 
int nargs, int nret,
do_enter_rtas(args);
 }
 
+/**
+ * rtas_call_unlocked() - Invoke an RTAS firmware function without 
synchronization.
+ * @args: RTAS parameter block to be used for the call, must obey RTAS 
addressing
+ *constraints.
+ * @token: Identifies the function being invoked.
+ * @nargs: Number of input parameters. Does not include token.
+ * @nret: Number of output parameters, including the call status.
+ * @: List of @nargs input parameters.
+ *
+ * Invokes the RTAS function indicated by @token, which the caller
+ * should obtain via rtas_function_token().
+ *
+ * This function is similar to rtas_call(), but must be used with a
+ * limited set of RTAS calls specifically exempted from the general
+ * requirement that only one RTAS call may be in progress at any
+ * time. Examples include stop-self and ibm,nmi-interlock.
+ */
 void rtas_call_unlocked(struct rtas_args *args, int token, int nargs, int 
nret, ...)
 {
va_list list;

-- 
2.39.1



[PATCH 7/8] powerpc/rtas: warn on unsafe argument to rtas_call_unlocked()

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

Any caller of rtas_call_unlocked() must provide an rtas_args parameter
block distinct from the core rtas_args buffer used by the rtas_call()
path. It's an unlikely error to make, but the potential consequences
are grim, and it's trivial to check.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 633c925164e7..47a2aa43d7d4 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -1042,6 +1042,13 @@ void rtas_call_unlocked(struct rtas_args *args, int 
token, int nargs, int nret,
 {
va_list list;
 
+   /*
+* Callers must not use rtas_args; otherwise they risk
+* corrupting the state of the rtas_call() path, which is
+* serialized by rtas_lock.
+*/
+   WARN_ON(args == _args);
+
va_start(list, nret);
va_rtas_call(args, token, nargs, nret, list);
va_end(list);

-- 
2.39.1



[PATCH 4/8] powerpc/rtas: fix miswording in rtas_function kerneldoc

2023-03-06 Thread Nathan Lynch via B4 Relay
From: Nathan Lynch 

The 'filter' member is a pointer, not a bool; fix the wording
accordingly.

Signed-off-by: Nathan Lynch 
---
 arch/powerpc/kernel/rtas.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index c73b01d722f6..c29c38b1a55a 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -68,7 +68,7 @@ struct rtas_filter {
  *functions are believed to have no users on
  *ppc64le, and we want to keep it that way. It does
  *not make sense for this to be set when @filter
- *is false.
+ *is NULL.
  */
 struct rtas_function {
s32 token;

-- 
2.39.1



Re: [PATCH v10 03/13] dt-bindings: Convert gpio-mmio to yaml

2023-03-06 Thread Sean Anderson
On 3/6/23 15:51, Jonas Gorski wrote:
> Hi,
> 
> On Mon, 6 Mar 2023 at 20:16, Sean Anderson  wrote:
>>
>> This is a generic binding for simple MMIO GPIO controllers. Although we
>> have a single driver for these controllers, they were previously spread
>> over several files. Consolidate them. The register descriptions are
>> adapted from the comments in the source. There is no set order for the
>> registers, so I have not specified one.
>>
>> Signed-off-by: Sean Anderson 
>> ---
>>
>> Changes in v10:
>> - New
>>
>>  .../bindings/gpio/brcm,bcm6345-gpio.yaml  |  16 +--
>>  .../devicetree/bindings/gpio/gpio-mmio.yaml   | 136 ++
>>  .../bindings/gpio/ni,169445-nand-gpio.txt |  38 -
>>  .../devicetree/bindings/gpio/wd,mbl-gpio.txt  |  38 -
>>  4 files changed, 137 insertions(+), 91 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
>>  delete mode 100644 
>> Documentation/devicetree/bindings/gpio/ni,169445-nand-gpio.txt
>>  delete mode 100644 Documentation/devicetree/bindings/gpio/wd,mbl-gpio.txt
>>
>> diff --git a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml 
>> b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
>> index 4d69f79df859..e11f4af49c52 100644
>> --- a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
>> +++ b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
> 
> You are (re-)moving the compatible this file is named after, you might
> want to rename the file as well then. Going by age bcm6358 would be
> the next oldest one (bcm6318 would be the newest, despite the lowest
> number).

I can do that. Would it be fine to rename to e.g. brcm,bcm63xx-gpio.yaml?

--Sean



Re: [PATCH v10 03/13] dt-bindings: Convert gpio-mmio to yaml

2023-03-06 Thread Jonas Gorski
Hi,

On Mon, 6 Mar 2023 at 20:16, Sean Anderson  wrote:
>
> This is a generic binding for simple MMIO GPIO controllers. Although we
> have a single driver for these controllers, they were previously spread
> over several files. Consolidate them. The register descriptions are
> adapted from the comments in the source. There is no set order for the
> registers, so I have not specified one.
>
> Signed-off-by: Sean Anderson 
> ---
>
> Changes in v10:
> - New
>
>  .../bindings/gpio/brcm,bcm6345-gpio.yaml  |  16 +--
>  .../devicetree/bindings/gpio/gpio-mmio.yaml   | 136 ++
>  .../bindings/gpio/ni,169445-nand-gpio.txt |  38 -
>  .../devicetree/bindings/gpio/wd,mbl-gpio.txt  |  38 -
>  4 files changed, 137 insertions(+), 91 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
>  delete mode 100644 
> Documentation/devicetree/bindings/gpio/ni,169445-nand-gpio.txt
>  delete mode 100644 Documentation/devicetree/bindings/gpio/wd,mbl-gpio.txt
>
> diff --git a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml 
> b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
> index 4d69f79df859..e11f4af49c52 100644
> --- a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
> +++ b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml

You are (re-)moving the compatible this file is named after, you might
want to rename the file as well then. Going by age bcm6358 would be
the next oldest one (bcm6318 would be the newest, despite the lowest
number).

Regards
Jonas


Re: [PATCH v4 31/33] powerc/mm: try VMA lock-based page fault handling first

2023-03-06 Thread Suren Baghdasaryan
On Mon, Feb 27, 2023 at 9:37 AM Suren Baghdasaryan  wrote:
>
> From: Laurent Dufour 
>
> Attempt VMA lock-based page fault handling first, and fall back to the
> existing mmap_lock-based handling if that fails.
> Copied from "x86/mm: try VMA lock-based page fault handling first"

Hi Andrew,
Laurent posted a fix for this patch at
https://lore.kernel.org/all/20230306154244.17560-1-lduf...@linux.ibm.com/.
Could you please squash the fix into this patch?
Thanks,
Suren.

>
> Signed-off-by: Laurent Dufour 
> Signed-off-by: Suren Baghdasaryan 
> ---
>  arch/powerpc/mm/fault.c| 41 ++
>  arch/powerpc/platforms/powernv/Kconfig |  1 +
>  arch/powerpc/platforms/pseries/Kconfig |  1 +
>  3 files changed, 43 insertions(+)
>
> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 2bef19cc1b98..c7ae86b04b8a 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -469,6 +469,44 @@ static int ___do_page_fault(struct pt_regs *regs, 
> unsigned long address,
> if (is_exec)
> flags |= FAULT_FLAG_INSTRUCTION;
>
> +#ifdef CONFIG_PER_VMA_LOCK
> +   if (!(flags & FAULT_FLAG_USER))
> +   goto lock_mmap;
> +
> +   vma = lock_vma_under_rcu(mm, address);
> +   if (!vma)
> +   goto lock_mmap;
> +
> +   if (unlikely(access_pkey_error(is_write, is_exec,
> +  (error_code & DSISR_KEYFAULT), vma))) {
> +   int rc = bad_access_pkey(regs, address, vma);
> +
> +   vma_end_read(vma);
> +   return rc;
> +   }
> +
> +   if (unlikely(access_error(is_write, is_exec, vma))) {
> +   int rc = bad_access(regs, address);
> +
> +   vma_end_read(vma);
> +   return rc;
> +   }
> +
> +   fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, 
> regs);
> +   vma_end_read(vma);
> +
> +   if (!(fault & VM_FAULT_RETRY)) {
> +   count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +   goto done;
> +   }
> +   count_vm_vma_lock_event(VMA_LOCK_RETRY);
> +
> +   if (fault_signal_pending(fault, regs))
> +   return user_mode(regs) ? 0 : SIGBUS;
> +
> +lock_mmap:
> +#endif /* CONFIG_PER_VMA_LOCK */
> +
> /* When running in the kernel we expect faults to occur only to
>  * addresses in user space.  All other faults represent errors in the
>  * kernel and should generate an OOPS.  Unfortunately, in the case of 
> an
> @@ -545,6 +583,9 @@ static int ___do_page_fault(struct pt_regs *regs, 
> unsigned long address,
>
> mmap_read_unlock(current->mm);
>
> +#ifdef CONFIG_PER_VMA_LOCK
> +done:
> +#endif
> if (unlikely(fault & VM_FAULT_ERROR))
> return mm_fault_error(regs, address, fault);
>
> diff --git a/arch/powerpc/platforms/powernv/Kconfig 
> b/arch/powerpc/platforms/powernv/Kconfig
> index ae248a161b43..70a46acc70d6 100644
> --- a/arch/powerpc/platforms/powernv/Kconfig
> +++ b/arch/powerpc/platforms/powernv/Kconfig
> @@ -16,6 +16,7 @@ config PPC_POWERNV
> select PPC_DOORBELL
> select MMU_NOTIFIER
> select FORCE_SMP
> +   select ARCH_SUPPORTS_PER_VMA_LOCK
> default y
>
>  config OPAL_PRD
> diff --git a/arch/powerpc/platforms/pseries/Kconfig 
> b/arch/powerpc/platforms/pseries/Kconfig
> index b481c5c8bae1..9c205fe0e619 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -21,6 +21,7 @@ config PPC_PSERIES
> select HOTPLUG_CPU
> select FORCE_SMP
> select SWIOTLB
> +   select ARCH_SUPPORTS_PER_VMA_LOCK
> default y
>
>  config PARAVIRT
> --
> 2.39.2.722.g9855ee24e9-goog
>


[PATCH v10 07/13] phy: fsl: Add Lynx 10G SerDes driver

2023-03-06 Thread Sean Anderson
This adds support for the Lynx 10G "SerDes" devices found on various NXP
QorIQ SoCs. There may be up to four SerDes devices on each SoC, each
supporting up to eight lanes. Protocol support for each SerDes is highly
heterogeneous, with each SoC typically having a totally different
selection of supported protocols for each lane. Additionally, the SerDes
devices on each SoC also have differing support. One SerDes will
typically support Ethernet on most lanes, while the other will typically
support PCIe on most lanes.

There is wide hardware support for this SerDes. It is present on QorIQ
T-Series and Layerscape processors. Because each SoC typically has
specific instructions and exceptions for its SerDes, I have limited the
initial scope of this module to just the LS1046A and LS1088A.
Additionally, I have only added support for Ethernet protocols. There is
not a great need for dynamic reconfiguration for other protocols (except
perhaps for M.2 cards), so support for them may never be added.

Nevertheless, I have tried to provide an obvious path for adding support
for other SoCs as well as other protocols. SATA just needs support for
configuring LNmSSCR0. PCIe may need to configure the equalization
registers. It also uses multiple lanes. I have tried to write the driver
with multi-lane support in mind, so there should not need to be any
large changes. Although there are 6 protocols supported, I have only
tested SGMII and XFI. The rest have been implemented as described in
the datasheet. Most of these protocols should work "as-is", but
10GBASE-KR will need PCS support for link training.

Unlike some other phys where e.g. PCIe x4 will use 4 separate phys all
configured for PCIe, this driver uses one phy configured to use 4 lanes.
This is because while the individual lanes may be configured
individually, the protocol selection acts on all lanes at once.
Additionally, the order which lanes should be configured in is specified
by the datasheet. To coordinate this, lanes are reserved in phy_init,
and released in phy_exit.

This driver was written with reference to the LS1046A reference manual.
However, it was informed by reference manuals for all processors with
mEMACs, especially the T4240 (which appears to have a "maxed-out"
configuration). The earlier P-series processors appear to be similar, but
have a different overall register layout (using "banks" instead of
separate SerDes). Perhaps this those use a "5G Lynx SerDes."

Signed-off-by: Sean Anderson 
---

Changes in v10:
- Fix debugging print with incorrect error variable

Changes in v9:
- Split off clock "driver" into its own patch to allow for better
  review.
- Add ability to defer lane initialization to phy_init. This allows
  for easier transitioning between firmware-managed serdes and Linux-
  managed serdes, as the consumer (such as dpaa2, which knows what the
  firmware is doing) has the last say on who gets control.
- phy-type -> fsl,phy

Changes in v8:
- Remove unused variable from lynx_ls_mode_init

Changes in v7:
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
  anything.

Changes in v6:
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
  arches.
- Depend on COMMON_CLK and either layerscape/ppc

Changes in v5:
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
  series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Rework all debug statements to remove use of __func__. Additional
  information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
  Trying to modify out parent's rate in set_rate will deadlock.
- Explicitly perform a stop/reset sequence in set_rate. This way we
  always ensure that the PLL is properly stopped.
- Set the power-down bit when disabling the PLL. We can do this now that
  enable/disable aren't abused during the set rate sequence.
- Fix typos in QSGMII_OFFSET and XFI_OFFSET
- Rename LNmTECR0_TEQ_TYPE_PRE to LNmTECR0_TEQ_TYPE_POST to better
  reflect its function (adding post-cursor equalization).
- Use of_clk_hw_onecell_get instead of a custom function.
- Return struct clks from lynx_clks_init instead of embedding lynx_clk
  in lynx_priv.
- Rework PCCR helper functions; T-series SoCs differ from Layerscape SoCs
  primarily in the layout and offset of the PCCRs. This will help bring a
  cleaner abstraction layer. The caps have been removed, since this handles the
  only current usage.
- Convert to use new binding format. As a result of this, we no longer need to
  have protocols for PCIe or SATA. Additionally, modes now live in lynx_group
  instead of lynx_priv.
- Remove teq from lynx_proto_params, since it can be determined from
  preq_ratio/postq_ratio.
- Fix an early return from lynx_set_mode not releasing 

[PATCH v10 12/13] arm64: dts: ls1088a: Prevent PCSs from probing as phys

2023-03-06 Thread Sean Anderson
The internal PCSs are not always accessible during boot (such as if the
serdes has deselected the appropriate link mode). Give them appropriate
compatible strings so they don't automatically (fail to) probe as
genphys.

Signed-off-by: Sean Anderson 

---

(no changes since v8)

Changes in v8:
- New

 .../arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 30 ---
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index 59b401daad4d..bbc714f84577 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -932,7 +932,8 @@ pcs_mdio1: mdio@8c07000 {
#size-cells = <0>;
status = "disabled";
 
-   pcs1: ethernet-phy@0 {
+   pcs1: ethernet-pcs@0 {
+   compatible = "fsl,lynx-pcs";
reg = <0>;
};
};
@@ -945,7 +946,8 @@ pcs_mdio2: mdio@8c0b000 {
#size-cells = <0>;
status = "disabled";
 
-   pcs2: ethernet-phy@0 {
+   pcs2: ethernet-pcs@0 {
+   compatible = "fsl,lynx-pcs";
reg = <0>;
};
};
@@ -958,19 +960,23 @@ pcs_mdio3: mdio@8c0f000 {
#size-cells = <0>;
status = "disabled";
 
-   pcs3_0: ethernet-phy@0 {
+   pcs3_0: ethernet-pcs@0 {
+   compatible = "fsl,lynx-pcs";
reg = <0>;
};
 
-   pcs3_1: ethernet-phy@1 {
+   pcs3_1: ethernet-pcs@1 {
+   compatible = "fsl,lynx-pcs";
reg = <1>;
};
 
-   pcs3_2: ethernet-phy@2 {
+   pcs3_2: ethernet-pcs@2 {
+   compatible = "fsl,lynx-pcs";
reg = <2>;
};
 
-   pcs3_3: ethernet-phy@3 {
+   pcs3_3: ethernet-pcs@3 {
+   compatible = "fsl,lynx-pcs";
reg = <3>;
};
};
@@ -983,19 +989,23 @@ pcs_mdio7: mdio@8c1f000 {
#size-cells = <0>;
status = "disabled";
 
-   pcs7_0: ethernet-phy@0 {
+   pcs7_0: ethernet-pcs@0 {
+   compatible = "fsl,lynx-pcs";
reg = <0>;
};
 
-   pcs7_1: ethernet-phy@1 {
+   pcs7_1: ethernet-pcs@1 {
+   compatible = "fsl,lynx-pcs";
reg = <1>;
};
 
-   pcs7_2: ethernet-phy@2 {
+   pcs7_2: ethernet-pcs@2 {
+   compatible = "fsl,lynx-pcs";
reg = <2>;
};
 
-   pcs7_3: ethernet-phy@3 {
+   pcs7_3: ethernet-pcs@3 {
+   compatible = "fsl,lynx-pcs";
reg = <3>;
};
};
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v10 13/13] arm64: dts: ls1088ardb: Add serdes descriptions

2023-03-06 Thread Sean Anderson
This adds serdes support to the LS1088ARDB. I have tested the QSGMII
ports as well as the two 10G ports. The SFP slot is now fully supported,
instead of being modeled as a fixed-link.

Linux hangs around when the serdes is initialized if the si5341 is
enabled with the in-tree driver, so I have modeled it as a two fixed
clocks instead. There are a few registers in the QIXIS FPGA which
control the SFP GPIOs; I have modeled them as discrete GPIO controllers
for now. I never saw the AQR105 interrupt fire; not sure what was going
on, but I have removed it to force polling.

To enable serdes support, the DPC needs to set the macs to
MAC_LINK_TYPE_BACKPLANE. All MACs using the same QSGMII should be
converted at once. Additionally, in order to change interface types, the
MC firmware must support DPAA2_MAC_FEATURE_PROTOCOL_CHANGE.

Signed-off-by: Sean Anderson 

---

Changes in v10:
- Move serdes bindings to SoC dtsi
- Use "descriptions" instead of "bindings"
- Don't use /clocks
- Add missing gpio-controller properties

Changes in v9:
- Add fsl,unused-lanes-reserved to allow a gradual transition, depending
  on the mac link type.
- Remove unused clocks
- Fix some phy mode node names
- phy-type -> fsl,phy

Changes in v8:
- Rename serdes phy handles like the LS1046A
- Add SFP slot binding
- Fix incorrect lane ordering (it's backwards on the LS1088A just like it is in
  the LS1046A).
- Fix duplicated lane 2 (it should have been lane 3).
- Fix incorrectly-documented value for XFI1.
- Remove interrupt for aquantia phy. It never fired for whatever reason,
  preventing the link from coming up.
- Add GPIOs for QIXIS FPGA.
- Enable MAC1 PCS
- Remove si5341 binding

Changes in v4:
- Convert to new bindings

 .../boot/dts/freescale/fsl-ls1088a-rdb.dts| 82 ++-
 1 file changed, 80 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
index ee8e932628d1..ede537b644e8 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a-rdb.dts
@@ -10,17 +10,55 @@
 
 /dts-v1/;
 
+#include 
+
 #include "fsl-ls1088a.dtsi"
 
 / {
model = "LS1088A RDB Board";
compatible = "fsl,ls1088a-rdb", "fsl,ls1088a";
+
+   clk_100mhz: clock-100mhz {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <1>;
+   };
+
+   clk_156mhz: clock-156mhz {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <15625>;
+   };
+
+   sfp_slot: sfp {
+   compatible = "sff,sfp";
+   i2c-bus = <_i2c>;
+   los-gpios = <_stat 5 GPIO_ACTIVE_HIGH>;
+   tx-fault-gpios = <_stat 4 GPIO_ACTIVE_HIGH>;
+   tx-disable-gpios = < 4 GPIO_ACTIVE_HIGH>;
+   };
+};
+
+ {
+   clocks = <_100mhz>, <_156mhz>;
+   clock-names = "ref0", "ref1";
+   fsl,unused-lanes-reserved;
+   status = "okay";
+};
+
+ {
+   managed = "in-band-status";
+   pcs-handle = <>;
+   phys = <_C>;
+   sfp = <_slot>;
 };
 
  {
phy-handle = <_aquantia_phy>;
phy-connection-type = "10gbase-r";
+   managed = "in-band-status";
pcs-handle = <>;
+   phys = <_D>;
 };
 
  {
@@ -28,6 +66,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_0>;
+   phys = <_A>;
 };
 
  {
@@ -35,6 +74,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_1>;
+   phys = <_A>;
 };
 
  {
@@ -42,6 +82,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_2>;
+   phys = <_A>;
 };
 
  {
@@ -49,6 +90,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_3>;
+   phys = <_A>;
 };
 
  {
@@ -56,6 +98,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_0>;
+   phys = <_B>;
 };
 
  {
@@ -63,6 +106,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_1>;
+   phys = <_B>;
 };
 
  {
@@ -70,6 +114,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_2>;
+   phys = <_B>;
 };
 
  {
@@ -77,6 +122,7 @@  {
phy-connection-type = "qsgmii";
managed = "in-band-status";
pcs-handle = <_3>;
+   phys = <_B>;
 };
 
  {
@@ -128,7 +174,6 @@  {
 
mdio2_aquantia_phy: ethernet-phy@0 {
compatible = "ethernet-phy-ieee802.3-c45";
-   interrupts-extended = < 2 IRQ_TYPE_LEVEL_LOW>;
reg = <0x0>;
};
 };
@@ -171,6 +216,12 @@ rtc@51 {
interrupts-extended = < 0 
IRQ_TYPE_LEVEL_LOW>;
};

[PATCH v10 11/13] arm64: dts: ls1088a: Add serdes nodes

2023-03-06 Thread Sean Anderson
This adds nodes for the SerDes devices. They are disabled by default
to prevent any breakage on existing boards.

Signed-off-by: Sean Anderson 
---

Changes in v10:
- Move serdes bindings to SoC dtsi
- Add support for all (ethernet) serdes modes
- Refer to "nodes" instead of "bindings"
- Move compatible/reg first

Changes in v4:
- Convert to new bindings

Changes in v3:
- New

 .../arm64/boot/dts/freescale/fsl-ls1088a.dtsi | 126 ++
 1 file changed, 126 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
index e5fb137ac02b..59b401daad4d 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1088a.dtsi
@@ -9,6 +9,7 @@
  */
 #include 
 #include 
+#include 
 #include 
 
 / {
@@ -238,6 +239,131 @@ reset: syscon@1e6 {
reg = <0x0 0x1e6 0x0 0x1>;
};
 
+   serdes1: serdes@1ea {
+   compatible = "fsl,ls1088a-serdes", "fsl,lynx-10g";
+   reg = <0x0 0x1ea 0x0 0x2000>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   #clock-cells = <1>;
+   status = "disabled";
+
+   /*
+* XXX: Lane A uses pins SD1_RX3_P/N! That is, the lane
+* numbers and pin numbers are _reversed_.
+*/
+   serdes1_A: phy@0 {
+   #phy-cells = <0>;
+   reg = <0>;
+
+   /* SG3 */
+   sgmii-0 {
+   fsl,pccr = <0x8>;
+   fsl,index = <0>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* QSGb */
+   qsgmii-0 {
+   fsl,pccr = <0x9>;
+   fsl,index = <0>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+   };
+
+   serdes1_B: phy@1 {
+   #phy-cells = <0>;
+   reg = <1>;
+
+   /* SG7 */
+   sgmii-1 {
+   fsl,pccr = <0x8>;
+   fsl,index = <1>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* QSGa */
+   qsgmii-1 {
+   fsl,pccr = <0x9>;
+   fsl,index = <1>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* TODO: PCIe1 */
+   };
+
+   serdes1_C: phy@2 {
+   #phy-cells = <0>;
+   reg = <2>;
+
+   /* SG1 */
+   sgmii-2 {
+   fsl,pccr = <0x8>;
+   fsl,index = <2>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /*
+* XFI1
+* Table 23-1 and section 23.5.16.4 disagree;
+* this reflects the table.
+*
+* fsl,cfg is documented as 1, but it is set to
+* 2 by the RCW! This is the same as the
+* LS1046A.
+*/
+   xfi-0 {
+   fsl,pccr = <0xb>;
+   fsl,index = <0>;
+   fsl,cfg = <0x2>;
+   fsl,type = ;
+   };
+   };
+
+   serdes1_D: phy@3 {
+   #phy-cells = <0>;
+   reg = <3>;
+
+   /* SG2 */
+   sgmii-3 {
+   fsl,pccr = <0x8>;
+   fsl,index = <3>;
+   fsl,cfg = <0x1>;
+   

[PATCH v10 10/13] arm64: dts: ls1046ardb: Add serdes descriptions

2023-03-06 Thread Sean Anderson
This adds appropriate descriptions for the macs which use the SerDes. The
156.25MHz fixed clock is a crystal. The 100MHz clocks (there are
actually 3) come from a Renesas 6V49205B at address 69 on i2c0. There is
no driver for this device (and as far as I know all you can do with the
100MHz clocks is gate them), so I have chosen to model it as a single
fixed clock.

Note: the SerDes1 lane numbering for the LS1046A is *reversed*.
This means that Lane A (what the driver thinks is lane 0) uses pins
SD1_TX3_P/N.

Signed-off-by: Sean Anderson 

---

Changes in v10:
- Move serdes descriptions to SoC dtsi
- Don't use /clocks
- Use "descriptions" instead of "bindings"
- Split off defconfig change into separate patch

Changes in v9:
- Fix name of phy mode node
- phy-type -> fsl,phy

Changes in v8:
- Rename serdes phy handles to use _A, _B, etc. instead of _0, _1, etc.
  This should help remind readers that the numbering corresponds to the
  physical layout of the registers, and not the lane (pin) number.

Changes in v6:
- XGI.9 -> XFI.9

Changes in v4:
- Convert to new bindings

 .../boot/dts/freescale/fsl-ls1046a-rdb.dts| 26 +++
 1 file changed, 26 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts 
b/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts
index 07f6cc6e354a..0d6dcfd1630a 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a-rdb.dts
@@ -26,6 +26,24 @@ aliases {
chosen {
stdout-path = "serial0:115200n8";
};
+
+   clk_100mhz: clock-100mhz {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <1>;
+   };
+
+   clk_156mhz: clock-156mhz {
+   compatible = "fixed-clock";
+   #clock-cells = <0>;
+   clock-frequency = <15625>;
+   };
+};
+
+ {
+   clocks = <_100mhz>, <_156mhz>;
+   clock-names = "ref0", "ref1";
+   status = "okay";
 };
 
  {
@@ -140,21 +158,29 @@ ethernet@e6000 {
ethernet@e8000 {
phy-handle = <_phy1>;
phy-connection-type = "sgmii";
+   phys = <_B>;
+   phy-names = "serdes";
};
 
ethernet@ea000 {
phy-handle = <_phy2>;
phy-connection-type = "sgmii";
+   phys = <_A>;
+   phy-names = "serdes";
};
 
ethernet@f { /* 10GEC1 */
phy-handle = <_phy>;
phy-connection-type = "xgmii";
+   phys = <_D>;
+   phy-names = "serdes";
};
 
ethernet@f2000 { /* 10GEC2 */
phy-connection-type = "10gbase-r";
managed = "in-band-status";
+   phys = <_C>;
+   phy-names = "serdes";
};
 
mdio@fc000 {
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v10 08/13] phy: lynx10g: Enable by default on Layerscape

2023-03-06 Thread Sean Anderson
The next few patches will break ethernet if the serdes is not enabled,
so enable the serdes driver by default on Layerscape.

Signed-off-by: Sean Anderson 
---

Changes in v10:
- New

 drivers/phy/freescale/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/phy/freescale/Kconfig b/drivers/phy/freescale/Kconfig
index 6bebe00f5889..b396162dc859 100644
--- a/drivers/phy/freescale/Kconfig
+++ b/drivers/phy/freescale/Kconfig
@@ -54,6 +54,7 @@ config PHY_FSL_LYNX_10G
depends on ARCH_LAYERSCAPE || PPC || COMPILE_TEST
select GENERIC_PHY
select REGMAP_MMIO
+   default y if ARCH_LAYERSCAPE
help
  This adds support for the Lynx "SerDes" devices found on various QorIQ
  SoCs. There may be up to four SerDes devices on each SoC, and each
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v10 09/13] arm64: dts: ls1046a: Add serdes nodes

2023-03-06 Thread Sean Anderson
This adds nodes for the SerDes devices. They are disabled by default
to prevent any breakage on existing boards.

Signed-off-by: Sean Anderson 
---

Changes in v10:
- Move serdes bindings to SoC dtsi
- Add support for all (ethernet) serdes modes
- Refer to "nodes" instead of "bindings"
- Move compatible/reg first

Changes in v4:
- Convert to new bindings

Changes in v3:
- Describe modes in device tree

Changes in v2:
- Use one phy cell for SerDes1, since no lanes can be grouped
- Disable SerDes by default to prevent breaking boards inadvertently.

 .../arm64/boot/dts/freescale/fsl-ls1046a.dtsi | 111 ++
 1 file changed, 111 insertions(+)

diff --git a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi 
b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
index a01e3cfec77f..f6361fafaef7 100644
--- a/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
+++ b/arch/arm64/boot/dts/freescale/fsl-ls1046a.dtsi
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
compatible = "fsl,ls1046a";
@@ -424,6 +425,116 @@ sfp: efuse@1e8 {
clock-names = "sfp";
};
 
+   serdes1: serdes@1ea {
+   compatible = "fsl,ls1046a-serdes", "fsl,lynx-10g";
+   reg = <0x0 0x1ea 0x0 0x2000>;
+   #address-cells = <1>;
+   #size-cells = <0>;
+   #clock-cells = <1>;
+   status = "disabled";
+
+   /*
+* XXX: Lane A uses pins SD1_RX3_P/N! That is, the lane
+* numbers and pin numbers are _reversed_. In addition,
+* the PCCR documentation is _inconsistent_ in its
+* usage of these terms!
+*
+* PCCR "Lane 0" refers to...
+*  =
+*0 Lane A
+*2 Lane A
+*8 Lane A
+*9 Lane A
+*B Lane D!
+*/
+   serdes1_A: phy@0 {
+   #phy-cells = <0>;
+   reg = <0>;
+
+   /* SGMII.6 */
+   sgmii-0 {
+   fsl,pccr = <0x8>;
+   fsl,index = <0>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+   };
+
+   serdes1_B: phy@1 {
+   #phy-cells = <0>;
+   reg = <1>;
+
+   /* SGMII.5 */
+   sgmii-1 {
+   fsl,pccr = <0x8>;
+   fsl,index = <1>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* QSGMII.6,5,10,1 */
+   qsgmii-1 {
+   fsl,pccr = <0x9>;
+   fsl,index = <1>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* TODO: PCIe.1 */
+   };
+
+   serdes1_C: phy@2 {
+   #phy-cells = <0>;
+   reg = <2>;
+
+   /* SGMII.10 */
+   sgmii-2 {
+   fsl,pccr = <0x8>;
+   fsl,index = <2>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* XFI.10 */
+   xfi-0 {
+   fsl,pccr = <0xb>;
+   fsl,index = <0>;
+   fsl,cfg = <0x2>;
+   fsl,type = ;
+   };
+   };
+
+   serdes1_D: phy@3 {
+   #phy-cells = <0>;
+   reg = <3>;
+
+   /* SGMII.9 */
+   sgmii-3 {
+   fsl,pccr = <0x8>;
+   fsl,index = <3>;
+   fsl,cfg = <0x1>;
+   fsl,type = ;
+   };
+
+   /* XFI.9 */
+

[PATCH v10 04/13] dt-bindings: gpio-mmio: Add compatible for QIXIS

2023-03-06 Thread Sean Anderson
NXP has a "QIXIS" FPGA on several of their reference design boards. On
the LS1088ARDB there are several registers which control GPIOs. These
can be modeled with the MMIO GPIO driver.

Signed-off-by: Sean Anderson 
---

Changes in v10:
- New

 .../devicetree/bindings/gpio/gpio-mmio.yaml| 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml 
b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
index fd5c7055d542..a00a249e17cb 100644
--- a/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
+++ b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
@@ -21,10 +21,16 @@ properties:
 true
 
   compatible:
-enum:
-  - brcm,bcm6345-gpio # Broadcom BCM6345 GPIO controller
-  - wd,mbl-gpio # Western Digital MyBook Live memory-mapped GPIO controller
-  - ni,169445-nand-gpio # National Instruments 169445 GPIO NAND controller
+oneOf:
+  - enum:
+  - brcm,bcm6345-gpio # Broadcom BCM6345 GPIO controller
+  - wd,mbl-gpio # Western Digital MyBook Live memory-mapped GPIO 
controller
+  - ni,169445-nand-gpio # National Instruments 169445 GPIO NAND 
controller
+  - items:
+  - enum:
+  - fsl,fpga-qixis-los-stat
+  - fsl,fpga-qixis-brdcfg9
+  - const: ni,169445-nand-gpio
 
   '#gpio-cells':
 const: 2
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v10 01/13] dt-bindings: phy: Add 2500BASE-X and 10GBASE-R

2023-03-06 Thread Sean Anderson
This adds some modes necessary for Lynx 10G support. 2500BASE-X, also
known as 2.5G SGMII, is 1000BASE-X/SGMII overclocked to 3.125 GHz, with
autonegotiation disabled. 10GBASE-R, also known as XFI, is the protocol
spoken between the PMA and PMD ethernet layers for 10GBASE-T and
10GBASE-S/L/E. It is typically used to communicate directly with SFP+
modules, or with 10GBASE-T phys.

Signed-off-by: Sean Anderson 
Acked-by: Rob Herring 
---
PR increasing phy-type maximum [1].

If this commit could be applied sooner rather than later, I'd appreciate
it. This should help avoid another respin if someone else adds another
phy type.

[1] https://github.com/devicetree-org/dt-schema/pull/85

(no changes since v6)

Changes in v6:
- Bump PHY_TYPE_2500BASEX to 13, since PHY_TYPE_USXGMII was added in the
  meantime

Changes in v4:
- New

 include/dt-bindings/phy/phy.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/dt-bindings/phy/phy.h b/include/dt-bindings/phy/phy.h
index 6b901b342348..5b2b674d8d25 100644
--- a/include/dt-bindings/phy/phy.h
+++ b/include/dt-bindings/phy/phy.h
@@ -23,5 +23,7 @@
 #define PHY_TYPE_DPHY  10
 #define PHY_TYPE_CPHY  11
 #define PHY_TYPE_USXGMII   12
+#define PHY_TYPE_2500BASEX 13
+#define PHY_TYPE_10GBASER  14
 
 #endif /* _DT_BINDINGS_PHY */
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v10 06/13] clk: Add Lynx 10G SerDes PLL driver

2023-03-06 Thread Sean Anderson
This adds support for the PLLs found in Lynx 10G "SerDes" devices found on
various NXP QorIQ SoCs. There are two PLLs in each SerDes. This driver has
been split from the main PHY driver to allow for better review, even though
these PLLs are not present anywhere else besides the SerDes. An auxiliary
device is not used as it offers no benefits over a function call (and there
is no need to have a separate device).

The PLLs are modeled as clocks proper to let us take advantage of the
existing clock infrastructure. I have not given the same treatment to the
per-lane clocks because they need to be programmed in-concert with the rest
of the lane settings. One tricky thing is that the VCO (PLL) rate exceeds
2^32 (maxing out at around 5GHz). This will be a problem on 32-bit
platforms, since clock rates are stored as unsigned longs. To work around
this, the pll clock rate is generally treated in units of kHz.

The PLLs are configured rather interestingly. Instead of the usual direct
programming of the appropriate divisors, the input and output clock rates
are selected directly. Generally, the only restriction is that the input
and output must be integer multiples of each other. This suggests some kind
of internal look-up table. The datasheets generally list out the supported
combinations explicitly, and not all input/output combinations are
documented. I'm not sure if this is due to lack of support, or due to an
oversight. If this becomes an issue, then some combinations can be
blacklisted (or whitelisted). This may also be necessary for other SoCs
which have more stringent clock requirements.

Signed-off-by: Sean Anderson 

---

Changes in v10:
- Remove unnecessary inclusion of clk.h
- Don't gate clocks in compatibility mode

Changes in v9:
- Convert some u32s to unsigned long to match arguments
- Switch from round_rate to determine_rate
- Drop explicit reference to reference clock
- Use .parent_names when requesting parents
- Use devm_clk_hw_get_clk to pass clocks back to serdes
- Fix indentation
- Split off from following patch to allow for better review

 MAINTAINERS|   7 +
 drivers/clk/Makefile   |   1 +
 drivers/clk/clk-fsl-lynx-10g.c | 510 +
 drivers/phy/freescale/Kconfig  |   6 +
 include/linux/phy/lynx-10g.h   |  16 ++
 5 files changed, 540 insertions(+)
 create mode 100644 drivers/clk/clk-fsl-lynx-10g.c
 create mode 100644 include/linux/phy/lynx-10g.h

diff --git a/MAINTAINERS b/MAINTAINERS
index edd3d562beee..e964fac26a05 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -12207,6 +12207,13 @@ S: Maintained
 W: http://linux-test-project.github.io/
 T: git https://github.com/linux-test-project/ltp.git
 
+LYNX 10G SERDES DRIVER
+M: Sean Anderson 
+S: Maintained
+F: drivers/clk/clk-fsl-lynx-10g.c
+F: include/dt-bindings/clock/fsl,lynx-10g.h
+F: include/linux/phy/lynx-10g.h
+
 LYNX 28G SERDES PHY DRIVER
 M: Ioana Ciornei 
 L: net...@vger.kernel.org
diff --git a/drivers/clk/Makefile b/drivers/clk/Makefile
index e3ca0d058a25..eebed69f6c58 100644
--- a/drivers/clk/Makefile
+++ b/drivers/clk/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_ARCH_SPARX5) += clk-sparx5.o
 obj-$(CONFIG_COMMON_CLK_EN7523)+= clk-en7523.o
 obj-$(CONFIG_COMMON_CLK_FIXED_MMIO)+= clk-fixed-mmio.o
 obj-$(CONFIG_COMMON_CLK_FSL_FLEXSPI)   += clk-fsl-flexspi.o
+obj-$(CONFIG_PHY_FSL_LYNX_10G) += clk-fsl-lynx-10g.o
 obj-$(CONFIG_COMMON_CLK_FSL_SAI)   += clk-fsl-sai.o
 obj-$(CONFIG_COMMON_CLK_GEMINI)+= clk-gemini.o
 obj-$(CONFIG_COMMON_CLK_ASPEED)+= clk-aspeed.o
diff --git a/drivers/clk/clk-fsl-lynx-10g.c b/drivers/clk/clk-fsl-lynx-10g.c
new file mode 100644
index ..78357303b578
--- /dev/null
+++ b/drivers/clk/clk-fsl-lynx-10g.c
@@ -0,0 +1,510 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022 Sean Anderson 
+ *
+ * This file contains the implementation for the PLLs found on Lynx 10G phys.
+ *
+ * XXX: The VCO rate of the PLLs can exceed ~4GHz, which is the maximum rate
+ * expressable in an unsigned long. To work around this, rates are specified in
+ * kHz. This is as if there was a division by 1000 in the PLL.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PLL_STRIDE 0x20
+#define PLLa(a, off)   ((a) * PLL_STRIDE + (off))
+#define PLLaRSTCTL(a)  PLLa(a, 0x00)
+#define PLLaCR0(a) PLLa(a, 0x04)
+
+#define PLLaRSTCTL_RSTREQ  BIT(31)
+#define PLLaRSTCTL_RST_DONEBIT(30)
+#define PLLaRSTCTL_RST_ERR BIT(29)
+#define PLLaRSTCTL_PLLRST_BBIT(7)
+#define PLLaRSTCTL_SDRST_B BIT(6)
+#define PLLaRSTCTL_SDENBIT(5)
+
+#define PLLaRSTCTL_ENABLE_SET  (PLLaRSTCTL_RST_DONE | PLLaRSTCTL_PLLRST_B | \
+PLLaRSTCTL_SDRST_B | PLLaRSTCTL_SDEN)
+#define PLLaRSTCTL_ENABLE_MASK (PLLaRSTCTL_ENABLE_SET | PLLaRSTCTL_RST_ERR)
+
+#define 

[PATCH v10 02/13] dt-bindings: phy: Add Lynx 10G phy binding

2023-03-06 Thread Sean Anderson
This adds a binding for the SerDes module found on QorIQ processors.
Each phy is a subnode of the top-level device, possibly supporting
multiple lanes and protocols. This "thick" #phy-cells is used due to
allow for better organization of parameters. Note that the particular
parameters necessary to select a protocol-controller/lane combination
vary across different SoCs, and even within different SerDes on the same
SoC.

The driver is designed to be able to completely reconfigure lanes at
runtime. Generally, the phy consumer can select the appropriate
protocol using set_mode.

There are two PLLs, each of which can be used as the master clock for
each lane. Each PLL has its own reference. For the moment they are
required, because it simplifies the driver implementation. Absent
reference clocks can be modeled by a fixed-clock with a rate of 0.

Signed-off-by: Sean Anderson 
---

(no changes since v9)

Changes in v9:
- Add fsl,unused-lanes-reserved to allow for a gradual transition
  between firmware and Linux control of the SerDes
- Change phy-type back to fsl,type, as I was getting the error
'#phy-cells' is a dependency of 'phy-type'

Changes in v7:
- Use double quotes everywhere in yaml

Changes in v6:
- fsl,type -> phy-type

Changes in v4:
- Use subnodes to describe lane configuration, instead of describing
  PCCRs. This is the same style used by phy-cadence-sierra et al.

Changes in v3:
- Manually expand yaml references
- Add mode configuration to device tree

Changes in v2:
- Rename to fsl,lynx-10g.yaml
- Refer to the device in the documentation, rather than the binding
- Move compatible first
- Document phy cells in the description
- Allow a value of 1 for phy-cells. This allows for compatibility with
  the similar (but according to Ioana Ciornei different enough) lynx-28g
  binding.
- Remove minItems
- Use list for clock-names
- Fix example binding having too many cells in regs
- Add #clock-cells. This will allow using assigned-clocks* to configure
  the PLLs.
- Document the structure of the compatible strings

 .../devicetree/bindings/phy/fsl,lynx-10g.yaml | 248 ++
 1 file changed, 248 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml

diff --git a/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml 
b/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
new file mode 100644
index ..7c364f7de85c
--- /dev/null
+++ b/Documentation/devicetree/bindings/phy/fsl,lynx-10g.yaml
@@ -0,0 +1,248 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/phy/fsl,lynx-10g.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NXP Lynx 10G SerDes
+
+maintainers:
+  - Sean Anderson 
+
+description: |
+  These Lynx "SerDes" devices are found in NXP's QorIQ line of processors. The
+  SerDes provides up to eight lanes. Each lane may be configured individually,
+  or may be combined with adjacent lanes for a multi-lane protocol. The SerDes
+  supports a variety of protocols, including up to 10G Ethernet, PCIe, SATA, 
and
+  others. The specific protocols supported for each lane depend on the
+  particular SoC.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - fsl,ls1046a-serdes
+  - fsl,ls1088a-serdes
+  - const: fsl,lynx-10g
+
+  "#address-cells":
+const: 1
+
+  "#size-cells":
+const: 0
+
+  "#clock-cells":
+const: 1
+description: |
+  The cell contains an ID as described in dt-bindings/clock/fsl,lynx-10g.h.
+  Note that when assigning a rate to a PLL, the PLL's rate is divided by
+  1000 to avoid overflow. A rate of 500 corresponds to 5GHz.
+
+  clocks:
+maxItems: 2
+description: |
+  Clock for each PLL reference clock input.
+
+  clock-names:
+minItems: 2
+maxItems: 2
+items:
+  enum:
+- ref0
+- ref1
+
+  fsl,unused-lanes-reserved:
+$ref: /schemas/types.yaml#/definitions/flag
+description: |
+  Unused lanes are reserved for firmware use, and should not be disabled.
+  Normally, groups containing unused lanes may be reconfigured or disabled
+  to save power. However, when this property is present, unused lanes will
+  not be touched until they are used by another driver. This allows
+  migrating from firmware control of lanes to driver control.
+
+  Lanes not present in any group will never be modified, regardless of the
+  presence of this property.
+
+  reg:
+maxItems: 1
+
+patternProperties:
+  "^phy@":
+type: object
+
+description: |
+  A contiguous group of lanes which will be configured together. Each group
+  corresponds to one phy device. Lanes not described by any group will be
+  left as-is.
+
+properties:
+  "#phy-cells":
+const: 0
+
+  reg:
+minItems: 1
+maxItems: 8
+description:
+  The lanes in the group. These must be listed in order. The 

[PATCH v10 03/13] dt-bindings: Convert gpio-mmio to yaml

2023-03-06 Thread Sean Anderson
This is a generic binding for simple MMIO GPIO controllers. Although we
have a single driver for these controllers, they were previously spread
over several files. Consolidate them. The register descriptions are
adapted from the comments in the source. There is no set order for the
registers, so I have not specified one.

Signed-off-by: Sean Anderson 
---

Changes in v10:
- New

 .../bindings/gpio/brcm,bcm6345-gpio.yaml  |  16 +--
 .../devicetree/bindings/gpio/gpio-mmio.yaml   | 136 ++
 .../bindings/gpio/ni,169445-nand-gpio.txt |  38 -
 .../devicetree/bindings/gpio/wd,mbl-gpio.txt  |  38 -
 4 files changed, 137 insertions(+), 91 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
 delete mode 100644 
Documentation/devicetree/bindings/gpio/ni,169445-nand-gpio.txt
 delete mode 100644 Documentation/devicetree/bindings/gpio/wd,mbl-gpio.txt

diff --git a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml 
b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
index 4d69f79df859..e11f4af49c52 100644
--- a/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
+++ b/Documentation/devicetree/bindings/gpio/brcm,bcm6345-gpio.yaml
@@ -4,7 +4,7 @@
 $id: http://devicetree.org/schemas/gpio/brcm,bcm6345-gpio.yaml#
 $schema: http://devicetree.org/meta-schemas/core.yaml#
 
-title: Broadcom BCM6345 GPIO controller
+title: Broadcom BCM63xx GPIO controller
 
 maintainers:
   - Álvaro Fernández Rojas 
@@ -18,8 +18,6 @@ description: |+
 
   BCM6338 have 8-bit data and dirout registers, where GPIO state can be read
   and/or written, and the direction changed from input to output.
-  BCM6345 have 16-bit data and dirout registers, where GPIO state can be read
-  and/or written, and the direction changed from input to output.
   BCM6318, BCM6328, BCM6358, BCM6362, BCM6368 and BCM63268 have 32-bit data
   and dirout registers, where GPIO state can be read and/or written, and the
   direction changed from input to output.
@@ -29,7 +27,6 @@ properties:
 enum:
   - brcm,bcm6318-gpio
   - brcm,bcm6328-gpio
-  - brcm,bcm6345-gpio
   - brcm,bcm6358-gpio
   - brcm,bcm6362-gpio
   - brcm,bcm6368-gpio
@@ -63,17 +60,6 @@ required:
 additionalProperties: false
 
 examples:
-  - |
-gpio@fffe0406 {
-  compatible = "brcm,bcm6345-gpio";
-  reg-names = "dirout", "dat";
-  reg = <0xfffe0406 2>, <0xfffe040a 2>;
-  native-endian;
-
-  gpio-controller;
-  #gpio-cells = <2>;
-};
-
   - |
 gpio@0 {
   compatible = "brcm,bcm63268-gpio";
diff --git a/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml 
b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
new file mode 100644
index ..fd5c7055d542
--- /dev/null
+++ b/Documentation/devicetree/bindings/gpio/gpio-mmio.yaml
@@ -0,0 +1,136 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/gpio/gpio-mmio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Generic MMIO GPIO
+
+maintainers:
+  - Linus Walleij 
+  - Bartosz Golaszewski 
+
+description: |
+  Some simple GPIO controllers may consist of a single data register or a pair
+  of set/clear-bit registers. Such controllers are common for glue logic in
+  FPGAs or ASICs. Commonly, these controllers are accessed over memory-mapped
+  NAND-style parallel busses.
+
+properties:
+  big-endian:
+true
+
+  compatible:
+enum:
+  - brcm,bcm6345-gpio # Broadcom BCM6345 GPIO controller
+  - wd,mbl-gpio # Western Digital MyBook Live memory-mapped GPIO controller
+  - ni,169445-nand-gpio # National Instruments 169445 GPIO NAND controller
+
+  '#gpio-cells':
+const: 2
+
+  gpio-controller:
+true
+
+  reg:
+minItems: 1
+description: |
+  A list of registers in the controller. The width of each register is
+  determined by its size. All registers must have the same width. The 
number
+  of GPIOs is set by the width, with bit 0 corresponding to GPIO 0.
+items:
+  - description: |
+  Register to READ the value of the GPIO lines. If GPIO line is high,
+  the bit will be set. If the GPIO line is low, the bit will be 
cleared.
+  This register may also be used to drive GPIOs if the SET register is
+  omitted.
+  - description: |
+  Register to SET the value of the GPIO lines. Setting a bit in this
+  register will drive the GPIO line high.
+  - description: |
+  Register to CLEAR the value of the GPIO lines. Setting a bit in this
+  register will drive the GPIO line low. If this register is omitted,
+  the SET register will be used to clear the GPIO lines as well, by
+  actively writing the line with 0.
+  - description: |
+  Register to set the line as OUTPUT. Setting a bit in this register
+  will turn that line into an output line. Conversely, clearing a 

[PATCH v10 00/13] phy: Add support for Lynx 10G SerDes

2023-03-06 Thread Sean Anderson
This adds support for the Lynx 10G SerDes found on the QorIQ T-series
and Layerscape series. Due to limited time and hardware, only support
for the LS1046ARDB and LS1088ARDB is added in this initial series.

This series is based on phy/next, but it requires phylink support. This
is already present for the LS1088A, and it was recently added for the
LS1046A in net-next/master.

Major reconfiguration of baud rate (e.g. 1G->10G) does not work. From my
testing, SerDes register settings appear identical. The issue appears to
be between the PCS and the MAC. The link itself comes up at both ends,
and a mac loopback succeeds. However, a PCS loopback results in dropped
packets. Perhaps there is some undocumented register in the PCS?

I suspect this driver is around 95% complete, but I don't have the
documentation to make it work completely. At the very least it is useful
for two cases:

- Although this is untested, it should support 2.5G SGMII as well as
  1000BASE-KX. The latter needs MAC and PCS support, but the former
  should work out of the box.
- It allows for clock configurations not supported by the RCW. This is
  very useful if you want to use e.g. SRDS_PRTCL_S1=0x and =0x1133
  on the same board. This is because the former setting will use PLL1
  as the 1G reference, but the latter will use PLL1 as the 10G
  reference. Because we can reconfigure the PLLs, it is possible to
  always use PLL1 as the 1G reference.

The final patch in this series depends on [1].

[1] 
https://lore.kernel.org/netdev/20221227230918.2440351-1-sean.ander...@seco.com/

Changes in v10:
- Convert gpio-mmio to yaml
- Add compatible for QIXIS
- Remove unnecessary inclusion of clk.h
- Don't gate clocks in compatibility mode
- Fix debugging print with incorrect error variable
- Move serdes bindings to SoC dtsi
- Add support for all (ethernet) serdes modes
- Refer to "nodes" instead of "bindings"
- Move compatible/reg first

Changes in v9:
- Add fsl,unused-lanes-reserved to allow for a gradual transition
  between firmware and Linux control of the SerDes
- Change phy-type back to fsl,type, as I was getting the error
'#phy-cells' is a dependency of 'phy-type'
- Convert some u32s to unsigned long to match arguments
- Switch from round_rate to determine_rate
- Drop explicit reference to reference clock
- Use .parent_names when requesting parents
- Use devm_clk_hw_get_clk to pass clocks back to serdes
- Fix indentation
- Split off clock "driver" into its own patch to allow for better
  review.
- Add ability to defer lane initialization to phy_init. This allows
  for easier transitioning between firmware-managed serdes and Linux-
  managed serdes, as the consumer (such as dpaa2, which knows what the
  firmware is doing) has the last say on who gets control.
- Fix name of phy mode node
- Add fsl,unused-lanes-reserved to allow a gradual transition, depending
  on the mac link type.
- Remove unused clocks
- Fix some phy mode node names

Changes in v8:
- Remove unused variable from lynx_ls_mode_init
- Rename serdes phy handles to use _A, _B, etc. instead of _0, _1, etc.
  This should help remind readers that the numbering corresponds to the
  physical layout of the registers, and not the lane (pin) number.
- Prevent PCSs from probing as phys
- Rename serdes phy handles like the LS1046A
- Add SFP slot binding
- Fix incorrect lane ordering (it's backwards on the LS1088A just like it is in
  the LS1046A).
- Fix duplicated lane 2 (it should have been lane 3).
- Fix incorrectly-documented value for XFI1.
- Remove interrupt for aquantia phy. It never fired for whatever reason,
  preventing the link from coming up.
- Add GPIOs for QIXIS FPGA.
- Enable MAC1 PCS
- Remove si5341 binding

Changes in v7:
- Use double quotes everywhere in yaml
- Break out call order into generic documentation
- Refuse to switch "major" protocols
- Update Kconfig to reflect restrictions
- Remove set/clear of "pcs reset" bit, since it doesn't seem to fix
  anything.

Changes in v6:
- Bump PHY_TYPE_2500BASEX to 13, since PHY_TYPE_USXGMII was added in the
  meantime
- fsl,type -> phy-type
- frequence -> frequency
- Update MAINTAINERS to include new files
- Include bitfield.h and slab.h to allow compilation on non-arm64
  arches.
- Depend on COMMON_CLK and either layerscape/ppc
- XGI.9 -> XFI.9

Changes in v5:
- Update commit description
- Dual id header
- Remove references to PHY_INTERFACE_MODE_1000BASEKX to allow this
  series to be applied directly to linux/master.
- Add fsl,lynx-10g.h to MAINTAINERS

Changes in v4:
- Add 2500BASE-X and 10GBASE-R phy types
- Use subnodes to describe lane configuration, instead of describing
  PCCRs. This is the same style used by phy-cadence-sierra et al.
- Add ids for Lynx 10g PLLs
- Rework all debug statements to remove use of __func__. Additional
  information has been provided as necessary.
- Consider alternative parent rates in round_rate and not in set_rate.
  Trying to modify out parent's rate in set_rate will deadlock.
- 

[PATCH v10 05/13] dt-bindings: clock: Add ids for Lynx 10g PLLs

2023-03-06 Thread Sean Anderson
This adds ids for the Lynx 10g SerDes's internal PLLs. These may be used
with assigned-clock* to specify a particular frequency to use. For
example, to set the second PLL (at offset 0x20)'s frequency, use
LYNX10G_PLLa(1). These are for use only in the device tree, and are not
otherwise used by the driver.

Signed-off-by: Sean Anderson 
Acked-by: Rob Herring 
---

(no changes since v6)

Changes in v6:
- frequence -> frequency

Changes in v5:
- Update commit description
- Dual id header

Changes in v4:
- New

 include/dt-bindings/clock/fsl,lynx-10g.h | 14 ++
 1 file changed, 14 insertions(+)
 create mode 100644 include/dt-bindings/clock/fsl,lynx-10g.h

diff --git a/include/dt-bindings/clock/fsl,lynx-10g.h 
b/include/dt-bindings/clock/fsl,lynx-10g.h
new file mode 100644
index ..15362ae85304
--- /dev/null
+++ b/include/dt-bindings/clock/fsl,lynx-10g.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause */
+/*
+ * Copyright (C) 2022 Sean Anderson 
+ */
+
+#ifndef __DT_BINDINGS_CLK_LYNX_10G_H
+#define __DT_BINDINGS_CLK_LYNX_10G_H
+
+#define LYNX10G_CLKS_PER_PLL 2
+
+#define LYNX10G_PLLa(a)((a) * LYNX10G_CLKS_PER_PLL)
+#define LYNX10G_PLLa_EX_DLY(a) ((a) * LYNX10G_CLKS_PER_PLL + 1)
+
+#endif /* __DT_BINDINGS_CLK_LYNX_10G_H */
-- 
2.35.1.1320.gc452695387.dirty



[PATCH v2 4/4] KVM: x86: Drop union for pages_{4k,2m,1g} stats

2023-03-06 Thread David Matlack
Drop the union for the pages_{4k,2m,1g} stats. The union is no longer
necessary now that KVM supports choosing a custom name for stats.

Eliminating the union also would allow future commits to more easily
move pages[] into common code, e.g. if KVM ever gains support for a
common page table code.

An alternative would be to drop pages[] and have kvm_update_page_stats()
update pages_{4k,2m,1g} directly. But that's not a good direction to go
in since other architectures use other page sizes.

No functional change intended.

Link: https://lore.kernel.org/kvm/20221208193857.4090582-1-dmatl...@google.com/
Signed-off-by: David Matlack 
---
 arch/x86/include/asm/kvm_host.h | 9 +
 arch/x86/kvm/x86.c  | 6 +++---
 2 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 808c292ad3f4..a59e41355ef4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1473,14 +1473,7 @@ struct kvm_vm_stat {
u64 mmu_recycled;
u64 mmu_cache_miss;
u64 mmu_unsync;
-   union {
-   struct {
-   atomic64_t pages_4k;
-   atomic64_t pages_2m;
-   atomic64_t pages_1g;
-   };
-   atomic64_t pages[KVM_NR_PAGE_SIZES];
-   };
+   atomic64_t pages[KVM_NR_PAGE_SIZES];
u64 nx_lpage_splits;
u64 max_mmu_page_hash_collisions;
u64 max_mmu_rmap_size;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 072f5ba83170..101ad6b7e7b6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -240,9 +240,9 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
KVM_STAT(VM, CUMULATIVE, NONE, mmu_recycled),
KVM_STAT(VM, CUMULATIVE, NONE, mmu_cache_miss),
KVM_STAT(VM, INSTANT, NONE, mmu_unsync),
-   KVM_STAT(VM, INSTANT, NONE, pages_4k),
-   KVM_STAT(VM, INSTANT, NONE, pages_2m),
-   KVM_STAT(VM, INSTANT, NONE, pages_1g),
+   __KVM_STAT(VM, INSTANT, NONE, pages[PG_LEVEL_4K - 1], "pages_4k"),
+   __KVM_STAT(VM, INSTANT, NONE, pages[PG_LEVEL_2M - 1], "pages_2m"),
+   __KVM_STAT(VM, INSTANT, NONE, pages[PG_LEVEL_1G - 1], "pages_1g"),
KVM_STAT(VM, INSTANT, NONE, nx_lpage_splits),
KVM_STAT(VM, PEAK, NONE, max_mmu_rmap_size),
KVM_STAT(VM, PEAK, NONE, max_mmu_page_hash_collisions)
-- 
2.40.0.rc0.216.gc4246ad0f0-goog



[PATCH v2 3/4] KVM: Allow custom names for KVM_STAT()

2023-03-06 Thread David Matlack
Allow custom names to be specified for stats built on KVM_STAT() via a
new inner macro __KVM_STAT(). e.g.

  KVM_STAT(VM, CUMULATIVE, NONE, foo),
  __KVM_STAT(VM, CUMULATIVE, NONE, bar, "custom_name"),
  ...

Custom name support enables decoupling the userspace-visible stat names
from their internal representation in C. This can allow future commits
to refactor the various stats structs without impacting userspace tools
that read KVM stats.

This also allows stats to be stored in data structures such as arrays,
without needing unions to access specific stats. For example, the union
for pages_{4k,2m,1g} is no longer necessary. At Google, we have several
other out-of-tree stats that would benefit from this support.

No functional change intended.

Link: 
https://lore.kernel.org/all/20211019000459.3163029-1-jingzhan...@google.com/
Suggested-by: Jing Zhang 
Signed-off-by: David Matlack 
---
 include/linux/kvm_host.h | 35 +++
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 6673ae757c4e..fa026e8997b2 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1761,40 +1761,43 @@ struct _kvm_stats_desc {
.name = _name, \
 }
 
-#define VM_GENERIC_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size,
\
- _bucket_size)\
-   _KVM_STATS_DESC(struct kvm_vm_stat, generic._stat, #_stat, _type,  \
+#define VM_GENERIC_STATS_DESC(_stat, _name, _type, _unit, _base, _exponent,
\
+ _size, _bucket_size) \
+   _KVM_STATS_DESC(struct kvm_vm_stat, generic._stat, _name, _type,   \
_unit, _base, _exponent, _size, _bucket_size)
 
-#define VCPU_GENERIC_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size,  
\
-   _bucket_size)  \
-   _KVM_STATS_DESC(struct kvm_vcpu_stat, generic._stat, #_stat, _type,\
+#define VCPU_GENERIC_STATS_DESC(_stat, _name, _type, _unit, _base, _exponent,  
\
+   _size, _bucket_size)   \
+   _KVM_STATS_DESC(struct kvm_vcpu_stat, generic._stat, _name, _type, \
_unit, _base, _exponent, _size, _bucket_size)
 
-#define VM_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size,   \
+#define VM_STATS_DESC(_stat, _name, _type, _unit, _base, _exponent, _size, 
\
  _bucket_size)\
-   _KVM_STATS_DESC(struct kvm_vm_stat, _stat, #_stat, _type, _unit,   \
+   _KVM_STATS_DESC(struct kvm_vm_stat, _stat, _name, _type, _unit,\
_base, _exponent, _size, _bucket_size)
 
-#define VCPU_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size, \
+#define VCPU_STATS_DESC(_stat, _name, _type, _unit, _base, _exponent, _size,   
\
_bucket_size)  \
-   _KVM_STATS_DESC(struct kvm_vcpu_stat, _stat, #_stat, _type, _unit, \
+   _KVM_STATS_DESC(struct kvm_vcpu_stat, _stat, _name, _type, _unit,  \
_base, _exponent, _size, _bucket_size)
 
 /* SCOPE: VM, VM_GENERIC, VCPU, VCPU_GENERIC */
-#define STATS_DESC(SCOPE, stat, type, unit, base, exp, sz, bsz)
   \
-   SCOPE##_STATS_DESC(stat, type, unit, base, exp, sz, bsz)
+#define STATS_DESC(SCOPE, stat, name, type, unit, base, exp, sz, bsz) \
+   SCOPE##_STATS_DESC(stat, name, type, unit, base, exp, sz, bsz)
 
-#define KVM_STAT(SCOPE, TYPE, UNIT, _stat)\
-   STATS_DESC(SCOPE, _stat, KVM_STATS_TYPE_##TYPE,\
+#define __KVM_STAT(SCOPE, TYPE, UNIT, _stat, _name)   \
+   STATS_DESC(SCOPE, _stat, _name, KVM_STATS_TYPE_##TYPE, \
   KVM_STATS_UNIT_##UNIT, KVM_STATS_BASE_POW10, 0, 1, 0)
 
+#define KVM_STAT(SCOPE, TYPE, UNIT, _stat)\
+   __KVM_STAT(SCOPE, TYPE, UNIT, _stat, #_stat)
+
 #define KVM_STAT_NSEC(SCOPE, _stat)   \
-   STATS_DESC(SCOPE, _stat, KVM_STATS_TYPE_CUMULATIVE,\
+   STATS_DESC(SCOPE, _stat, #_stat, KVM_STATS_TYPE_CUMULATIVE,\
   KVM_STATS_UNIT_SECONDS, KVM_STATS_BASE_POW10, -9, 1, 0)
 
 #define KVM_HIST_NSEC(SCOPE, TYPE, _stat, _size, _bucket_size)\
-   STATS_DESC(VCPU_GENERIC, _stat, KVM_STATS_TYPE_##TYPE##_HIST,  \
+   STATS_DESC(VCPU_GENERIC, _stat, #_stat, KVM_STATS_TYPE_##TYPE##_HIST,  \
   KVM_STATS_UNIT_SECONDS, KVM_STATS_BASE_POW10, -9,   \
   _size, _bucket_size)
 
-- 

[PATCH v2 1/4] KVM: Refactor stats descriptor generation macros

2023-03-06 Thread David Matlack
Refactor the various KVM stats macros to reduce the amount of duplicate
macro code. This change also improves readability by spelling out
"CUMULATIVE", "INSTANT", and "PEAK" instead of the previous short-hands
which were less clear ("COUNTER", "ICOUNTER", and "PCOUNTER").

No functional change intended.

Suggested-by: Sean Christopherson 
Signed-off-by: David Matlack 
---
 arch/arm64/kvm/guest.c|  14 +--
 arch/mips/kvm/mips.c  |  54 +--
 arch/powerpc/kvm/book3s.c |  62 ++--
 arch/powerpc/kvm/booke.c  |  48 -
 arch/riscv/kvm/vcpu.c |  16 +--
 arch/s390/kvm/kvm-s390.c  | 198 +++---
 arch/x86/kvm/x86.c|  94 +-
 include/linux/kvm_host.h  |  95 ++
 8 files changed, 272 insertions(+), 309 deletions(-)

diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c
index 07444fa22888..890ed444c237 100644
--- a/arch/arm64/kvm/guest.c
+++ b/arch/arm64/kvm/guest.c
@@ -44,13 +44,13 @@ const struct kvm_stats_header kvm_vm_stats_header = {
 
 const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
KVM_GENERIC_VCPU_STATS(),
-   STATS_DESC_COUNTER(VCPU, hvc_exit_stat),
-   STATS_DESC_COUNTER(VCPU, wfe_exit_stat),
-   STATS_DESC_COUNTER(VCPU, wfi_exit_stat),
-   STATS_DESC_COUNTER(VCPU, mmio_exit_user),
-   STATS_DESC_COUNTER(VCPU, mmio_exit_kernel),
-   STATS_DESC_COUNTER(VCPU, signal_exits),
-   STATS_DESC_COUNTER(VCPU, exits)
+   KVM_STAT(VCPU, CUMULATIVE, NONE, hvc_exit_stat),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, wfe_exit_stat),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, wfi_exit_stat),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, mmio_exit_user),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, mmio_exit_kernel),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, signal_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, exits)
 };
 
 const struct kvm_stats_header kvm_vcpu_stats_header = {
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 36c8991b5d39..b7b2fa400bcf 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -53,34 +53,34 @@ const struct kvm_stats_header kvm_vm_stats_header = {
 
 const struct _kvm_stats_desc kvm_vcpu_stats_desc[] = {
KVM_GENERIC_VCPU_STATS(),
-   STATS_DESC_COUNTER(VCPU, wait_exits),
-   STATS_DESC_COUNTER(VCPU, cache_exits),
-   STATS_DESC_COUNTER(VCPU, signal_exits),
-   STATS_DESC_COUNTER(VCPU, int_exits),
-   STATS_DESC_COUNTER(VCPU, cop_unusable_exits),
-   STATS_DESC_COUNTER(VCPU, tlbmod_exits),
-   STATS_DESC_COUNTER(VCPU, tlbmiss_ld_exits),
-   STATS_DESC_COUNTER(VCPU, tlbmiss_st_exits),
-   STATS_DESC_COUNTER(VCPU, addrerr_st_exits),
-   STATS_DESC_COUNTER(VCPU, addrerr_ld_exits),
-   STATS_DESC_COUNTER(VCPU, syscall_exits),
-   STATS_DESC_COUNTER(VCPU, resvd_inst_exits),
-   STATS_DESC_COUNTER(VCPU, break_inst_exits),
-   STATS_DESC_COUNTER(VCPU, trap_inst_exits),
-   STATS_DESC_COUNTER(VCPU, msa_fpe_exits),
-   STATS_DESC_COUNTER(VCPU, fpe_exits),
-   STATS_DESC_COUNTER(VCPU, msa_disabled_exits),
-   STATS_DESC_COUNTER(VCPU, flush_dcache_exits),
-   STATS_DESC_COUNTER(VCPU, vz_gpsi_exits),
-   STATS_DESC_COUNTER(VCPU, vz_gsfc_exits),
-   STATS_DESC_COUNTER(VCPU, vz_hc_exits),
-   STATS_DESC_COUNTER(VCPU, vz_grr_exits),
-   STATS_DESC_COUNTER(VCPU, vz_gva_exits),
-   STATS_DESC_COUNTER(VCPU, vz_ghfc_exits),
-   STATS_DESC_COUNTER(VCPU, vz_gpa_exits),
-   STATS_DESC_COUNTER(VCPU, vz_resvd_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, wait_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, cache_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, signal_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, int_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, cop_unusable_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, tlbmod_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, tlbmiss_ld_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, tlbmiss_st_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, addrerr_st_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, addrerr_ld_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, syscall_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, resvd_inst_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, break_inst_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, trap_inst_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, msa_fpe_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, fpe_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, msa_disabled_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, flush_dcache_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_gpsi_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_gsfc_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_hc_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_grr_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_gva_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_ghfc_exits),
+   KVM_STAT(VCPU, CUMULATIVE, NONE, vz_gpa_exits),
+   

[PATCH v2 2/4] KVM: Refactor designated initializer macros for struct _kvm_stats_desc

2023-03-06 Thread David Matlack
Refactor the macros that generate struct _kvm_stats_desc designated
initializers to cut down on duplication.

No functional change intended.

Signed-off-by: David Matlack 
---
 include/linux/kvm_host.h | 75 +++-
 1 file changed, 35 insertions(+), 40 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 02b1151c2753..6673ae757c4e 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1744,47 +1744,42 @@ struct _kvm_stats_desc {
char name[KVM_STATS_NAME_SIZE];
 };
 
-#define STATS_DESC_COMMON(type, unit, base, exp, sz, bsz) \
-   .flags = type | unit | base |  \
-BUILD_BUG_ON_ZERO(type & ~KVM_STATS_TYPE_MASK) |  \
-BUILD_BUG_ON_ZERO(unit & ~KVM_STATS_UNIT_MASK) |  \
-BUILD_BUG_ON_ZERO(base & ~KVM_STATS_BASE_MASK),   \
-   .exponent = exp,   \
-   .size = sz,\
-   .bucket_size = bsz
-
-#define VM_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz)   \
+/* Generates a designated initializer list for a struct _kvm_stats_desc. */
+#define _KVM_STATS_DESC(_struct, _field, _name, _type, _unit, _base,  \
+   _exponent, _size, _bucket_size)\
+{ \
{  \
-   {  \
-   STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
-   .offset = offsetof(struct kvm_vm_stat, generic.stat)   \
-   }, \
-   .name = #stat, \
-   }
-#define VCPU_GENERIC_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
-   {  \
-   {  \
-   STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
-   .offset = offsetof(struct kvm_vcpu_stat, generic.stat) \
-   }, \
-   .name = #stat, \
-   }
-#define VM_STATS_DESC(stat, type, unit, base, exp, sz, bsz)   \
-   {  \
-   {  \
-   STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
-   .offset = offsetof(struct kvm_vm_stat, stat)   \
-   }, \
-   .name = #stat, \
-   }
-#define VCPU_STATS_DESC(stat, type, unit, base, exp, sz, bsz) \
-   {  \
-   {  \
-   STATS_DESC_COMMON(type, unit, base, exp, sz, bsz), \
-   .offset = offsetof(struct kvm_vcpu_stat, stat) \
-   }, \
-   .name = #stat, \
-   }
+   .flags = _type | _unit | _base |   \
+BUILD_BUG_ON_ZERO(_type & ~KVM_STATS_TYPE_MASK) | \
+BUILD_BUG_ON_ZERO(_unit & ~KVM_STATS_UNIT_MASK) | \
+BUILD_BUG_ON_ZERO(_base & ~KVM_STATS_BASE_MASK),  \
+   .exponent = _exponent, \
+   .size = _size, \
+   .bucket_size = _bucket_size,   \
+   .offset = offsetof(_struct, _field),   \
+   }, \
+   .name = _name, \
+}
+
+#define VM_GENERIC_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size,
\
+ _bucket_size)\
+   _KVM_STATS_DESC(struct kvm_vm_stat, generic._stat, #_stat, _type,  \
+   _unit, _base, _exponent, _size, _bucket_size)
+
+#define VCPU_GENERIC_STATS_DESC(_stat, _type, _unit, _base, _exponent, _size,  

[PATCH v2 0/4] KVM: Refactor KVM stats macros and enable custom stat names

2023-03-06 Thread David Matlack
This series refactors the KVM stats macros to reduce duplication and
adds the support for choosing custom names for stats.

Custom name makes it possible to decouple the userspace-visible stat
names from their internal representation in C. This can allow future
commits to refactor the various stats structs without impacting
userspace tools that read KVM stats.

This also allows stats to be stored in data structures such as arrays,
without needing unions to access specific stats. Case in point, the last
patch in this series removes the pages_{4k,2m,1g} union, which is a
useful cleanup to prepare for sharing paging code across architectures
[1].

And for full transparency, another motivation for this series it that at
Google we have several out-of-tree stats that use arrays. Custom name
support is something we added internally and it reduces our technical
debt to get the support merged upstream.

Tested on x86. Compile tested on ARM. Not yet tested on any other
architectures.

Link: https://lore.kernel.org/kvm/20221208193857.4090582-1-dmatl...@google.com/

v2:
 - Refactor stat macros (patch 1) to reduce duplication and make it
   simpler to add custom name support [Sean]

v1: https://lore.kernel.org/kvm/20230118175300.790835-1-dmatl...@google.com/

David Matlack (4):
  KVM: Refactor stats descriptor generation macros
  KVM: Refactor designated initializer macros for struct _kvm_stats_desc
  KVM: Allow custom names for KVM_STAT()
  KVM: x86: Drop union for pages_{4k,2m,1g} stats

 arch/arm64/kvm/guest.c  |  14 +--
 arch/mips/kvm/mips.c|  54 -
 arch/powerpc/kvm/book3s.c   |  62 +-
 arch/powerpc/kvm/booke.c|  48 
 arch/riscv/kvm/vcpu.c   |  16 +--
 arch/s390/kvm/kvm-s390.c| 198 
 arch/x86/include/asm/kvm_host.h |   9 +-
 arch/x86/kvm/x86.c  |  94 +++
 include/linux/kvm_host.h| 179 +++--
 9 files changed, 314 insertions(+), 360 deletions(-)


base-commit: 45dd9bc75d9adc9483f0c7d662ba6e73ed698a0b
-- 
2.40.0.rc0.216.gc4246ad0f0-goog



Re: [PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Gerald Schaefer
On Mon, 6 Mar 2023 17:06:44 +
Catalin Marinas  wrote:

> On Mon, Mar 06, 2023 at 05:15:48PM +0100, Gerald Schaefer wrote:
> > diff --git a/arch/arm64/include/asm/pgtable.h 
> > b/arch/arm64/include/asm/pgtable.h
> > index b6ba466e2e8a..0bd18de9fd97 100644
> > --- a/arch/arm64/include/asm/pgtable.h
> > +++ b/arch/arm64/include/asm/pgtable.h
> > @@ -57,7 +57,7 @@ static inline bool arch_thp_swp_supported(void)
> >   * fault on one CPU which has been handled concurrently by another CPU
> >   * does not need to perform additional invalidation.
> >   */
> > -#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
> > +#define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0)
> 
> For arm64:
> 
> Acked-by: Catalin Marinas 
> 
> > diff --git a/arch/s390/include/asm/pgtable.h 
> > b/arch/s390/include/asm/pgtable.h
> > index 2c70b4d1263d..c1f6b46ec555 100644
> > --- a/arch/s390/include/asm/pgtable.h
> > +++ b/arch/s390/include/asm/pgtable.h
> > @@ -1239,7 +1239,8 @@ static inline int pte_allow_rdp(pte_t old, pte_t new)
> >  }
> >  
> >  static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
> > -   unsigned long address)
> > +   unsigned long address,
> > +   pte_t *ptep)
> >  {
> > /*
> >  * RDP might not have propagated the PTE protection reset to all CPUs,
> > @@ -1247,11 +1248,12 @@ static inline void 
> > flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
> >  * NOTE: This will also be called when a racing pagetable update on
> >  * another thread already installed the correct PTE. Both cases cannot
> >  * really be distinguished.
> > -* Therefore, only do the local TLB flush when RDP can be used, to avoid
> > -* unnecessary overhead.
> > +* Therefore, only do the local TLB flush when RDP can be used, and the
> > +* PTE does not have _PAGE_PROTECT set, to avoid unnecessary overhead.
> > +* A local RDP can be used to do the flush.
> >  */
> > -   if (MACHINE_HAS_RDP)
> > -   asm volatile("ptlb" : : : "memory");
> > +   if (MACHINE_HAS_RDP && !(pte_val(*ptep) & _PAGE_PROTECT))
> > +   __ptep_rdp(address, ptep, 0, 0, 1);
> 
> I wonder whether passing the actual entry is somewhat quicker as it
> avoids another memory access (though it might already be in the cache).

The RDP instruction itself only requires the PTE pointer as input, or more
precisely a pointer to the pagetable origin. We calculate that from the PTE
pointer, by masking out some bits, w/o actual memory access to the PTE entry
value.

Of course, there is the pte_val(*ptep) & _PAGE_PROTECT check here, with
memory access, but this might get removed in the future. TBH, I simply
wasn't sure (enough) yet, if we could technically ever end up here with
_PAGE_PROTECT set at all. For "real" spurious protection faults, it should
never be set, not so sure about racing pagetable updates though.

So this might actually be an unnecessary / overly cautious check, that
gets removed in the future, and not worth passing along the PTE value
in addition to the pointer.


[PATCH v2 0/4] Reenable VFIO support on POWER systems

2023-03-06 Thread Timothy Pearson
This patch series reenables VFIO support on POWER systems.  It
is based on Alexey Kardashevskiys's patch series, rebased and
successfully tested under QEMU with a Marvell PCIe SATA controller
on a POWER9 Blackbird host.

Alexey Kardashevskiy (3):
  powerpc/iommu: Add "borrowing" iommu_table_group_ops
  powerpc/pci_64: Init pcibios subsys a bit later
  powerpc/iommu: Add iommu_ops to report capabilities and allow blocking
domains

Timothy Pearson (1):
  Add myself to MAINTAINERS for Power VFIO support

 MAINTAINERS   |   5 +
 arch/powerpc/include/asm/iommu.h  |   6 +-
 arch/powerpc/include/asm/pci-bridge.h |   7 +
 arch/powerpc/kernel/iommu.c   | 246 +-
 arch/powerpc/kernel/pci_64.c  |   2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  36 +++-
 arch/powerpc/platforms/pseries/iommu.c|  27 +++
 arch/powerpc/platforms/pseries/pseries.h  |   4 +
 arch/powerpc/platforms/pseries/setup.c|   3 +
 drivers/vfio/vfio_iommu_spapr_tce.c   |  96 ++---
 10 files changed, 338 insertions(+), 94 deletions(-)

-- 
2.30.2


[PATCH v2 1/4] powerpc/iommu: Add "borrowing" iommu_table_group_ops

2023-03-06 Thread Timothy Pearson
PPC64 IOMMU API defines iommu_table_group_ops which handles DMA windows
for PEs: control the ownership, create/set/unset a table the hardware
for dynamic DMA windows (DDW). VFIO uses the API to implement support
on POWER.

So far only PowerNV IODA2 (POWER8 and newer machines) implemented this and 
other cases (POWER7 or nested KVM) did not and instead reused
existing iommu_table structs. This means 1) no DDW 2) ownership transfer
is done directly in the VFIO SPAPR TCE driver.

Soon POWER is going to get its own iommu_ops and ownership control is
going to move there. This implements spapr_tce_table_group_ops which
borrows iommu_table tables. The upside is that VFIO needs to know less
about POWER.

The new ops returns the existing table from create_table() and
only checks if the same window is already set. This is only going to work
if the default DMA window starts table_group.tce32_start and as big as
pe->table_group.tce32_size (not the case for IODA2+ PowerNV).

This changes iommu_table_group_ops::take_ownership() to return an error
if borrowing a table failed.

This should not cause any visible change in behavior for PowerNV.
pSeries was not that well tested/supported anyway.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Timothy Pearson 
---
 arch/powerpc/include/asm/iommu.h  |  6 +-
 arch/powerpc/kernel/iommu.c   | 98 ++-
 arch/powerpc/platforms/powernv/pci-ioda.c |  6 +-
 arch/powerpc/platforms/pseries/iommu.c|  3 +
 drivers/vfio/vfio_iommu_spapr_tce.c   | 94 --
 5 files changed, 121 insertions(+), 86 deletions(-)

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 7e29c73e3dd4..678b5bdc79b1 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -175,7 +175,7 @@ struct iommu_table_group_ops {
long (*unset_window)(struct iommu_table_group *table_group,
int num);
/* Switch ownership from platform code to external user (e.g. VFIO) */
-   void (*take_ownership)(struct iommu_table_group *table_group);
+   long (*take_ownership)(struct iommu_table_group *table_group);
/* Switch ownership from external user (e.g. VFIO) back to core */
void (*release_ownership)(struct iommu_table_group *table_group);
 };
@@ -215,6 +215,8 @@ extern long iommu_tce_xchg_no_kill(struct mm_struct *mm,
enum dma_data_direction *direction);
 extern void iommu_tce_kill(struct iommu_table *tbl,
unsigned long entry, unsigned long pages);
+
+extern struct iommu_table_group_ops spapr_tce_table_group_ops;
 #else
 static inline void iommu_register_group(struct iommu_table_group *table_group,
int pci_domain_number,
@@ -303,8 +305,6 @@ extern int iommu_tce_check_gpa(unsigned long page_shift,
iommu_tce_check_gpa((tbl)->it_page_shift, (gpa)))
 
 extern void iommu_flush_tce(struct iommu_table *tbl);
-extern int iommu_take_ownership(struct iommu_table *tbl);
-extern void iommu_release_ownership(struct iommu_table *tbl);
 
 extern enum dma_data_direction iommu_tce_direction(unsigned long tce);
 extern unsigned long iommu_direction_to_tce_perm(enum dma_data_direction dir);
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index ee95937bdaf1..f9f5a9418092 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1086,7 +1086,7 @@ void iommu_tce_kill(struct iommu_table *tbl,
 }
 EXPORT_SYMBOL_GPL(iommu_tce_kill);
 
-int iommu_take_ownership(struct iommu_table *tbl)
+static int iommu_take_ownership(struct iommu_table *tbl)
 {
unsigned long flags, i, sz = (tbl->it_size + 7) >> 3;
int ret = 0;
@@ -1118,9 +1118,8 @@ int iommu_take_ownership(struct iommu_table *tbl)
 
return ret;
 }
-EXPORT_SYMBOL_GPL(iommu_take_ownership);
 
-void iommu_release_ownership(struct iommu_table *tbl)
+static void iommu_release_ownership(struct iommu_table *tbl)
 {
unsigned long flags, i, sz = (tbl->it_size + 7) >> 3;
 
@@ -1137,7 +1136,6 @@ void iommu_release_ownership(struct iommu_table *tbl)
spin_unlock(>pools[i].lock);
spin_unlock_irqrestore(>large_pool.lock, flags);
 }
-EXPORT_SYMBOL_GPL(iommu_release_ownership);
 
 int iommu_add_device(struct iommu_table_group *table_group, struct device *dev)
 {
@@ -1179,4 +1177,96 @@ void iommu_del_device(struct device *dev)
iommu_group_remove_device(dev);
 }
 EXPORT_SYMBOL_GPL(iommu_del_device);
+
+/*
+ * A simple iommu_table_group_ops which only allows reusing the existing
+ * iommu_table. This handles VFIO for POWER7 or the nested KVM.
+ * The ops does not allow creating windows and only allows reusing the existing
+ * one if it matches table_group->tce32_start/tce32_size/page_shift.
+ */
+static unsigned long spapr_tce_get_table_size(__u32 page_shift,
+ __u64 window_size, __u32 levels)
+{
+  

[PATCH v2 2/4] powerpc/pci_64: Init pcibios subsys a bit later

2023-03-06 Thread Timothy Pearson
The following patches are going to add dependency/use of iommu_ops which
is initialized in subsys_initcall as well.

This moves pciobios_init() to the next initcall level.

This should not cause behavioral change.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Timothy Pearson 
---
 arch/powerpc/kernel/pci_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci_64.c b/arch/powerpc/kernel/pci_64.c
index fd42059ae2a5..e27342ef128b 100644
--- a/arch/powerpc/kernel/pci_64.c
+++ b/arch/powerpc/kernel/pci_64.c
@@ -73,7 +73,7 @@ static int __init pcibios_init(void)
return 0;
 }
 
-subsys_initcall(pcibios_init);
+subsys_initcall_sync(pcibios_init);
 
 int pcibios_unmap_io_space(struct pci_bus *bus)
 {
-- 
2.30.2


[PATCH v2] KVM: PPC: Make KVM_CAP_IRQFD_RESAMPLE support platform

2023-03-06 Thread Timothy Pearson
 dependent

When introduced, IRQFD resampling worked on POWER8 with XICS. However
KVM on POWER9 has never implemented it - the compatibility mode code
("XICS-on-XIVE") misses the kvm_notify_acked_irq() call and the native
XIVE mode does not handle INTx in KVM at all.

This moved the capability support advertising to platforms and stops
advertising it on XIVE, i.e. POWER9 and later.

Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Timothy Pearson 
---
 arch/arm64/kvm/arm.c   | 3 +++
 arch/mips/kvm/mips.c   | 3 +++
 arch/powerpc/kvm/powerpc.c | 6 ++
 arch/riscv/kvm/vm.c| 3 +++
 arch/s390/kvm/kvm-s390.c   | 3 +++
 arch/x86/kvm/x86.c | 3 +++
 virt/kvm/kvm_main.c| 1 -
 7 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3bd732eaf087..0ad50969430a 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -220,6 +220,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_VCPU_ATTRIBUTES:
case KVM_CAP_PTP_KVM:
case KVM_CAP_ARM_SYSTEM_SUSPEND:
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
r = 1;
break;
case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index 36c8991b5d39..52bdc479875d 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -1046,6 +1046,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_READONLY_MEM:
case KVM_CAP_SYNC_MMU:
case KVM_CAP_IMMEDIATE_EXIT:
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
r = 1;
break;
case KVM_CAP_NR_VCPUS:
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 4c5405fc5538..d23e25e8432d 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -576,6 +576,12 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
break;
 #endif
 
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+   r = !xive_enabled();
+   break;
+#endif
+
case KVM_CAP_PPC_ALLOC_HTAB:
r = hv_enabled;
break;
diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c
index 65a964d7e70d..0ef7a6168018 100644
--- a/arch/riscv/kvm/vm.c
+++ b/arch/riscv/kvm/vm.c
@@ -65,6 +65,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_READONLY_MEM:
case KVM_CAP_MP_STATE:
case KVM_CAP_IMMEDIATE_EXIT:
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
r = 1;
break;
case KVM_CAP_NR_VCPUS:
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 39b36562c043..6ca84bfdd2dc 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -573,6 +573,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_S390_VCPU_RESETS:
case KVM_CAP_SET_GUEST_DEBUG:
case KVM_CAP_S390_DIAG318:
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
r = 1;
break;
case KVM_CAP_SET_GUEST_DEBUG2:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7713420abab0..891aeace811e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4432,6 +4432,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long 
ext)
case KVM_CAP_VAPIC:
case KVM_CAP_ENABLE_CAP:
case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+#ifdef CONFIG_HAVE_KVM_IRQFD
+   case KVM_CAP_IRQFD_RESAMPLE:
+#endif
r = 1;
break;
case KVM_CAP_EXIT_HYPERCALL:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index d255964ec331..b1679d08a216 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -4479,7 +4479,6 @@ static long kvm_vm_ioctl_check_extension_generic(struct 
kvm *kvm, long arg)
 #endif
 #ifdef CONFIG_HAVE_KVM_IRQFD
case KVM_CAP_IRQFD:
-   case KVM_CAP_IRQFD_RESAMPLE:
 #endif
case KVM_CAP_IOEVENTFD_ANY_LENGTH:
case KVM_CAP_CHECK_EXTENSION_VM:
-- 
2.30.2


[PATCH v2 4/4] Add myself to MAINTAINERS for Power VFIO support

2023-03-06 Thread Timothy Pearson
Signed-off-by: Timothy Pearson 
---
 MAINTAINERS | 5 +
 1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d5bc223f305..876f96e82d66 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -9836,6 +9836,11 @@ F:   drivers/crypto/vmx/ghash*
 F: drivers/crypto/vmx/ppc-xlate.pl
 F: drivers/crypto/vmx/vmx.c
 
+IBM Power VFIO Support
+M: Timothy Pearson 
+S: Supported
+F: drivers/vfio/vfio_iommu_spapr_tce.c
+
 IBM ServeRAID RAID DRIVER
 S: Orphan
 F: drivers/scsi/ips.*
-- 
2.30.2


[PATCH v2 3/4] powerpc/iommu: Add iommu_ops to report capabilities and

2023-03-06 Thread Timothy Pearson
 allow blocking domains

Up until now PPC64 managed to avoid using iommu_ops. The VFIO driver
uses a SPAPR TCE sub-driver and all iommu_ops uses were kept in
the Type1 VFIO driver. Recent development added 2 uses of iommu_ops to
the generic VFIO which broke POWER:
- a coherency capability check;
- blocking IOMMU domain - iommu_group_dma_owner_claimed()/...

This adds a simple iommu_ops which reports support for cache
coherency and provides a basic support for blocking domains. No other
domain types are implemented so the default domain is NULL.

Since now iommu_ops controls the group ownership, this takes it out of
VFIO.

This adds an IOMMU device into a pci_controller (=PHB) and registers it
in the IOMMU subsystem, iommu_ops is registered at this point.
This setup is done in postcore_initcall_sync.

This replaces iommu_group_add_device() with iommu_probe_device() as
the former misses necessary steps in connecting PCI devices to IOMMU
devices. This adds a comment about why explicit iommu_probe_device()
is still needed.

The previous discussion is here:
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20220707135552.3688927-1-...@ozlabs.ru/
https://patchwork.ozlabs.org/project/kvm-ppc/patch/20220701061751.1955857-1-...@ozlabs.ru/

Fixes: e8ae0e140c05 ("vfio: Require that devices support DMA cache coherence")
Fixes: 70693f470848 ("vfio: Set DMA ownership for VFIO devices")
Cc: Deming Wang 
Cc: Robin Murphy 
Cc: Jason Gunthorpe 
Cc: Alex Williamson 
Cc: Daniel Henrique Barboza 
Cc: Fabiano Rosas 
Cc: Murilo Opsfelder Araujo 
Cc: Nicholas Piggin 
Co-authored-by: Timothy Pearson 
Signed-off-by: Alexey Kardashevskiy 
Signed-off-by: Timothy Pearson 
---
 arch/powerpc/include/asm/pci-bridge.h |   7 +
 arch/powerpc/kernel/iommu.c   | 148 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  30 +
 arch/powerpc/platforms/pseries/iommu.c|  24 
 arch/powerpc/platforms/pseries/pseries.h  |   4 +
 arch/powerpc/platforms/pseries/setup.c|   3 +
 drivers/vfio/vfio_iommu_spapr_tce.c   |   8 --
 7 files changed, 214 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 71c1d26f2400..2aa3a091ef20 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 
 struct device_node;
 
@@ -44,6 +45,9 @@ struct pci_controller_ops {
 #endif
 
void(*shutdown)(struct pci_controller *hose);
+
+   struct iommu_group *(*device_group)(struct pci_controller *hose,
+   struct pci_dev *pdev);
 };
 
 /*
@@ -131,6 +135,9 @@ struct pci_controller {
struct irq_domain   *dev_domain;
struct irq_domain   *msi_domain;
struct fwnode_handle*fwnode;
+
+   /* iommu_ops support */
+   struct iommu_device iommu;
 };
 
 /* These are used for config access before all the PCI probing
diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index f9f5a9418092..b42e202af3bc 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DBG(...)
 
@@ -1156,8 +1157,14 @@ int iommu_add_device(struct iommu_table_group 
*table_group, struct device *dev)
 
pr_debug("%s: Adding %s to iommu group %d\n",
 __func__, dev_name(dev),  iommu_group_id(table_group->group));
-
-   return iommu_group_add_device(table_group->group, dev);
+   /*
+* This is still not adding devices via the IOMMU bus notifier because
+* of pcibios_init() from arch/powerpc/kernel/pci_64.c which calls
+* pcibios_scan_phb() first (and this guy adds devices and triggers
+* the notifier) and only then it calls pci_bus_add_devices() which
+* configures DMA for buses which also creates PEs and IOMMU groups.
+*/
+   return iommu_probe_device(dev);
 }
 EXPORT_SYMBOL_GPL(iommu_add_device);
 
@@ -1237,6 +1244,7 @@ static long spapr_tce_take_ownership(struct 
iommu_table_group *table_group)
rc = iommu_take_ownership(tbl);
if (!rc)
continue;
+
for (j = 0; j < i; ++j)
iommu_release_ownership(table_group->tables[j]);
return rc;
@@ -1269,4 +1277,140 @@ struct iommu_table_group_ops spapr_tce_table_group_ops 
= {
.release_ownership = spapr_tce_release_ownership,
 };
 
+/*
+ * A simple iommu_ops to allow less cruft in generic VFIO code.
+ */
+static int spapr_tce_blocking_iommu_attach_dev(struct iommu_domain *dom,
+  struct device *dev)
+{
+   struct iommu_group *grp = iommu_group_get(dev);
+   struct iommu_table_group *table_group;
+   int ret = -EINVAL;
+
+   if (!grp)
+   return -ENODEV;
+
+   table_group = 

Re: [PATCH] powerpc/mm: fix mmap_lock bad unlock

2023-03-06 Thread Suren Baghdasaryan
On Mon, Mar 6, 2023 at 6:09 AM Laurent Dufour  wrote:
>
> On 06/03/2023 15:07:26, David Hildenbrand wrote:
> > On 06.03.23 14:55, Laurent Dufour wrote:
> >> When page fault is tried holding the per VMA lock, bad_access_pkey() and
> >> bad_access() should not be called because it is assuming the mmap_lock is
> >> held.
> >> In the case a bad access is detected, fall back to the default path,
> >> grabbing the mmap_lock to handle the fault and report the error.
> >>
> >> Fixes: 169db3bb4609 ("powerc/mm: try VMA lock-based page fault handling
> >> first")
> >> Reported-by: Sachin Sant 
> >> Link:
> >> https://lore.kernel.org/linux-mm/842502fb-f99c-417c-9648-a37d0ecdc...@linux.ibm.com
> >> Cc: Suren Baghdasaryan 
> >> Signed-off-by: Laurent Dufour 
> >> ---
> >>   arch/powerpc/mm/fault.c | 8 ++--
> >>   1 file changed, 2 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> >> index c7ae86b04b8a..e191b3ebd8d6 100644
> >> --- a/arch/powerpc/mm/fault.c
> >> +++ b/arch/powerpc/mm/fault.c
> >> @@ -479,17 +479,13 @@ static int ___do_page_fault(struct pt_regs *regs,
> >> unsigned long address,
> >> if (unlikely(access_pkey_error(is_write, is_exec,
> >>  (error_code & DSISR_KEYFAULT), vma))) {
> >> -int rc = bad_access_pkey(regs, address, vma);
> >> -
> >>   vma_end_read(vma);
> >> -return rc;
> >> +goto lock_mmap;
> >>   }
> >> if (unlikely(access_error(is_write, is_exec, vma))) {
> >> -int rc = bad_access(regs, address);
> >> -
> >>   vma_end_read(vma);
> >> -return rc;
> >> +goto lock_mmap;
> >>   }
> >> fault = handle_mm_fault(vma, address, flags |
> >> FAULT_FLAG_VMA_LOCK, regs);
> >
> > IIUC, that commit is neither upstream not in mm-stable -- it's unstable.
> > Maybe raise that as a review comment in reply to the original patch, so we
> > can easily connect the dots and squash it into the original, problematic
> > patch that is still under review.
> >
> Oh yes, I missed that. I'll reply to the Suren's thread.

Thanks Laurent! Seems simple enough to patch the original change.

>
> Thanks,
> Laurent.


Re: [PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Catalin Marinas
On Mon, Mar 06, 2023 at 05:15:48PM +0100, Gerald Schaefer wrote:
> diff --git a/arch/arm64/include/asm/pgtable.h 
> b/arch/arm64/include/asm/pgtable.h
> index b6ba466e2e8a..0bd18de9fd97 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -57,7 +57,7 @@ static inline bool arch_thp_swp_supported(void)
>   * fault on one CPU which has been handled concurrently by another CPU
>   * does not need to perform additional invalidation.
>   */
> -#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
> +#define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0)

For arm64:

Acked-by: Catalin Marinas 

> diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
> index 2c70b4d1263d..c1f6b46ec555 100644
> --- a/arch/s390/include/asm/pgtable.h
> +++ b/arch/s390/include/asm/pgtable.h
> @@ -1239,7 +1239,8 @@ static inline int pte_allow_rdp(pte_t old, pte_t new)
>  }
>  
>  static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
> - unsigned long address)
> + unsigned long address,
> + pte_t *ptep)
>  {
>   /*
>* RDP might not have propagated the PTE protection reset to all CPUs,
> @@ -1247,11 +1248,12 @@ static inline void 
> flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
>* NOTE: This will also be called when a racing pagetable update on
>* another thread already installed the correct PTE. Both cases cannot
>* really be distinguished.
> -  * Therefore, only do the local TLB flush when RDP can be used, to avoid
> -  * unnecessary overhead.
> +  * Therefore, only do the local TLB flush when RDP can be used, and the
> +  * PTE does not have _PAGE_PROTECT set, to avoid unnecessary overhead.
> +  * A local RDP can be used to do the flush.
>*/
> - if (MACHINE_HAS_RDP)
> - asm volatile("ptlb" : : : "memory");
> + if (MACHINE_HAS_RDP && !(pte_val(*ptep) & _PAGE_PROTECT))
> + __ptep_rdp(address, ptep, 0, 0, 1);

I wonder whether passing the actual entry is somewhat quicker as it
avoids another memory access (though it might already be in the cache).

-- 
Catalin


Re: [PATCH v7 00/10] Add the PowerQUICC audio support using the QMC

2023-03-06 Thread Mark Brown
On Mon, Mar 06, 2023 at 05:17:44PM +0100, Herve Codina wrote:
> Hi,
> 
> This series adds support for audio using the QMC controller available in
> some Freescale PowerQUICC SoCs.
> 
> This series contains three parts in order to show the different blocks
> hierarchy and their usage in this support.

I already applied this series, please send incremental patches with any
changes.


signature.asc
Description: PGP signature


[PATCH v7 10/10] MAINTAINERS: add the Freescale QMC audio entry

2023-03-06 Thread Herve Codina
After contributing the component, add myself as the maintainer for the
Freescale QMC audio ASoC component.

Signed-off-by: Herve Codina 
Reviewed-by: Christophe Leroy 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5597d402fbd8..a6e6b70cf8bd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8315,6 +8315,14 @@ F:   sound/soc/fsl/fsl*
 F: sound/soc/fsl/imx*
 F: sound/soc/fsl/mpc8610_hpcd.c
 
+FREESCALE SOC SOUND QMC DRIVER
+M: Herve Codina 
+L: alsa-de...@alsa-project.org (moderated for non-subscribers)
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
+F: sound/soc/fsl/fsl_qmc_audio.c
+
 FREESCALE USB PERIPHERAL DRIVERS
 M: Li Yang 
 L: linux-...@vger.kernel.org
-- 
2.39.2



[PATCH v7 09/10] ASoC: fsl: Add support for QMC audio

2023-03-06 Thread Herve Codina
The QMC audio is an ASoC component which provides DAIs that use the QMC
(QUICC Multichannel Controller) to transfer the audio data.

It provides as many DAIs as the number of QMC channels it references.

Signed-off-by: Herve Codina 
Reviewed-by: Christophe Leroy 
Tested-by: Christophe Leroy 
---
 sound/soc/fsl/Kconfig |   9 +
 sound/soc/fsl/Makefile|   2 +
 sound/soc/fsl/fsl_qmc_audio.c | 735 ++
 3 files changed, 746 insertions(+)
 create mode 100644 sound/soc/fsl/fsl_qmc_audio.c

diff --git a/sound/soc/fsl/Kconfig b/sound/soc/fsl/Kconfig
index 614eceda6b9e..17db29c25d96 100644
--- a/sound/soc/fsl/Kconfig
+++ b/sound/soc/fsl/Kconfig
@@ -172,6 +172,15 @@ config SND_MPC52xx_DMA
 config SND_SOC_POWERPC_DMA
tristate
 
+config SND_SOC_POWERPC_QMC_AUDIO
+   tristate "QMC ALSA SoC support"
+   depends on CPM_QMC
+   help
+ ALSA SoC Audio support using the Freescale QUICC Multichannel
+ Controller (QMC).
+ Say Y or M if you want to add support for SoC audio using Freescale
+ QMC.
+
 comment "SoC Audio support for Freescale PPC boards:"
 
 config SND_SOC_MPC8610_HPCD
diff --git a/sound/soc/fsl/Makefile b/sound/soc/fsl/Makefile
index b54beb1a66fa..8db7e97d0bd5 100644
--- a/sound/soc/fsl/Makefile
+++ b/sound/soc/fsl/Makefile
@@ -28,6 +28,7 @@ snd-soc-fsl-easrc-objs := fsl_easrc.o
 snd-soc-fsl-xcvr-objs := fsl_xcvr.o
 snd-soc-fsl-aud2htx-objs := fsl_aud2htx.o
 snd-soc-fsl-rpmsg-objs := fsl_rpmsg.o
+snd-soc-fsl-qmc-audio-objs := fsl_qmc_audio.o
 
 obj-$(CONFIG_SND_SOC_FSL_AUDMIX) += snd-soc-fsl-audmix.o
 obj-$(CONFIG_SND_SOC_FSL_ASOC_CARD) += snd-soc-fsl-asoc-card.o
@@ -44,6 +45,7 @@ obj-$(CONFIG_SND_SOC_POWERPC_DMA) += snd-soc-fsl-dma.o
 obj-$(CONFIG_SND_SOC_FSL_XCVR) += snd-soc-fsl-xcvr.o
 obj-$(CONFIG_SND_SOC_FSL_AUD2HTX) += snd-soc-fsl-aud2htx.o
 obj-$(CONFIG_SND_SOC_FSL_RPMSG) += snd-soc-fsl-rpmsg.o
+obj-$(CONFIG_SND_SOC_POWERPC_QMC_AUDIO) += snd-soc-fsl-qmc-audio.o
 
 # MPC5200 Platform Support
 obj-$(CONFIG_SND_MPC52xx_DMA) += mpc5200_dma.o
diff --git a/sound/soc/fsl/fsl_qmc_audio.c b/sound/soc/fsl/fsl_qmc_audio.c
new file mode 100644
index ..7cbb8e4758cc
--- /dev/null
+++ b/sound/soc/fsl/fsl_qmc_audio.c
@@ -0,0 +1,735 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ALSA SoC using the QUICC Multichannel Controller (QMC)
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct qmc_dai {
+   char *name;
+   int id;
+   struct device *dev;
+   struct qmc_chan *qmc_chan;
+   unsigned int nb_tx_ts;
+   unsigned int nb_rx_ts;
+};
+
+struct qmc_audio {
+   struct device *dev;
+   unsigned int num_dais;
+   struct qmc_dai *dais;
+   struct snd_soc_dai_driver *dai_drivers;
+};
+
+struct qmc_dai_prtd {
+   struct qmc_dai *qmc_dai;
+   dma_addr_t dma_buffer_start;
+   dma_addr_t period_ptr_submitted;
+   dma_addr_t period_ptr_ended;
+   dma_addr_t dma_buffer_end;
+   size_t period_size;
+   struct snd_pcm_substream *substream;
+};
+
+static int qmc_audio_pcm_construct(struct snd_soc_component *component,
+  struct snd_soc_pcm_runtime *rtd)
+{
+   struct snd_card *card = rtd->card->snd_card;
+   int ret;
+
+   ret = dma_coerce_mask_and_coherent(card->dev, DMA_BIT_MASK(32));
+   if (ret)
+   return ret;
+
+   snd_pcm_set_managed_buffer_all(rtd->pcm, SNDRV_DMA_TYPE_DEV, card->dev,
+  64*1024, 64*1024);
+   return 0;
+}
+
+static int qmc_audio_pcm_hw_params(struct snd_soc_component *component,
+  struct snd_pcm_substream *substream,
+  struct snd_pcm_hw_params *params)
+{
+   struct snd_pcm_runtime *runtime = substream->runtime;
+   struct qmc_dai_prtd *prtd = substream->runtime->private_data;
+
+   prtd->dma_buffer_start = runtime->dma_addr;
+   prtd->dma_buffer_end = runtime->dma_addr + params_buffer_bytes(params);
+   prtd->period_size = params_period_bytes(params);
+   prtd->period_ptr_submitted = prtd->dma_buffer_start;
+   prtd->period_ptr_ended = prtd->dma_buffer_start;
+   prtd->substream = substream;
+
+   return 0;
+}
+
+static void qmc_audio_pcm_write_complete(void *context)
+{
+   struct qmc_dai_prtd *prtd = context;
+   int ret;
+
+   prtd->period_ptr_ended += prtd->period_size;
+   if (prtd->period_ptr_ended >= prtd->dma_buffer_end)
+   prtd->period_ptr_ended = prtd->dma_buffer_start;
+
+   prtd->period_ptr_submitted += prtd->period_size;
+   if (prtd->period_ptr_submitted >= prtd->dma_buffer_end)
+   prtd->period_ptr_submitted = prtd->dma_buffer_start;
+
+   ret = qmc_chan_write_submit(prtd->qmc_dai->qmc_chan,
+   prtd->period_ptr_submitted, 

[PATCH v7 08/10] dt-bindings: sound: Add support for QMC audio

2023-03-06 Thread Herve Codina
The QMC (QUICC mutichannel controller) is a controller present in some
PowerQUICC SoC such as MPC885.
The QMC audio is an ASoC component that uses the QMC controller to
transfer the audio data.

Signed-off-by: Herve Codina 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Christophe Leroy 
---
 .../bindings/sound/fsl,qmc-audio.yaml | 117 ++
 1 file changed, 117 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml

diff --git a/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml 
b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
new file mode 100644
index ..ff5cd9241941
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/fsl,qmc-audio.yaml
@@ -0,0 +1,117 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/sound/fsl,qmc-audio.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: QMC audio
+
+maintainers:
+  - Herve Codina 
+
+description: |
+  The QMC audio is an ASoC component which uses QMC (QUICC Multichannel
+  Controller) channels to transfer the audio data.
+  It provides as many DAI as the number of QMC channel used.
+
+allOf:
+  - $ref: dai-common.yaml#
+
+properties:
+  compatible:
+const: fsl,qmc-audio
+
+  '#address-cells':
+const: 1
+  '#size-cells':
+const: 0
+  '#sound-dai-cells':
+const: 1
+
+patternProperties:
+  '^dai@([0-9]|[1-5][0-9]|6[0-3])$':
+description:
+  A DAI managed by this controller
+type: object
+
+properties:
+  reg:
+minimum: 0
+maximum: 63
+description:
+  The DAI number
+
+  fsl,qmc-chan:
+$ref: /schemas/types.yaml#/definitions/phandle-array
+items:
+  - items:
+  - description: phandle to QMC node
+  - description: Channel number
+description:
+  Should be a phandle/number pair. The phandle to QMC node and the QMC
+  channel to use for this DAI.
+
+required:
+  - reg
+  - fsl,qmc-chan
+
+required:
+  - compatible
+  - '#address-cells'
+  - '#size-cells'
+  - '#sound-dai-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+audio_controller: audio-controller {
+compatible = "fsl,qmc-audio";
+#address-cells = <1>;
+#size-cells = <0>;
+#sound-dai-cells = <1>;
+dai@16 {
+reg = <16>;
+fsl,qmc-chan = < 16>;
+};
+dai@17 {
+reg = <17>;
+fsl,qmc-chan = < 17>;
+};
+};
+
+sound {
+compatible = "simple-audio-card";
+#address-cells = <1>;
+#size-cells = <0>;
+simple-audio-card,dai-link@0 {
+reg = <0>;
+format = "dsp_b";
+cpu {
+sound-dai = <_controller 16>;
+};
+codec {
+sound-dai = <>;
+dai-tdm-slot-num = <4>;
+dai-tdm-slot-width = <8>;
+/* TS 3, 5, 7, 9 */
+dai-tdm-slot-tx-mask = <0 0 0 1 0 1 0 1 0 1>;
+dai-tdm-slot-rx-mask = <0 0 0 1 0 1 0 1 0 1>;
+};
+};
+simple-audio-card,dai-link@1 {
+reg = <1>;
+format = "dsp_b";
+cpu {
+sound-dai = <_controller 17>;
+};
+codec {
+sound-dai = <>;
+dai-tdm-slot-num = <4>;
+dai-tdm-slot-width = <8>;
+/* TS 2, 4, 6, 8 */
+dai-tdm-slot-tx-mask = <0 0 1 0 1 0 1 0 1>;
+dai-tdm-slot-rx-mask = <0 0 1 0 1 0 1 0 1>;
+};
+};
+};
-- 
2.39.2



[PATCH v7 07/10] MAINTAINERS: add the Freescale QMC controller entry

2023-03-06 Thread Herve Codina
After contributing the driver, add myself as the maintainer for the
Freescale QMC controller.

Signed-off-by: Herve Codina 
Reviewed-by: Christophe Leroy 
---
 MAINTAINERS | 8 
 1 file changed, 8 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index fea9ee7ade8e..5597d402fbd8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8247,6 +8247,14 @@ S:   Maintained
 F: drivers/soc/fsl/qe/
 F: include/soc/fsl/qe/
 
+FREESCALE QUICC ENGINE QMC DRIVER
+M: Herve Codina 
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
+F: drivers/soc/fsl/qe/qmc.c
+F: include/soc/fsl/qe/qmc.h
+
 FREESCALE QUICC ENGINE TSA DRIVER
 M: Herve Codina 
 L: linuxppc-dev@lists.ozlabs.org
-- 
2.39.2



[PATCH v7 06/10] soc: fsl: cpm1: Add support for QMC

2023-03-06 Thread Herve Codina
The QMC (QUICC Multichannel Controller) emulates up to 64
channels within one serial controller using the same TDM
physical interface routed from the TSA.

It is available in some PowerQUICC SoC such as the
MPC885 or MPC866.

It is also available on some Quicc Engine SoCs.
This current version support CPM1 SoCs only and some
enhancement are needed to support Quicc Engine SoCs.

Signed-off-by: Herve Codina 
Acked-by: Li Yang 
---
 drivers/soc/fsl/qe/Kconfig  |   12 +
 drivers/soc/fsl/qe/Makefile |1 +
 drivers/soc/fsl/qe/qmc.c| 1537 +++
 include/soc/fsl/qe/qmc.h|   71 ++
 4 files changed, 1621 insertions(+)
 create mode 100644 drivers/soc/fsl/qe/qmc.c
 create mode 100644 include/soc/fsl/qe/qmc.h

diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index b0088495c323..f90cfdf0c763 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -44,6 +44,18 @@ config CPM_TSA
  This option enables support for this
  controller
 
+config CPM_QMC
+   tristate "CPM QMC support"
+   depends on OF && HAS_IOMEM
+   depends on CPM1 || (SOC_FSL && COMPILE_TEST)
+   depends on CPM_TSA
+   help
+ Freescale CPM QUICC Multichannel Controller
+ (QMC)
+
+ This option enables support for this
+ controller
+
 config QE_TDM
bool
default y if FSL_UCC_HDLC
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 45c961acc81b..ec8506e13113 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ b/drivers/soc/fsl/qe/Makefile
@@ -5,6 +5,7 @@
 obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
 obj-$(CONFIG_CPM)  += qe_common.o
 obj-$(CONFIG_CPM_TSA)  += tsa.o
+obj-$(CONFIG_CPM_QMC)  += qmc.o
 obj-$(CONFIG_UCC)  += ucc.o
 obj-$(CONFIG_UCC_SLOW) += ucc_slow.o
 obj-$(CONFIG_UCC_FAST) += ucc_fast.o
diff --git a/drivers/soc/fsl/qe/qmc.c b/drivers/soc/fsl/qe/qmc.c
new file mode 100644
index ..b3c292c9a14e
--- /dev/null
+++ b/drivers/soc/fsl/qe/qmc.c
@@ -0,0 +1,1537 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * QMC driver
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "tsa.h"
+
+/* SCC general mode register high (32 bits) */
+#define SCC_GSMRL  0x00
+#define SCC_GSMRL_ENR  (1 << 5)
+#define SCC_GSMRL_ENT  (1 << 4)
+#define SCC_GSMRL_MODE_QMC (0x0A << 0)
+
+/* SCC general mode register low (32 bits) */
+#define SCC_GSMRH  0x04
+#define   SCC_GSMRH_CTSS   (1 << 7)
+#define   SCC_GSMRH_CDS(1 << 8)
+#define   SCC_GSMRH_CTSP   (1 << 9)
+#define   SCC_GSMRH_CDP(1 << 10)
+
+/* SCC event register (16 bits) */
+#define SCC_SCCE   0x10
+#define   SCC_SCCE_IQOV(1 << 3)
+#define   SCC_SCCE_GINT(1 << 2)
+#define   SCC_SCCE_GUN (1 << 1)
+#define   SCC_SCCE_GOV (1 << 0)
+
+/* SCC mask register (16 bits) */
+#define SCC_SCCM   0x14
+/* Multichannel base pointer (32 bits) */
+#define QMC_GBL_MCBASE 0x00
+/* Multichannel controller state (16 bits) */
+#define QMC_GBL_QMCSTATE   0x04
+/* Maximum receive buffer length (16 bits) */
+#define QMC_GBL_MRBLR  0x06
+/* Tx time-slot assignment table pointer (16 bits) */
+#define QMC_GBL_TX_S_PTR   0x08
+/* Rx pointer (16 bits) */
+#define QMC_GBL_RXPTR  0x0A
+/* Global receive frame threshold (16 bits) */
+#define QMC_GBL_GRFTHR 0x0C
+/* Global receive frame count (16 bits) */
+#define QMC_GBL_GRFCNT 0x0E
+/* Multichannel interrupt base address (32 bits) */
+#define QMC_GBL_INTBASE0x10
+/* Multichannel interrupt pointer (32 bits) */
+#define QMC_GBL_INTPTR 0x14
+/* Rx time-slot assignment table pointer (16 bits) */
+#define QMC_GBL_RX_S_PTR   0x18
+/* Tx pointer (16 bits) */
+#define QMC_GBL_TXPTR  0x1A
+/* CRC constant (32 bits) */
+#define QMC_GBL_C_MASK32   0x1C
+/* Time slot assignment table Rx (32 x 16 bits) */
+#define QMC_GBL_TSATRX 0x20
+/* Time slot assignment table Tx (32 x 16 bits) */
+#define QMC_GBL_TSATTX 0x60
+/* CRC constant (16 bits) */
+#define QMC_GBL_C_MASK16   0xA0
+
+/* TSA entry (16bit entry in TSATRX and TSATTX) */
+#define QMC_TSA_VALID  (1 << 15)
+#define QMC_TSA_WRAP   (1 << 14)
+#define QMC_TSA_MASK   (0x303F)
+#define QMC_TSA_CHANNEL(x) ((x) << 6)
+
+/* Tx buffer descriptor base address (16 bits, offset from MCBASE) */
+#define QMC_SPE_TBASE  0x00
+
+/* Channel mode register (16 bits) */
+#define QMC_SPE_CHAMR  0x02
+#define   QMC_SPE_CHAMR_MODE_HDLC  (1 << 15)
+#define   QMC_SPE_CHAMR_MODE_TRANSP((0 << 15) | (1 << 13))
+#define   QMC_SPE_CHAMR_ENT(1 << 12)
+#define   QMC_SPE_CHAMR_POL(1 << 8)
+#define   QMC_SPE_CHAMR_HDLC_IDLM  (1 

[PATCH v7 05/10] dt-bindings: soc: fsl: cpm_qe: Add QMC controller

2023-03-06 Thread Herve Codina
Add support for the QMC (QUICC Multichannel Controller) available in
some PowerQUICC SoC such as MPC885 or MPC866.

Signed-off-by: Herve Codina 
Reviewed-by: Krzysztof Kozlowski 
Reviewed-by: Christophe Leroy 
---
 .../soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml  | 162 ++
 1 file changed, 162 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml

diff --git 
a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
new file mode 100644
index ..ec888f48cac8
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml
@@ -0,0 +1,162 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,cpm1-scc-qmc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: PowerQUICC CPM QUICC Multichannel Controller (QMC)
+
+maintainers:
+  - Herve Codina 
+
+description:
+  The QMC (QUICC Multichannel Controller) emulates up to 64 channels within one
+  serial controller using the same TDM physical interface routed from TSA.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - fsl,mpc885-scc-qmc
+  - fsl,mpc866-scc-qmc
+  - const: fsl,cpm1-scc-qmc
+
+  reg:
+items:
+  - description: SCC (Serial communication controller) register base
+  - description: SCC parameter ram base
+  - description: Dual port ram base
+
+  reg-names:
+items:
+  - const: scc_regs
+  - const: scc_pram
+  - const: dpram
+
+  interrupts:
+maxItems: 1
+description: SCC interrupt line in the CPM interrupt controller
+
+  fsl,tsa-serial:
+$ref: /schemas/types.yaml#/definitions/phandle-array
+items:
+  - items:
+  - description: phandle to TSA node
+  - enum: [1, 2, 3]
+description: |
+  TSA serial interface (dt-bindings/soc/cpm1-fsl,tsa.h defines 
these
+  values)
+   - 1: SCC2
+   - 2: SCC3
+   - 3: SCC4
+description:
+  Should be a phandle/number pair. The phandle to TSA node and the TSA
+  serial interface to use.
+
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+patternProperties:
+  '^channel@([0-9]|[1-5][0-9]|6[0-3])$':
+description:
+  A channel managed by this controller
+type: object
+
+properties:
+  reg:
+minimum: 0
+maximum: 63
+description:
+  The channel number
+
+  fsl,operational-mode:
+$ref: /schemas/types.yaml#/definitions/string
+enum: [transparent, hdlc]
+default: transparent
+description: |
+  The channel operational mode
+- hdlc: The channel handles HDLC frames
+- transparent: The channel handles raw data without any processing
+
+  fsl,reverse-data:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The bit order as seen on the channels is reversed,
+  transmitting/receiving the MSB of each octet first.
+  This flag is used only in 'transparent' mode.
+
+  fsl,tx-ts-mask:
+$ref: /schemas/types.yaml#/definitions/uint64
+description:
+  Channel assigned Tx time-slots within the Tx time-slots routed by the
+  TSA to this cell.
+
+  fsl,rx-ts-mask:
+$ref: /schemas/types.yaml#/definitions/uint64
+description:
+  Channel assigned Rx time-slots within the Rx time-slots routed by the
+  TSA to this cell.
+
+required:
+  - reg
+  - fsl,tx-ts-mask
+  - fsl,rx-ts-mask
+
+required:
+  - compatible
+  - reg
+  - reg-names
+  - interrupts
+  - fsl,tsa-serial
+  - '#address-cells'
+  - '#size-cells'
+
+additionalProperties: false
+
+examples:
+  - |
+#include 
+
+qmc@a60 {
+compatible = "fsl,mpc885-scc-qmc", "fsl,cpm1-scc-qmc";
+reg = <0xa60 0x20>,
+  <0x3f00 0xc0>,
+  <0x2000 0x1000>;
+reg-names = "scc_regs", "scc_pram", "dpram";
+interrupts = <27>;
+interrupt-parent = <_PIC>;
+
+#address-cells = <1>;
+#size-cells = <0>;
+
+fsl,tsa-serial = < FSL_CPM_TSA_SCC4>;
+
+channel@16 {
+/* Ch16 : First 4 even TS from all routed from TSA */
+reg = <16>;
+fsl,mode = "transparent";
+fsl,reverse-data;
+fsl,tx-ts-mask = <0x 0x00aa>;
+fsl,rx-ts-mask = <0x 0x00aa>;
+};
+
+channel@17 {
+/* Ch17 : First 4 odd TS from all routed from TSA */
+reg = <17>;
+fsl,mode = "transparent";
+fsl,reverse-data;
+fsl,tx-ts-mask = <0x 0x0055>;
+fsl,rx-ts-mask = <0x 0x0055>;
+};
+
+channel@19 {
+/* Ch19 

[PATCH v7 04/10] powerpc/8xx: Use a larger CPM1 command check mask

2023-03-06 Thread Herve Codina
The CPM1 command mask is defined for use with the standard
CPM1 command register as described in the user's manual:
  0  |13|47|8   11|12  14| 15|
  RST|- |OPCODE|CH_NUM| -|FLG|

In the QMC extension the CPM1 command register is redefined
(QMC supplement user's manuel) with the following mapping:
  0  |13|47|8   13|14| 15|
  RST|QMC OPCODE|  1110|CHANNEL_NUMBER| -|FLG|

Extend the check command mask in order to support both the
standard CH_NUM field and the QMC extension CHANNEL_NUMBER
field.

Signed-off-by: Herve Codina 
Acked-by: Christophe Leroy 
Acked-by: Michael Ellerman 
---
 arch/powerpc/platforms/8xx/cpm1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/8xx/cpm1.c 
b/arch/powerpc/platforms/8xx/cpm1.c
index 8ef1f4392086..6b828b9f90d9 100644
--- a/arch/powerpc/platforms/8xx/cpm1.c
+++ b/arch/powerpc/platforms/8xx/cpm1.c
@@ -100,7 +100,7 @@ int cpm_command(u32 command, u8 opcode)
int i, ret;
unsigned long flags;
 
-   if (command & 0xff0f)
+   if (command & 0xff03)
return -EINVAL;
 
spin_lock_irqsave(_lock, flags);
-- 
2.39.2



[PATCH v7 03/10] MAINTAINERS: add the Freescale TSA controller entry

2023-03-06 Thread Herve Codina
After contributing the driver, add myself as the maintainer for the
Freescale TSA controller.

Signed-off-by: Herve Codina 
Reviewed-by: Christophe Leroy 
---
 MAINTAINERS | 9 +
 1 file changed, 9 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d5bc223f305..fea9ee7ade8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8247,6 +8247,15 @@ S:   Maintained
 F: drivers/soc/fsl/qe/
 F: include/soc/fsl/qe/
 
+FREESCALE QUICC ENGINE TSA DRIVER
+M: Herve Codina 
+L: linuxppc-dev@lists.ozlabs.org
+S: Maintained
+F: Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml
+F: drivers/soc/fsl/qe/tsa.c
+F: drivers/soc/fsl/qe/tsa.h
+F: include/dt-bindings/soc/cpm1-fsl,tsa.h
+
 FREESCALE QUICC ENGINE UCC ETHERNET DRIVER
 M: Li Yang 
 L: net...@vger.kernel.org
-- 
2.39.2



[PATCH v7 02/10] soc: fsl: cpm1: Add support for TSA

2023-03-06 Thread Herve Codina
The TSA (Time Slot Assigner) purpose is to route some TDM time-slots to
other internal serial controllers.

It is available in some PowerQUICC SoC such as the MPC885 or MPC866.

It is also available on some Quicc Engine SoCs.
This current version support CPM1 SoCs only and some enhancement are
needed to support Quicc Engine SoCs.

Signed-off-by: Herve Codina 
Acked-by: Li Yang 
Reviewed-by: Christophe Leroy 
---
 drivers/soc/fsl/qe/Kconfig  |  11 +
 drivers/soc/fsl/qe/Makefile |   1 +
 drivers/soc/fsl/qe/tsa.c| 846 
 drivers/soc/fsl/qe/tsa.h|  42 ++
 4 files changed, 900 insertions(+)
 create mode 100644 drivers/soc/fsl/qe/tsa.c
 create mode 100644 drivers/soc/fsl/qe/tsa.h

diff --git a/drivers/soc/fsl/qe/Kconfig b/drivers/soc/fsl/qe/Kconfig
index 357c5800b112..b0088495c323 100644
--- a/drivers/soc/fsl/qe/Kconfig
+++ b/drivers/soc/fsl/qe/Kconfig
@@ -33,6 +33,17 @@ config UCC
bool
default y if UCC_FAST || UCC_SLOW
 
+config CPM_TSA
+   tristate "CPM TSA support"
+   depends on OF && HAS_IOMEM
+   depends on CPM1 || COMPILE_TEST
+   help
+ Freescale CPM Time Slot Assigner (TSA)
+ controller.
+
+ This option enables support for this
+ controller
+
 config QE_TDM
bool
default y if FSL_UCC_HDLC
diff --git a/drivers/soc/fsl/qe/Makefile b/drivers/soc/fsl/qe/Makefile
index 55a555304f3a..45c961acc81b 100644
--- a/drivers/soc/fsl/qe/Makefile
+++ b/drivers/soc/fsl/qe/Makefile
@@ -4,6 +4,7 @@
 #
 obj-$(CONFIG_QUICC_ENGINE)+= qe.o qe_common.o qe_ic.o qe_io.o
 obj-$(CONFIG_CPM)  += qe_common.o
+obj-$(CONFIG_CPM_TSA)  += tsa.o
 obj-$(CONFIG_UCC)  += ucc.o
 obj-$(CONFIG_UCC_SLOW) += ucc_slow.o
 obj-$(CONFIG_UCC_FAST) += ucc_fast.o
diff --git a/drivers/soc/fsl/qe/tsa.c b/drivers/soc/fsl/qe/tsa.c
new file mode 100644
index ..3646153117b3
--- /dev/null
+++ b/drivers/soc/fsl/qe/tsa.c
@@ -0,0 +1,846 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * TSA driver
+ *
+ * Copyright 2022 CS GROUP France
+ *
+ * Author: Herve Codina 
+ */
+
+#include "tsa.h"
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+/* TSA SI RAM routing tables entry */
+#define TSA_SIRAM_ENTRY_LAST   (1 << 16)
+#define TSA_SIRAM_ENTRY_BYTE   (1 << 17)
+#define TSA_SIRAM_ENTRY_CNT(x) (((x) & 0x0f) << 18)
+#define TSA_SIRAM_ENTRY_CSEL_MASK  (0x7 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_NU(0x0 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC2  (0x2 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC3  (0x3 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SCC4  (0x4 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SMC1  (0x5 << 22)
+#define TSA_SIRAM_ENTRY_CSEL_SMC2  (0x6 << 22)
+
+/* SI mode register (32 bits) */
+#define TSA_SIMODE 0x00
+#define   TSA_SIMODE_SMC2  0x8000
+#define   TSA_SIMODE_SMC1  0x8000
+#define   TSA_SIMODE_TDMA(x)   ((x) << 0)
+#define   TSA_SIMODE_TDMB(x)   ((x) << 16)
+#define TSA_SIMODE_TDM_MASK0x0fff
+#define TSA_SIMODE_TDM_SDM_MASK0x0c00
+#define   TSA_SIMODE_TDM_SDM_NORM  0x
+#define   TSA_SIMODE_TDM_SDM_ECHO  0x0400
+#define   TSA_SIMODE_TDM_SDM_INTL_LOOP 0x0800
+#define   TSA_SIMODE_TDM_SDM_LOOP_CTRL 0x0c00
+#define TSA_SIMODE_TDM_RFSD(x) ((x) << 8)
+#define TSA_SIMODE_TDM_DSC 0x0080
+#define TSA_SIMODE_TDM_CRT 0x0040
+#define TSA_SIMODE_TDM_STZ 0x0020
+#define TSA_SIMODE_TDM_CE  0x0010
+#define TSA_SIMODE_TDM_FE  0x0008
+#define TSA_SIMODE_TDM_GM  0x0004
+#define TSA_SIMODE_TDM_TFSD(x) ((x) << 0)
+
+/* SI global mode register (8 bits) */
+#define TSA_SIGMR  0x04
+#define TSA_SIGMR_ENB  (1<<3)
+#define TSA_SIGMR_ENA  (1<<2)
+#define TSA_SIGMR_RDM_MASK 0x03
+#define   TSA_SIGMR_RDM_STATIC_TDMA0x00
+#define   TSA_SIGMR_RDM_DYN_TDMA   0x01
+#define   TSA_SIGMR_RDM_STATIC_TDMAB   0x02
+#define   TSA_SIGMR_RDM_DYN_TDMAB  0x03
+
+/* SI status register (8 bits) */
+#define TSA_SISTR  0x06
+
+/* SI command register (8 bits) */
+#define TSA_SICMR  0x07
+
+/* SI clock route register (32 bits) */
+#define TSA_SICR   0x0C
+#define   TSA_SICR_SCC2(x) ((x) << 8)
+#define   TSA_SICR_SCC3(x) ((x) << 16)
+#define   TSA_SICR_SCC4(x) ((x) << 24)
+#define TSA_SICR_SCC_MASK  0x0ff
+#define TSA_SICR_SCC_GRX   (1 << 7)
+#define TSA_SICR_SCC_SCX_TSA   (1 << 6)
+#define TSA_SICR_SCC_RXCS_MASK (0x7 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG1   (0x0 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG2   (0x1 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG3   (0x2 << 3)
+#define   TSA_SICR_SCC_RXCS_BRG4   

[PATCH v7 01/10] dt-bindings: soc: fsl: cpm_qe: Add TSA controller

2023-03-06 Thread Herve Codina
Add support for the time slot assigner (TSA) available in some
PowerQUICC SoC such as MPC885 or MPC866.

Signed-off-by: Herve Codina 
Reviewed-by: Christophe Leroy 
---
 .../bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml | 205 ++
 include/dt-bindings/soc/cpm1-fsl,tsa.h|  13 ++
 2 files changed, 218 insertions(+)
 create mode 100644 
Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml
 create mode 100644 include/dt-bindings/soc/cpm1-fsl,tsa.h

diff --git a/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml 
b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml
new file mode 100644
index ..7e51c639a79a
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml
@@ -0,0 +1,205 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/soc/fsl/cpm_qe/fsl,cpm1-tsa.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: PowerQUICC CPM Time-slot assigner (TSA) controller
+
+maintainers:
+  - Herve Codina 
+
+description:
+  The TSA is the time-slot assigner that can be found on some PowerQUICC SoC.
+  Its purpose is to route some TDM time-slots to other internal serial
+  controllers.
+
+properties:
+  compatible:
+items:
+  - enum:
+  - fsl,mpc885-tsa
+  - fsl,mpc866-tsa
+  - const: fsl,cpm1-tsa
+
+  reg:
+items:
+  - description: SI (Serial Interface) register base
+  - description: SI RAM base
+
+  reg-names:
+items:
+  - const: si_regs
+  - const: si_ram
+
+  '#address-cells':
+const: 1
+
+  '#size-cells':
+const: 0
+
+patternProperties:
+  '^tdm@[0-1]$':
+description:
+  The TDM managed by this controller
+type: object
+
+additionalProperties: false
+
+properties:
+  reg:
+minimum: 0
+maximum: 1
+description:
+  The TDM number for this TDM, 0 for TDMa and 1 for TDMb
+
+  fsl,common-rxtx-pins:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The hardware can use four dedicated pins for Tx clock, Tx sync, Rx
+  clock and Rx sync or use only two pins, Tx/Rx clock and Tx/Rx sync.
+  Without the 'fsl,common-rxtx-pins' property, the four pins are used.
+  With the 'fsl,common-rxtx-pins' property, two pins are used.
+
+  clocks:
+minItems: 2
+items:
+  - description: External clock connected to L1RSYNC pin
+  - description: External clock connected to L1RCLK pin
+  - description: External clock connected to L1TSYNC pin
+  - description: External clock connected to L1TCLK pin
+
+  clock-names:
+minItems: 2
+items:
+  - const: l1rsync
+  - const: l1rclk
+  - const: l1tsync
+  - const: l1tclk
+
+  fsl,rx-frame-sync-delay-bits:
+enum: [0, 1, 2, 3]
+default: 0
+description: |
+  Receive frame sync delay in number of bits.
+  Indicates the delay between the Rx sync and the first bit of the Rx
+  frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay.
+
+  fsl,tx-frame-sync-delay-bits:
+enum: [0, 1, 2, 3]
+default: 0
+description: |
+  Transmit frame sync delay in number of bits.
+  Indicates the delay between the Tx sync and the first bit of the Tx
+  frame. 0 for no bit delay. 1, 2 or 3 for 1, 2 or 3 bits delay.
+
+  fsl,clock-falling-edge:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  Data is sent on falling edge of the clock (and received on the rising
+  edge). If 'clock-falling-edge' is not present, data is sent on the
+  rising edge (and received on the falling edge).
+
+  fsl,fsync-rising-edge:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  Frame sync pulses are sampled with the rising edge of the channel
+  clock. If 'fsync-rising-edge' is not present, pulses are sampled with
+  the falling edge.
+
+  fsl,double-speed-clock:
+$ref: /schemas/types.yaml#/definitions/flag
+description:
+  The channel clock is twice the data rate.
+
+patternProperties:
+  '^fsl,[rt]x-ts-routes$':
+$ref: /schemas/types.yaml#/definitions/uint32-matrix
+description: |
+  A list of tuple that indicates the Tx or Rx time-slots routes.
+items:
+  items:
+- description:
+The number of time-slots
+  minimum: 1
+  maximum: 64
+- description: |
+The source (Tx) or destination (Rx) serial interface
+(dt-bindings/soc/cpm1-fsl,tsa.h defines these values)
+ - 0: No destination
+ - 1: SCC2
+ - 2: SCC3
+ - 3: SCC4
+ - 4: 

[PATCH v7 00/10] Add the PowerQUICC audio support using the QMC

2023-03-06 Thread Herve Codina
Hi,

This series adds support for audio using the QMC controller available in
some Freescale PowerQUICC SoCs.

This series contains three parts in order to show the different blocks
hierarchy and their usage in this support.

The first one is related to TSA (Time Slot Assigner).
The TSA handles the data present at the pin level (TDM with up to 64
time slots) and dispatchs them to one or more serial controller (SCC).

The second is related to QMC (QUICC Multichannel Controller).
The QMC handles the data at the serial controller (SCC) level and splits
again the data to creates some virtual channels.

The last one is related to the audio component (QMC audio).
It is the glue between the QMC controller and the ASoC component. It
handles one or more QMC virtual channels and creates one DAI per QMC
virtual channels handled.

Compared to the previous iteration
  
https://lore.kernel.org/linux-kernel/20230217145645.1768659-1-herve.cod...@bootlin.com/
this v7 series:
  - remove '#fsl,serial-cells' (TSA) and '#fsl,chan-cells' (QMC)
properties
  - Fix the QMC timeslots mask generation in case of 64 timeslots

Best regards,
Herve Codina

Changes v6 -> v7
  - Patch 1
Remove #fsl,serial-cells
Add 'Reviewed-by: Christophe Leroy '

  - Patch 2, 3, 7, 8 and 10
Add 'Reviewed-by: Christophe Leroy '

  - Patch 5
Remove #fsl,chan-cells
Add 'Reviewed-by: Christophe Leroy '

  - Patch 6
Fix the timeslot assigned mask in case of 64 timeslots

  - Patch 9
Add 'Reviewed-by: Christophe Leroy '
Add 'Tested-by: Christophe Leroy '

Changes v5 -> v6
  - Patch 1
Fix blank lines and spaces
Remove fsl,diagnostic-mode
Add some maxItems values
Renamed fsl,tsa.h to cpm1-fsl,tsa.h

  - Patch 2
Remove fsl,diagnostic-mode
Renamed fsl,tsa.h to cpm1-fsl,tsa.h
Add 'Acked-by: Li Yang '

  - Patch 3
Renamed fsl,tsa.h to cpm1-fsl,tsa.h

  - Patch 5
Renamed fsl,tsa.h to cpm1-fsl,tsa.h
Add 'Reviewed-by: Krzysztof Kozlowski '

Changes v4 -> v5
  - patch 1
Rename fsl,tsa.yaml to fsl,cpm1-tsa.yaml
Rename #serial-cells to #fsl,serial-cells and add a description
Fix typos
Remove examples present in description
Use a pattern property for fsl,[rt]x-ts-routes

  - patch 2
Remove one left out_8() ppc specific function call
Remove the no more needed PPC dependency in case of COMPILE_TEST

  - patch 4
Add 'Acked-by: Michael Ellerman '

  - patch 5
Rename fsl,qmc.yaml to fsl,cpm1-scc-qmc.yaml
Rename #chan-cells to #fsl,chan-cells and add a description

  - patch 6
Add the SOC_FSL dependency in case of COMPILE_TEST (issue raised by
the kernel test robot).
Fix a typo in commit log
Add 'Acked-by: Li Yang '

Changes v3 -> v4
  - patches 2, 6 and 9
Update code comment format.

  - patch 1
Fix some description formats.
Add 'additionalProperties: false' in subnode.
Move fsl,mode to fsl,diagnostic-mode.
Change clocks and clock-names properties.
Add '#serial-cells' property related to the newly introduced
fsl,tsa-serial phandle.

  - patch 2
Move fsl,mode to fsl,diagnostic-mode.
Replace the fsl,tsa phandle and the fsl,tsa-cell-id property by a
fsl,tsa-serial phandle and update the related API.
Add missing locks.

  - patch 5
Fix some description format.
Replace the fsl,tsa phandle and the fsl,tsa-cell-id property by a
fsl,tsa-serial phandle.
Rename fsl,mode to fsl,operational-mode and update its description.

  - patch 6
Replace the fsl,tsa phandle and the fsl,tsa-cell-id property by a
fsl,tsa-serial phandle and use the TSA updated API.
Rename fsl,mode to fsl,operational-mode.

  - patch 8
Add 'Reviewed-by: Krzysztof Kozlowski '

Changes v2 -> v3
  - All bindings
Rename fsl-tsa.h to fsl,tsa.h
Add missing vendor prefix
Various fixes (quotes, node names, upper/lower case)

  - patches 1 and 2 (TSA binding specific)
Remove 'reserved' values in the routing tables
Remove fsl,grant-mode
Add a better description for 'fsl,common-rxtx-pins'
Fix clocks/clocks-name handling against fsl,common-rxtx-pins
Add information related to the delays unit
Removed FSL_CPM_TSA_NBCELL
Fix license in binding header file fsl,tsa.h

  - patches 5 and 6 (QMC binding specific)
Remove fsl,cpm-command property
Add interrupt property constraint

  - patches 8 and 9 (QMC audio binding specific)
Remove 'items' in compatible property definition
Add missing 'dai-common.yaml' reference
Fix the qmc_chan phandle definition

  - patch 2 and 6
Use io{read,write}be{32,16}
Change commit subjects and logs

  - patch 4
Add 'Acked-by: Christophe Leroy '

Changes v1 -> v2:
  - patch 2 and 6
Fix kernel test robot errors

  - other patches
No changes

Herve Codina (10):
  dt-bindings: soc: fsl: cpm_qe: Add TSA controller
  soc: fsl: cpm1: Add support for TSA
  MAINTAINERS: add the Freescale TSA controller entry
  powerpc/8xx: Use a larger 

[PATCH] mm: add PTE pointer parameter to flush_tlb_fix_spurious_fault()

2023-03-06 Thread Gerald Schaefer
s390 can do more fine-grained handling of spurious TLB protection faults,
when there also is the PTE pointer available.

Therefore, pass on the PTE pointer to flush_tlb_fix_spurious_fault() as
an additional parameter.

This will add no functional change to other architectures, but those with
private flush_tlb_fix_spurious_fault() implementations need to be made
aware of the new parameter.

Reviewed-by: Alexander Gordeev 
Signed-off-by: Gerald Schaefer 
---
 arch/arm64/include/asm/pgtable.h  |  2 +-
 arch/mips/include/asm/pgtable.h   |  3 ++-
 arch/powerpc/include/asm/book3s/64/tlbflush.h |  3 ++-
 arch/s390/include/asm/pgtable.h   | 12 +++-
 arch/x86/include/asm/pgtable.h|  2 +-
 include/linux/pgtable.h   |  2 +-
 mm/memory.c   |  3 ++-
 mm/pgtable-generic.c  |  2 +-
 8 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index b6ba466e2e8a..0bd18de9fd97 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -57,7 +57,7 @@ static inline bool arch_thp_swp_supported(void)
  * fault on one CPU which has been handled concurrently by another CPU
  * does not need to perform additional invalidation.
  */
-#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
+#define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0)
 
 /*
  * ZERO_PAGE is a global shared page that is always zero: used
diff --git a/arch/mips/include/asm/pgtable.h b/arch/mips/include/asm/pgtable.h
index 791389bf3c12..574fa14ac8b2 100644
--- a/arch/mips/include/asm/pgtable.h
+++ b/arch/mips/include/asm/pgtable.h
@@ -469,7 +469,8 @@ static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
 }
 
 static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
-   unsigned long address)
+   unsigned long address,
+   pte_t *ptep)
 {
 }
 
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 2bbc0fcce04a..ff7f0ee179e5 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -121,7 +121,8 @@ static inline void flush_tlb_page(struct vm_area_struct 
*vma,
 
 #define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault
 static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
-   unsigned long address)
+   unsigned long address,
+   pte_t *ptep)
 {
/*
 * Book3S 64 does not require spurious fault flushes because the PTE
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 2c70b4d1263d..c1f6b46ec555 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -1239,7 +1239,8 @@ static inline int pte_allow_rdp(pte_t old, pte_t new)
 }
 
 static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
-   unsigned long address)
+   unsigned long address,
+   pte_t *ptep)
 {
/*
 * RDP might not have propagated the PTE protection reset to all CPUs,
@@ -1247,11 +1248,12 @@ static inline void flush_tlb_fix_spurious_fault(struct 
vm_area_struct *vma,
 * NOTE: This will also be called when a racing pagetable update on
 * another thread already installed the correct PTE. Both cases cannot
 * really be distinguished.
-* Therefore, only do the local TLB flush when RDP can be used, to avoid
-* unnecessary overhead.
+* Therefore, only do the local TLB flush when RDP can be used, and the
+* PTE does not have _PAGE_PROTECT set, to avoid unnecessary overhead.
+* A local RDP can be used to do the flush.
 */
-   if (MACHINE_HAS_RDP)
-   asm volatile("ptlb" : : : "memory");
+   if (MACHINE_HAS_RDP && !(pte_val(*ptep) & _PAGE_PROTECT))
+   __ptep_rdp(address, ptep, 0, 0, 1);
 }
 #define flush_tlb_fix_spurious_fault flush_tlb_fix_spurious_fault
 
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 7425f32e5293..15ae4d6ba476 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -1097,7 +1097,7 @@ static inline void ptep_set_wrprotect(struct mm_struct 
*mm,
clear_bit(_PAGE_BIT_RW, (unsigned long *)>pte);
 }
 
-#define flush_tlb_fix_spurious_fault(vma, address) do { } while (0)
+#define flush_tlb_fix_spurious_fault(vma, address, ptep) do { } while (0)
 
 #define mk_pmd(page, pgprot)   

[PATCH] powerpc/mm: fix mmap_lock bad unlock

2023-03-06 Thread Laurent Dufour
When page fault is tried holding the per VMA lock, bad_access_pkey() and
bad_access() should not be called because it is assuming the mmap_lock is
held.
In the case a bad access is detected, fall back to the default path,
grabbing the mmap_lock to handle the fault and report the error.

Fixes: 169db3bb4609 ("powerc/mm: try VMA lock-based page fault handling first")
Reported-by: Sachin Sant 
Link: 
https://lore.kernel.org/linux-mm/842502fb-f99c-417c-9648-a37d0ecdc...@linux.ibm.com
Cc: Suren Baghdasaryan 
Signed-off-by: Laurent Dufour 
---
 arch/powerpc/mm/fault.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index c7ae86b04b8a..e191b3ebd8d6 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -479,17 +479,13 @@ static int ___do_page_fault(struct pt_regs *regs, 
unsigned long address,
 
if (unlikely(access_pkey_error(is_write, is_exec,
   (error_code & DSISR_KEYFAULT), vma))) {
-   int rc = bad_access_pkey(regs, address, vma);
-
vma_end_read(vma);
-   return rc;
+   goto lock_mmap;
}
 
if (unlikely(access_error(is_write, is_exec, vma))) {
-   int rc = bad_access(regs, address);
-
vma_end_read(vma);
-   return rc;
+   goto lock_mmap;
}
 
fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, 
regs);
-- 
2.39.2



  1   2   >