Re: [PATCH 4/4] ASoC: fsl: drop unneeded snd_soc_dai_set_drvdata

2021-02-17 Thread Nicolin Chen
On Sat, Feb 13, 2021 at 11:19:07AM +0100, Julia Lawall wrote:
> snd_soc_dai_set_drvdata is not needed when the set data comes from
> snd_soc_dai_get_drvdata or dev_get_drvdata.  The problem was fixed
> usingthe following semantic patch: (http://coccinelle.lip6.fr/)
> 
> // 
> @@
> expression x,y,e;
> @@
>   x = dev_get_drvdata(y->dev)
>   ... when != x = e
> - snd_soc_dai_set_drvdata(y,x);
> 
> @@
> expression x,y,e;
> @@
>   x = snd_soc_dai_get_drvdata(y)
>   ... when != x = e
> - snd_soc_dai_set_drvdata(y,x);
> // 
> 
> Signed-off-by: Julia Lawall 

Acked-by: Nicolin Chen 


Re: [PATCH] arm64: defconfig: enable modern virtio pci device

2021-02-17 Thread Jason Wang



On 2021/2/11 下午7:52, Arnd Bergmann wrote:

On Wed, Feb 10, 2021 at 8:05 PM Anders Roxell  wrote:

Since patch ("virtio-pci: introduce modern device module") got added it
is not possible to boot a defconfig kernel in qemu with a virtio pci
device.  Add CONFIG_VIRTIO_PCI_MODERN=y fragment makes the kernel able
to boot.

Signed-off-by: Anders Roxell 
---
  arch/arm/configs/multi_v7_defconfig | 1 +
  arch/arm64/configs/defconfig| 1 +

Acked-by: Arnd Bergmann 

Michael, can you pick this up in the vhost tree that introduces the regression?

  Arnd



Hi:

Based on the discussion previously, the plan is to select 
VIRTIO_PCI_MODERN, and document the module that select it must depend on 
PCI.


I will post a patch soon.

Thanks



linux-next: manual merge of the devicetree tree with the powerpc tree

2021-02-17 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the devicetree tree got a conflict in:

  arch/powerpc/kexec/elf_64.c

between commit:

  2377c92e37fe ("powerpc/kexec_file: fix FDT size estimation for kdump kernel")

from the powerpc tree and commit:

  130b2d59cec0 ("powerpc: Use common of_kexec_alloc_and_setup_fdt()")

from the devicetree tree.

I can't easily see how to resolve these, so for now I have just used
the latter' changes to this file.

I fixed it up and can carry the fix as necessary. This is now fixed as
far as linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging.  You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.

-- 
Cheers,
Stephen Rothwell


pgpQKYVonrP1H.pgp
Description: OpenPGP digital signature


Re: [PATCH v18 00/10] Carry forward IMA measurement log on kexec on ARM64

2021-02-17 Thread Rob Herring
On Sat, Feb 13, 2021 at 08:10:38AM -0800, Lakshmi Ramasubramanian wrote:
> On kexec file load Integrity Measurement Architecture (IMA) subsystem
> may verify the IMA signature of the kernel and initramfs, and measure
> it.  The command line parameters passed to the kernel in the kexec call
> may also be measured by IMA.  A remote attestation service can verify
> a TPM quote based on the TPM event log, the IMA measurement list, and
> the TPM PCR data.  This can be achieved only if the IMA measurement log
> is carried over from the current kernel to the next kernel across
> the kexec call.
> 
> powerpc already supports carrying forward the IMA measurement log on
> kexec.  This patch set adds support for carrying forward the IMA
> measurement log on kexec on ARM64.
> 
> This patch set moves the platform independent code defined for powerpc
> such that it can be reused for other platforms as well.  A chosen node
> "linux,ima-kexec-buffer" is added to the DTB for ARM64 to hold
> the address and the size of the memory reserved to carry
> the IMA measurement log.
> 
> This patch set has been tested for ARM64 platform using QEMU.
> I would like help from the community for testing this change on powerpc.
> Thanks.
> 
> This patch set is based on
> commit f31e3386a4e9 ("ima: Free IMA measurement buffer after kexec syscall")
> in https://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity.git
> "ima-kexec-fixes" branch.
> 
> Changelog:
> 
> v18
>   - Added a parameter to of_kexec_alloc_and_setup_fdt() for the caller
> to specify additional space needed for the FDT buffer
>   - Renamed arm64 and powerpc ELF buffer address field in
> "struct kimage_arch" to elf_load_addr to match x86_64 name.
>   - Removed of_ima_add_kexec_buffer() and instead directly set
> ima_buffer_addr and ima_buffer_size in ima_add_kexec_buffer()
>   - Moved FDT_EXTRA_SPACE definition from include/linux/of.h to
> drivers/of/kexec.c
> 
> v17
>   - Renamed of_kexec_setup_new_fdt() to of_kexec_alloc_and_setup_fdt(),
> and moved memory allocation for the new FDT to this function.
> 
> v16
>   - Defined functions to allocate and free buffer for FDT for powerpc
> and arm64.
>   - Moved ima_buffer_addr and ima_buffer_size fields from
> "struct kimage_arch" in powerpc to "struct kimage"
> v15
>   - Included Rob's patches in the patch set, and rebased
> the changes to "next-integrity" branch.
>   - Allocate memory for DTB, on arm64, using kmalloc() instead of
> vmalloc() to keep it consistent with powerpc implementation.
>   - Call of_kexec_setup_new_fdt() from setup_new_fdt_ppc64() and
> remove setup_new_fdt() in the same patch to keep it bisect safe.
> 
> v14
>   - Select CONFIG_HAVE_IMA_KEXEC for CONFIG_KEXEC_FILE, for powerpc
> and arm64, if CONFIG_IMA is enabled.
>   - Use IS_ENABLED() macro instead of "#ifdef" in remove_ima_buffer(),
> ima_get_kexec_buffer(), and ima_free_kexec_buffer().
>   - Call of_kexec_setup_new_fdt() from setup_new_fdt_ppc64() and
> remove setup_new_fdt() in "arch/powerpc/kexec/file_load.c".
> 
> v13
>   - Moved the arch independent functions to drivers/of/kexec.c
> and then refactored the code.
>   - Moved arch_ima_add_kexec_buffer() to
> security/integrity/ima/ima_kexec.c
> 
> v12
>   - Use fdt_appendprop_addrrange() in setup_ima_buffer()
> to setup the IMA measurement list property in
> the device tree.
>   - Moved architecture independent functions from
> "arch/powerpc/kexec/ima.c" to "drivers/of/kexec."
>   - Deleted "arch/powerpc/kexec/ima.c" and
> "arch/powerpc/include/asm/ima.h".
> 
> v11
>   - Rebased the changes on the kexec code refactoring done by
> Rob Herring in his "dt/kexec" branch
>   - Removed "extern" keyword in function declarations
>   - Removed unnecessary header files included in C files
>   - Updated patch descriptions per Thiago's comments
> 
> v10
>   - Moved delete_fdt_mem_rsv(), remove_ima_buffer(),
> get_ima_kexec_buffer, and get_root_addr_size_cells()
> to drivers/of/kexec.c
>   - Moved arch_ima_add_kexec_buffer() to
> security/integrity/ima/ima_kexec.c
>   - Conditionally define IMA buffer fields in struct kimage_arch
> 
> v9
>   - Moved delete_fdt_mem_rsv() to drivers/of/kexec_fdt.c
>   - Defined a new function get_ima_kexec_buffer() in
> drivers/of/ima_kexec.c to replace do_get_kexec_buffer()
>   - Changed remove_ima_kexec_buffer() to the original function name
> remove_ima_buffer()
>   - Moved remove_ima_buffer() to drivers/of/ima_kexec.c
>   - Moved ima_get_kexec_buffer() and ima_free_kexec_buffer()
> to security/integrity/ima/ima_kexec.c
> 
> v8:
>   - Moved remove_ima_kexec_buffer(), do_get_kexec_buffer(), and
> delete_fdt_mem_rsv() to drivers/of/fdt.c
>   - Moved ima_dump_measurement_list() and ima_add_kexec_buffer()
> back to security/integrity/ima/ima_kexec.c
> 
> v7:
>   - Renamed remove_ima_buffer() to remove_ima_kexec_buffer() and moved
> this function 

Re: [PATCH 1/4] add generic builtin command line

2021-02-17 Thread Andrew Morton
On Mon, 15 Feb 2021 11:32:01 -0800 Daniel Gimpelevich 
 wrote:

> On Thu, 2019-03-21 at 15:15 -0700, Andrew Morton wrote:
> > On Thu, 21 Mar 2019 08:13:08 -0700 Daniel Walker  wrote:
> > > On Wed, Mar 20, 2019 at 08:14:33PM -0700, Andrew Morton wrote:
> > > > The patches (or some version of them) are already in linux-next,
> > > > which messes me up.  I'll disable them for now.
> > >  
> > > Those are from my tree, but I remove them when you picked up the series. 
> > > The
> > > next linux-next should not have them.
> > 
> > Yup, thanks, all looks good now.
> 
> This patchset is currently neither in mainline nor in -next. May I ask
> what happened to it? Thanks.

Seems that I didn't bring them back after the confict with the powerpc
tree resolved itself.

Please resend everything for -rc1 and let's await the reviewer
feedback,



Re: [PATCH v5 10/10] powerpc/signal64: Use __get_user() to copy sigset_t

2021-02-17 Thread Christopher M. Riedl
On Fri Feb 12, 2021 at 3:21 PM CST, Daniel Axtens wrote:
> "Christopher M. Riedl"  writes:
>
> > Usually sigset_t is exactly 8B which is a "trivial" size and does not
> > warrant using __copy_from_user(). Use __get_user() directly in
> > anticipation of future work to remove the trivial size optimizations
> > from __copy_from_user(). Calling __get_user() also results in a small
> > boost to signal handling throughput here.
>
> Modulo the comments from Christophe, this looks good to me. It seems to
> do what it says on the tin.
>
> Reviewed-by: Daniel Axtens 
>
> Do you know if this patch is responsible for the slightly increase in
> radix performance you observed in your cover letter? The rest of the
> series shouldn't really make things faster than the no-KUAP case...

No, this patch just results in a really small improvement overall.

I looked over this again and I think the reason for the speedup is that
my implementation of unsafe_copy_from_user() in this series calls
__copy_tofrom_user() directly avoiding a barrier_nospec(). This speeds
up all the unsafe_get_from_user() calls in this version.

Skipping the barrier_nospec() like that is incorrect, but Christophe
recently sent a patch which moves barrier_nospec() into the uaccess
allowance helpers. It looks like mpe has already accepted that patch:

https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-February/223959.html

That means that my implementation of unsafe_copy_from_user() is _now_
correct _and_ faster. We do not need to call barrier_nospec() since the
precondition for the 'unsafe' routine is that we called one of the
helpers to open up a uaccess window earlier.

>
> Kind regards,
> Daniel
>
> >
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >  arch/powerpc/kernel/signal_64.c | 14 --
> >  1 file changed, 12 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index 817b64e1e409..42fdc4a7ff72 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -97,6 +97,14 @@ static void prepare_setup_sigcontext(struct task_struct 
> > *tsk, int ctx_has_vsx_re
> >  #endif /* CONFIG_VSX */
> >  }
> >  
> > +static inline int get_user_sigset(sigset_t *dst, const sigset_t *src)
> > +{
> > +   if (sizeof(sigset_t) <= 8)
> > +   return __get_user(dst->sig[0], >sig[0]);
> > +   else
> > +   return __copy_from_user(dst, src, sizeof(sigset_t));
> > +}
> > +
> >  /*
> >   * Set up the sigcontext for the signal frame.
> >   */
> > @@ -701,8 +709,9 @@ SYSCALL_DEFINE3(swapcontext, struct ucontext __user *, 
> > old_ctx,
> >  * We kill the task with a SIGSEGV in this situation.
> >  */
> >  
> > -   if (__copy_from_user(, _ctx->uc_sigmask, sizeof(set)))
> > +   if (get_user_sigset(, _ctx->uc_sigmask))
> > do_exit(SIGSEGV);
> > +
> > set_current_blocked();
> >  
> > if (!user_read_access_begin(new_ctx, ctx_size))
> > @@ -740,8 +749,9 @@ SYSCALL_DEFINE0(rt_sigreturn)
> > if (!access_ok(uc, sizeof(*uc)))
> > goto badframe;
> >  
> > -   if (__copy_from_user(, >uc_sigmask, sizeof(set)))
> > +   if (get_user_sigset(, >uc_sigmask))
> > goto badframe;
> > +
> > set_current_blocked();
> >  
> >  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > -- 
> > 2.26.1



Re: [PATCH v5 05/10] powerpc/signal64: Remove TM ifdefery in middle of if/else block

2021-02-17 Thread Christopher M. Riedl
On Thu Feb 11, 2021 at 11:21 PM CST, Daniel Axtens wrote:
> Hi Chris,
>
> > Rework the messy ifdef breaking up the if-else for TM similar to
> > commit f1cf4f93de2f ("powerpc/signal32: Remove ifdefery in middle of 
> > if/else").
>
> I'm not sure what 'the messy ifdef' and 'the if-else for TM' is (yet):
> perhaps you could start the commit message with a tiny bit of
> background.

Yup good point - I will reword this for the next spin.

>
> > Unlike that commit for ppc32, the ifdef can't be removed entirely since
> > uc_transact in sigframe depends on CONFIG_PPC_TRANSACTIONAL_MEM.
> >
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >  arch/powerpc/kernel/signal_64.c | 16 +++-
> >  1 file changed, 7 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index b211a8ea4f6e..8e1d804ce552 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -710,9 +710,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
> > struct pt_regs *regs = current_pt_regs();
> > struct ucontext __user *uc = (struct ucontext __user *)regs->gpr[1];
> > sigset_t set;
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > unsigned long msr;
> > -#endif
> >  
> > /* Always make any pending restarted system calls return -EINTR */
> > current->restart_block.fn = do_no_restart_syscall;
> > @@ -765,7 +763,10 @@ SYSCALL_DEFINE0(rt_sigreturn)
> >  
> > if (__get_user(msr, >uc_mcontext.gp_regs[PT_MSR]))
> > goto badframe;
> > +#endif
> > +
> > if (MSR_TM_ACTIVE(msr)) {
> > +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > /* We recheckpoint on return. */
> > struct ucontext __user *uc_transact;
> >  
> > @@ -778,9 +779,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
> > if (restore_tm_sigcontexts(current, >uc_mcontext,
> >_transact->uc_mcontext))
> > goto badframe;
> > -   } else
> >  #endif
> > -   {
> > +   } else {
> > /*
> >  * Fall through, for non-TM restore
> >  *
>
> I think you can get rid of all the ifdefs in _this function_ by
> providing a couple of stubs:
>
> diff --git a/arch/powerpc/kernel/process.c
> b/arch/powerpc/kernel/process.c
> index a66f435dabbf..19059a4b798f 100644
> --- a/arch/powerpc/kernel/process.c
> +++ b/arch/powerpc/kernel/process.c
> @@ -1120,6 +1120,7 @@ void restore_tm_state(struct pt_regs *regs)
> #else
> #define tm_recheckpoint_new_task(new)
> #define __switch_to_tm(prev, new)
> +void tm_reclaim_current(uint8_t cause) {}
> #endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
>  
> static inline void save_sprs(struct thread_struct *t)
> diff --git a/arch/powerpc/kernel/signal_64.c
> b/arch/powerpc/kernel/signal_64.c
> index 8e1d804ce552..ed58ca019ad9 100644
> --- a/arch/powerpc/kernel/signal_64.c
> +++ b/arch/powerpc/kernel/signal_64.c
> @@ -594,6 +594,13 @@ static long restore_tm_sigcontexts(struct
> task_struct *tsk,
>  
> return err;
> }
> +#else
> +static long restore_tm_sigcontexts(struct task_struct *tsk,
> + struct sigcontext __user *sc,
> + struct sigcontext __user *tm_sc)
> +{
> + return -EINVAL;
> +}
> #endif
>  
> /*
> @@ -722,7 +729,6 @@ SYSCALL_DEFINE0(rt_sigreturn)
> goto badframe;
> set_current_blocked();
>  
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> /*
> * If there is a transactional state then throw it away.
> * The purpose of a sigreturn is to destroy all traces of the
> @@ -763,10 +769,8 @@ SYSCALL_DEFINE0(rt_sigreturn)
>  
> if (__get_user(msr, >uc_mcontext.gp_regs[PT_MSR]))
> goto badframe;
> -#endif
>  
> if (MSR_TM_ACTIVE(msr)) {
> -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> /* We recheckpoint on return. */
> struct ucontext __user *uc_transact;
>  
> @@ -779,7 +783,6 @@ SYSCALL_DEFINE0(rt_sigreturn)
> if (restore_tm_sigcontexts(current, >uc_mcontext,
> _transact->uc_mcontext))
> goto badframe;
> -#endif
> } else {
> /*
> * Fall through, for non-TM restore
>
> My only concern here was whether it was valid to access
> if (__get_user(msr, >uc_mcontext.gp_regs[PT_MSR]))
> if CONFIG_PPC_TRANSACTIONAL_MEM was not defined, but I didn't think of
> any obvious reason why it wouldn't be...

Hmm, we don't really need it for the non-TM case and it is another extra
uaccess. I will take your suggestion to remove the need for the other
ifdefs but might keep this one. Keeping it also makes it very clear this
call is only here for TM and possible to remove in a potentially TM-less
future :)

>
> > @@ -818,10 +818,8 @@ int handle_rt_signal64(struct ksignal *ksig, sigset_t 
> > *set,
> > unsigned long newsp = 0;
> > long err = 0;
> > struct pt_regs *regs = tsk->thread.regs;
> > -#ifdef CONFIG_PPC_TRANSACTIONAL_MEM
> > /* Save the thread's msr before get_tm_stackpointer() changes it */
> > unsigned long msr = regs->msr;
> > -#endif
>
> I wondered if that comment still makes sense in the absence of the
> ifdef. It is preserved in commit f1cf4f93de2f 

Re: [PATCH kernel 2/2] powerpc/iommu: Do not immediately panic when failed IOMMU table allocation

2021-02-17 Thread Leonardo Bras
On Tue, 2021-02-16 at 14:33 +1100, Alexey Kardashevskiy wrote:
> Most platforms allocate IOMMU table structures (specifically it_map)
> at the boot time and when this fails - it is a valid reason for panic().
> 
> However the powernv platform allocates it_map after a device is returned
> to the host OS after being passed through and this happens long after
> the host OS booted. It is quite possible to trigger the it_map allocation
> panic() and kill the host even though it is not necessary - the host OS
> can still use the DMA bypass mode (requires a tiny fraction of it_map's
> memory) and even if that fails, the host OS is runnnable as it was without
> the device for which allocating it_map causes the panic.
> 
> Instead of immediately crashing in a powernv/ioda2 system, this prints
> an error and continues. All other platforms still call panic().
> 
> Signed-off-by: Alexey Kardashevskiy 

Hello Alexey,

This looks like a good change, that passes panic() decision to platform
code. Everything looks pretty straightforward, but I have a question
regarding this:

> @@ -1930,16 +1931,16 @@ static long pnv_pci_ioda2_setup_default_config(struct 
> pnv_ioda_pe *pe)
>   res_start = pe->phb->ioda.m32_pci_base >> tbl->it_page_shift;
>   res_end = min(window_size, SZ_4G) >> tbl->it_page_shift;
>   }
> - iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end);
> - rc = pnv_pci_ioda2_set_window(>table_group, 0, tbl);
> 
> + if (iommu_init_table(tbl, pe->phb->hose->node, res_start, res_end))
> + rc = pnv_pci_ioda2_set_window(>table_group, 0, tbl);
> + else
> + rc = -ENOMEM;
>   if (rc) {
> - pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n",
> - rc);
> + pe_err(pe, "Failed to configure 32-bit TCE table, err %ld\n", 
> rc);
>   iommu_tce_table_put(tbl);
> - return rc;
> + tbl = NULL; /* This clears iommu_table_base below */
>   }
> -
>   if (!pnv_iommu_bypass_disabled)
>   pnv_pci_ioda2_set_bypass(pe, true);
>  
> 

If I could understand correctly, previously if iommu_init_table() did
not panic(), and pnv_pci_ioda2_set_window() returned something other
than 0, it would return rc in the if (rc) clause, but now it does not
happen anymore, going through if (!pnv_iommu_bypass_disabled) onwards.

Is that desired?

As far as I could see, returning rc there seems a good procedure after
iommu_init_table returning -ENOMEM.

Best regards, 
Leonardo Bras  






Re: [PATCH v5 07/10] powerpc/signal64: Replace restore_sigcontext() w/ unsafe_restore_sigcontext()

2021-02-17 Thread Christopher M. Riedl
On Fri Feb 12, 2021 at 3:17 PM CST, Daniel Axtens wrote:
> Hi Chris,
>
> > Previously restore_sigcontext() performed a costly KUAP switch on every
> > uaccess operation. These repeated uaccess switches cause a significant
> > drop in signal handling performance.
> >
> > Rewrite restore_sigcontext() to assume that a userspace read access
> > window is open. Replace all uaccess functions with their 'unsafe'
> > versions which avoid the repeated uaccess switches.
> >
>
> Much of the same comments apply here as to the last patch:
> - the commit message might be improved by adding that you are also
> changing the calling functions to open the uaccess window before
> calling into the new unsafe functions
>
> - I have checked that the safe to unsafe conversions look right.
>
> - I think you're opening too wide a window in user_read_access_begin,
> it seems to me that it could be reduced to just the
> uc_mcontext. (Again, not that it makes a difference with the current
> HW.)

Ok, I'll fix these in the next version as well.

>
> Kind regards,
> Daniel
>
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >  arch/powerpc/kernel/signal_64.c | 68 -
> >  1 file changed, 41 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index 4248e4489ff1..d668f8af18fe 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -326,14 +326,14 @@ static long setup_tm_sigcontexts(struct sigcontext 
> > __user *sc,
> >  /*
> >   * Restore the sigcontext from the signal frame.
> >   */
> > -
> > -static long restore_sigcontext(struct task_struct *tsk, sigset_t *set, int 
> > sig,
> > - struct sigcontext __user *sc)
> > +#define unsafe_restore_sigcontext(tsk, set, sig, sc, e) \
> > +   unsafe_op_wrap(__unsafe_restore_sigcontext(tsk, set, sig, sc), e)
> > +static long notrace __unsafe_restore_sigcontext(struct task_struct *tsk, 
> > sigset_t *set,
> > +   int sig, struct sigcontext 
> > __user *sc)
> >  {
> >  #ifdef CONFIG_ALTIVEC
> > elf_vrreg_t __user *v_regs;
> >  #endif
> > -   unsigned long err = 0;
> > unsigned long save_r13 = 0;
> > unsigned long msr;
> > struct pt_regs *regs = tsk->thread.regs;
> > @@ -348,27 +348,28 @@ static long restore_sigcontext(struct task_struct 
> > *tsk, sigset_t *set, int sig,
> > save_r13 = regs->gpr[13];
> >  
> > /* copy the GPRs */
> > -   err |= __copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr));
> > -   err |= __get_user(regs->nip, >gp_regs[PT_NIP]);
> > +   unsafe_copy_from_user(regs->gpr, sc->gp_regs, sizeof(regs->gpr),
> > + efault_out);
> > +   unsafe_get_user(regs->nip, >gp_regs[PT_NIP], efault_out);
> > /* get MSR separately, transfer the LE bit if doing signal return */
> > -   err |= __get_user(msr, >gp_regs[PT_MSR]);
> > +   unsafe_get_user(msr, >gp_regs[PT_MSR], efault_out);
> > if (sig)
> > regs->msr = (regs->msr & ~MSR_LE) | (msr & MSR_LE);
> > -   err |= __get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3]);
> > -   err |= __get_user(regs->ctr, >gp_regs[PT_CTR]);
> > -   err |= __get_user(regs->link, >gp_regs[PT_LNK]);
> > -   err |= __get_user(regs->xer, >gp_regs[PT_XER]);
> > -   err |= __get_user(regs->ccr, >gp_regs[PT_CCR]);
> > +   unsafe_get_user(regs->orig_gpr3, >gp_regs[PT_ORIG_R3], efault_out);
> > +   unsafe_get_user(regs->ctr, >gp_regs[PT_CTR], efault_out);
> > +   unsafe_get_user(regs->link, >gp_regs[PT_LNK], efault_out);
> > +   unsafe_get_user(regs->xer, >gp_regs[PT_XER], efault_out);
> > +   unsafe_get_user(regs->ccr, >gp_regs[PT_CCR], efault_out);
> > /* Don't allow userspace to set SOFTE */
> > set_trap_norestart(regs);
> > -   err |= __get_user(regs->dar, >gp_regs[PT_DAR]);
> > -   err |= __get_user(regs->dsisr, >gp_regs[PT_DSISR]);
> > -   err |= __get_user(regs->result, >gp_regs[PT_RESULT]);
> > +   unsafe_get_user(regs->dar, >gp_regs[PT_DAR], efault_out);
> > +   unsafe_get_user(regs->dsisr, >gp_regs[PT_DSISR], efault_out);
> > +   unsafe_get_user(regs->result, >gp_regs[PT_RESULT], efault_out);
> >  
> > if (!sig)
> > regs->gpr[13] = save_r13;
> > if (set != NULL)
> > -   err |=  __get_user(set->sig[0], >oldmask);
> > +   unsafe_get_user(set->sig[0], >oldmask, efault_out);
> >  
> > /*
> >  * Force reload of FP/VEC.
> > @@ -378,29 +379,28 @@ static long restore_sigcontext(struct task_struct 
> > *tsk, sigset_t *set, int sig,
> > regs->msr &= ~(MSR_FP | MSR_FE0 | MSR_FE1 | MSR_VEC | MSR_VSX);
> >  
> >  #ifdef CONFIG_ALTIVEC
> > -   err |= __get_user(v_regs, >v_regs);
> > -   if (err)
> > -   return err;
> > +   unsafe_get_user(v_regs, >v_regs, efault_out);
> > if (v_regs && !access_ok(v_regs, 34 * sizeof(vector128)))
> > return -EFAULT;
> > /* Copy 33 vec registers (vr0..31 and vscr) from the 

Re: [PATCH v5 06/10] powerpc/signal64: Replace setup_sigcontext() w/ unsafe_setup_sigcontext()

2021-02-17 Thread Christopher M. Riedl
On Thu Feb 11, 2021 at 11:41 PM CST, Daniel Axtens wrote:
> Hi Chris,
>
> > Previously setup_sigcontext() performed a costly KUAP switch on every
> > uaccess operation. These repeated uaccess switches cause a significant
> > drop in signal handling performance.
> >
> > Rewrite setup_sigcontext() to assume that a userspace write access window
> > is open. Replace all uaccess functions with their 'unsafe' versions
> > which avoid the repeated uaccess switches.
>
> Just to clarify the commit message a bit: you're also changing the
> callers of the old safe versions to first open the window, then call the
> unsafe variants, then close the window again.

Noted!

>
> > Signed-off-by: Christopher M. Riedl 
> > ---
> >  arch/powerpc/kernel/signal_64.c | 70 -
> >  1 file changed, 43 insertions(+), 27 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/signal_64.c 
> > b/arch/powerpc/kernel/signal_64.c
> > index 8e1d804ce552..4248e4489ff1 100644
> > --- a/arch/powerpc/kernel/signal_64.c
> > +++ b/arch/powerpc/kernel/signal_64.c
> > @@ -101,9 +101,13 @@ static void prepare_setup_sigcontext(struct 
> > task_struct *tsk, int ctx_has_vsx_re
> >   * Set up the sigcontext for the signal frame.
> >   */
> >  
> > -static long setup_sigcontext(struct sigcontext __user *sc,
> > -   struct task_struct *tsk, int signr, sigset_t *set,
> > -   unsigned long handler, int ctx_has_vsx_region)
> > +#define unsafe_setup_sigcontext(sc, tsk, signr, set, handler,  
> > \
> > +   ctx_has_vsx_region, e)  \
> > +   unsafe_op_wrap(__unsafe_setup_sigcontext(sc, tsk, signr, set,   \
> > +   handler, ctx_has_vsx_region), e)
> > +static long notrace __unsafe_setup_sigcontext(struct sigcontext __user *sc,
> > +   struct task_struct *tsk, int signr, 
> > sigset_t *set,
> > +   unsigned long handler, int 
> > ctx_has_vsx_region)
> >  {
> > /* When CONFIG_ALTIVEC is set, we _always_ setup v_regs even if the
> >  * process never used altivec yet (MSR_VEC is zero in pt_regs of
> > @@ -118,20 +122,19 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> >  #endif
> > struct pt_regs *regs = tsk->thread.regs;
> > unsigned long msr = regs->msr;
> > -   long err = 0;
> > /* Force usr to alway see softe as 1 (interrupts enabled) */
> > unsigned long softe = 0x1;
> >  
> > BUG_ON(tsk != current);
> >  
> >  #ifdef CONFIG_ALTIVEC
> > -   err |= __put_user(v_regs, >v_regs);
> > +   unsafe_put_user(v_regs, >v_regs, efault_out);
> >  
> > /* save altivec registers */
> > if (tsk->thread.used_vr) {
> > /* Copy 33 vec registers (vr0..31 and vscr) to the stack */
> > -   err |= __copy_to_user(v_regs, >thread.vr_state,
> > - 33 * sizeof(vector128));
> > +   unsafe_copy_to_user(v_regs, >thread.vr_state,
> > +   33 * sizeof(vector128), efault_out);
> > /* set MSR_VEC in the MSR value in the frame to indicate that 
> > sc->v_reg)
> >  * contains valid data.
> >  */
> > @@ -140,12 +143,12 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> > /* We always copy to/from vrsave, it's 0 if we don't have or don't
> >  * use altivec.
> >  */
> > -   err |= __put_user(tsk->thread.vrsave, (u32 __user *)_regs[33]);
> > +   unsafe_put_user(tsk->thread.vrsave, (u32 __user *)_regs[33], 
> > efault_out);
> >  #else /* CONFIG_ALTIVEC */
> > -   err |= __put_user(0, >v_regs);
> > +   unsafe_put_user(0, >v_regs, efault_out);
> >  #endif /* CONFIG_ALTIVEC */
> > /* copy fpr regs and fpscr */
> > -   err |= copy_fpr_to_user(>fp_regs, tsk);
> > +   unsafe_copy_fpr_to_user(>fp_regs, tsk, efault_out);
> >  
> > /*
> >  * Clear the MSR VSX bit to indicate there is no valid state attached
> > @@ -160,24 +163,27 @@ static long setup_sigcontext(struct sigcontext __user 
> > *sc,
> >  */
> > if (tsk->thread.used_vsr && ctx_has_vsx_region) {
> > v_regs += ELF_NVRREG;
> > -   err |= copy_vsx_to_user(v_regs, tsk);
> > +   unsafe_copy_vsx_to_user(v_regs, tsk, efault_out);
> > /* set MSR_VSX in the MSR value in the frame to
> >  * indicate that sc->vs_reg) contains valid data.
> >  */
> > msr |= MSR_VSX;
> > }
> >  #endif /* CONFIG_VSX */
> > -   err |= __put_user(>gp_regs, >regs);
> > +   unsafe_put_user(>gp_regs, >regs, efault_out);
> > WARN_ON(!FULL_REGS(regs));
> > -   err |= __copy_to_user(>gp_regs, regs, GP_REGS_SIZE);
> > -   err |= __put_user(msr, >gp_regs[PT_MSR]);
> > -   err |= __put_user(softe, >gp_regs[PT_SOFTE]);
> > -   err |= __put_user(signr, >signal);
> > -   err |= __put_user(handler, >handler);
> > +   unsafe_copy_to_user(>gp_regs, regs, GP_REGS_SIZE, efault_out);
> > +   unsafe_put_user(msr, 

Re: [PATCH kernel 1/2] powerpc/iommu: Allocate it_map by vmalloc

2021-02-17 Thread Leonardo Bras
On Tue, 2021-02-16 at 14:33 +1100, Alexey Kardashevskiy wrote:
> The IOMMU table uses the it_map bitmap to keep track of allocated DMA
> pages. This has always been a contiguous array allocated at either
> the boot time or when a passed through device is returned to the host OS.
> The it_map memory is allocated by alloc_pages() which allocates
> contiguous physical memory.
> 
> Such allocation method occasionally creates a problem when there is
> no big chunk of memory available (no free memory or too fragmented).
> On powernv/ioda2 the default DMA window requires 16MB for it_map.
> 
> This replaces alloc_pages_node() with vzalloc_node() which allocates
> contiguous block but in virtual memory. This should reduce changes of
> failure but should not cause other behavioral changes as it_map is only
> used by the kernel's DMA hooks/api when MMU is on.
> 
> Signed-off-by: Alexey Kardashevskiy 

It looks a very good change, and also makes code much simpler to read.

FWIW:

Reviewed-by: Leonardo Bras 

> ---
>  arch/powerpc/kernel/iommu.c | 15 +++
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
> index c00214a4355c..8eb6eb0afa97 100644
> --- a/arch/powerpc/kernel/iommu.c
> +++ b/arch/powerpc/kernel/iommu.c
> @@ -719,7 +719,6 @@ struct iommu_table *iommu_init_table(struct iommu_table 
> *tbl, int nid,
>  {
>   unsigned long sz;
>   static int welcomed = 0;
> - struct page *page;
>   unsigned int i;
>   struct iommu_pool *p;
>  
> 
> 
> 
> @@ -728,11 +727,9 @@ struct iommu_table *iommu_init_table(struct iommu_table 
> *tbl, int nid,
>   /* number of bytes needed for the bitmap */
>   sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long);
>  
> 
> 
> 
> - page = alloc_pages_node(nid, GFP_KERNEL, get_order(sz));
> - if (!page)
> + tbl->it_map = vzalloc_node(sz, nid);
> + if (!tbl->it_map)
>   panic("iommu_init_table: Can't allocate %ld bytes\n", sz);
> - tbl->it_map = page_address(page);
> - memset(tbl->it_map, 0, sz);
>  
> 
> 
> 
>   iommu_table_reserve_pages(tbl, res_start, res_end);
>  
> 
> 
> 
> @@ -774,8 +771,6 @@ struct iommu_table *iommu_init_table(struct iommu_table 
> *tbl, int nid,
>  
> 
> 
> 
>  static void iommu_table_free(struct kref *kref)
>  {
> - unsigned long bitmap_sz;
> - unsigned int order;
>   struct iommu_table *tbl;
>  
> 
> 
> 
>   tbl = container_of(kref, struct iommu_table, it_kref);
> @@ -796,12 +791,8 @@ static void iommu_table_free(struct kref *kref)
>   if (!bitmap_empty(tbl->it_map, tbl->it_size))
>   pr_warn("%s: Unexpected TCEs\n", __func__);
>  
> 
> 
> 
> - /* calculate bitmap size in bytes */
> - bitmap_sz = BITS_TO_LONGS(tbl->it_size) * sizeof(unsigned long);
> -
>   /* free bitmap */
> - order = get_order(bitmap_sz);
> - free_pages((unsigned long) tbl->it_map, order);
> + vfree(tbl->it_map);
>  
> 
> 
> 
>   /* free table */
>   kfree(tbl);




Re: [PATCH 2/2] powerpc/uaccess: Move might_fault() into user_access_begin()

2021-02-17 Thread Christophe Leroy

Michael Ellerman  a écrit :


Christophe Leroy  writes:

Le 08/02/2021 à 14:57, Michael Ellerman a écrit :

We have a might_fault() check in __unsafe_put_user_goto(), but that is
dangerous as it potentially calls lots of code while user access is
enabled.

It also triggers the check Alexey added to the irq restore path to
catch cases like that:

   WARNING: CPU: 30 PID: 1 at  
arch/powerpc/include/asm/book3s/64/kup.h:324  
arch_local_irq_restore+0x160/0x190

   NIP arch_local_irq_restore+0x160/0x190
   LR  lock_is_held_type+0x140/0x200
   Call Trace:
 0xc0007f392ff8 (unreliable)
 ___might_sleep+0x180/0x320
 __might_fault+0x50/0xe0
 filldir64+0x2d0/0x5d0
 call_filldir+0xc8/0x180
 ext4_readdir+0x948/0xb40
 iterate_dir+0x1ec/0x240
 sys_getdents64+0x80/0x290
 system_call_exception+0x160/0x280
 system_call_common+0xf0/0x27c

So remove the might fault check from unsafe_put_user().

Any call to unsafe_put_user() must be inside a region that's had user
access enabled with user_access_begin(), so move the might_fault() in
there. That also allows us to drop the is_kernel_addr() test, because
there should be no code using user_access_begin() in order to access a
kernel address.


x86 and mips only have might_fault() on get_user() and put_user(),
neither on __get_user() nor on __put_user() nor on the unsafe
alternative.


Yeah, that's their choice, or perhaps it's by accident.

arm64 on the other hand has might_fault() in all variants.

A __get_user() can fault just as much as a get_user(), so there's no
reason the check should be omitted from __get_user(), other than perhaps
some historical argument about __get_user() being the "fast" case.


When have might_fault() in __get_user_nocheck() that is used by
__get_user() and __get_user_allowed() ie by unsafe_get_user().

Shoudln't those be dropped as well ?


That was handled by Alexey's patch, which I ended up merging with this
one:

  https://git.kernel.org/torvalds/c/7d506ca97b665b95e698a53697dad99fae813c1a


ie. we still have might_fault() in __get_user_nocheck(), but it's
guarded by a check of do_allow, so we won't call it for
__get_user_allowed().

So I think the code (in my next branch) is correct, we don't have any
might_fault() calls in unsafe regions.

But I'd still be happier if we could simplify our uaccess.h more, it's a
bit of a rats nest. We could start by making __get/put_user() ==
get/put_user() the same way arm64 did.


I agree there are several easy simplifications to do there. I'll look  
at that in the coming weeks.


I'm not sure it is good to make __get/put_user equal get/put_user as  
it would mean calling access_ok() everytime. But we could most likely  
make something simpler with get_user() calling access_ok() then  
__get_user().


I think we should also audit our use of the _inatomic variants.  
might_fault() voids when pagefault is disabled so I think the inatomic  
variants should be needed. As far as I can see, powerpc is the only  
arch having that.


Need to also check why we still need that is_kernel_addr() check  
because since the removal of set_fs(), __get/put_user() helpers  
shouldn't be used anymore for kernel addresses


Christophe