Re: [PATCH v3 3/3] clk: qcom: lpasscc-sc7180: Re-configure the PLL in case lost

2020-10-14 Thread Doug Anderson
Hi,

On Wed, Oct 14, 2020 at 4:39 PM Stephen Boyd  wrote:
>
> Quoting Douglas Anderson (2020-10-14 14:05:23)
> > diff --git a/drivers/clk/qcom/lpasscorecc-sc7180.c 
> > b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > index 48d370e2108e..e12d4c2b1b70 100644
> > --- a/drivers/clk/qcom/lpasscorecc-sc7180.c
> > +++ b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > @@ -388,6 +388,25 @@ static int lpass_create_pm_clks(struct platform_device 
> > *pdev)
> > return ret;
> >  }
> >
> > +static int lpass_core_cc_pm_clk_resume(struct device *dev)
> > +{
> > +   struct regmap *regmap = dev_get_regmap(dev, "lpass_core_cc");
>
> Please make "lpass_core_cc" a static const pointer in this driver so
> that it can be used here and when the regmap is made so that we're
> certain they match.

Sure.  In theory the compiler ought to use string constant pooling so
they would have pointed to the same place, but you're right that it
wouldn't catch typos or other cases where they don't match.  I'll do
both regmap names just for symmetry.  Hopefully that's OK.

-Doug


Re: [PATCH v2 14/17] resource: Move devmem revoke code to resource framework

2020-10-14 Thread Jason Gunthorpe
On Fri, Oct 09, 2020 at 11:28:54AM -0700, Dan Williams wrote:
> On Fri, Oct 9, 2020 at 7:32 AM Jason Gunthorpe  wrote:
> >
> > On Fri, Oct 09, 2020 at 04:24:45PM +0200, Daniel Vetter wrote:
> > > On Fri, Oct 9, 2020 at 2:31 PM Jason Gunthorpe  wrote:
> > > >
> > > > On Fri, Oct 09, 2020 at 09:59:31AM +0200, Daniel Vetter wrote:
> > > >
> > > > > +struct address_space *iomem_get_mapping(void)
> > > > > +{
> > > > > + return iomem_inode->i_mapping;
> > > >
> > > > This should pair an acquire with the release below
> > > >
> > > > > + /*
> > > > > +  * Publish /dev/mem initialized.
> > > > > +  * Pairs with smp_load_acquire() in revoke_iomem().
> > > > > +  */
> > > > > + smp_store_release(&iomem_inode, inode);
> > > >
> > > > However, this seems abnormal, initcalls rarely do this kind of stuff
> > > > with global data..
> > > >
> > > > The kernel crashes if this fs_initcall is raced with
> > > > iomem_get_mapping() due to the unconditional dereference, so I think
> > > > it can be safely switched to a simple assignment.
> > >
> > > Ah yes I checked this all, but forgot to correctly annotate the
> > > iomem_get_mapping access. For reference, see b34e7e298d7a ("/dev/mem:
> > > Add missing memory barriers for devmem_inode").
> >
> > Oh yikes, so revoke_iomem can run concurrently during early boot,
> > tricky.
> 
> It runs early because request_mem_region() can run before fs_initcall.
> Rather than add an unnecessary lock just arrange for the revoke to be
> skipped before the inode is initialized. The expectation is that any
> early resource reservations will block future userspace mapping
> attempts.

Actually, on this point a simple WRITE_ONCE/READ_ONCE pairing is OK,
Paul once explained that the pointer chase on the READ_ONCE side is
required to be like an acquire - this is why rcu_dereference is just
READ_ONCE

Jason


Re: [PATCH v6 16/25] init: lto: fix PREL32 relocations

2020-10-14 Thread Jann Horn
On Tue, Oct 13, 2020 at 2:34 AM Sami Tolvanen  wrote:
> With LTO, the compiler can rename static functions to avoid global
> naming collisions. As initcall functions are typically static,
> renaming can break references to them in inline assembly. This
> change adds a global stub with a stable name for each initcall to
> fix the issue when PREL32 relocations are used.

While I understand that this may be necessary for now, are there any
plans to fix this in the compiler in the future? There was a thread
about this issue at
,
and possible solutions were discussed there, but it looks like that
fizzled out...


Re: [PATCH -next] Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"

2020-10-14 Thread Oliver O'Halloran
On Thu, Oct 15, 2020 at 5:28 AM Qian Cai  wrote:
>
> This reverts commit 3a3181e16fbde752007759f8759d25e0ff1fc425 which
> causes memory corruptions on POWER9 NV.

I was going to post this along with a fix for Cedric's original bug,
but I can do that separately so:

Acked-by: Oliver O'Halloran 


Re: [tip: locking/core] lockdep: Fix lockdep recursion

2020-10-14 Thread Peter Zijlstra
On Wed, Oct 14, 2020 at 11:34:05AM -0700, Paul E. McKenney wrote:
> commit 7deaa04b02298001426730ed0e6214ac20d1a1c1
> Author: Paul E. McKenney 
> Date:   Tue Oct 13 12:39:23 2020 -0700
> 
> rcu: Prevent lockdep-RCU splats on lock acquisition/release
> 
> The rcu_cpu_starting() and rcu_report_dead() functions transition the
> current CPU between online and offline state from an RCU perspective.
> Unfortunately, this means that the rcu_cpu_starting() function's lock
> acquisition and the rcu_report_dead() function's lock releases happen
> while the CPU is offline from an RCU perspective, which can result in
> lockdep-RCU splats about using RCU from an offline CPU.  In reality,
> aside from the splats, both transitions are safe because a new grace
> period cannot start until these functions release their locks.

But we call the trace_* crud before we acquire the lock. Are you sure
that's a false-positive? 



RE: [PATCH 01/13] remoteproc: Re-check state in rproc_shutdown()

2020-10-14 Thread Peng Fan
> Subject: [PATCH 01/13] remoteproc: Re-check state in rproc_shutdown()
> 
> The state of the remote processor may have changed between the time a call
> to rproc_shutdown() was made and the time it is executed.  To avoid moving
> forward with an operation that may have been cancelled, recheck while
> holding the mutex.
> 
> Cc: 
> Signed-off-by: Mathieu Poirier 
> ---
>  drivers/remoteproc/remoteproc_core.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c
> b/drivers/remoteproc/remoteproc_core.c
> index 7f90eeea67e2..fb2632cbd2df 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -1834,6 +1834,9 @@ void rproc_shutdown(struct rproc *rproc)
>   return;
>   }
> 
> + if (rproc->state != RPROC_RUNNING)
> + goto out;
> +
>   /* if the remote proc is still needed, bail out */
>   if (!atomic_dec_and_test(&rproc->power))
>   goto out;
> --

Reviewed-by: Peng Fan 


Re: [PATCH] sched/cputime: correct account of irqtime

2020-10-14 Thread Pingfan Liu
On Wed, Oct 14, 2020 at 9:02 PM Peter Zijlstra  wrote:
>
> On Mon, Oct 12, 2020 at 09:50:44PM +0800, Pingfan Liu wrote:
> > __do_softirq() may be interrupted by hardware interrupts. In this case,
> > irqtime_account_irq() will account the time slice as CPUTIME_SOFTIRQ by
> > mistake.
> >
> > By passing irqtime_account_irq() an extra param about either hardirq or
> > softirq, irqtime_account_irq() can handle the above case.
>
> I'm not sure I see the scenario in which it goes wrong.
>
> irqtime_account_irq() is designed such that we're called with the old
> preempt_count on enter and the new preempt_count on exit. This way we'll
> accumuate the delta to the previous context.
>
Oops! You are right, the time delta between a softirq and a
interrupting hardirq should be accounted into the softrq.

Thanks for your clear explanation.

Regards,
Pingfan


Re: [PATCH v2 24/24] counters: docs: add a missing include

2020-10-14 Thread Shuah Khan

On 10/13/20 6:14 AM, Mauro Carvalho Chehab wrote:

Changeset 37a0dbf631f6 ("counters: Introduce counter_atomic* counters")

Is causing two new warnings:

.../Documentation/core-api/counters.rst:8: WARNING: Undefined substitution 
referenced: "copy".
.../Documentation/core-api/counters.rst:9: WARNING: Undefined substitution 
referenced: "copy".

Because it forgot to include isonum.txt, which defines |copy|
macro.

While here, also add it to core-api index file, in order to
solve this warning:

.../Documentation/core-api/counters.rst: WARNING: document isn't 
included in any toctree

Fixes: 37a0dbf631f6 ("counters: Introduce counter_atomic* counters")
Signed-off-by: Mauro Carvalho Chehab 
---
  Documentation/core-api/counters.rst | 1 +
  Documentation/core-api/index.rst| 1 +
  2 files changed, 2 insertions(+)

diff --git a/Documentation/core-api/counters.rst 
b/Documentation/core-api/counters.rst
index 642d907f4d3a..2821aebf3f45 100644
--- a/Documentation/core-api/counters.rst
+++ b/Documentation/core-api/counters.rst
@@ -1,4 +1,5 @@
  .. SPDX-License-Identifier: GPL-2.0
+.. include:: 
  
  ==

  Simple atomic counters
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 69171b1799f2..cf9cd44c1191 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -43,6 +43,7 @@ Library functionality that is used throughout the kernel.
 this_cpu_ops
 timekeeping
 errseq
+   counters
  
  Concurrency primitives

  ==



Thank you for the patch. I will add this to my patch series.

thanks,
-- Shuah


Re: [GIT PULL v2] objtool changes for v5.10

2020-10-14 Thread Stephen Rothwell
Hi Ingo,

On Tue, 13 Oct 2020 12:38:31 +0200 Ingo Molnar  wrote:
>
> * Ingo Molnar  wrote:
> 
> > > This seems to be missing
> > > 
> > > https://lore.kernel.org/lkml/patch-1.thread-251403.git-2514037e9477.your-ad-here.call-01602244460-ext-7088@work.hours/
> > > 
> > > or did that get sent in a previous pull request?  
> > 
> > No, that fix is still missing, thanks for the reminder. I overlooked it 
> > thinking that it's a tooling patch - but this needs to be paired with:
> > 
> >   2486baae2cf6: ("objtool: Allow nested externs to enable BUILD_BUG()")
> > 
> > I'll send a v2 pull request in an hour or two.  

Thanks for that.

-- 
Cheers,
Stephen Rothwell


pgpNYrRHMr7OL.pgp
Description: OpenPGP digital signature


Re: [PATCH v5 1/5] arm64: Add framework to turn IPI as NMI

2020-10-14 Thread Masayoshi Mizuma
On Wed, Oct 14, 2020 at 04:42:07PM +0530, Sumit Garg wrote:
> Introduce framework to turn an IPI as NMI using pseudo NMIs. In case a
> particular platform doesn't support pseudo NMIs, then request IPI as a
> regular IRQ.
> 
> The main motivation for this feature is to have an IPI that can be
> leveraged to invoke NMI functions on other CPUs. And current prospective
> users are NMI backtrace and KGDB CPUs round-up whose support is added
> via future patches.
> 
> Signed-off-by: Sumit Garg 
> ---
>  arch/arm64/include/asm/nmi.h | 16 +
>  arch/arm64/kernel/Makefile   |  2 +-
>  arch/arm64/kernel/ipi_nmi.c  | 77 
> 
>  3 files changed, 94 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/nmi.h
>  create mode 100644 arch/arm64/kernel/ipi_nmi.c
> 
> diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
> new file mode 100644
> index 000..3433c55
> --- /dev/null
> +++ b/arch/arm64/include/asm/nmi.h
> @@ -0,0 +1,16 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_NMI_H
> +#define __ASM_NMI_H
> +
> +#ifndef __ASSEMBLER__
> +
> +#include 
> +
> +extern void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask);
> +
> +void set_smp_ipi_nmi(int ipi);
> +void ipi_nmi_setup(int cpu);
> +void ipi_nmi_teardown(int cpu);
> +
> +#endif /* !__ASSEMBLER__ */
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index bbaf0bc..525a1e0 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -17,7 +17,7 @@ obj-y   := debug-monitors.o entry.o 
> irq.o fpsimd.o  \
>  return_address.o cpuinfo.o cpu_errata.o  
> \
>  cpufeature.o alternative.o cacheinfo.o   
> \
>  smp.o smp_spin_table.o topology.o smccc-call.o   
> \
> -syscall.o proton-pack.o
> +syscall.o proton-pack.o ipi_nmi.o
>  
>  targets  += efi-entry.o
>  
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> new file mode 100644
> index 000..a959256
> --- /dev/null
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -0,0 +1,77 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * NMI support for IPIs
> + *
> + * Copyright (C) 2020 Linaro Limited
> + * Author: Sumit Garg 
> + */
> +
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +
> +static struct irq_desc *ipi_desc __read_mostly;
> +static int ipi_id __read_mostly;
> +static bool is_nmi __read_mostly;
> +
> +void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask)
> +{
> + if (WARN_ON_ONCE(!ipi_desc))
> + return;
> +
> + __ipi_send_mask(ipi_desc, mask);
> +}
> +
> +static irqreturn_t ipi_nmi_handler(int irq, void *data)
> +{
> + /* nop, NMI handlers for special features can be added here. */
> +
> + return IRQ_HANDLED;
> +}
> +
> +void ipi_nmi_setup(int cpu)
> +{
> + if (!ipi_desc)
> + return;
> +
> + if (is_nmi) {
> + if (!prepare_percpu_nmi(ipi_id))
> + enable_percpu_nmi(ipi_id, IRQ_TYPE_NONE);
> + } else {
> + enable_percpu_irq(ipi_id, IRQ_TYPE_NONE);
> + }
> +}
> +
> +void ipi_nmi_teardown(int cpu)
> +{
> + if (!ipi_desc)
> + return;
> +
> + if (is_nmi) {
> + disable_percpu_nmi(ipi_id);
> + teardown_percpu_nmi(ipi_id);
> + } else {
> + disable_percpu_irq(ipi_id);
> + }
> +}
> +
> +void __init set_smp_ipi_nmi(int ipi)
> +{
> + int err;
> +
> + err = request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number);
> + if (err) {
> + err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI",
> +  &cpu_number);
> + WARN_ON(err);
> + is_nmi = false;
> + } else {
> + is_nmi = true;
> + }
> +
> + ipi_desc = irq_to_desc(ipi);
> + irq_set_status_flags(ipi, IRQ_HIDDEN);
> + ipi_id = ipi;
> +}
> -- 

Looks good to me. Please feel free to add:

Reviewed-by: Masayoshi Mizuma 

Thanks!
Masa


Re: [PATCH v6 17/25] PCI: Fix PREL32 relocations for LTO

2020-10-14 Thread Kees Cook
On Mon, Oct 12, 2020 at 05:31:55PM -0700, Sami Tolvanen wrote:
> With Clang's Link Time Optimization (LTO), the compiler can rename
> static functions to avoid global naming collisions. As PCI fixup
> functions are typically static, renaming can break references
> to them in inline assembly. This change adds a global stub to
> DECLARE_PCI_FIXUP_SECTION to fix the issue when PREL32 relocations
> are used.
> 
> Signed-off-by: Sami Tolvanen 
> Acked-by: Bjorn Helgaas 
> Reviewed-by: Kees Cook 

Another independent patch! :) Bjorn, since you've already Acked this
patch, would be be willing to pick it up for your tree?

-Kees

> ---
>  include/linux/pci.h | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 835530605c0d..4e64421981c7 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1909,19 +1909,28 @@ enum pci_fixup_pass {
>  };
>  
>  #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> -#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,
> \
> - class_shift, hook)  \
> - __ADDRESSABLE(hook) \
> +#define ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,   
> \
> + class_shift, hook, stub)\
> + void stub(struct pci_dev *dev); \
> + void stub(struct pci_dev *dev)  \
> + {   \
> + hook(dev);  \
> + }   \
>   asm(".section " #sec ", \"a\"   \n" \
>   ".balign16  \n" \
>   ".short "   #vendor ", " #device "  \n" \
>   ".long "#class ", " #class_shift "  \n" \
> - ".long "#hook " - . \n" \
> + ".long "#stub " - . \n" \
>   ".previous  \n");
> +
> +#define __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,
> \
> +   class_shift, hook, stub)  \
> + ___DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,  \
> +   class_shift, hook, stub)
>  #define DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,  \
> class_shift, hook)\
>   __DECLARE_PCI_FIXUP_SECTION(sec, name, vendor, device, class,   \
> -   class_shift, hook)
> +   class_shift, hook, __UNIQUE_ID(hook))
>  #else
>  /* Anonymous variables would be nice... */
>  #define DECLARE_PCI_FIXUP_SECTION(section, name, vendor, device, class,  
> \
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook


Re: [RFC PATCH 1/4] Add a RPMSG driver for the APU in the mt8183

2020-10-14 Thread Mathieu Poirier
Hi Alexandre,

On Wed, Sep 30, 2020 at 01:53:47PM +0200, Alexandre Bailon wrote:
> This adds a driver to communicate with the APU available
> in the mt8183. The driver is generic and could be used for other APU.
> It mostly provides a userspace interface to send messages and
> and share big buffers with the APU.
> 
> Signed-off-by: Alexandre Bailon 
> ---
>  drivers/rpmsg/Kconfig  |   9 +
>  drivers/rpmsg/Makefile |   1 +
>  drivers/rpmsg/apu_rpmsg.c  | 606 +
>  drivers/rpmsg/apu_rpmsg.h  |  52 +++
>  include/uapi/linux/apu_rpmsg.h |  36 ++
>  5 files changed, 704 insertions(+)
>  create mode 100644 drivers/rpmsg/apu_rpmsg.c
>  create mode 100644 drivers/rpmsg/apu_rpmsg.h
>  create mode 100644 include/uapi/linux/apu_rpmsg.h
> 
> diff --git a/drivers/rpmsg/Kconfig b/drivers/rpmsg/Kconfig
> index f96716893c2a..3437c6fc8647 100644
> --- a/drivers/rpmsg/Kconfig
> +++ b/drivers/rpmsg/Kconfig
> @@ -64,4 +64,13 @@ config RPMSG_VIRTIO
>   select RPMSG
>   select VIRTIO
>  
> +config RPMSG_APU
> + tristate "APU RPMSG driver"
> + help
> +   This provides a RPMSG driver that provides some facilities to
> +   communicate with an accelerated processing unit (APU).
> +   This creates one or more char files that could be used by userspace
> +   to send a message to an APU. In addition, this also take care of
> +   sharing the memory buffer with the APU.
> +
>  endmenu
> diff --git a/drivers/rpmsg/Makefile b/drivers/rpmsg/Makefile
> index ffe932ef6050..93e0f3de99c9 100644
> --- a/drivers/rpmsg/Makefile
> +++ b/drivers/rpmsg/Makefile
> @@ -8,3 +8,4 @@ obj-$(CONFIG_RPMSG_QCOM_GLINK_RPM) += qcom_glink_rpm.o
>  obj-$(CONFIG_RPMSG_QCOM_GLINK_SMEM) += qcom_glink_smem.o
>  obj-$(CONFIG_RPMSG_QCOM_SMD) += qcom_smd.o
>  obj-$(CONFIG_RPMSG_VIRTIO)   += virtio_rpmsg_bus.o
> +obj-$(CONFIG_RPMSG_APU)  += apu_rpmsg.o
> diff --git a/drivers/rpmsg/apu_rpmsg.c b/drivers/rpmsg/apu_rpmsg.c
> new file mode 100644
> index ..5131b8b8e1f2
> --- /dev/null
> +++ b/drivers/rpmsg/apu_rpmsg.c
> @@ -0,0 +1,606 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// Copyright 2020 BayLibre SAS
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include "rpmsg_internal.h"
> +
> +#include 
> +
> +#include "apu_rpmsg.h"
> +
> +/* Maximum of APU devices supported */
> +#define APU_DEV_MAX 2
> +
> +#define dev_to_apu(dev) container_of(dev, struct rpmsg_apu, dev)
> +#define cdev_to_apu(i_cdev) container_of(i_cdev, struct rpmsg_apu, cdev)
> +
> +struct rpmsg_apu {
> + struct rpmsg_device *rpdev;
> + struct cdev cdev;
> + struct device dev;
> +
> + struct rproc *rproc;
> + struct iommu_domain *domain;
> + struct iova_domain *iovad;
> + int iova_limit_pfn;
> +};
> +
> +struct rpmsg_request {
> + struct completion completion;
> + struct list_head node;
> + void *req;
> +};
> +
> +struct apu_buffer {
> + int fd;
> + struct dma_buf *dma_buf;
> + struct dma_buf_attachment *attachment;
> + struct sg_table *sg_table;
> + u32 iova;
> +};
> +
> +/*
> + * Shared IOVA domain.
> + * The MT8183 has two VP6 core but they are sharing the IOVA.
> + * They could be used alone, or together. In order to avoid conflict,
> + * create an IOVA domain that could be shared by those two core.
> + * @iovad: The IOVA domain to share between the APU cores
> + * @refcount: Allow to automatically release the IOVA domain once all the APU
> + *cores has been stopped
> + */
> +struct apu_iova_domain {
> + struct iova_domain iovad;
> + struct kref refcount;
> +};
> +
> +static dev_t rpmsg_major;
> +static DEFINE_IDA(rpmsg_ctrl_ida);
> +static DEFINE_IDA(rpmsg_minor_ida);
> +static DEFINE_IDA(req_ida);
> +static LIST_HEAD(requests);
> +static struct apu_iova_domain *apu_iovad;
> +
> +static int apu_rpmsg_callback(struct rpmsg_device *dev, void *data, int 
> count,
> +   void *priv, u32 addr)
> +{
> + struct rpmsg_request *rpmsg_req;
> + struct apu_dev_request *hdr = data;
> +
> + list_for_each_entry(rpmsg_req, &requests, node) {
> + struct apu_dev_request *tmp_hdr = rpmsg_req->req;
> +
> + if (hdr->id == tmp_hdr->id) {
> + memcpy(rpmsg_req->req, data, count);
> + complete(&rpmsg_req->completion);
> +
> + return 0;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int apu_device_memory_map(struct rpmsg_apu *apu,
> +  struct apu_buffer *buffer)
> +{
> + struct rpmsg_device *rpdev = apu->rpdev;
> + phys_addr_t phys;
> + int total_buf_space;
> + int iova_pfn;
> + int ret;
> +
> + if (!buffer->fd)
> + return 0;
> +
> + buffer->dma_buf = dma_buf_get(buffer->fd);
> + if (IS_ERR(buffer->dma_buf)) {
> +  

Re: [PATCH v5 2/5] irqchip/gic-v3: Enable support for SGIs to act as NMIs

2020-10-14 Thread Masayoshi Mizuma
On Wed, Oct 14, 2020 at 04:42:08PM +0530, Sumit Garg wrote:
> Add support to handle SGIs as regular NMIs. As SGIs or IPIs defaults to a
> special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
> handler update in case of SGIs.
> 
> Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
> as IRQs/NMIs happen as part of this routine.
> 
> Signed-off-by: Sumit Garg 
> ---
>  drivers/irqchip/irq-gic-v3.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 16fecc0..5efc865 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -477,6 +477,11 @@ static int gic_irq_nmi_setup(struct irq_data *d)
>   if (WARN_ON(gic_irq(d) >= 8192))
>   return -EINVAL;
>  
> + if (get_intid_range(d) == SGI_RANGE) {
> + gic_irq_set_prio(d, GICD_INT_NMI_PRI);
> + return 0;
> + }
> +
>   /* desc lock should already be held */
>   if (gic_irq_in_rdist(d)) {
>   u32 idx = gic_get_ppi_index(d);
> @@ -514,6 +519,11 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
>   if (WARN_ON(gic_irq(d) >= 8192))
>   return;
>  
> + if (get_intid_range(d) == SGI_RANGE) {
> + gic_irq_set_prio(d, GICD_INT_DEF_PRI);
> + return;
> + }
> +
>   /* desc lock should already be held */
>   if (gic_irq_in_rdist(d)) {
>   u32 idx = gic_get_ppi_index(d);
> @@ -1708,6 +1718,7 @@ static int __init gic_init_bases(void __iomem 
> *dist_base,
>  
>   gic_dist_init();
>   gic_cpu_init();
> + gic_enable_nmi_support();
>   gic_smp_init();
>   gic_cpu_pm_init();
>  
> @@ -1719,8 +1730,6 @@ static int __init gic_init_bases(void __iomem 
> *dist_base,
>   gicv2m_init(handle, gic_data.domain);
>   }
>  
> - gic_enable_nmi_support();
> -
>   return 0;
>  
>  out_free:
> -- 

Looks good to me. Please feel free to add:

Reviewed-by: Masayoshi Mizuma 

Thanks!
Masa


Re: [PATCH v6 3/4] rcu/trace: Add tracing for how segcb list changes

2020-10-14 Thread joel
On Wed, Oct 14, 2020 at 08:52:17PM +0530, Neeraj Upadhyay wrote:
> 
> 
> On 9/23/2020 8:52 PM, Joel Fernandes (Google) wrote:
> > Track how the segcb list changes before/after acceleration, during
> > queuing and during dequeuing.
> > 
> > This has proved useful to discover an optimization to avoid unwanted GP
> > requests when there are no callbacks accelerated. The overhead is minimal as
> > each segment's length is now stored in the respective segment.
> > 
> > Signed-off-by: Joel Fernandes (Google) 
> > ---
> >   include/trace/events/rcu.h | 25 +
> >   kernel/rcu/rcu_segcblist.c | 34 ++
> >   kernel/rcu/rcu_segcblist.h |  5 +
> >   kernel/rcu/tree.c  |  9 +
> >   4 files changed, 73 insertions(+)
> > 
> > diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> > index 155b5cb43cfd..7b84df3c95df 100644
> > --- a/include/trace/events/rcu.h
> > +++ b/include/trace/events/rcu.h
> > @@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
> >   __entry->qlen)
> >   );
> > +TRACE_EVENT_RCU(rcu_segcb,
> > +
> > +   TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
> > +
> > +   TP_ARGS(ctx, cb_count, gp_seq),
> > +
> > +   TP_STRUCT__entry(
> > +   __field(const char *, ctx)
> > +   __array(int, cb_count, 4)
> > +   __array(unsigned long, gp_seq, 4)
> 
> Use RCU_CBLIST_NSEGS in place of 4 ?

Done.

> > +   ),
> > +
> > +   TP_fast_assign(
> > +   __entry->ctx = ctx;
> > +   memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
> > +   memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned 
> > long));
> > +   ),
> > +
> > +   TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, 
> > NEXT=%d) "
> > + "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, 
> > NEXT=%lu)", __entry->ctx,
> > + __entry->cb_count[0], __entry->cb_count[1], 
> > __entry->cb_count[2], __entry->cb_count[3],
> > + __entry->gp_seq[0], __entry->gp_seq[1], 
> > __entry->gp_seq[2], __entry->gp_seq[3])
> > +
> > +);
> > +
> >   /*
> >* Tracepoint for the registration of a single RCU callback of the special
> >* kvfree() form.  The first argument is the RCU type, the second argument
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 0e6d19bd3de9..df0f31e30947 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -13,6 +13,7 @@
> >   #include 
> >   #include "rcu_segcblist.h"
> > +#include "rcu.h"
> >   /* Initialize simple callback list. */
> >   void rcu_cblist_init(struct rcu_cblist *rclp)
> > @@ -343,6 +344,39 @@ void rcu_segcblist_extract_done_cbs(struct 
> > rcu_segcblist *rsclp,
> > rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> >   }
> > +/*
> > + * Return how many CBs each segment along with their gp_seq values.
> > + *
> > + * This function is O(N) where N is the number of callbacks. Only used from
> 
> N is number of segments?

Yes, will fix.

> > + * tracing code which is usually disabled in production.
> > + */
> > +#ifdef CONFIG_RCU_TRACE
> > +static void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> > +int cbcount[RCU_CBLIST_NSEGS],
> > +unsigned long gpseq[RCU_CBLIST_NSEGS])
> > +{
> > +   int i;
> > +
> > +   for (i = 0; i < RCU_CBLIST_NSEGS; i++)
> > +   cbcount[i] = 0;
> > +
> 
> What is the reason for initializing to 0?

You are right, not needed. I'll remove.

> > +   for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
> > +   cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
> > +   gpseq[i] = rsclp->gp_seq[i];
> > +   }
> > +}
> > +
> > +void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context)
> > +{
> > +   int cbs[RCU_CBLIST_NSEGS];
> > +   unsigned long gps[RCU_CBLIST_NSEGS];
> > +
> > +   rcu_segcblist_countseq(rsclp, cbs, gps);
> > +
> > +   trace_rcu_segcb(context, cbs, gps);
> > +}
> > +#endif
> > +
> >   /*
> >* Extract only those callbacks still pending (not yet ready to be
> >* invoked) from the specified rcu_segcblist structure and place them in
> > diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> > index 3e0eb1056ae9..15c10d30f88c 100644
> > --- a/kernel/rcu/rcu_segcblist.h
> > +++ b/kernel/rcu/rcu_segcblist.h
> > @@ -103,3 +103,8 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, 
> > unsigned long seq);
> >   bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long 
> > seq);
> >   void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> >  struct rcu_segcblist *src_rsclp);
> > +#ifdef CONFIG_RCU_TRACE
> > +void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context);
> > +#else
> > +#define trace_rcu_segcb_list(...)
> > +#endif
> > diff --git a/kernel/rcu/t

Re: [PATCH] fs: fix NULL dereference due to data race in prepend_path()

2020-10-14 Thread Andrii Nakryiko
On Wed, Oct 14, 2020 at 4:08 PM Al Viro  wrote:
>
> On Wed, Oct 14, 2020 at 01:45:28PM -0700, Andrii Nakryiko wrote:
> > Fix data race in prepend_path() with re-reading mnt->mnt_ns twice without
> > holding the lock. is_mounted() does check for NULL, but 
> > is_anon_ns(mnt->mnt_ns)
> > might re-read the pointer again which could be NULL already, if in between
> > reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets 
> > mnt->mnt_ns
> > to NULL.
>
> Cute...  What config/compiler has resulted in that?  I agree with the 
> analysis, but

Don't know for sure, but nothing special or exotic, a typical Facebook
production kernel config and some version of GCC (I didn't check
exactly which one).

Just given enough servers in the fleet, with time and intensive
workloads races like this, however unlikely, do surface up pretty
regularly.

> I really hate the open-coded (and completely unexplained) use of 
> IS_ERR_OR_NULL()
> there.
>
> > - if (is_mounted(vfsmnt) && !is_anon_ns(mnt->mnt_ns))
> > + mnt_ns = READ_ONCE(mnt->mnt_ns);
> > + /* open-coded is_mounted() to use local mnt_ns */
> > + if (!IS_ERR_OR_NULL(mnt_ns) && !is_anon_ns(mnt_ns))
> >   error = 1;  // absolute root
> >   else
> >   error = 2;  // detached or not attached 
> > yet
>
> Better turn that into an inlined helper in fs/mount.h, next to is_mounted(), 
> IMO,
> and kill that IS_ERR_OR_NULL garbage there.  What that thing does is
> if (ns == NULL || ns == MNT_NS_INTERNAL)
> and it's *not* on any kind of hot path to warrant that kind of 
> microoptimizations.

Sounds good. I didn't know code well enough to infer NULL ||
MNT_NS_INTERNAL instead of IS_ERR_OR_NULL from is_mounted(), so just
open-coded the latter.

>
> So let's make that
>
> static inline bool is_real_ns(struct mnt_namespace *mnt_ns)
> {
> return mnt_ns && mnt_ns != MNT_NS_INTERNAL;
> }
>
> turn is_mounted(m) into is_real_ns(m->mnt_ns) and replace that line in your 
> fix
> with
> if (is_real_ns(mnt_ns) && !is_anon_ns(mnt_ns))
>
> Objections?

Nope, I'll send a follow-up patch, thanks.


Re: [PATCH v2 1/7] staging: qlge: replace ql_* with qlge_* to avoid namespace clashes with other qlogic drivers

2020-10-14 Thread Benjamin Poirier
On 2020-10-14 18:43 +0800, Coiby Xu wrote:
> To avoid namespace clashes with other qlogic drivers and also for the
> sake of naming consistency, use the "qlge_" prefix as suggested in
> drivers/staging/qlge/TODO.
> 
> Suggested-by: Benjamin Poirier 
> Signed-off-by: Coiby Xu 
> ---
>  drivers/staging/qlge/TODO   |4 -
>  drivers/staging/qlge/qlge.h |  190 ++--
>  drivers/staging/qlge/qlge_dbg.c | 1073 ---
>  drivers/staging/qlge/qlge_ethtool.c |  231 ++---
>  drivers/staging/qlge/qlge_main.c| 1257 +--
>  drivers/staging/qlge/qlge_mpi.c |  352 
>  6 files changed, 1551 insertions(+), 1556 deletions(-)
> 
> diff --git a/drivers/staging/qlge/TODO b/drivers/staging/qlge/TODO
> index f93f7428f5d5..5ac55664c3e2 100644
> --- a/drivers/staging/qlge/TODO
> +++ b/drivers/staging/qlge/TODO
> @@ -28,10 +28,6 @@
>  * the driver has a habit of using runtime checks where compile time checks 
> are
>possible (ex. ql_free_rx_buffers(), ql_alloc_rx_buffers())
>  * reorder struct members to avoid holes if it doesn't impact performance
> -* in terms of namespace, the driver uses either qlge_, ql_ (used by
> -  other qlogic drivers, with clashes, ex: ql_sem_spinlock) or nothing (with
> -  clashes, ex: struct ob_mac_iocb_req). Rename everything to use the "qlge_"
> -  prefix.

You only renamed ql -> qlge. The prefix needs to be added where there is
currently none like the second example of that text.

Besides, the next patch reintroduces the name struct ql_adapter.


Re: [PATCH v6 14/25] kbuild: lto: remove duplicate dependencies from .mod files

2020-10-14 Thread Kees Cook
On Mon, Oct 12, 2020 at 05:31:52PM -0700, Sami Tolvanen wrote:
> With LTO, llvm-nm prints out symbols for each archive member
> separately, which results in a lot of duplicate dependencies in the
> .mod file when CONFIG_TRIM_UNUSED_SYMS is enabled. When a module
> consists of several compilation units, the output can exceed the
> default xargs command size limit and split the dependency list to
> multiple lines, which results in used symbols getting trimmed.
> 
> This change removes duplicate dependencies, which will reduce the
> probability of this happening and makes .mod files smaller and
> easier to read.
> 
> Signed-off-by: Sami Tolvanen 
> Reviewed-by: Kees Cook 

Hi Masahiro,

This appears to be a general improvement as well. This looks like it can
land without depending on the rest of the series.

-Kees

> ---
>  scripts/Makefile.build | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index ab0ddf4884fd..96d6c9e18901 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -266,7 +266,7 @@ endef
>  
>  # List module undefined symbols (or empty line if not enabled)
>  ifdef CONFIG_TRIM_UNUSED_KSYMS
> -cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | xargs echo
> +cmd_undef_syms = $(NM) $< | sed -n 's/^  *U //p' | sort -u | xargs echo
>  else
>  cmd_undef_syms = echo
>  endif
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook


Re: [PATCH] drm/amdgpu: add missing newline at eof

2020-10-14 Thread Alex Deucher
On Wed, Oct 14, 2020 at 5:18 PM  wrote:
>
> From: Tom Rix 
>
> Representative checkpatch.pl warning
>
> WARNING: adding a line without newline at end of file
>  30: FILE: drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h:30:
> +#endif
>
> Signed-off-by: Tom Rix 

Applied.  Thanks!

Alex

> ---
>  drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_1_sh_mask.h | 2 +-
>  drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h   | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_1_sh_mask.h 
> b/drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_1_sh_mask.h
> index f26246a600c6..4089cfa081f5 100644
> --- a/drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_1_sh_mask.h
> +++ b/drivers/gpu/drm/amd/include/asic_reg/gc/gc_9_4_1_sh_mask.h
> @@ -745,4 +745,4 @@
>  #define RLC_EDC_CNT2__RLC_SPM_SE7_SCRATCH_RAM_SEC_COUNT_MASK 
>  0x3000L
>  #define RLC_EDC_CNT2__RLC_SPM_SE7_SCRATCH_RAM_DED_COUNT_MASK 
>  0xC000L
>
> -#endif
> \ No newline at end of file
> +#endif
> diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h 
> b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h
> index 29929b360db8..d8696e2274c4 100644
> --- a/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h
> +++ b/drivers/gpu/drm/amd/pm/swsmu/smu11/vangogh_ppt.h
> @@ -27,4 +27,4 @@
>
>  extern void vangogh_set_ppt_funcs(struct smu_context *smu);
>
> -#endif
> \ No newline at end of file
> +#endif
> --
> 2.18.1
>
> ___
> dri-devel mailing list
> dri-de...@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3 01/11] counters: Introduce counter_atomic* counters

2020-10-14 Thread Shuah Khan

On 10/13/20 5:27 AM, Mauro Carvalho Chehab wrote:

Em Fri,  9 Oct 2020 09:55:56 -0600
Shuah Khan  escreveu:


Introduce Simple atomic counters.

There are a number of atomic_t usages in the kernel where atomic_t api
is used strictly for counting and not for managing object lifetime. In
some cases, atomic_t might not even be needed.

The purpose of these counters is to clearly differentiate atomic_t
counters from atomic_t usages that guard object lifetimes, hence prone
to overflow and underflow errors. It allows tools that scan for underflow
and overflow on atomic_t usages to detect overflow and underflows to scan
just the cases that are prone to errors.

Simple atomic counters api provides interfaces for simple atomic counters
that just count, and don't guard resource lifetimes. The interfaces are
built on top of atomic_t api, providing a smaller subset of atomic_t
interfaces necessary to support simple counters.

Counter wraps around to INT_MIN when it overflows and should not be used
to guard resource lifetimes, device usage and open counts that control
state changes, and pm states. Overflowing to INT_MIN is consistent with
the atomic_t api, which it is built on top of.

Using counter_atomic* to guard lifetimes could lead to use-after free
when it overflows and undefined behavior when used to manage state
changes and device usage/open states.

Reviewed-by: Greg Kroah-Hartman 
Signed-off-by: Shuah Khan 


Did you try building this with htmldocs? It produces 3 new warnings
due to wrong usage of the "ref" tag:

.../Documentation/core-api/counters.rst:46: WARNING: undefined label: 
test counters module (if the link has no caption the label must precede a 
section header)
.../Documentation/core-api/counters.rst:49: WARNING: undefined label: 
selftest for counters (if the link has no caption the label must precede a 
section header)
.../Documentation/core-api/counters.rst:62: WARNING: undefined label: 
atomic_ops (if the link has no caption the label must precede a section header)



I added the document to patch 1/11 and the referenced file gets added
in 2/11. Poor planning on my part.

I will fix it.


(plus another one that I'll be sending a fixup patch anytime soon)



Thanks for the fix.


Are those referring to some documents that don't exist yet uptream?
Or are you trying to force Sphinx to generate a cross-reference for
a C file at the tree? If it is the latter, then this won't work,
as it will only generate cross-references for files that are placed
inside the documentation output dir (Documentation/output, by default).



Th first two aren't in upstream. atomic_ops exists - I will double check
the link.

thanks,
-- Shuah


[PATCH v3 1/2] mmap locking API: Order lock of nascent mm outside lock of live mm

2020-10-14 Thread Jann Horn
Until now, the mmap lock of the nascent mm was ordered inside the mmap lock
of the old mm (in dup_mmap() and in UML's activate_mm()).
A following patch will change the exec path to very broadly lock the
nascent mm, but fine-grained locking should still work at the same time for
the old mm.

In particular, mmap locking calls are hidden behind the copy_from_user()
calls and such that are reached through functions like copy_strings() -
when a page fault occurs on a userspace memory access, the mmap lock
will be taken.

To do this in a way that lockdep is happy about, let's turn around the lock
ordering in both places that currently nest the locks.
Since SINGLE_DEPTH_NESTING is normally used for the inner nesting layer,
make up our own lock subclass MMAP_LOCK_SUBCLASS_NASCENT and use that
instead.
The added locking calls in exec_mmap() are temporary; the following patch
will move the locking out of exec_mmap().

As Johannes Berg pointed out[1][2], moving the mmap locking of
arch/um/'s activate_mm() up into the execve code also fixes an issue that
would've caused a scheduling-in-atomic bug due to mmap_write_lock_nested()
while holding a spinlock if UM had support for voluntary preemption.
(Even when a semaphore is uncontended, locking it can still trigger
rescheduling via might_sleep().)

[1] 
https://lore.kernel.org/linux-mm/115d17aa221b73a479e26ffee52899ddb18b1f53.ca...@sipsolutions.net/
[2] 
https://lore.kernel.org/linux-mm/7b7d6954b74e109e653539d880173fa9cb5c5ddf.ca...@sipsolutions.net/

Signed-off-by: Jann Horn 
---
 arch/um/include/asm/mmu_context.h |  3 +--
 fs/exec.c |  4 
 include/linux/mmap_lock.h | 23 +--
 kernel/fork.c |  7 ++-
 4 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/arch/um/include/asm/mmu_context.h 
b/arch/um/include/asm/mmu_context.h
index 17ddd4edf875..c13bc5150607 100644
--- a/arch/um/include/asm/mmu_context.h
+++ b/arch/um/include/asm/mmu_context.h
@@ -48,9 +48,8 @@ static inline void activate_mm(struct mm_struct *old, struct 
mm_struct *new)
 * when the new ->mm is used for the first time.
 */
__switch_mm(&new->context.id);
-   mmap_write_lock_nested(new, SINGLE_DEPTH_NESTING);
+   mmap_assert_write_locked(new);
uml_setup_stubs(new);
-   mmap_write_unlock(new);
 }
 
 static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, 
diff --git a/fs/exec.c b/fs/exec.c
index a91003e28eaa..229dbc7aa61a 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1114,6 +1114,8 @@ static int exec_mmap(struct mm_struct *mm)
if (ret)
return ret;
 
+   mmap_write_lock_nascent(mm);
+
if (old_mm) {
/*
 * Make sure that if there is a core dump in progress
@@ -1125,6 +1127,7 @@ static int exec_mmap(struct mm_struct *mm)
if (unlikely(old_mm->core_state)) {
mmap_read_unlock(old_mm);
mutex_unlock(&tsk->signal->exec_update_mutex);
+   mmap_write_unlock(mm);
return -EINTR;
}
}
@@ -1138,6 +1141,7 @@ static int exec_mmap(struct mm_struct *mm)
tsk->mm->vmacache_seqnum = 0;
vmacache_flush(tsk);
task_unlock(tsk);
+   mmap_write_unlock(mm);
if (old_mm) {
mmap_read_unlock(old_mm);
BUG_ON(active_mm != old_mm);
diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h
index 0707671851a8..24de1fe99ee4 100644
--- a/include/linux/mmap_lock.h
+++ b/include/linux/mmap_lock.h
@@ -3,6 +3,18 @@
 
 #include 
 
+/*
+ * Lock subclasses for the mmap_lock.
+ *
+ * MMAP_LOCK_SUBCLASS_NASCENT is for core kernel code that wants to lock an mm
+ * that is still being constructed and wants to be able to access the active mm
+ * normally at the same time. It nests outside MMAP_LOCK_SUBCLASS_NORMAL.
+ */
+enum {
+   MMAP_LOCK_SUBCLASS_NORMAL = 0,
+   MMAP_LOCK_SUBCLASS_NASCENT
+};
+
 #define MMAP_LOCK_INITIALIZER(name) \
.mmap_lock = __RWSEM_INITIALIZER((name).mmap_lock),
 
@@ -16,9 +28,16 @@ static inline void mmap_write_lock(struct mm_struct *mm)
down_write(&mm->mmap_lock);
 }
 
-static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass)
+/*
+ * Lock an mm_struct that is still being set up (during fork or exec).
+ * This nests outside the mmap locks of live mm_struct instances.
+ * No interruptible/killable versions exist because at the points where you're
+ * supposed to use this helper, the mm isn't visible to anything else, so we
+ * expect the mmap_lock to be uncontended.
+ */
+static inline void mmap_write_lock_nascent(struct mm_struct *mm)
 {
-   down_write_nested(&mm->mmap_lock, subclass);
+   down_write_nested(&mm->mmap_lock, MMAP_LOCK_SUBCLASS_NASCENT);
 }
 
 static inline int mmap_write_lock_killable(struct mm_struct *mm)
diff --git a/kernel/fork.c b/kernel/fork.c

Re: [PATCH] compiler.h: Fix barrier_data() on clang

2020-10-14 Thread Nick Desaulniers
On Wed, Oct 14, 2020 at 2:26 PM Arvind Sankar  wrote:
>
> Commit
>   815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually 
> exclusive")
>
> neglected to copy barrier_data() from compiler-gcc.h into
> compiler-clang.h. The definition in compiler-gcc.h was really to work
> around clang's more aggressive optimization, so this broke
> barrier_data() on clang, and consequently memzero_explicit() as well.
>
> For example, this results in at least the memzero_explicit() call in
> lib/crypto/sha256.c:sha256_transform() being optimized away by clang.
>
> Fix this by moving the definition of barrier_data() into compiler.h.
>
> Also move the gcc/clang definition of barrier() into compiler.h,
> __memory_barrier() is icc-specific (and barrier() is already defined
> using it in compiler-intel.h) and doesn't belong in compiler.h.
>
> Signed-off-by: Arvind Sankar 
> Fixes: 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually 
> exclusive")

Thanks for the patch! Curious how you spotted this? My mistake for
missing it.  Definite difference in the disassembly before/after.

Cc: sta...@vger.kernel.org
Reviewed-by: Nick Desaulniers 
Tested-by: Nick Desaulniers 

akpm@ would you mind picking this up when you have a chance?

See also:
commit 7829fb09a2b4 ("lib: make memzero_explicit more robust against
dead store elimination")

I'm pretty sure `man 3 explicit_bzero` was created in libc for this
exact problem, though the manual page is an interesting read.

> ---
>  include/linux/compiler-clang.h |  6 --
>  include/linux/compiler-gcc.h   | 19 ---
>  include/linux/compiler.h   | 18 --
>  3 files changed, 16 insertions(+), 27 deletions(-)
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index cee0c728d39a..04c0a5a717f7 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -52,12 +52,6 @@
>  #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
>  #endif
>
> -/* The following are for compatibility with GCC, from compiler-gcc.h,
> - * and may be redefined here because they should not be shared with other
> - * compilers, like ICC.
> - */
> -#define barrier() __asm__ __volatile__("" : : : "memory")
> -
>  #if __has_feature(shadow_call_stack)
>  # define __noscs   __attribute__((__no_sanitize__("shadow-call-stack")))
>  #endif
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index 7a3769040d7d..fda30ffb037b 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -15,25 +15,6 @@
>  # error Sorry, your compiler is too old - please upgrade it.
>  #endif
>
> -/* Optimization barrier */
> -
> -/* The "volatile" is due to gcc bugs */
> -#define barrier() __asm__ __volatile__("": : :"memory")
> -/*
> - * This version is i.e. to prevent dead stores elimination on @ptr
> - * where gcc and llvm may behave differently when otherwise using
> - * normal barrier(): while gcc behavior gets along with a normal
> - * barrier(), llvm needs an explicit input variable to be assumed
> - * clobbered. The issue is as follows: while the inline asm might
> - * access any memory it wants, the compiler could have fit all of
> - * @ptr into memory registers instead, and since @ptr never escaped
> - * from that, it proved that the inline asm wasn't touching any of
> - * it. This version works well with both compilers, i.e. we're telling
> - * the compiler that the inline asm absolutely may see the contents
> - * of @ptr. See also: https://llvm.org/bugs/show_bug.cgi?id=15495
> - */
> -#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> -
>  /*
>   * This macro obfuscates arithmetic on a variable address so that gcc
>   * shouldn't recognize the original var, and make assumptions about it.
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index 92ef163a7479..dfba70b2644f 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -80,11 +80,25 @@ void ftrace_likely_update(struct ftrace_likely_data *f, 
> int val,
>
>  /* Optimization barrier */
>  #ifndef barrier
> -# define barrier() __memory_barrier()
> +/* The "volatile" is due to gcc bugs */
> +# define barrier() __asm__ __volatile__("": : :"memory")
>  #endif
>
>  #ifndef barrier_data
> -# define barrier_data(ptr) barrier()
> +/*
> + * This version is i.e. to prevent dead stores elimination on @ptr
> + * where gcc and llvm may behave differently when otherwise using
> + * normal barrier(): while gcc behavior gets along with a normal
> + * barrier(), llvm needs an explicit input variable to be assumed
> + * clobbered. The issue is as follows: while the inline asm might
> + * access any memory it wants, the compiler could have fit all of
> + * @ptr into memory registers instead, and since @ptr never escaped
> + * from that, it proved that the inline asm wasn't touching any of
> + * it. This version works well with both compilers, i.e. we're telling
> + * the 

Re: [GIT PULL] Btrfs updates for 5.10

2020-10-14 Thread Stephen Rothwell
Hi David,

On Mon, 12 Oct 2020 22:25:02 +0200 David Sterba  wrote:
>
> Mishaps:
> 
> - commit 62cf5391209a ("btrfs: move btrfs_rm_dev_replace_free_srcdev
>   outside of all locks") is a rebase leftover after the patch got
>   merged to 5.9-rc8 as a466c85edc6f ("btrfs: move
>   btrfs_rm_dev_replace_free_srcdev outside of all locks"), the
>   remaining part is trivial and the patch is in the middle of the
>   series so I'm keeping it there instead of rebasing

And yet, this entire pull request has been rebased since what was in
linux-next on Tuesday (and what would still be there today except I
dropped it because of several conflicts) ...  it looks like it was
rebased a week ago, but then never included in your "for-next" branch.
So I supposed it has had your internal testing, at least.

-- 
Cheers,
Stephen Rothwell


pgpo4GaGWyXIl.pgp
Description: OpenPGP digital signature


[PATCH v3 2/2] exec: Broadly lock nascent mm until setup_arg_pages()

2020-10-14 Thread Jann Horn
While AFAIK there currently is nothing that can modify the VMA tree of a
new mm until userspace has started running under the mm, we should properly
lock the mm here anyway, both to keep lockdep happy when adding locking
assertions and to be safe in the future in case someone e.g. decides to
permit VMA-tree-mutating operations in process_madvise_behavior_valid().

The goal of this patch is to broadly lock the nascent mm in the exec path,
from around the time it is created all the way to the end of
setup_arg_pages() (because setup_arg_pages() accesses bprm->vma).
As long as the mm is write-locked, keep it around in bprm->mm, even after
it has been installed on the task (with an extra reference on the mm, to
reduce complexity in free_bprm()).
After setup_arg_pages(), we have to unlock the mm so that APIs such as
copy_to_user() will work in the following binfmt-specific setup code.

Suggested-by: Jason Gunthorpe 
Suggested-by: Michel Lespinasse 
Signed-off-by: Jann Horn 
---
 fs/exec.c   | 68 -
 include/linux/binfmts.h |  2 +-
 2 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 229dbc7aa61a..00edf833781f 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -254,11 +254,6 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
return -ENOMEM;
vma_set_anonymous(vma);
 
-   if (mmap_write_lock_killable(mm)) {
-   err = -EINTR;
-   goto err_free;
-   }
-
/*
 * Place the stack at the largest stack address the architecture
 * supports. Later, we'll move this to an appropriate place. We don't
@@ -276,12 +271,9 @@ static int __bprm_mm_init(struct linux_binprm *bprm)
goto err;
 
mm->stack_vm = mm->total_vm = 1;
-   mmap_write_unlock(mm);
bprm->p = vma->vm_end - sizeof(void *);
return 0;
 err:
-   mmap_write_unlock(mm);
-err_free:
bprm->vma = NULL;
vm_area_free(vma);
return err;
@@ -364,9 +356,9 @@ static int bprm_mm_init(struct linux_binprm *bprm)
struct mm_struct *mm = NULL;
 
bprm->mm = mm = mm_alloc();
-   err = -ENOMEM;
if (!mm)
-   goto err;
+   return -ENOMEM;
+   mmap_write_lock_nascent(mm);
 
/* Save current stack limit for all calculations made during exec. */
task_lock(current->group_leader);
@@ -374,17 +366,12 @@ static int bprm_mm_init(struct linux_binprm *bprm)
task_unlock(current->group_leader);
 
err = __bprm_mm_init(bprm);
-   if (err)
-   goto err;
-
-   return 0;
-
-err:
-   if (mm) {
-   bprm->mm = NULL;
-   mmdrop(mm);
-   }
+   if (!err)
+   return 0;
 
+   bprm->mm = NULL;
+   mmap_write_unlock(mm);
+   mmdrop(mm);
return err;
 }
 
@@ -735,6 +722,7 @@ static int shift_arg_pages(struct vm_area_struct *vma, 
unsigned long shift)
 /*
  * Finalizes the stack vm_area_struct. The flags and permissions are updated,
  * the stack is optionally relocated, and some extra space is added.
+ * At the end of this, the mm_struct will be unlocked on success.
  */
 int setup_arg_pages(struct linux_binprm *bprm,
unsigned long stack_top,
@@ -787,9 +775,6 @@ int setup_arg_pages(struct linux_binprm *bprm,
bprm->loader -= stack_shift;
bprm->exec -= stack_shift;
 
-   if (mmap_write_lock_killable(mm))
-   return -EINTR;
-
vm_flags = VM_STACK_FLAGS;
 
/*
@@ -807,7 +792,7 @@ int setup_arg_pages(struct linux_binprm *bprm,
ret = mprotect_fixup(vma, &prev, vma->vm_start, vma->vm_end,
vm_flags);
if (ret)
-   goto out_unlock;
+   return ret;
BUG_ON(prev != vma);
 
if (unlikely(vm_flags & VM_EXEC)) {
@@ -819,7 +804,7 @@ int setup_arg_pages(struct linux_binprm *bprm,
if (stack_shift) {
ret = shift_arg_pages(vma, stack_shift);
if (ret)
-   goto out_unlock;
+   return ret;
}
 
/* mprotect_fixup is overkill to remove the temporary stack flags */
@@ -846,11 +831,17 @@ int setup_arg_pages(struct linux_binprm *bprm,
current->mm->start_stack = bprm->p;
ret = expand_stack(vma, stack_base);
if (ret)
-   ret = -EFAULT;
+   return -EFAULT;
 
-out_unlock:
+   /*
+* From this point on, anything that wants to poke around in the
+* mm_struct must lock it by itself.
+*/
+   bprm->vma = NULL;
mmap_write_unlock(mm);
-   return ret;
+   mmput(mm);
+   bprm->mm = NULL;
+   return 0;
 }
 EXPORT_SYMBOL(setup_arg_pages);
 
@@ -1114,8 +1105,6 @@ static int exec_mmap(struct mm_struct *mm)
if (ret)
return ret;
 
-   mmap_write_lock_nascent(mm);
-
if (old_mm) {
 

[PATCH v2] i2c: i2c-mux-gpio: Enable this driver in ACPI land

2020-10-14 Thread Evan Green
Enable i2c-mux-gpio devices to be defined via ACPI. The idle-state
property translates directly to a fwnode_property_*() call. The child
reg property translates naturally into _ADR in ACPI.

The i2c-parent binding is a relic from the days when all direct children
of an i2c controller in Linux had to be i2c devices. These days that
implementation detail has been worked out, so the i2c-mux can sit
as a direct child of its parent controller, which is where it makes the
most sense from a hardware description perspective. For the ACPI
implementation we'll assume that's always how the i2c-mux-gpio is
instantiated.

Signed-off-by: Evan Green 
---

Changes in v2:
 - Make it compile properly when !CONFIG_ACPI (Randy)
 - Update commit message regarding i2c-parent (Peter)

 drivers/i2c/muxes/i2c-mux-gpio.c | 103 ++-
 1 file changed, 75 insertions(+), 28 deletions(-)

diff --git a/drivers/i2c/muxes/i2c-mux-gpio.c b/drivers/i2c/muxes/i2c-mux-gpio.c
index 4effe563e9e8d..8e4008f4a9b5d 100644
--- a/drivers/i2c/muxes/i2c-mux-gpio.c
+++ b/drivers/i2c/muxes/i2c-mux-gpio.c
@@ -49,34 +49,80 @@ static int i2c_mux_gpio_deselect(struct i2c_mux_core *muxc, 
u32 chan)
return 0;
 }
 
-#ifdef CONFIG_OF
-static int i2c_mux_gpio_probe_dt(struct gpiomux *mux,
-   struct platform_device *pdev)
+#ifdef CONFIG_ACPI
+
+static int i2c_mux_gpio_get_acpi_adr(struct device *dev,
+struct fwnode_handle *fwdev,
+unsigned int *adr)
+
+{
+   unsigned long long adr64;
+   acpi_status status;
+
+   status = acpi_evaluate_integer(ACPI_HANDLE_FWNODE(fwdev),
+  METHOD_NAME__ADR,
+  NULL, &adr64);
+
+   if (!ACPI_SUCCESS(status)) {
+   dev_err(dev, "Cannot get address");
+   return -EINVAL;
+   }
+
+   *adr = adr64;
+   return 0;
+}
+
+#else
+
+static int i2c_mux_gpio_get_acpi_adr(struct device *dev,
+struct fwnode_handle *fwdev,
+unsigned int *adr)
+{
+   return -EINVAL;
+}
+
+#endif
+
+static int i2c_mux_gpio_probe_fw(struct gpiomux *mux,
+struct platform_device *pdev)
 {
-   struct device_node *np = pdev->dev.of_node;
-   struct device_node *adapter_np, *child;
-   struct i2c_adapter *adapter;
+   struct device *dev = &pdev->dev;
+   struct device_node *np = dev->of_node;
+   acpi_handle dev_handle;
+   struct device_node *adapter_np;
+   struct i2c_adapter *adapter = NULL;
+   struct fwnode_handle *child = NULL;
unsigned *values;
-   int i = 0;
+   int rc, i = 0;
 
-   if (!np)
-   return -ENODEV;
+   if (is_of_node(dev->fwnode)) {
+   if (!np)
+   return -ENODEV;
 
-   adapter_np = of_parse_phandle(np, "i2c-parent", 0);
-   if (!adapter_np) {
-   dev_err(&pdev->dev, "Cannot parse i2c-parent\n");
-   return -ENODEV;
+   adapter_np = of_parse_phandle(np, "i2c-parent", 0);
+   if (!adapter_np) {
+   dev_err(&pdev->dev, "Cannot parse i2c-parent\n");
+   return -ENODEV;
+   }
+   adapter = of_find_i2c_adapter_by_node(adapter_np);
+   of_node_put(adapter_np);
+
+   } else if (is_acpi_node(dev->fwnode)) {
+   /*
+* In ACPI land the mux should be a direct child of the i2c
+* bus it muxes.
+*/
+   dev_handle = ACPI_HANDLE(dev->parent);
+   adapter = i2c_acpi_find_adapter_by_handle(dev_handle);
}
-   adapter = of_find_i2c_adapter_by_node(adapter_np);
-   of_node_put(adapter_np);
+
if (!adapter)
return -EPROBE_DEFER;
 
mux->data.parent = i2c_adapter_id(adapter);
put_device(&adapter->dev);
 
-   mux->data.n_values = of_get_child_count(np);
-
+   mux->data.n_values = device_get_child_node_count(dev);
values = devm_kcalloc(&pdev->dev,
  mux->data.n_values, sizeof(*mux->data.values),
  GFP_KERNEL);
@@ -85,24 +131,25 @@ static int i2c_mux_gpio_probe_dt(struct gpiomux *mux,
return -ENOMEM;
}
 
-   for_each_child_of_node(np, child) {
-   of_property_read_u32(child, "reg", values + i);
+   device_for_each_child_node(dev, child) {
+   if (is_of_node(child)) {
+   fwnode_property_read_u32(child, "reg", values + i);
+
+   } else if (is_acpi_node(child)) {
+   rc = i2c_mux_gpio_get_acpi_adr(dev, child, values + i);
+   if (rc)
+   return rc;
+   }
+
i++;
}
mux->dat

Re: [PATCH v2] MIPS: DEC: Restore bootmem reservation for firmware working memory area

2020-10-14 Thread Maciej W. Rozycki
On Thu, 15 Oct 2020, Serge Semin wrote:

> > Table 2-2  REX Memory Regions
> > -
> > StartingEnding
> > Region  Address Address Use
> > -
> > 0   0xa000  0xa000  Restart block, exception vectors,
> > REX stack and bss
> > 1   0xa001  0xa0017fff  Keyboard or tty drivers
> > 
> > 2   0xa0018000  0xa001f3ff 1)   CRT driver
> > 
> > 3   0xa002  0xa002  boot, cnfg, init and t objects
> > 
> > 4   0xa002  0xa002  64KB scratch space
> > -
> > 1) Note that the last 3 Kbytes of region 2 are reserved for backward
> > compatibility with previous system software.
> > -
> > 
> 
> ...
> 
> > @@ -146,6 +150,9 @@ void __init plat_mem_setup(void)
> >  
> > ioport_resource.start = ~0UL;
> > ioport_resource.end = 0UL;
> 
> > +
> > +   /* Stay away from the firmware working memory area for now. */
> > +   memblock_reserve(PHYS_OFFSET, __pa_symbol(&_text) - PHYS_OFFSET);
> 
> I am just wondering...
> Here we reserve a region within [PHYS_OFFSET, __pa_symbol(&_text)]. But then 
> in
> the prom_free_prom_memory() method we'll get to free a region as either
> [PAGE_SIZE, __pa_symbol(&_text)] or [PAGE_SIZE, __pa_symbol(&_text) - 
> 0x0002].
> 
> First of all the start address of the being freed region is PAGE_SIZE, which 
> doesn't
> take the PHYS_OFFSET into account. I assume it won't cause problems because
> PHYS_OFFSET is set to 0 for DEC. If so then we either shouldn't use 
> PHYS_OFFSET
> here or should take PHYS_OFFSET into account there on freeing or should just 
> forget
> about that since other than confusion it won't cause any problem.)
> (I should have posted this question to v1 of this patch instead of pointing 
> out
> on the confusing size argument of the memblock_reserve() method. Sorry about
> that.)

 Technically, yes, we could use PHYS_OFFSET here, though as a separate 
change, as it's not related to the regression addressed.

 Please mind that this is very old code, which long predates the existence 
of PHYS_OFFSET, introduced with commit 6f284a2ce7b8 ("[MIPS] FLATMEM: 
introduce PHYS_OFFSET.") back in 2007 only.  While `prom_free_prom_memory' 
code dates back to commit b5766e7e6177 ("o bootmem fixes for DECstations") 
(no proper change heading here; this is from CVS repo days) from 2000, and 
I fiddled with it myself not so long afterwards with commit e25ac8bd2505 
("DECstation fixes from Maciej") in 2001 (both in the old LMO GIT repo).  
So if anytime it should have been updated in the course of a code audit 
that was surely due across all platforms with the introduction of 
PHYS_OFFSET.

 Of course it doesn't mean it must not or cannot be fixed now, but it has 
been functionally correct even if semantically broken, so no one saw the 
urge to change it (let alone notice the problem in the first place -- you 
are the first one).

> Secondly why is PAGE_SIZE selected as the start address? It belongs to the
> Region 1 in accordance with "Table 2-2 REX Memory Regions" cited above. Thus 
> we
> get to keep reserved just a part of the Region 1. Most likely it's the restart
> block and the exception vectors. Even assuming that the DEC developers knew 
> what
> they were doing, I am wondering can we be sure that a single page is enough 
> for
> all that data?..

 Again this is so old as to predate the existence of synthesised exception 
handlers we currently use (which has been a blessing BTW), which I reckon 
take a little less space than the preassembled handlers we previously had 
used to, and stay well within even the smallest supported page size of 
4KiB.  Anything else we can just overwrite as we do not mean to call into 
the firmware at this point anymore; we couldn't trust it to do the right 
thing anyway once we have booted (e.g. not to keep interrupts masked for 
too long, etc.).

 I've been occasionally thinking about a better solution in place of the 
LANCE hack here, needed because the chip has only its low 16 out of 24 
address lines wired, due to a limitation of the IOASIC DMA controller (it 
also drives one half of the IOASIC's 16-bit data bus only, communicating 
with every other byte of system memory only and hence the requirement for 
a 128KiB allocation, with every other byte wasted, rather than a 64KiB 
one).

 Had all 24 lines been used, we could use dynamic ZONE_DMA buffers as with 
PC ISA DMA, as the LANCE implements proper DMA rings, but with low 16 only 
it would not really play.  I have not come up with any solution however 
that would be significantly better than what we have, and the current one 
works, so I have left it as it is.

 Do these 

[PATCH v3 0/2] Broad write-locking of nascent mm in execve

2020-10-14 Thread Jann Horn
v3:
 - add note about how this also fixes arch/um/ locking in patch 1
   (Johannes Berg)
 - use IS_DEFINED() instead of #ifdef in patch 2 (Jason Gunthorpe)
v2:
 - fix commit message of patch 1/2 and be more verbose about where
   the old mmap lock is taken (Michel, Jason)
 - resending without mangling the diffs :/ (Michel, Jason)

These two patches replace "mmap locking API: don't check locking
if the mm isn't live yet"[1], which is currently in the mmotm tree,
and should be placed in the same spot where the old patch was.

While I originally said that this would be an alternative
patch (meaning that the existing patch would have worked just
as well), the new patches actually address an additional issue
that the old patch missed (bprm->vma is used after the switch
to the new mm).

I have boot-tested these patches on x64-64 (with lockdep) and
!MMU arm (the latter with both FLAT and ELF).

[1] 
https://lkml.kernel.org/r/cag48ez03yjg9ju_6tgimcavjutyre_o4leq7901b5zocnna...@mail.gmail.com

Jann Horn (2):
  mmap locking API: Order lock of nascent mm outside lock of live mm
  exec: Broadly lock nascent mm until setup_arg_pages()

 arch/um/include/asm/mmu_context.h |  3 +-
 fs/exec.c | 64 ---
 include/linux/binfmts.h   |  2 +-
 include/linux/mmap_lock.h | 23 ++-
 kernel/fork.c |  7 +---
 5 files changed, 59 insertions(+), 40 deletions(-)


base-commit: fb0155a09b0224a7147cb07a4ce6034c8d29667f
prerequisite-patch-id: 08f97130a51898a5f6efddeeb5b42638577398c7
prerequisite-patch-id: 577664d761cd23fe9031ffdb1d3c9ac313572c67
prerequisite-patch-id: dc29a39716aa8689f80ba2767803d9df3709beaa
prerequisite-patch-id: 42b1b546d33391ead2753621f541bcc408af1769
prerequisite-patch-id: 2cbb839f57006f32e21f4229e099ae1bd782be24
prerequisite-patch-id: 1b4daf01cf61654a5ec54b5c3f7c7508be7244ee
prerequisite-patch-id: f46cc8c99f1909fe2a65fbc3cf1f6bc57489a086
prerequisite-patch-id: 2b0caed97223241d5008898dde995d02fda544e4
prerequisite-patch-id: 6b7adcb54989e1ec3370f256ff2c35d19cf785aa
-- 
2.28.0.1011.ga647a8990f-goog



Re: [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()

2020-10-14 Thread Linus Torvalds
On Wed, Oct 14, 2020 at 3:51 PM Linus Torvalds
 wrote:
>
> I think it's this instruction:
>
> addir1,r1,16
>
> that should be removed from the function exit, because Al removed the
>
> -   stwur1,-16(r1)
>
> on function entry.
>
> So I think you end up with a corrupt stack pointer and basically
> random behavior.
>
> Mind trying that? (This is obviously all in
> arch/powerpc/lib/checksum_32.S, the csum_partial_copy_generic()
> function).

Patch attached to make it easier to test.

NOTE! This is ENTIRELY untested. My ppc asm is so rusty that I might
be barking entirely up the wrong tree, and I just made things much
worse.

 Linus


patch
Description: Binary data


Re: [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()

2020-10-14 Thread Linus Torvalds
On Wed, Oct 14, 2020 at 3:27 PM Jason A. Donenfeld  wrote:
>
> This patch is causing crashes in WireGuard's CI over at
> https://www.wireguard.com/build-status/ . Apparently sending a simple
> network packet winds up triggering refcount_t's warn-on-saturate code. I

Ouch.

The C parts look fairly straightforward, and I don't see how they
could cause that odd refcount issue.

So I assume it's the low-level asm code conversion that is buggy. And
it's apparently the 32-bit conversion, since your ppc64 status looks
fine.

I think it's this instruction:

addir1,r1,16

that should be removed from the function exit, because Al removed the

-   stwur1,-16(r1)

on function entry.

So I think you end up with a corrupt stack pointer and basically
random behavior.

Mind trying that? (This is obviously all in
arch/powerpc/lib/checksum_32.S, the csum_partial_copy_generic()
function).

   Linus


Re: [PATCH v6 16/25] init: lto: fix PREL32 relocations

2020-10-14 Thread Kees Cook
On Mon, Oct 12, 2020 at 05:31:54PM -0700, Sami Tolvanen wrote:
> With LTO, the compiler can rename static functions to avoid global
> naming collisions. As initcall functions are typically static,
> renaming can break references to them in inline assembly. This
> change adds a global stub with a stable name for each initcall to
> fix the issue when PREL32 relocations are used.
> 
> Signed-off-by: Sami Tolvanen 
> Reviewed-by: Kees Cook 

This is another independent improvement... this could land before the
other portions of the series.

-Kees

> ---
>  include/linux/init.h | 31 +++
>  1 file changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/init.h b/include/linux/init.h
> index af638cd6dd52..cea63f7e7705 100644
> --- a/include/linux/init.h
> +++ b/include/linux/init.h
> @@ -209,26 +209,49 @@ extern bool initcall_debug;
>   */
>  #define __initcall_section(__sec, __iid) \
>   #__sec ".init.." #__iid
> +
> +/*
> + * With LTO, the compiler can rename static functions to avoid
> + * global naming collisions. We use a global stub function for
> + * initcalls to create a stable symbol name whose address can be
> + * taken in inline assembly when PREL32 relocations are used.
> + */
> +#define __initcall_stub(fn, __iid, id)   \
> + __initcall_name(initstub, __iid, id)
> +
> +#define __define_initcall_stub(__stub, fn)   \
> + int __init __stub(void);\
> + int __init __stub(void) \
> + {   \
> + return fn();\
> + }   \
> + __ADDRESSABLE(__stub)
>  #else
>  #define __initcall_section(__sec, __iid) \
>   #__sec ".init"
> +
> +#define __initcall_stub(fn, __iid, id)   fn
> +
> +#define __define_initcall_stub(__stub, fn)   \
> + __ADDRESSABLE(fn)
>  #endif
>  
>  #ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
> -#define define_initcall(fn, __name, __sec)   \
> - __ADDRESSABLE(fn)   \
> +#define define_initcall(fn, __stub, __name, __sec)   \
> + __define_initcall_stub(__stub, fn)  \
>   asm(".section   \"" __sec "\", \"a\"\n" \
>   __stringify(__name) ":  \n" \
> - ".long  " #fn " - . \n" \
> + ".long  " __stringify(__stub) " - . \n" \
>   ".previous  \n");
>  #else
> -#define define_initcall(fn, __name, __sec)   \
> +#define define_initcall(fn, __unused, __name, __sec) \
>   static initcall_t __name __used \
>   __attribute__((__section__(__sec))) = fn;
>  #endif
>  
>  #define __unique_initcall(fn, id, __sec, __iid)  \
>   define_initcall(fn, \
> + __initcall_stub(fn, __iid, id), \
>   __initcall_name(initcall, __iid, id),   \
>   __initcall_section(__sec, __iid))
>  
> -- 
> 2.28.0.1011.ga647a8990f-goog
> 

-- 
Kees Cook


Re: [RFC 13/13] m68k: mac: convert to generic clockevent

2020-10-14 Thread Finn Thain
On Sat, 10 Oct 2020, Arnd Bergmann wrote:

> > Perhaps patch 13 does not belong in this series (?).
> >
> > All m68k platforms will need conversion before the TODO can be removed 
> > from Documentation/features/time/clockevents/arch-support.txt.
> 
> Yes, correct. I marked this patch as RFC instead of PATCH, as I'm just 
> trying to find out where it should be headed. I would hope the other 
> patches can just get merged.
> 

I wonder whether we can improve support for your proposed configuration 
i.e. a system with no oneshot clockevent device.

The 16 platforms you identified are not all in that category but I suspect 
that there are others which are (though they don't appear in this series 
because they already use GENERIC_CLOCKEVENTS).

One useful optimization would be some way to elide oneshot clockevent 
support (perhaps with the help of Link Time Optimization).

> > On m68k, HZ is fixed at 100. Without addressing that, would there be 
> > any benefit from adopting GENERIC_CLOCKEVENTS as per this RFC patch?
> 
> I don't think so, I mainly did it to see if there is a problem with 
> mixing the two modes, and I couldn't find any. The behavior seems 
> unchanged before and after my patch, the main difference being a few 
> extra kilobytes in kernel .text for the generic clockevents code.
> 

I think that is a good reason to convert all m68k platforms at once and to 
elide some of the dead code.

> > On Thu, 8 Oct 2020, Arnd Bergmann wrote:
> >
> > > Now that the infrastructure allows kernels to have both legacy timer 
> > > ticks and clockevent drivers in the same image, start by moving one 
> > > platform to generic clockevents.
> > >
> > > As qemu only supports the q800 platform among the classic m68k, use 
> > > that as an example.
> > >
> >
> > Correct VIA emulation is suprisingly difficult, so this kind of work 
> > should be tested on real hardware.
> >
> > I say that because when I did the clocksource conversion for m68k I 
> > ran into a bug in QEMU (since fixed) and also because I once worked on 
> > some of the bugs in the emulated VIA device used in MAME/MESS.
> 
> Good point, though I would be surprised if anything went wrong with this 
> patch on real hardware but not in emulation, as all the register-level 
> interactions with the timer are the same.
> 

On the subject of register accesses, via1[ACR] is shared with ADB drivers, 
so this patch probably has to protect those accesses with 
local_irq_save/restore or local_irq_disable/enable. (I can't be sure of 
the contexts in which .set_state_shutdown and .set_state_periodic methods 
are called.)

> Adding oneshot mode is a completely different matter though, that 
> clearly needs to be tested on real hardware.
> 

Right, and many emulators trade-off timing accuracy for performance which 
makes them unsuitable for testing invasive changes of that sort.

> > > I also tried adding oneshot mode, which was successful but broke the 
> > > clocksource. It's probably not hard to make it work properly, but 
> > > this is where I've stopped.
> > >
> >
> > I'm not so sure that one timer is able to support both a clocksource 
> > driver and a clockevent driver. In some cases we may have to drop the 
> > clocksource driver (i.e. fall back on the jiffies clocksource).
> >
> > Anyway, even on Macs with only one VIA chip we still have two timers. 
> > So I think we should try to use Timer 1 as a freerunning clocksource 
> > and Timer 2 as a oneshot clock event. This may result in better 
> > accuracy and simpler code. This may require some experimentation 
> > though.
> 
> Ah, good. This is partly what I had been hoping for, as my patch can be 
> used as a starting point for that if you want to give it a go.
> 

After looking at the chip documentation I don't think it's viable to use 
the hardware timers in the way I proposed. A VIA register access requires 
at least one full VIA clock cycle (about 1.3 us) which means register 
accesses themselves cause timing delays. They also make clocksource reads 
expensive.

I think this rules out oneshot clockevent devices because if the system 
offered such a device it would preferentially get used as a tick device.

So I think your approach (periodic clockevent device driven by the 
existing periodic tick interrupt) is best for this platform due to 
simplicity (not much code) and performance (good accuracy, no additional 
overhead).

I suspect the same approach would work equally well on other platforms too 
(even though they are probably be capable of oneshot clockevent devices).

>  Arnd
> 


Re: [PATCH v5 5/5] arm64: ipi_nmi: Add support for NMI backtrace

2020-10-14 Thread Masayoshi Mizuma
On Wed, Oct 14, 2020 at 04:42:11PM +0530, Sumit Garg wrote:
> Enable NMI backtrace support on arm64 using IPI turned as an NMI
> leveraging pseudo NMIs support. It is now possible for users to get a
> backtrace of a CPU stuck in hard-lockup using magic SYSRQ.
> 
> Signed-off-by: Sumit Garg 
> ---
>  arch/arm64/include/asm/irq.h |  6 ++
>  arch/arm64/kernel/ipi_nmi.c  | 12 +++-
>  2 files changed, 17 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
> index b2b0c64..e840bf1 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
> @@ -6,6 +6,12 @@
>  
>  #include 
>  
> +#ifdef CONFIG_SMP
> +extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
> +bool exclude_self);
> +#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
> +#endif
> +
>  struct pt_regs;
>  
>  static inline int nr_legacy_irqs(void)
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index e0a9e03..e1dc19d 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #include 
> @@ -25,12 +26,21 @@ void arch_send_call_nmi_func_ipi_mask(cpumask_t *mask)
>   __ipi_send_mask(ipi_desc, mask);
>  }
>  
> +void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> +{
> + nmi_trigger_cpumask_backtrace(mask, exclude_self,
> +   arch_send_call_nmi_func_ipi_mask);
> +}
> +
>  static irqreturn_t ipi_nmi_handler(int irq, void *data)
>  {
>   unsigned int cpu = smp_processor_id();
>  
> - ipi_kgdb_nmicallback(cpu, get_irq_regs());
> + if (nmi_cpu_backtrace(get_irq_regs()))
> + goto out;
>  
> + ipi_kgdb_nmicallback(cpu, get_irq_regs());
> +out:
>   return IRQ_HANDLED;
>  }
>  
> -- 

It works well. Please feel free to add:

Tested-by: Masayoshi Mizuma 

Thanks!
Masa


Re: [PATCH v5 3/5] arm64: smp: Allocate and setup IPI as NMI

2020-10-14 Thread Masayoshi Mizuma
On Wed, Oct 14, 2020 at 04:42:09PM +0530, Sumit Garg wrote:
> Allocate an unused IPI that can be turned as NMI using ipi_nmi framework.
> Also, invoke corresponding NMI setup/teardown APIs.
> 
> Signed-off-by: Sumit Garg 
> ---
>  arch/arm64/kernel/smp.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 82e75fc..129ebfb 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -43,6 +43,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -962,6 +963,8 @@ static void ipi_setup(int cpu)
>  
>   for (i = 0; i < nr_ipi; i++)
>   enable_percpu_irq(ipi_irq_base + i, 0);
> +
> + ipi_nmi_setup(cpu);
>  }
>  
>  #ifdef CONFIG_HOTPLUG_CPU
> @@ -974,6 +977,8 @@ static void ipi_teardown(int cpu)
>  
>   for (i = 0; i < nr_ipi; i++)
>   disable_percpu_irq(ipi_irq_base + i);
> +
> + ipi_nmi_teardown(cpu);
>  }
>  #endif
>  
> @@ -995,6 +1000,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
>   irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
>   }
>  
> + if (n > nr_ipi)
> + set_smp_ipi_nmi(ipi_base + nr_ipi);
> +
>   ipi_irq_base = ipi_base;
>  
>   /* Setup the boot CPU immediately */
> -- 

Looks good to me. Please feel free to add:

Reviewed-by: Masayoshi Mizuma 

Thanks!
Masa



Re: [PATCH RFC V3 5/9] x86/pks: Add PKS kernel API

2020-10-14 Thread Ira Weiny
On Tue, Oct 13, 2020 at 11:43:57AM -0700, Dave Hansen wrote:
> > +static inline void pks_update_protection(int pkey, unsigned long 
> > protection)
> > +{
> > +   current->thread.saved_pkrs = update_pkey_val(current->thread.saved_pkrs,
> > +pkey, protection);
> > +   preempt_disable();
> > +   write_pkrs(current->thread.saved_pkrs);
> > +   preempt_enable();
> > +}
> 
> Why does this need preempt count manipulation in addition to the
> get/put_cpu_var() inside of write_pkrs()?

This is a bug.  The disable should be around the update_pkey_val().

> 
> > +/**
> > + * PKS access control functions
> > + *
> > + * Change the access of the domain specified by the pkey.  These are global
> > + * updates.  They only affects the current running thread.  It is 
> > undefined and
> > + * a bug for users to call this without having allocated a pkey and using 
> > it as
> > + * pkey here.
> > + *
> > + * pks_mknoaccess()
> > + * Disable all access to the domain
> > + * pks_mkread()
> > + * Make the domain Read only
> > + * pks_mkrdwr()
> > + * Make the domain Read/Write
> > + *
> > + * @pkey the pkey for which the access should change.
> > + *
> > + */
> > +void pks_mknoaccess(int pkey)
> > +{
> > +   pks_update_protection(pkey, PKEY_DISABLE_ACCESS);
> > +}
> > +EXPORT_SYMBOL_GPL(pks_mknoaccess);
> 
> These are named like PTE manipulation functions, which is kinda weird.
> 
> What's wrong with: pks_disable_access(pkey) ?

Internal review suggested these names.  I'm not dead set on them.

FWIW I would rather they not get to wordy.

I was trying to get some consistency with pks_mk*() as meaning PKS 'make' X.

Do me 'disable' implies a state transition where 'make' implies we are
'setting' an absolute value.  I think the later is a better name.  And 'make'
made more sense because 'set' is so overloaded IHO.

> 
> > +void pks_mkread(int pkey)
> > +{
> > +   pks_update_protection(pkey, PKEY_DISABLE_WRITE);
> > +}
> > +EXPORT_SYMBOL_GPL(pks_mkread);
> 
> I really don't like this name.  It doesn't make readable, or even
> read-only, *especially* if it was already access-disabled.

Ok.

But it does sense if going from access-disable to read, correct?.  I could see
this being better named pks_mkreadonly() so that going from RW to this would
make more sense.  Especially after thinking about it above 'read only' needs to
be in the name.

Before I change anything I'd like to get consensus on naming.

How about the following?

pks_mk_noaccess()
pks_mk_readonly()
pks_mk_readwrite()

?

> 
> > +static const char pks_key_user0[] = "kernel";
> > +
> > +/* Store names of allocated keys for debug.  Key 0 is reserved for the 
> > kernel.  */
> > +static const char *pks_key_users[PKS_NUM_KEYS] = {
> > +   pks_key_user0
> > +};
> > +
> > +/*
> > + * Each key is represented by a bit.  Bit 0 is set for key 0 and reserved 
> > for
> > + * its use.  We use ulong for the bit operations but only 16 bits are used.
> > + */
> > +static unsigned long pks_key_allocation_map = 1 << PKS_KERN_DEFAULT_KEY;
> > +
> > +/*
> > + * pks_key_alloc - Allocate a PKS key
> > + *
> > + * @pkey_user: String stored for debugging of key exhaustion.  The caller 
> > is
> > + * responsible to maintain this memory until pks_key_free().
> > + */
> > +int pks_key_alloc(const char * const pkey_user)
> > +{
> > +   int nr;
> > +
> > +   if (!cpu_feature_enabled(X86_FEATURE_PKS))
> > +   return -EINVAL;
> 
> I'm not sure I like -EINVAL for this.  I thought we returned -ENOSPC for
> this case for user pkeys.

-ENOTSUP?

I'm not really sure anyone will need to know the difference between the
platform not supporting the key vs running out of them.  But they are 2
different error conditions.

> 
> > +   while (1) {
> > +   nr = find_first_zero_bit(&pks_key_allocation_map, PKS_NUM_KEYS);
> > +   if (nr >= PKS_NUM_KEYS) {
> > +   pr_info("Cannot allocate supervisor key for %s.\n",
> > +   pkey_user);
> > +   return -ENOSPC;

We return -ENOSPC here when running out of keys.

> > +   }
> > +   if (!test_and_set_bit_lock(nr, &pks_key_allocation_map))
> > +   break;
> > +   }
> > +
> > +   /* for debugging key exhaustion */
> > +   pks_key_users[nr] = pkey_user;
> > +
> > +   return nr;
> > +}
> > +EXPORT_SYMBOL_GPL(pks_key_alloc);
> > +
> > +/*
> > + * pks_key_free - Free a previously allocate PKS key
> > + *
> > + * @pkey: Key to be free'ed
> > + */
> > +void pks_key_free(int pkey)
> > +{
> > +   if (!cpu_feature_enabled(X86_FEATURE_PKS))
> > +   return;
> > +
> > +   if (pkey >= PKS_NUM_KEYS || pkey <= PKS_KERN_DEFAULT_KEY)
> > +   return;
> 
> This seems worthy of a WARN_ON_ONCE() at least.  It's essentially
> corrupt data coming into a kernel API.

Ok, Done,
Ira



[PATCH 1/2] device-dax/kmem: Fix resource release

2020-10-14 Thread Dan Williams
The conversion to request_mem_region() is broken because it assumes that
the range is marked busy prior to release. However, due to the way that
the kmem driver manipulates the IORESOURCE_BUSY flag (clears it to
let {add,remove}_memory() handle busy) it requires a manual
release_resource() to perform cleanup.

Given that the actual 'struct resource *' needs to be recalled, not just
the range, add that tracking to the kmem driver-data.

Reported-by: David Hildenbrand 
Fixes: 0513bd5bb114 ("device-dax/kmem: replace release_resource() with 
release_mem_region()")
Cc: Vishal Verma 
Cc: Dave Hansen 
Cc: Pavel Tatashin 
Cc: Brice Goglin 
Cc: Dave Jiang 
Cc: Ira Weiny 
Cc: Jia He 
Cc: Joao Martins 
Cc: Jonathan Cameron 
Signed-off-by: Dan Williams 
---
 drivers/dax/kmem.c |   48 ++--
 1 file changed, 34 insertions(+), 14 deletions(-)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 6c933f2b604e..af04b6d1d263 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -35,11 +35,17 @@ static int dax_kmem_range(struct dev_dax *dev_dax, int i, 
struct range *r)
return 0;
 }
 
+struct dax_kmem_data {
+   const char *res_name;
+   struct resource *res[];
+};
+
 static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
 {
struct device *dev = &dev_dax->dev;
+   struct dax_kmem_data *data;
+   int rc = -ENOMEM;
int i, mapped = 0;
-   char *res_name;
int numa_node;
 
/*
@@ -55,14 +61,17 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
return -EINVAL;
}
 
-   res_name = kstrdup(dev_name(dev), GFP_KERNEL);
-   if (!res_name)
+   data = kzalloc(sizeof(*data) + sizeof(struct resource *) * 
dev_dax->nr_range, GFP_KERNEL);
+   if (!data)
return -ENOMEM;
 
+   data->res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+   if (!data->res_name)
+   goto err_res_name;
+
for (i = 0; i < dev_dax->nr_range; i++) {
struct resource *res;
struct range range;
-   int rc;
 
rc = dax_kmem_range(dev_dax, i, &range);
if (rc) {
@@ -72,7 +81,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
}
 
/* Region is permanently reserved if hotremove fails. */
-   res = request_mem_region(range.start, range_len(&range), 
res_name);
+   res = request_mem_region(range.start, range_len(&range), 
data->res_name);
if (!res) {
dev_warn(dev, "mapping%d: %#llx-%#llx could not reserve 
region\n",
i, range.start, range.end);
@@ -82,9 +91,10 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
 */
if (mapped)
continue;
-   kfree(res_name);
-   return -EBUSY;
+   rc = -EBUSY;
+   goto err_request_mem;
}
+   data->res[i] = res;
 
/*
 * Set flags appropriate for System RAM.  Leave ..._BUSY clear
@@ -104,18 +114,25 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
if (rc) {
dev_warn(dev, "mapping%d: %#llx-%#llx memory add 
failed\n",
i, range.start, range.end);
-   release_mem_region(range.start, range_len(&range));
+   release_resource(res);
+   kfree(res);
+   data->res[i] = NULL;
if (mapped)
continue;
-   kfree(res_name);
-   return rc;
+   goto err_request_mem;
}
mapped++;
}
 
-   dev_set_drvdata(dev, res_name);
+   dev_set_drvdata(dev, data);
 
return 0;
+
+err_request_mem:
+   kfree(data->res_name);
+err_res_name:
+   kfree(data);
+   return rc;
 }
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
@@ -123,7 +140,7 @@ static int dev_dax_kmem_remove(struct dev_dax *dev_dax)
 {
int i, success = 0;
struct device *dev = &dev_dax->dev;
-   const char *res_name = dev_get_drvdata(dev);
+   struct dax_kmem_data *data = dev_get_drvdata(dev);
 
/*
 * We have one shot for removing memory, if some memory blocks were not
@@ -142,7 +159,9 @@ static int dev_dax_kmem_remove(struct dev_dax *dev_dax)
rc = remove_memory(dev_dax->target_node, range.start,
range_len(&range));
if (rc == 0) {
-   release_mem_region(range.start, range_len(&range));
+   release_resource(data->res[i]);
+   kfree(data->res[i]);
+   data->res[i] = NULL;
  

[PATCH 0/2] device-dax subdivision v5 to v6 fixups

2020-10-14 Thread Dan Williams
Hi,

The v5 series of the device-dax-subdivision series landed upstream which
missed some of the late breaking fixups in v6 [1]. The Xen one is
cosmetic, the kmem one is a functional problem. I will handle the kmem
in a device-dax follow-on pull request post-rc1. The Xen one can go
through the Xen tree at its own pace.

My thanks to Andrew for wrangling the thrash up to v5, and my apologies
to Andrew et al for not highlighting this gap sooner.

[1]: 
http://lore.kernel.org/r/160196728453.2166475.12832711415715687418.st...@dwillia2-desk3.amr.corp.intel.com

---

Dan Williams (2):
  device-dax/kmem: Fix resource release
  xen/unpopulated-alloc: Consolidate pgmap manipulation


 drivers/dax/kmem.c  |   48 ---
 drivers/xen/unpopulated-alloc.c |   14 ++-
 2 files changed, 41 insertions(+), 21 deletions(-)

base-commit: 4da9af0014b51c8b015ed8c622440ef28912efe6


[PATCH 2/2] xen/unpopulated-alloc: Consolidate pgmap manipulation

2020-10-14 Thread Dan Williams
Cleanup fill_list() to keep all the pgmap manipulations in a single
location of the function. Update the exit unwind path accordingly.

Link: http://lore.kernel.org/r/6186fa28-d123-12db-6171-a75cb6e61...@oracle.com

Cc: Juergen Gross 
Cc: Stefano Stabellini 
Cc: Andrew Morton 
Cc: 
Reported-by: Boris Ostrovsky 
Signed-off-by: Dan Williams 
---
 drivers/xen/unpopulated-alloc.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/xen/unpopulated-alloc.c b/drivers/xen/unpopulated-alloc.c
index 8c512ea550bb..75ab5de99868 100644
--- a/drivers/xen/unpopulated-alloc.c
+++ b/drivers/xen/unpopulated-alloc.c
@@ -27,11 +27,6 @@ static int fill_list(unsigned int nr_pages)
if (!res)
return -ENOMEM;
 
-   pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
-   if (!pgmap)
-   goto err_pgmap;
-
-   pgmap->type = MEMORY_DEVICE_GENERIC;
res->name = "Xen scratch";
res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
 
@@ -43,6 +38,11 @@ static int fill_list(unsigned int nr_pages)
goto err_resource;
}
 
+   pgmap = kzalloc(sizeof(*pgmap), GFP_KERNEL);
+   if (!pgmap)
+   goto err_pgmap;
+
+   pgmap->type = MEMORY_DEVICE_GENERIC;
pgmap->range = (struct range) {
.start = res->start,
.end = res->end,
@@ -91,10 +91,10 @@ static int fill_list(unsigned int nr_pages)
return 0;
 
 err_memremap:
-   release_resource(res);
-err_resource:
kfree(pgmap);
 err_pgmap:
+   release_resource(res);
+err_resource:
kfree(res);
return ret;
 }



Re: linux-next: build failure after merge of the vfio tree

2020-10-14 Thread Stephen Rothwell
Hi Alex,

On Tue, 13 Oct 2020 13:20:16 -0600 Alex Williamson  
wrote:
>
> Thanks, Stephen.  Diana has posted a 32bit build fix which I've merged,
> maybe that was the error.  Also Diana's series in my branch is currently
> dependent on fsl-bus support in GregKH's char-misc-next branch.  Looking
> at the log from the successful build, I wonder if our branches are just
> in the wrong order (vfio/next processed on line 341, char-misc-next
> processed on 387).  I don't know if you regularly re-order for this
> sort of thing, otherwise it should work out when Greg's branch gets
> merged, but testing sooner in next would be preferred.

I have put the vfio tree after the char-misc tree today (so hopefully
it will build).  The proper way to do this is for you and Greg to have
a shared branch with the commits you both depend on and bot merge that
branch.  That way, it doesn't matter what order the tress are merged
(by me or Linus).

-- 
Cheers,
Stephen Rothwell


pgpBocWdENpig.pgp
Description: OpenPGP digital signature


Re: [PATCH v3] arm64/mm: add fallback option to allocate virtually contiguous memory

2020-10-14 Thread Sudarshan Rajagopalan

On 2020-10-13 04:38, Anshuman Khandual wrote:

On 10/13/2020 04:35 AM, Sudarshan Rajagopalan wrote:
When section mappings are enabled, we allocate vmemmap pages from 
physically
continuous memory of size PMD_SIZE using vmemmap_alloc_block_buf(). 
Section
mappings are good to reduce TLB pressure. But when system is highly 
fragmented
and memory blocks are being hot-added at runtime, its possible that 
such
physically continuous memory allocations can fail. Rather than failing 
the
memory hot-add procedure, add a fallback option to allocate vmemmap 
pages from

discontinuous pages using vmemmap_populate_basepages().


There is a checkpatch warning here, which could be fixed while merging 
?


WARNING: Possible unwrapped commit description (prefer a maximum 75
chars per line)
#7:
When section mappings are enabled, we allocate vmemmap pages from 
physically


total: 0 errors, 1 warnings, 13 lines checked



Thanks Anshuman for the review. I sent out an updated patch fixing the 
checkpatch warning.




Signed-off-by: Sudarshan Rajagopalan 
Reviewed-by: Gavin Shan 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Anshuman Khandual 
Cc: Mark Rutland 
Cc: Logan Gunthorpe 
Cc: David Hildenbrand 
Cc: Andrew Morton 
Cc: Steven Price 


Nonetheless, this looks fine. Did not see any particular problem
while creating an experimental vmemmap with interleaving section
and base page mapping.

Reviewed-by: Anshuman Khandual 


---
 arch/arm64/mm/mmu.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62fea1b6..44486fd0e883 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1121,8 +1121,11 @@ int __meminit vmemmap_populate(unsigned long 
start, unsigned long end, int node,

void *p = NULL;

p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
-   if (!p)
-   return -ENOMEM;
+   if (!p) {
+   if (vmemmap_populate_basepages(addr, next, 
node, altmap))
+   return -ENOMEM;
+   continue;
+   }

pmd_set_huge(pmdp, __pa(p), __pgprot(PROT_SECT_NORMAL));
} else




Sudarshan

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a 
Linux Foundation Collaborative Project


[PATCH v3] arm64/mm: add fallback option to allocate virtually contiguous memory

2020-10-14 Thread Sudarshan Rajagopalan
When section mappings are enabled, we allocate vmemmap pages from
physically continuous memory of size PMD_SIZE using
vmemmap_alloc_block_buf(). Section mappings are good to reduce TLB
pressure. But when system is highly fragmented and memory blocks are
being hot-added at runtime, its possible that such physically continuous
memory allocations can fail. Rather than failing the memory hot-add
procedure, add a fallback option to allocate vmemmap pages from
discontinuous pages using vmemmap_populate_basepages().

Signed-off-by: Sudarshan Rajagopalan 
Reviewed-by: Gavin Shan 
Reviewed-by: Anshuman Khandual 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Anshuman Khandual 
Cc: Mark Rutland 
Cc: Logan Gunthorpe 
Cc: David Hildenbrand 
Cc: Andrew Morton 
Cc: Steven Price 
---
 arch/arm64/mm/mmu.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 75df62fea1b6..44486fd0e883 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1121,8 +1121,11 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node,
void *p = NULL;
 
p = vmemmap_alloc_block_buf(PMD_SIZE, node, altmap);
-   if (!p)
-   return -ENOMEM;
+   if (!p) {
+   if (vmemmap_populate_basepages(addr, next, 
node, altmap))
+   return -ENOMEM;
+   continue;
+   }
 
pmd_set_huge(pmdp, __pa(p), __pgprot(PROT_SECT_NORMAL));
} else
-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



[PATCH v3] arm64/mm: add fallback option to allocate virtually contiguous memory

2020-10-14 Thread Sudarshan Rajagopalan
V1: The initial patch used the approach to abort at the first instance of 
PMD_SIZE
allocation failure, unmaps all previously mapped sections using vmemmap_free
and maps the entire request with vmemmap_populate_basepages to allocate 
virtually contiguous memory.
https://lkml.org/lkml/2020/9/10/66

V2: Allocates virtually contiguous memory only for sections that failed
PMD_SIZE allocation, and continous to allocate physically contiguous
memory for other sections.
https://lkml.org/lkml/2020/9/30/1489

V3: Addressed trivial review comments. Pass in altmap to 
vmemmap_populate_basepages.

Sudarshan Rajagopalan (1):
  arm64/mm: add fallback option to allocate virtually contiguous memory

 arch/arm64/mm/mmu.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project



Re: [PATCH] checkpatch: add a fixer for missing newline at eof

2020-10-14 Thread Joe Perches
On Wed, 2020-10-14 at 14:15 -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> Remove the trailing error message from the fixed lines

Hi Tom.

> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -3393,8 +3393,11 @@ sub process {
>  
>  # check for adding lines without a newline.
>   if ($line =~ /^\+/ && defined $lines[$linenr] && 
> $lines[$linenr] =~ /^\\ No newline at end of file/) {
> - WARN("MISSING_EOF_NEWLINE",
> -  "adding a line without newline at end of file\n" . 
> $herecurr);
> + if (WARN("MISSING_EOF_NEWLINE",
> +  "adding a line without newline at end of 
> file\n" . $herecurr) &&
> + $fix) {
> + fix_delete_line($fixlinenr+1, "No newline at end of 
> file");

This is misindented, the 4 spaces before fix_delete_line
should be a tab, otherwise this looks fine, thanks.




Re: [PATCH 0/2] net: dsa: mv88e6xxx: serdes link without phy

2020-10-14 Thread Jakub Kicinski
On Tue, 13 Oct 2020 15:18:56 +1300 Chris Packham wrote:
> This small series gets my hardware into a working state. The key points are to
> make sure we don't force the link and that we ask the MAC for the link status.
> I also have updated my dts to say `phy-mode = "1000base-x";` and `managed =
> "in-band-status";`

Russell, Andrew, PHY folks - does this look good to you?


linux-next: manual merge of the btrfs tree with Linus' tree

2020-10-14 Thread Stephen Rothwell
Hi all,

Please do *not* rebase/rewrite your linux-next included tree and then
immediately send it to Linus.  Or if you do, the please also update
what you have in linux-next (so you can sneak it past me :-().
(mutter, mutter, unnecessary conflicts :-().

I have dropped the brtfs tree from linux-next today.

-- 
Cheers,
Stephen Rothwell


pgpFSo4GbrmJU.pgp
Description: OpenPGP digital signature


[PATCH 1/2] vfs: move generic_remap_checks out of mm

2020-10-14 Thread Darrick J. Wong
From: Darrick J. Wong 

I would like to move all the generic helpers for the vfs remap range
functionality (aka clonerange and dedupe) into a separate file so that
they won't be scattered across the vfs and the mm subsystems.  The
eventual goal is to be able to deselect remap_range.c if none of the
filesystems need that code, but the tricky part here is picking a
stable(ish) part of the merge window to rearrange code.

Signed-off-by: Darrick J. Wong 
---
 fs/Makefile|3 +-
 fs/remap_range.c   |  103 
 include/linux/fs.h |2 +
 mm/filemap.c   |   81 +
 4 files changed, 108 insertions(+), 81 deletions(-)
 create mode 100644 fs/remap_range.c


diff --git a/fs/Makefile b/fs/Makefile
index 1c7b0e3f6daa..7173350739c5 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -13,7 +13,8 @@ obj-y :=  open.o read_write.o file_table.o super.o \
seq_file.o xattr.o libfs.o fs-writeback.o \
pnode.o splice.o sync.o utimes.o d_path.o \
stack.o fs_struct.o statfs.o fs_pin.o nsfs.o \
-   fs_types.o fs_context.o fs_parser.o fsopen.o init.o
+   fs_types.o fs_context.o fs_parser.o fsopen.o init.o \
+   remap_range.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=   buffer.o block_dev.o direct-io.o mpage.o
diff --git a/fs/remap_range.c b/fs/remap_range.c
new file mode 100644
index ..e66d8c131b69
--- /dev/null
+++ b/fs/remap_range.c
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * linux/fs/remap_range.c
+ *
+ * Copyright (C) 1994-1999  Linus Torvalds
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "internal.h"
+
+#include 
+#include 
+
+/*
+ * Performs necessary checks before doing a clone.
+ *
+ * Can adjust amount of bytes to clone via @req_count argument.
+ * Returns appropriate error code that caller should return or
+ * zero in case the clone should be allowed.
+ */
+int generic_remap_checks(struct file *file_in, loff_t pos_in,
+struct file *file_out, loff_t pos_out,
+loff_t *req_count, unsigned int remap_flags)
+{
+   struct inode *inode_in = file_in->f_mapping->host;
+   struct inode *inode_out = file_out->f_mapping->host;
+   uint64_t count = *req_count;
+   uint64_t bcount;
+   loff_t size_in, size_out;
+   loff_t bs = inode_out->i_sb->s_blocksize;
+   int ret;
+
+   /* The start of both ranges must be aligned to an fs block. */
+   if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_out, bs))
+   return -EINVAL;
+
+   /* Ensure offsets don't wrap. */
+   if (pos_in + count < pos_in || pos_out + count < pos_out)
+   return -EINVAL;
+
+   size_in = i_size_read(inode_in);
+   size_out = i_size_read(inode_out);
+
+   /* Dedupe requires both ranges to be within EOF. */
+   if ((remap_flags & REMAP_FILE_DEDUP) &&
+   (pos_in >= size_in || pos_in + count > size_in ||
+pos_out >= size_out || pos_out + count > size_out))
+   return -EINVAL;
+
+   /* Ensure the infile range is within the infile. */
+   if (pos_in >= size_in)
+   return -EINVAL;
+   count = min(count, size_in - (uint64_t)pos_in);
+
+   ret = generic_write_check_limits(file_out, pos_out, &count);
+   if (ret)
+   return ret;
+
+   /*
+* If the user wanted us to link to the infile's EOF, round up to the
+* next block boundary for this check.
+*
+* Otherwise, make sure the count is also block-aligned, having
+* already confirmed the starting offsets' block alignment.
+*/
+   if (pos_in + count == size_in) {
+   bcount = ALIGN(size_in, bs) - pos_in;
+   } else {
+   if (!IS_ALIGNED(count, bs))
+   count = ALIGN_DOWN(count, bs);
+   bcount = count;
+   }
+
+   /* Don't allow overlapped cloning within the same file. */
+   if (inode_in == inode_out &&
+   pos_out + bcount > pos_in &&
+   pos_out < pos_in + bcount)
+   return -EINVAL;
+
+   /*
+* We shortened the request but the caller can't deal with that, so
+* bounce the request back to userspace.
+*/
+   if (*req_count != count && !(remap_flags & REMAP_FILE_CAN_SHORTEN))
+   return -EINVAL;
+
+   *req_count = count;
+   return 0;
+}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a08..eea754a8dd67 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3012,6 +3012,8 @@ extern ssize_t generic_write_checks(struct kiocb *, 
struct iov_iter *);
 extern int generic_remap_checks(struct file *file_in, loff_t pos_in,
struct 

[PATCH 2/2] vfs: move the remap range helpers to remap_range.c

2020-10-14 Thread Darrick J. Wong
From: Darrick J. Wong 

Complete the migration by moving the file remapping helper functions out
of read_write.c and into remap_range.c.  This reduces the clutter in the
first file and (eventually) will make it so that we can compile out the
second file if it isn't needed.

Signed-off-by: Darrick J. Wong 
---
 fs/read_write.c|  473 ---
 fs/remap_range.c   |  480 
 include/linux/fs.h |3 
 3 files changed, 477 insertions(+), 479 deletions(-)


diff --git a/fs/read_write.c b/fs/read_write.c
index d3428189f36b..f0877f1c0c49 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1832,476 +1832,3 @@ SYSCALL_DEFINE6(copy_file_range, int, fd_in, loff_t 
__user *, off_in,
 out2:
return ret;
 }
-
-static int remap_verify_area(struct file *file, loff_t pos, loff_t len,
-bool write)
-{
-   struct inode *inode = file_inode(file);
-
-   if (unlikely(pos < 0 || len < 0))
-   return -EINVAL;
-
-if (unlikely((loff_t) (pos + len) < 0))
-   return -EINVAL;
-
-   if (unlikely(inode->i_flctx && mandatory_lock(inode))) {
-   loff_t end = len ? pos + len - 1 : OFFSET_MAX;
-   int retval;
-
-   retval = locks_mandatory_area(inode, file, pos, end,
-   write ? F_WRLCK : F_RDLCK);
-   if (retval < 0)
-   return retval;
-   }
-
-   return security_file_permission(file, write ? MAY_WRITE : MAY_READ);
-}
-/*
- * Ensure that we don't remap a partial EOF block in the middle of something
- * else.  Assume that the offsets have already been checked for block
- * alignment.
- *
- * For clone we only link a partial EOF block above or at the destination 
file's
- * EOF.  For deduplication we accept a partial EOF block only if it ends at the
- * destination file's EOF (can not link it into the middle of a file).
- *
- * Shorten the request if possible.
- */
-static int generic_remap_check_len(struct inode *inode_in,
-  struct inode *inode_out,
-  loff_t pos_out,
-  loff_t *len,
-  unsigned int remap_flags)
-{
-   u64 blkmask = i_blocksize(inode_in) - 1;
-   loff_t new_len = *len;
-
-   if ((*len & blkmask) == 0)
-   return 0;
-
-   if (pos_out + *len < i_size_read(inode_out))
-   new_len &= ~blkmask;
-
-   if (new_len == *len)
-   return 0;
-
-   if (remap_flags & REMAP_FILE_CAN_SHORTEN) {
-   *len = new_len;
-   return 0;
-   }
-
-   return (remap_flags & REMAP_FILE_DEDUP) ? -EBADE : -EINVAL;
-}
-
-/* Read a page's worth of file data into the page cache. */
-static struct page *vfs_dedupe_get_page(struct inode *inode, loff_t offset)
-{
-   struct page *page;
-
-   page = read_mapping_page(inode->i_mapping, offset >> PAGE_SHIFT, NULL);
-   if (IS_ERR(page))
-   return page;
-   if (!PageUptodate(page)) {
-   put_page(page);
-   return ERR_PTR(-EIO);
-   }
-   return page;
-}
-
-/*
- * Lock two pages, ensuring that we lock in offset order if the pages are from
- * the same file.
- */
-static void vfs_lock_two_pages(struct page *page1, struct page *page2)
-{
-   /* Always lock in order of increasing index. */
-   if (page1->index > page2->index)
-   swap(page1, page2);
-
-   lock_page(page1);
-   if (page1 != page2)
-   lock_page(page2);
-}
-
-/* Unlock two pages, being careful not to unlock the same page twice. */
-static void vfs_unlock_two_pages(struct page *page1, struct page *page2)
-{
-   unlock_page(page1);
-   if (page1 != page2)
-   unlock_page(page2);
-}
-
-/*
- * Compare extents of two files to see if they are the same.
- * Caller must have locked both inodes to prevent write races.
- */
-static int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
-struct inode *dest, loff_t destoff,
-loff_t len, bool *is_same)
-{
-   loff_t src_poff;
-   loff_t dest_poff;
-   void *src_addr;
-   void *dest_addr;
-   struct page *src_page;
-   struct page *dest_page;
-   loff_t cmp_len;
-   bool same;
-   int error;
-
-   error = -EINVAL;
-   same = true;
-   while (len) {
-   src_poff = srcoff & (PAGE_SIZE - 1);
-   dest_poff = destoff & (PAGE_SIZE - 1);
-   cmp_len = min(PAGE_SIZE - src_poff,
- PAGE_SIZE - dest_poff);
-   cmp_len = min(cmp_len, len);
-   if (cmp_len <= 0)
-   goto out_error;
-
-   src_page = vfs_dedupe_get_page(src, srcoff);
-   if (IS_ERR(src

[PATCH 0/2] vfs: move the clone/dedupe/remap helpers to a single file

2020-10-14 Thread Darrick J. Wong
Hi all,

I would like to move the generic helper functions that support the file
remap range operations (aka clone and dedupe) to a separate file under
fs/.  For the moment, I have a few goals here: one is to declutter
fs/read_write.c and mm/filemap.c.  The second goal is to be able to
deselect all the remap code if no filesystems require it.

The third (and much more long term) goal is to have a place to land the
generic code for the atomic file extent swap functionality, since it
will reuse some of the functionality.  Someday.  Whenever I get around
to submitting that again.

AFAICT, nobody is attempting to land any major changes in any of the vfs
remap functions during the 5.10 window -- for-next showed conflicts only
in the Makefile, so it seems like a quiet enough time to do this.  There
are no functional changes here, it's just moving code blocks around.

So, I have a few questions, particularly for Al, Andrew, and Linus:

(1) Do you find this reorganizing acceptable?

(2) I was planning to rebase this series next Friday and try to throw it
in at the end of the merge window; is that ok?  (The current patches are
based on 5.9, and applying them manually to current master and for-next
didn't show any new conflicts.)

(3) Can I just grab the copyrights from mm/filemap.c?  Or fs/read_write.c?
Or something entirely different?

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=vfs-rearrange-remap-helpers
---
 fs/Makefile|3 
 fs/read_write.c|  473 ---
 fs/remap_range.c   |  577 
 include/linux/fs.h |5 
 mm/filemap.c   |   81 ---
 5 files changed, 582 insertions(+), 557 deletions(-)
 create mode 100644 fs/remap_range.c



Re: [PATCH 0/3] l3mdev icmp error route lookup fixes

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 10:50:13 -0400 Mathieu Desnoyers wrote:
> Here is a series of fixes for ipv4 and ipv6 which ensure the route
> lookup is performed on the right routing table in VRF configurations
> when sending TTL expired icmp errors (useful for traceroute).
> 
> It includes tests for both ipv4 and ipv6.
> 
> These fixes address specifically address the code paths involved in
> sending TTL expired icmp errors. As detailed in the individual commit
> messages, those fixes do not address similar icmp errors related to
> network namespaces and unreachable / fragmentation needed messages,
> which appear to use different code paths.

Applied, thank you!


fs/ext4/inode.o: warning: objtool: mpage_release_unused_pages()+0x22c: unreachable instruction

2020-10-14 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   37187df45af7d28d27b5c130c23f407ca9dbefa2
commit: 0887a7ebc97770c7870abf3075a2e8cd502a7f52 ubsan: add trap 
instrumentation option
date:   6 months ago
config: x86_64-randconfig-m001-20201015 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
# 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0887a7ebc97770c7870abf3075a2e8cd502a7f52
git remote add linus 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
git fetch --no-tags linus master
git checkout 0887a7ebc97770c7870abf3075a2e8cd502a7f52
# save the attached .config to linux build tree
make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>):

>> fs/ext4/inode.o: warning: objtool: mpage_release_unused_pages()+0x22c: 
>> unreachable instruction
--
>> block/blk-flush.o: warning: objtool: blk_flush_complete_seq()+0x6f: 
>> unreachable instruction
--
>> drivers/virtio/virtio_ring.o: warning: objtool: virtqueue_add_split()+0x3a: 
>> unreachable instruction
--
>> drivers/ide/ide-probe.o: warning: objtool: ide_host_remove()+0x5f: 
>> unreachable instruction
--
>> net/core/flow_dissector.o: warning: objtool: skb_flow_dissector_init()+0x80: 
>> unreachable instruction
--
>> drivers/usb/host/isp116x-hcd.o: warning: objtool: postproc_atl_queue()+0x6c: 
>> unreachable instruction

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 4/7] ASoC: arizona: Allow codecs to be selected from kernel config

2020-10-14 Thread kernel test robot
Hi Richard,

I love your patch! Perhaps something to improve:

[auto build test WARNING on asoc/for-next]
[also build test WARNING on robh/for-next sound/for-next v5.9 next-20201013]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Richard-Fitzgerald/Add-dts-for-Rpi4-Cirrus-Lochnagar-and-codecs/20201014-225648
base:   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git 
for-next
config: xtensa-randconfig-s032-20201014 (attached as .config)
compiler: xtensa-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-rc1-dirty
# 
https://github.com/0day-ci/linux/commit/7ddf8ce197a5426e13fe9422a3ed17f0b02a94df
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Richard-Fitzgerald/Add-dts-for-Rpi4-Cirrus-Lochnagar-and-codecs/20201014-225648
git checkout 7ddf8ce197a5426e13fe9422a3ed17f0b02a94df
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 
CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=xtensa 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"sparse warnings: (new ones prefixed by >>)"
>> sound/soc/codecs/wm5102.c:687:35: sparse: sparse: cast to restricted __be16

vim +687 sound/soc/codecs/wm5102.c

cc9e92431ee9c7f Charles Keepax 2014-06-06  677  
cc9e92431ee9c7f Charles Keepax 2014-06-06  678  static int 
wm5102_out_comp_coeff_put(struct snd_kcontrol *kcontrol,
cc9e92431ee9c7f Charles Keepax 2014-06-06  679  
 struct snd_ctl_elem_value *ucontrol)
cc9e92431ee9c7f Charles Keepax 2014-06-06  680  {
0fe1daa6663ae94 Kuninori Morimoto  2018-02-13  681  struct 
snd_soc_component *component = snd_soc_kcontrol_component(kcontrol);
0fe1daa6663ae94 Kuninori Morimoto  2018-02-13  682  struct arizona *arizona 
= dev_get_drvdata(component->dev->parent);
cc9e92431ee9c7f Charles Keepax 2014-06-06  683  
d74bcaaeb668261 Lars-Peter Clausen 2014-11-09  684  
mutex_lock(&arizona->dac_comp_lock);
cc9e92431ee9c7f Charles Keepax 2014-06-06  685  
memcpy(&arizona->dac_comp_coeff, ucontrol->value.bytes.data,
cc9e92431ee9c7f Charles Keepax 2014-06-06  686 
sizeof(arizona->dac_comp_coeff));
cc9e92431ee9c7f Charles Keepax 2014-06-06 @687  arizona->dac_comp_coeff 
= be16_to_cpu(arizona->dac_comp_coeff);
d74bcaaeb668261 Lars-Peter Clausen 2014-11-09  688  
mutex_unlock(&arizona->dac_comp_lock);
cc9e92431ee9c7f Charles Keepax 2014-06-06  689  
cc9e92431ee9c7f Charles Keepax 2014-06-06  690  return 0;
cc9e92431ee9c7f Charles Keepax 2014-06-06  691  }
cc9e92431ee9c7f Charles Keepax 2014-06-06  692  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH -next] Revert "powerpc/pci: unmap legacy INTx interrupts when a PHB is removed"

2020-10-14 Thread Stephen Rothwell
Hi Oliver,

On Thu, 15 Oct 2020 10:00:49 +1100 "Oliver O'Halloran"  wrote:
>
> On Thu, Oct 15, 2020 at 5:28 AM Qian Cai  wrote:
> >
> > This reverts commit 3a3181e16fbde752007759f8759d25e0ff1fc425 which
> > causes memory corruptions on POWER9 NV.  
> 
> I was going to post this along with a fix for Cedric's original bug,
> but I can do that separately so:
> 
> Acked-by: Oliver O'Halloran 

I added that to linux-next today.

-- 
Cheers,
Stephen Rothwell


pgpNa5oljUaz7.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 7/7] selftests/ftrace: Add test case for synthetic event syntax errors

2020-10-14 Thread Masami Hiramatsu
Hi Steve,

On Wed, 14 Oct 2020 13:32:15 -0400
Steven Rostedt  wrote:

> On Wed, 14 Oct 2020 11:06:36 +0900
> Masami Hiramatsu  wrote:
> 
> > Hi Tom,
> > 
> > On Tue, 13 Oct 2020 09:17:58 -0500
> > Tom Zanussi  wrote:
> > 
> > > Add a selftest that verifies that the syntax error messages and caret
> > > positions are correct for most of the possible synthetic event syntax
> > > error cases.
> > > 
> > > Signed-off-by: Tom Zanussi 
> > > ---
> > >  .../trigger-synthetic_event_syntax_errors.tc  | 19 +++
> > >  1 file changed, 19 insertions(+)
> > >  create mode 100644 
> > > tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic_event_syntax_errors.tc
> > > 
> > > diff --git 
> > > a/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic_event_syntax_errors.tc
> > >  
> > > b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic_event_syntax_errors.tc
> > > new file mode 100644
> > > index ..ada594fe16cb
> > > --- /dev/null
> > > +++ 
> > > b/tools/testing/selftests/ftrace/test.d/trigger/inter-event/trigger-synthetic_event_syntax_errors.tc
> > > @@ -0,0 +1,19 @@
> > > +#!/bin/sh
> > > +# SPDX-License-Identifier: GPL-2.0
> > > +# description: event trigger - test synthetic_events syntax parser errors
> > > +# requires: synthetic_events error_log  
> > 
> > This also requires dynamic strings support. So, its "requires" line should 
> > be
> > 
> > # requires: synthetic_events error_log "char name[]' >> 
> > synthetic_events":README
> > 
> > > +
> > > +check_error() { # command-with-error-pos-by-^
> > > +ftrace_errlog_check 'synthetic_events' "$1" 'synthetic_events'
> > > +}
> > > +  
> > 
> > BTW, some errors looks a bit odd.
> > 
> > > +check_error 'myevent ^chr arg'   # INVALID_TYPE
> > > +check_error 'myevent ^char str[];; int v'# INVALID_TYPE  
> > 
> > I think there is a wrong "void" argument between ";", instead of invalid 
> > type.
> > 
> > > +check_error 'myevent char ^str]; int v'  # INVALID_NAME
> > > +check_error 'myevent char ^str;[]'   # INVALID_NAME  
> > 
> > This is also not an invalid name but '[]' is an invalid type. 
> > 
> > > +check_error 'myevent ^char str[; int v'  # INVALID_TYPE
> > > +check_error '^mye;vent char str[]'   # BAD_NAME
> > > +check_error 'myevent char str[]; ^int'   # INVALID_FIELD  
> > 
> > Isn't it an incomplete command?
> > 
> > > +check_error '^myevent'   # INCOMPLETE_CMD
> > > +
> > > +exit 0  
> > 
> 
> Hi Masami,
> 
> I finished testing this series along with other patches (some from you),
> and I'm ready to push this to next, and hopefully soon to Linus. You have a
> "tested-by" for the entire series. Are you OK with this patch too? Can we
> push this forward and fix up any issues you have later?

I think this is OK to push at least for the upstream kernel (unless 
backporting).
The above issues can be fixed in another series :)

Thank you,

-- 
Masami Hiramatsu 


Re: [PATCH 2/2] nvmem: core: add OF_RECONFIG handler for nvmem cells

2020-10-14 Thread kernel test robot
Hi Michael,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.9 next-20201013]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Michael-Auchter/nvmem-add-DT-overlay-support-for-cells/20201015-054223
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
37187df45af7d28d27b5c130c23f407ca9dbefa2
config: i386-randconfig-s001-20201014 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.3-rc1-dirty
# 
https://github.com/0day-ci/linux/commit/a3191767e8e4e0480e36126ce93e6ab41ab6f498
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review 
Michael-Auchter/nvmem-add-DT-overlay-support-for-cells/20201015-054223
git checkout a3191767e8e4e0480e36126ce93e6ab41ab6f498
# save the attached .config to linux build tree
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


"sparse warnings: (new ones prefixed by >>)"
>> drivers/nvmem/core.c:1674:23: sparse: sparse: symbol 'nvmem_of_notifier' was 
>> not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[RFC PATCH] nvmem: core: nvmem_of_notifier can be static

2020-10-14 Thread kernel test robot


Signed-off-by: kernel test robot 
---
 core.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c
index 859431c15d5bc3..6dd79075fdd9b0 100644
--- a/drivers/nvmem/core.c
+++ b/drivers/nvmem/core.c
@@ -1671,7 +1671,7 @@ static int of_nvmem_notify(struct notifier_block *nb, 
unsigned long action,
return NOTIFY_OK;
 }
 
-struct notifier_block nvmem_of_notifier = {
+static struct notifier_block nvmem_of_notifier = {
.notifier_call = of_nvmem_notify,
 };
 #endif /* CONFIG_OF_DYNAMIC */


linux-next: manual merge of the risc-v and vfs trees with Linus' tree

2020-10-14 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the risc-v tree got a conflict in:

  arch/x86/lib/getuser.S

between commit:

  ea6f043fc984 ("x86: Make __get_user() generate an out-of-line call")

from Linus' tree and commit:

  47058bb54b57 ("x86: remove address space overrides using set_fs()")

from the risc-v and vfs trees.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/x86/lib/getuser.S
index 2cd902e06062,2f052bc96866..
--- a/arch/x86/lib/getuser.S
+++ b/arch/x86/lib/getuser.S
@@@ -35,12 -35,19 +35,21 @@@
  #include 
  #include 
  
 +#define ASM_BARRIER_NOSPEC ALTERNATIVE "", "lfence", X86_FEATURE_LFENCE_RDTSC
 +
+ #ifdef CONFIG_X86_5LEVEL
+ #define LOAD_TASK_SIZE_MINUS_N(n) \
+   ALTERNATIVE __stringify(mov $((1 << 47) - 4096 - (n)),%rdx), \
+   __stringify(mov $((1 << 56) - 4096 - (n)),%rdx), 
X86_FEATURE_LA57
+ #else
+ #define LOAD_TASK_SIZE_MINUS_N(n) \
+   mov $(TASK_SIZE_MAX - (n)),%_ASM_DX
+ #endif
+ 
.text
  SYM_FUNC_START(__get_user_1)
-   mov PER_CPU_VAR(current_task), %_ASM_DX
-   cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX
+   LOAD_TASK_SIZE_MINUS_N(0)
+   cmp %_ASM_DX,%_ASM_AX
jae bad_get_user
sbb %_ASM_DX, %_ASM_DX  /* array_index_mask_nospec() */
and %_ASM_DX, %_ASM_AX


pgpyyH8ucxugv.pgp
Description: OpenPGP digital signature


Re: [tip: locking/core] lockdep: Fix lockdep recursion

2020-10-14 Thread Paul E. McKenney
On Thu, Oct 15, 2020 at 12:39:54AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2020 at 03:11:52PM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 14, 2020 at 11:53:19PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 14, 2020 at 11:34:05AM -0700, Paul E. McKenney wrote:
> > > > commit 7deaa04b02298001426730ed0e6214ac20d1a1c1
> > > > Author: Paul E. McKenney 
> > > > Date:   Tue Oct 13 12:39:23 2020 -0700
> > > > 
> > > > rcu: Prevent lockdep-RCU splats on lock acquisition/release
> > > > 
> > > > The rcu_cpu_starting() and rcu_report_dead() functions transition 
> > > > the
> > > > current CPU between online and offline state from an RCU 
> > > > perspective.
> > > > Unfortunately, this means that the rcu_cpu_starting() function's 
> > > > lock
> > > > acquisition and the rcu_report_dead() function's lock releases 
> > > > happen
> > > > while the CPU is offline from an RCU perspective, which can result 
> > > > in
> > > > lockdep-RCU splats about using RCU from an offline CPU.  In reality,
> > > > aside from the splats, both transitions are safe because a new grace
> > > > period cannot start until these functions release their locks.
> > > 
> > > But we call the trace_* crud before we acquire the lock. Are you sure
> > > that's a false-positive? 
> > 
> > You lost me on this one.
> > 
> > I am assuming that you are talking about rcu_cpu_starting(), because
> > that is the one where RCU is not initially watching, that is, the
> > case where tracing before the lock acquisition would be a problem.
> > You cannot be talking about rcu_cpu_starting() itself, because it does
> > not do any tracing before acquiring the lock.  But if you are talking
> > about the caller of rcu_cpu_starting(), then that caller should put the
> > rcu_cpu_starting() before the tracing.  But that would be the other
> > patch earlier in this thread that was proposing moving the call to
> > rcu_cpu_starting() much earlier in CPU bringup.
> > 
> > So what am I missing here?
> 
> rcu_cpu_starting();
>   raw_spin_lock_irqsave();
> local_irq_save();
> preempt_disable();
> spin_acquire()
>   lock_acquire()
> trace_lock_acquire() <--- *whoopsie-doodle*
> /* uses RCU for tracing */
> arch_spin_lock_flags() <--- the actual spinlock

Gah!  Idiot here left out the most important part, so good catch!!!
Much easier this way than finding out about it the hard way...

I should have asked myself harder questions earlier today about moving
the counter from the rcu_node structure to the rcu_data structure.

Perhaps something like the following untested patch on top of the
earlier patch?

Thanx, Paul



diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 286dc0a..8b5215e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1159,8 +1159,8 @@ bool rcu_lockdep_current_cpu_online(void)
preempt_disable_notrace();
rdp = this_cpu_ptr(&rcu_data);
rnp = rdp->mynode;
-   seq = READ_ONCE(rdp->ofl_seq) & ~0x1;
-   if (rdp->grpmask & rcu_rnp_online_cpus(rnp) || seq != 
READ_ONCE(rdp->ofl_seq))
+   seq = READ_ONCE(rnp->ofl_seq) & ~0x1;
+   if (rdp->grpmask & rcu_rnp_online_cpus(rnp) || seq != 
READ_ONCE(rnp->ofl_seq))
ret = true;
preempt_enable_notrace();
return ret;
@@ -1982,6 +1982,7 @@ static void rcu_gp_fqs_loop(void)
 static void rcu_gp_cleanup(void)
 {
int cpu;
+   unsigned long firstseq;
bool needgp = false;
unsigned long gp_duration;
unsigned long new_gp_seq;
@@ -2019,6 +2020,12 @@ static void rcu_gp_cleanup(void)
new_gp_seq = rcu_state.gp_seq;
rcu_seq_end(&new_gp_seq);
rcu_for_each_node_breadth_first(rnp) {
+   smp_mb(); // Pair with barriers used when updating ->ofl_seq to 
odd values.
+   firstseq = READ_ONCE(rnp->ofl_seq);
+   if (firstseq & 0x1)
+   while (firstseq == smp_load_acquire(&rnp->ofl_seq))
+   schedule_timeout_idle(1);  // Can't wake unless 
RCU is watching.
+   smp_mb(); // Pair with barriers used when updating ->ofl_seq to 
even values.
raw_spin_lock_irq_rcu_node(rnp);
if (WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp)))
dump_blkd_tasks(rnp, 10);
@@ -4067,8 +4074,9 @@ void rcu_cpu_starting(unsigned int cpu)
 
rnp = rdp->mynode;
mask = rdp->grpmask;
-   WRITE_ONCE(rdp->ofl_seq, rdp->ofl_seq + 1);
-   WARN_ON_ONCE(!(rdp->ofl_seq & 0x1));
+   WRITE_ONCE(rnp->ofl_seq, rnp->ofl_seq + 1);
+   WARN_ON_ONCE(!(rnp->ofl_seq & 0x1));
+   smp_mb(); // Pair with rcu_gp_cleanup()'s ->ofl_seq barrier().
raw_spin_lock_irqsave_rcu_node(rnp, flags);
WRITE_ONCE(rnp->qsmaskinitnext, rnp->qsmaskinitnext | mask);
 

Re: [PATCH 20/20] arch: dts: Fix DWC USB3 DT nodes name

2020-10-14 Thread Serge Semin
On Wed, Oct 14, 2020 at 10:04:32PM +0200, Krzysztof Kozlowski wrote:
> On Wed, 14 Oct 2020 at 19:16, Serge Semin
>  wrote:
> >
> > On Wed, Oct 14, 2020 at 12:33:25PM +0200, Krzysztof Kozlowski wrote:
> > > On Wed, 14 Oct 2020 at 12:23, Serge Semin
> > >  wrote:
> > > >
> > > > In accordance with the DWC USB3 bindings the corresponding node name is
> > > > suppose to comply with Generic USB HCD DT schema, which requires the USB
> > > > nodes to have the name acceptable by the regexp: "^usb(@.*)?" . But a 
> > > > lot
> > > > of the DWC USB3-compatible nodes defined in the ARM/ARM64 DTS files have
> > > > name as "^dwc3@.*" or "^usb[1-3]@.*" or even "^dwusb@.*", which will 
> > > > cause
> > > > the dtbs_check procedure failure. Let's fix the nodes naming to be
> > > > compatible with the DWC USB3 DT schema to make dtbs_check happy.
> > > >
> > > > Note we don't change the DWC USB3-compatible nodes names of
> > > > arch/arm64/boot/dts/apm/{apm-storm.dtsi,apm-shadowcat.dtsi} since the
> > > > in-source comment says that the nodes name need to be preserved as
> > > > "^dwusb@.*" for some backward compatibility.
> > > >
> > > > Signed-off-by: Serge Semin 
> > > >
> > > > ---
> > > >
> > > > Please, test the patch out to make sure it doesn't brake the dependent 
> > > > DTS
> > > > files. I did only a manual grepping of the possible nodes dependencies.
> > >
> >
> > > 1. It is you who should compare the decompiled DTS, not us. For example:
> > > $ for i in dts-old/*/*dtb dts-old/*/*/*dtb; do echo $i; crosc64
> > > scripts/dtc/dtx_diff ${i} dts-new/${i#dts-old/} ; done
> > >
> > > $ for i in dts-old/*/*dtb dts-old/*/*/*dtb; do echo $i; crosc64
> > > fdtdump ${i} > ${i}.fdt ; crosc64 fdtdump dts-new/${i#dts-old/} >
> > > dts-new/${i#dts-old/}.fdt ; diff -ubB ${i}.fdt
> > > dts-new/${i#dts-old/}.fdt ; done
> >
> > So basically you suggest first to compile the old and new dts files, then to
> > de-compile them, then diff old and new fdt's, and visually compare the 
> > results.
> > Personally it isn't that much better than what I did, since each old and new
> > dtbs will for sure differ due to the node names change suggested in this 
> > patch.
> > So it will lead to the visual debugging too, which isn't that effective. But
> > your approach is still more demonstrative to make sure that I didn't loose 
> > any
> > nodes redefinition, since in the occasion the old and new de-compiled nodes 
> > will
> > differ not only by the node names but with an additional old named node.
> 

> My suggestion is to compare the entire, effective DTS after all
> inclusions. Maybe you did it already, I don't know.

Only by grepping the dts'es, which include the dtsi'es modified in this patch.
So your suggestion of compiling and de-compiling has been indeed relevant.

> The point is that
> when you change node names in DTSI but you miss one in DTS, you end up
> with two nodes.

Yep, that's exactly what I meant when I said that your approach was more
demonstrative, etc.

> This is much easier to spot with dtxdiff or with
> fdtdump (which behaves better for node moves).
> 

> Indeed it is still a visual comparison - if you have any ideas how to
> automate it (e.g. ignore phandle changes), please share. It would
> solve my testings as well.

Alas I don't. That's why to save my time of coming up with an automated solution
I did a very thorough modification making sure that each affected node isn't
updated in the corresponding dts'es and asked to test the patches out.

Anyway the approach suggested by you will indeed give us a better understanding
of my patches correctness. So I'll use it before sending v3. Thanks.

> But asking others to test because you do
> not want to check it is not the best way to handle such changes.
> 
> >
> > So to speak thanks for suggesting it. I'll try it to validate the proposed
> > changes.
> >
> > Two questions:
> > 1) Any advise of a good inliner/command to compile all dtbs at once? Of 
> > course I
> > can get all the updated dtsi'es, then find out all the dts'es which include
> > them, then directly use dtc to compile the found dts'es... On the other 
> > hand I
> > can just compile all dts'es, then compare old and new ones. The diff of the
> > non-modified dtb'es will be just empty...
> 

> make dtbs

It's not that easy.) "make dtbs" will build dtbs only for enabled boards, which
first need to be enabled in the kernel config. So I'll need to have a config
with all the affected dts. The later is the same as if I just found all the
affected dts and built them one-by-one by directly calling dtc.

> touch your dts or git stash pop
> make dtbs
> compare
> diff for all unchanged will be simply empty, so easy to spot
> 
> > 2) What crosc64 is?
> 
> Ah, just an alias for cross compiling + ccache + kbuild out. I just
> copied you my helpers, so you need to tweak them.
> 
> >
> > >
> > > 2. Split it per arm architectures (and proper subject prefix - not
> > > "arch") and subarchitectures so maintainers can pick it 

Re: [patch 1/2] nohz: only wakeup a single target cpu when kicking a task

2020-10-14 Thread Frederic Weisbecker
On Wed, Oct 14, 2020 at 10:33:21AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 13, 2020 at 02:13:28PM -0300, Marcelo Tosatti wrote:
> 
> > > Yes but if the task isn't running, run_posix_cpu_timers() doesn't have
> > > anything to elapse. So indeed we can spare the IPI if the task is not
> > > running. Provided ordering makes sure that the task sees the new 
> > > dependency
> > > when it schedules in of course.
> > 
> > True.
> > 
> >  * p->on_cpu <- { 0, 1 }:
> >  *
> >  *   is set by prepare_task() and cleared by finish_task() such that it 
> > will be
> >  *   set before p is scheduled-in and cleared after p is scheduled-out, both
> >  *   under rq->lock. Non-zero indicates the task is running on its CPU.
> > 
> > 
> > CPU-0 (tick_set_dep)CPU-1 (task switch)
> > 
> > STORE p->tick_dep_mask
> > smp_mb() (atomic_fetch_or())
> > LOAD p->on_cpu
> > 
> > 
> > context_switch(prev, next)
> > STORE next->on_cpu = 1
> > ... [*]
> > 
> > LOAD current->tick_dep_mask
> > 
> 
> That load is in tick_nohz_task_switch() right? (which BTW is placed
> completely wrong) You could easily do something like the below I
> suppose.
> 
> diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
> index 81632cd5e3b7..2a5fafe66bb0 100644
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -410,6 +410,14 @@ void __tick_nohz_task_switch(void)
>   ts = this_cpu_ptr(&tick_cpu_sched);
>  
>   if (ts->tick_stopped) {
> + /*
> +  * tick_set_dep()   (this)
> +  *
> +  * STORE p->tick_dep_mask   STORE p->on_cpu
> +  * smp_mb() smp_mb()
> +  * LOAD p->on_cpu   LOAD p->tick_dep_mask
> +  */
> + smp_mb();
>   if (atomic_read(¤t->tick_dep_mask) ||
>   atomic_read(¤t->signal->tick_dep_mask))
>   tick_nohz_full_kick();

It would then need to be unconditional (whatever value of ts->tick_stopped).
Assuming the tick isn't stopped, we may well have an interrupt firing right
after schedule() which doesn't see the new value of tick_dep_map.

Alternatively, we could rely on p->on_rq which is set to TASK_ON_RQ_QUEUED
at wake up time, prior to the schedule() full barrier. Of course that doesn't
mean that the task is actually the one running on the CPU but it's a good sign,
considering that we are running in nohz_full mode and it's usually optimized
for single task mode.

Also setting a remote task's tick dependency is only used by posix cpu timer
in case the user has the bad taste to enqueue on a task running in nohz_full
mode. It shouldn't deserve an unconditional full barrier in the schedule path.

If the target is current, as is used by RCU, I guess we can keep a special
treatment.

> re tick_nohz_task_switch() being placed wrong, it should probably be
> placed before finish_lock_switch(). Something like so.
> 
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index cf044580683c..5c92c959824f 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4084,6 +4084,7 @@ static struct rq *finish_task_switch(struct task_struct 
> *prev)
>   vtime_task_switch(prev);
>   perf_event_task_sched_in(prev, current);
>   finish_task(prev);
> + tick_nohz_task_switch();
>   finish_lock_switch(rq);
>   finish_arch_post_lock_switch();
>   kcov_finish_switch(current);
> @@ -4121,7 +4122,6 @@ static struct rq *finish_task_switch(struct task_struct 
> *prev)
>   put_task_struct_rcu_user(prev);
>   }
>  
> - tick_nohz_task_switch();

IIRC, we wanted to keep it outside rq lock because it shouldn't it...


Re: [PATCH v3 3/3] clk: qcom: lpasscc-sc7180: Re-configure the PLL in case lost

2020-10-14 Thread Stephen Boyd
Quoting Douglas Anderson (2020-10-14 14:05:23)
> diff --git a/drivers/clk/qcom/lpasscorecc-sc7180.c 
> b/drivers/clk/qcom/lpasscorecc-sc7180.c
> index 48d370e2108e..e12d4c2b1b70 100644
> --- a/drivers/clk/qcom/lpasscorecc-sc7180.c
> +++ b/drivers/clk/qcom/lpasscorecc-sc7180.c
> @@ -388,6 +388,25 @@ static int lpass_create_pm_clks(struct platform_device 
> *pdev)
> return ret;
>  }
>  
> +static int lpass_core_cc_pm_clk_resume(struct device *dev)
> +{
> +   struct regmap *regmap = dev_get_regmap(dev, "lpass_core_cc");

Please make "lpass_core_cc" a static const pointer in this driver so
that it can be used here and when the regmap is made so that we're
certain they match.

> +   unsigned int l_val;
> +   int ret;
> +
> +   ret = pm_clk_resume(dev);
> +   if (ret)
> +   return ret;
> +
> +   /* If PLL_L_VAL was cleared then we should re-init the whole PLL */
> +   regmap_read(regmap, 0x1004, &l_val);
> +   if (!l_val)
> +   clk_fabia_pll_configure(&lpass_lpaaudio_dig_pll, regmap,
> +   &lpass_lpaaudio_dig_pll_config);
> +
> +   return 0;
> +}
> +
>  static int lpass_core_cc_sc7180_probe(struct platform_device *pdev)
>  {
> const struct qcom_cc_desc *desc;


Re: [PATCH] net: phy: Prevent reporting advertised modes when autoneg is off

2020-10-14 Thread Lukasz Stelmach
It was <2020-10-14 śro 23:04>, when Russell King - ARM Linux admin wrote:
> On Wed, Oct 14, 2020 at 04:39:47PM +0200, Lukasz Stelmach wrote:
>> It was <2020-10-14 śro 14:32>, when Russell King - ARM Linux admin wrote:
>> > In any case, the mii.c code does fill in the advertising mask even
>> > when autoneg is disabled, because, rightly or wrongly, the advertising
>> > mask contains more than just the link modes, it includes the
>> > interface(s) as well. Your change means phylib no longer reports the
>> > interface modes which is at odds with the mii.c code.
>> 
>> I am afraid you are wrong. There is a rather big if()[1], which
>> depending on AN beeing enabled fills more or less information. Yes this
>> if() looks like it has been yanked from mii_ethtool_gset(). When
>> advertising is converted and copied to cmd->link_modes.advertising at
>> the end of mii_ethtool_get_link_ksettings() it is 0[2] if autonegotiation
>> is disabled.
>> 
>> [1]
>> https://protect2.fireeye.com/v1/url?k=d1d33944-8c07852c-d1d2b20b-0cc47a3356b2-cede44db8a43e7c3&q=1&e=d1dea895-d1df-46b9-8413-ac329b0bbfa7&u=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.9%2Fsource%2Fdrivers%2Fnet%2Fmii.c%23L174
>> [2]
>> https://protect2.fireeye.com/v1/url?k=c840e942-9594552a-c841620d-0cc47a3356b2-510c0749446ea7df&q=1&e=d1dea895-d1df-46b9-8413-ac329b0bbfa7&u=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Fv5.9%2Fsource%2Fdrivers%2Fnet%2Fmii.c%23L215
>
> I'm very sorry, but I have to disagree.  I'll quote the code here:
>
> advertising = ADVERTISED_TP | ADVERTISED_MII;
>
>   // This is your big if()
> if (bmcr & BMCR_ANENABLE) {
>   advertising |= ADVERTISED_Autoneg;
>   ...
>   } else {
>   ...
>   }
>
>   ethtool_convert_legacy_u32_to_link_mode(cmd->link_modes.advertising,
>   advertising);
>
> So, when AN is disabled, we still end up with TP and MII in the
> advertised link modes. I call TP and MII "interface modes" above
> to separate them from the "link modes" that describe the speed and
> duplex etc.

Yes.

> Note that only lp_advertising is zeroed in the "else" clause of
> the above "if" statement - advertising remains as-is with TP and MII
> set.

Yes.

> Your patch, on the other hand, merely avoids setting anything in
> cmd->link_modes.advertising, which means we do not end up with the
> "interface modes".
>
> I hope that this helps you see my point.

Yes.

Thanks.

tl;dr: v2 is coming.

Let's take a look at what ethtool prints. With AN enabled

--8<---cut here---start->8---
Settings for enx00249b14f06e:
Supported ports: [ TP MII ]
Supported link modes:   10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Full 
Advertised pause frame use: Symmetric
Advertised auto-negotiation: Yes

[...]

Link detected: yes
--8<---cut here---end--->8---

with AN disabled

--8<---cut here---start->8---
Settings for enx00249b14f06e:
Supported ports: [ TP MII ]
Supported link modes:   10baseT/Half 10baseT/Full 
100baseT/Half 100baseT/Full 
1000baseT/Half 1000baseT/Full 
Supported pause frame use: No
Supports auto-negotiation: Yes
Supported FEC modes: Not reported
Advertised link modes:  Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No

[...]

Link detected: yes
--8<---cut here---end--->8---

The driver is ax88179_178a which calls mii_ethtool_get_link_ksettings()

The first line that differs is "Advertised link modes". Apparently[1]
ethtool does not consider ADVERTISED_TP and ADVERTISED_MII as
interesting modes. Indeed I should add an else clause to clear
cmd->link_modes.advertising (and lp_advertising) and set only
ETHTOOL_LINK_MODE_TP_BIT and ETHTOOL_LINK_MODE_MII_BIT.


[1] 
https://git.kernel.org/pub/scm/network/ethtool/ethtool.git/tree/ethtool.c?h=v5.8#n656
-- 
Łukasz Stelmach
Samsung R&D Institute Poland
Samsung Electronics


signature.asc
Description: PGP signature


Re: [PATCH net-next v5 08/10] bridge: cfm: Netlink GET configuration Interface.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:26 + Henrik Bjoernlund wrote:
> + if (nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_SEQ_NO_UPDATE,
> + mep->cc_ccm_tx_info.seq_no_update))
> + goto nla_put_failure;
> +
> + if (nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_PERIOD,
> + mep->cc_ccm_tx_info.period))
> + goto nla_put_failure;
> +
> + if (nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_IF_TLV,
> + mep->cc_ccm_tx_info.if_tlv))
> + goto nla_put_failure;
> +
> + if (nla_put_u8(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_IF_TLV_VALUE,
> +mep->cc_ccm_tx_info.if_tlv_value))
> + goto nla_put_failure;
> +
> + if (nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_PORT_TLV,
> + mep->cc_ccm_tx_info.port_tlv))
> + goto nla_put_failure;
> +
> + if (nla_put_u8(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_PORT_TLV_VALUE,
> +mep->cc_ccm_tx_info.port_tlv_value))
> + goto nla_put_failure;

Consider collapsing writing related attrs in a nest into a single
if statement:

if (nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_SEQ_NO_UPDATE,
mep->cc_ccm_tx_info.seq_no_update) ||
nla_put_u32(skb, IFLA_BRIDGE_CFM_CC_CCM_TX_PERIOD,
mep->cc_ccm_tx_info.period) ||
...


Re: [PATCH v3 2/3] clk: qcom: lpass-sc7180: Disentangle the two clock devices

2020-10-14 Thread Stephen Boyd
Quoting Doug Anderson (2020-10-14 16:07:52)
> Hi,
> 
> On Wed, Oct 14, 2020 at 4:00 PM Stephen Boyd  wrote:
> >
> > Quoting Doug Anderson (2020-10-14 15:28:58)
> > > Hi,
> > >
> > > On Wed, Oct 14, 2020 at 3:10 PM Stephen Boyd  wrote:
> > > >
> > > > Quoting Douglas Anderson (2020-10-14 14:05:22)
> > > > > diff --git a/drivers/clk/qcom/lpasscorecc-sc7180.c 
> > > > > b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > > > index abcf36006926..48d370e2108e 100644
> > > > > --- a/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > > > +++ b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > > > @@ -356,12 +356,48 @@ static const struct qcom_cc_desc 
> > > > > lpass_audio_hm_sc7180_desc = {
> > > > > .num_gdscs = ARRAY_SIZE(lpass_audio_hm_sc7180_gdscs),
> > > > >  };
> > > > >
> > > > > +static void lpass_pm_runtime_disable(void *data)
> > > > > +{
> > > > > +   pm_runtime_disable(data);
> > > > > +}
> > > > > +
> > > > > +static void lapss_pm_clk_destroy(void *data)
> > > > > +{
> > > > > +   pm_clk_destroy(data);
> > > > > +}
> > > >
> > > > Why are these helpers added again? And do we even need them? Can't we
> > > > just pass pm_runtime_disable or pm_clk_destroy to the
> > > > devm_add_action_or_reset() second parameter?
> > >
> > > Unfortunately, we can't due to the C specification.  Take a look at
> > > all the other users of devm_add_action_or_reset() and they all have
> > > pretty much the same stupid thing.
> >
> > Ok, but we don't need two of the same functions, right?
> 
> How would you write it more cleanly? 

Oh I see I'm making it confusing. Patch 1 has two functions for
pm_runtime_disable() and pm_clk_destroy(), called
lpass_pm_runtime_disable() and lapss_pm_clk_destroy() respectively
(please fix the lapss typo regardless).

Then this patch seems to introduce them again, but really the diff is
getting confused and it looks like the functions are introduced again.
Can you move them to this location (or at least near it) in the first
patch so that this doesn't look like they're being introduced again?

> > > ...actually, do we even need the runtime_disable in the error path?
> > > When the dev goes away does it matter if you left pm_runtime enabled
> > > on it?
> > >
> >
> > I don't know. The device isn't destroyed but maybe when the driver is
> > unbound it resets the runtime PM counters?
> 
> Certainly it seems safest just to do it...
> 

Can you confirm? I'd rather not carry extra code.


Re: [PATCH net-next v5 04/10] bridge: cfm: Kernel space implementation of CFM. MEP create/delete.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:22 + Henrik Bjoernlund wrote:
> + if (config->mdlevel > 7) {
> + NL_SET_ERR_MSG_MOD(extack,
> +"MD level is invalid");
> + return -EINVAL;
> + }
> + /* The MEP-ID is a 13 bit field in the CCM PDU identifying the MEP */
> + if (config->mepid > 0x1FFF) {
> + NL_SET_ERR_MSG_MOD(extack,
> +"MEP-ID is invalid");
> + return -EINVAL;
> + }

If I'm reading patch 7 correctly these parameters come from netlink,
right? In that case please use the netlink policies to check things
like ranges or correct values.


Re: [PATCH net-next v5 06/10] bridge: cfm: Kernel space implementation of CFM. CCM frame RX added.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:24 + Henrik Bjoernlund wrote:
> + /* This CCM related status is based on the latest received CCM PDU. */
> + u8 port_tlv_value; /* Port Status TLV value */
> + u8 if_tlv_value; /* Interface Status TLV value */
> +
> + /* CCM has not been received for 3.25 intervals */
> + bool ccm_defect;
> +
> + /* (RDI == 1) for last received CCM PDU */
> + bool rdi;
> +
> + /* Indications that a CCM PDU has been seen. */
> + bool seen; /* CCM PDU received */
> + bool tlv_seen; /* CCM PDU with TLV received */
> + /* CCM PDU with unexpected sequence number received */
> + bool seq_unexp_seen;

Please consider using a u8 bitfield rather than a bunch of bools,
if any of this structures are expected to have many instances. 
That'd save space.


Re: [GIT PULL] SPDX patches for 5.10-rc1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 19:52:16 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx.git 
> tags/spdx-5.10-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/3e4fb4346c781068610d03c12b16c0cfb0fd24a3

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] TTY/Serial driver patches for 5.10-rc1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 19:48:26 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git tags/tty-5.10-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/5d6c413c92a3e6fc9399141891147d0d826517c9

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] Driver core patches for 5.10-rc1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 19:48:51 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core.git 
> tags/driver-core-5.10-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/fe151462bd0f7ad0e758f1cdcbeb6426e3d1ee8e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [PATCH net-next v5 06/10] bridge: cfm: Kernel space implementation of CFM. CCM frame RX added.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:24 + Henrik Bjoernlund wrote:
> +struct br_cfm_status_tlv {
> + __u8 type;
> + __be16 length;
> + __u8 value;
> +};

This structure is unused (and likely not what you want, since it will
have 2 1 byte while unless you mark length as __packed).


Re: [GIT PULL] MFD for v5.10

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 17:09:07 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd.git tags/mfd-next-5.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1a31c12371556bfbe511edd268dab721b504d511

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] IPMI bug fixes for 5.10

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Tue, 13 Oct 2020 10:56:49 -0500:

> https://github.com/cminyard/linux-ipmi.git tags/for-linus-5.10-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6e4dc3d59284ea3bc7c3e40694bce84d988b01af

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] cgroup changes for v5.10-rc1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Tue, 13 Oct 2020 10:07:03 -0400:

> git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git for-5.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2f6c6d0891b472bbda70dfcc51fbb3147b6f54a0

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] first round of SCSI updates for the 5.8+ merge window

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Tue, 13 Oct 2020 15:53:46 -0700:

> git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/55e0500eb5c0440a3d43074edbd8db3e95851b66

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] LEDs changes for v5.10-rc1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 13:05:56 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds.git/ 
> tags/leds-5.10-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7fafb54c7d390e9b273a1d7d377e38d9c408046e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


[PATCH] powerpc32: don't adjust unmoved stack pointer in csum_partial_copy_generic() epilogue

2020-10-14 Thread Jason A. Donenfeld
A recent change to the checksum code removed usage of some extra
arguments, alongside with storage on the stack for those, and the stack
pointer no longer needed to be adjusted in the function prologue. But, a
left over subtraction wasn't removed in the function epilogue, causing
the function to return with the stack pointer moved 16 bytes away from
where it should have. This corrupted local state and lead to weird
crashes. This commit simply removes the leftover instruction from the
epilogue.

Fixes: 70d65cd555c5 ("ppc: propagate the calling conventions change down to 
csum_partial_copy_generic()")
Cc: Al Viro 
Signed-off-by: Jason A. Donenfeld 
---
 arch/powerpc/lib/checksum_32.S | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index ec5cd2dede35..27d9070617df 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -236,7 +236,6 @@ _GLOBAL(csum_partial_copy_generic)
slwir0,r0,8
adder12,r12,r0
 66:addze   r3,r12
-   addir1,r1,16
beqlr+  cr7
rlwinm  r3,r3,8,0,31/* odd destination address: rotate one byte */
blr
-- 
2.28.0



Re: [GIT PULL] pin control changes for the v5.10 kernel series

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 14:58:20 +0200:

> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git 
> tags/pinctrl-v5.10-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b4e1bce85fd8f43dc814049e2641cc6beaa8146b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] Backlight for v5.10

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 17:10:40 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight.git 
> tags/backlight-next-5.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/6448cbf662c7858c0d9eb0b135962bedd6d0b9a0

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] Devicetree updates for v5.10

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 10:53:04 -0500:

> git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git 
> tags/devicetree-for-5.10

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f888bdf9823c85fe945c4eb3ba353f749dec3856

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [PATCH net-next v5 04/10] bridge: cfm: Kernel space implementation of CFM. MEP create/delete.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:22 + Henrik Bjoernlund wrote:
> with restricted management access to each other<80><99>s equipment.

Some Unicode funk in this line?


Re: [PATCH v3 2/3] clk: qcom: lpass-sc7180: Disentangle the two clock devices

2020-10-14 Thread Stephen Boyd
Quoting Doug Anderson (2020-10-14 15:28:58)
> Hi,
> 
> On Wed, Oct 14, 2020 at 3:10 PM Stephen Boyd  wrote:
> >
> > Quoting Douglas Anderson (2020-10-14 14:05:22)
> > > diff --git a/drivers/clk/qcom/lpasscorecc-sc7180.c 
> > > b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > index abcf36006926..48d370e2108e 100644
> > > --- a/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > +++ b/drivers/clk/qcom/lpasscorecc-sc7180.c
> > > @@ -356,12 +356,48 @@ static const struct qcom_cc_desc 
> > > lpass_audio_hm_sc7180_desc = {
> > > .num_gdscs = ARRAY_SIZE(lpass_audio_hm_sc7180_gdscs),
> > >  };
> > >
> > > +static void lpass_pm_runtime_disable(void *data)
> > > +{
> > > +   pm_runtime_disable(data);
> > > +}
> > > +
> > > +static void lapss_pm_clk_destroy(void *data)
> > > +{
> > > +   pm_clk_destroy(data);
> > > +}
> >
> > Why are these helpers added again? And do we even need them? Can't we
> > just pass pm_runtime_disable or pm_clk_destroy to the
> > devm_add_action_or_reset() second parameter?
> 
> Unfortunately, we can't due to the C specification.  Take a look at
> all the other users of devm_add_action_or_reset() and they all have
> pretty much the same stupid thing. 

Ok, but we don't need two of the same functions, right?

> ...actually, do we even need the runtime_disable in the error path?
> When the dev goes away does it matter if you left pm_runtime enabled
> on it?
> 

I don't know. The device isn't destroyed but maybe when the driver is
unbound it resets the runtime PM counters?


Re: [PATCH net-next v5 05/10] bridge: cfm: Kernel space implementation of CFM. CCM frame TX added.

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:23 + Henrik Bjoernlund wrote:
> + skb = dev_alloc_skb(CFM_CCM_MAX_FRAME_LENGTH);
> + if (!skb)
> + return NULL;
> +
> + rcu_read_lock();
> + b_port = rcu_dereference(mep->b_port);
> + if (!b_port) {
> + rcu_read_unlock();
> + return NULL;
> + }

At a quick scan I noticed you appear to be leaking the skb here.
So let me point out some more nit picks.


Re: [PATCH net-next v5 00/10] net: bridge: cfm: Add support for Connectivity Fault Management(CFM)

2020-10-14 Thread Jakub Kicinski
On Mon, 12 Oct 2020 14:04:18 + Henrik Bjoernlund wrote:
> Connectivity Fault Management (CFM) is defined in 802.1Q section 12.14.
> 
> Connectivity Fault Management (CFM) comprises capabilities for detecting, 
> verifying,
> and isolating connectivity failures in Virtual Bridged Networks.
> These capabilities can be used in networks operated by multiple independent 
> organizations,
> each with restricted management access to each other’s equipment.

Please wrap the cover letter and commit messages to 70 chars.

> Reviewed-by: Horatiu Vultur  
> Signed-off-by: Henrik Bjoernlund  

You have two spaces after the name in many tags.


Re: [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()

2020-10-14 Thread Jason A. Donenfeld
On Thu, Oct 15, 2020 at 12:53 AM Linus Torvalds
 wrote:
>
> On Wed, Oct 14, 2020 at 3:51 PM Linus Torvalds
>  wrote:
> >
> > I think it's this instruction:
> >
> > addir1,r1,16
> >
> > that should be removed from the function exit, because Al removed the
> >
> > -   stwur1,-16(r1)
> >
> > on function entry.
> >
> > So I think you end up with a corrupt stack pointer and basically
> > random behavior.
> >
> > Mind trying that? (This is obviously all in
> > arch/powerpc/lib/checksum_32.S, the csum_partial_copy_generic()
> > function).
>
> Patch attached to make it easier to test.
>
> NOTE! This is ENTIRELY untested. My ppc asm is so rusty that I might
> be barking entirely up the wrong tree, and I just made things much
> worse.

Indeed - exactly the same thing that's in my tree. Writing a commit
message for it now.

Jason


Re: [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()

2020-10-14 Thread Jason A. Donenfeld
On Thu, Oct 15, 2020 at 12:51 AM Linus Torvalds
 wrote:
>
> On Wed, Oct 14, 2020 at 3:27 PM Jason A. Donenfeld  wrote:
> >
> > This patch is causing crashes in WireGuard's CI over at
> > https://www.wireguard.com/build-status/ . Apparently sending a simple
> > network packet winds up triggering refcount_t's warn-on-saturate code. I
>
> Ouch.
>
> The C parts look fairly straightforward, and I don't see how they
> could cause that odd refcount issue.
>
> So I assume it's the low-level asm code conversion that is buggy. And
> it's apparently the 32-bit conversion, since your ppc64 status looks
> fine.
>
> I think it's this instruction:
>
> addir1,r1,16
>
> that should be removed from the function exit, because Al removed the
>
> -   stwur1,-16(r1)

I just tried that about a minute ago, and indeed that seems to be
what's up. Problem goes away without it. I'll send a patch shortly.

Jason


Re: [PATCH v2] MIPS: DEC: Restore bootmem reservation for firmware working memory area

2020-10-14 Thread Thomas Bogendoerfer
On Wed, Oct 14, 2020 at 10:34:56PM +0100, Maciej W. Rozycki wrote:
> Fix a crash on DEC platforms starting with:
> 
> VFS: Mounted root (nfs filesystem) on device 0:11.
> Freeing unused PROM memory: 124k freed
> BUG: Bad page state in process swapper  pfn:1
> page:(ptrval) refcount:0 mapcount:-128 mapping: index:0x1 pfn:0x1
> flags: 0x0()
> raw:  0100 0122  0001  ff7f 
> page dumped because: nonzero mapcount
> Modules linked in:
> CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.0-00858-g865c50e1d279 #1
> Stack : 8065dc48 000b 8065d2b8 9bc27dcc 80645bfc 9bc259a4 806a1b97 
> 80703124
> 8071 8064a900 0001 80099574 806b116c 1000ec00 9bc27d88 
> 806a6f30
>   80645bfc  31232039 80706ba4 2e392e35 
> 8039f348
> 2d383538 0070 000a 35363867  806c2830 8071 
> 806b
> 8071 8064a900 0001 8100   8035af2c 
> 8070
> ...
> Call Trace:
> [<8004bc5c>] show_stack+0x34/0x104
> [<8015675c>] bad_page+0xfc/0x128
> [<80157714>] free_pcppages_bulk+0x1f4/0x5dc
> [<801591cc>] free_unref_page+0xc0/0x130
> [<8015cb04>] free_reserved_area+0x144/0x1d8
> [<805abd78>] kernel_init+0x20/0x100
> [<80046070>] ret_from_kernel_thread+0x14/0x1c
> Disabling lock debugging due to kernel taint
> 
> caused by an attempt to free bootmem space that as from commit 
> b93ddc4f9156 ("mips: Reserve memory for the kernel image resources") has 
> not been anymore reserved due to the removal of generic MIPS arch code 
> that used to reserve all the memory from the beginning of RAM up to the 
> kernel load address.
> 
> This memory does need to be reserved on DEC platforms however as it is 
> used by REX firmware as working area, as per the TURBOchannel firmware 
> specification[1]:
> 
> Table 2-2  REX Memory Regions
> -
> StartingEnding
> Region  Address Address Use
> -
> 0   0xa000  0xa000  Restart block, exception vectors,
> REX stack and bss
> 1   0xa001  0xa0017fff  Keyboard or tty drivers
> 
> 2   0xa0018000  0xa001f3ff 1)   CRT driver
> 
> 3   0xa002  0xa002  boot, cnfg, init and t objects
> 
> 4   0xa002  0xa002  64KB scratch space
> -
> 1) Note that the last 3 Kbytes of region 2 are reserved for backward
> compatibility with previous system software.
> -
> 
> (this table uses KSEG2 unmapped virtual addresses, which in the MIPS 
> architecture are offset from physical addresses by a fixed value of 
> 0xa000 and therefore the regions referred do correspond to the 
> beginning of the physical address space) and we call into the firmware 
> on several occasions throughout the bootstrap process.  It is believed 
> that pre-REX firmware used with non-TURBOchannel DEC platforms has the 
> same requirements, as hinted by note #1 cited.
> 
> Recreate the discarded reservation then, in DEC platform code, removing 
> the crash.
> 
> References:
> 
> [1] "TURBOchannel Firmware Specification", On-line version,
> EK-TCAAD-FS-004, Digital Equipment Corporation, January 1993, 
> Chapter 2 "System Module Firmware", p. 2-5
> 
> Signed-off-by: Maciej W. Rozycki 
> Fixes: b93ddc4f9156 ("mips: Reserve memory for the kernel image resources")
> Cc: sta...@vger.kernel.org # v5.2+
> ---
> Changes from v1:
> 
> - Fix 2nd argument of the call to `memblock_reserve' (thanks Serge!).
> ---
>  arch/mips/dec/setup.c |9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]


Re: [PATCH v5 3/5] counter: Add character device interface

2020-10-14 Thread David Lechner

On 9/26/20 9:18 PM, William Breathitt Gray wrote:

+static ssize_t counter_chrdev_read(struct file *filp, char __user *buf,
+  size_t len, loff_t *f_ps)
+{
+   struct counter_device *const counter = filp->private_data;
+   int err;
+   unsigned long flags;
+   unsigned int copied;
+
+   if (len < sizeof(struct counter_event))
+   return -EINVAL;
+
+   do {
+   if (kfifo_is_empty(&counter->events)) {
+   if (filp->f_flags & O_NONBLOCK)
+   return -EAGAIN;
+
+   err = wait_event_interruptible(counter->events_wait,
+   !kfifo_is_empty(&counter->events));
+   if (err)
+   return err;
+   }
+
+   raw_spin_lock_irqsave(&counter->events_lock, flags);
+   err = kfifo_to_user(&counter->events, buf, len, &copied);
+   raw_spin_unlock_irqrestore(&counter->events_lock, flags);
+   if (err)
+   return err;
+   } while (!copied);
+
+   return copied;
+}


All other uses of kfifo_to_user() I saw use a mutex instead of spin
lock. I don't see a reason for disabling interrupts here.

Example:


static ssize_t iio_event_chrdev_read(struct file *filep,
 char __user *buf,
 size_t count,
 loff_t *f_ps)
{
struct iio_dev *indio_dev = filep->private_data;
struct iio_dev_opaque *iio_dev_opaque = to_iio_dev_opaque(indio_dev);
struct iio_event_interface *ev_int = iio_dev_opaque->event_interface;
unsigned int copied;
int ret;

if (!indio_dev->info)
return -ENODEV;

if (count < sizeof(struct iio_event_data))
return -EINVAL;

do {
if (kfifo_is_empty(&ev_int->det_events)) {
if (filep->f_flags & O_NONBLOCK)
return -EAGAIN;

ret = wait_event_interruptible(ev_int->wait,
!kfifo_is_empty(&ev_int->det_events) ||
indio_dev->info == NULL);
if (ret)
return ret;
if (indio_dev->info == NULL)
return -ENODEV;
}

if (mutex_lock_interruptible(&ev_int->read_lock))
return -ERESTARTSYS;
ret = kfifo_to_user(&ev_int->det_events, buf, count, &copied);
mutex_unlock(&ev_int->read_lock);

if (ret)
return ret;

/*
 * If we couldn't read anything from the fifo (a different
 * thread might have been faster) we either return -EAGAIN if
 * the file descriptor is non-blocking, otherwise we go back to
 * sleep and wait for more data to arrive.
 */
if (copied == 0 && (filep->f_flags & O_NONBLOCK))
return -EAGAIN;

} while (copied == 0);

return copied;
}


Re: ACPI _CST introduced performance regresions on Haswll

2020-10-14 Thread Mel Gorman
On Tue, Oct 13, 2020 at 08:55:26PM +0200, Rafael J. Wysocki wrote:
> > > With C6 enabled, that state is used at
> > > least sometimes (so C1E is used less often), but PC6 doesn't seem to be
> > > really used - it looks like core C6 only is entered and which may be why 
> > > C6
> > > adds less latency than C1E (and analogously for C3).
> > > 
> > At the moment, I'm happy with either solution but mostly because I can't
> > tell what other trade-offs should be considered :/
> > 
>
> I talked to Len and Srinivas about this and my theory above didn't survive.
> 
> The most likely reason why you see a performance drop after enabling the
> ACPI support (which effectively causes C3 and C6 to be disabled by default
> on the affected machines) is because the benchmarks in question require
> sufficiently high one-CPU performance and the CPUs cannot reach high enough
> one-core turbo P-states without the other CPUs going into C6.
> 

That makes sense. It also can explain anomalies like server/clients being
split across NUMA nodes with no other activity can sometimes be better
because of turbo states being more important than memory locality for
some benchmarks.

> Inspection of the ACPI tables you sent me indicates that there is a BIOS
> switch in that system allowing C6 to be enabled.  Would it be possible to
> check whether or not there is a BIOS setup option to change that setting?
> 

Yes, it's well hidden but it's there. If the profile is made custom, then
the p-states can be selected and "custom" default enables C6 but not C3
(there is a note saying that it's not recommended for that CPU). If I
then switch it back to the normal profile, the c-states are not restored
so this is a one-way trip even if you disable the c-state in custom,
reboot, switch back, reboot. Same if the machine is reset to "optimal
default settings". Yey for BIOS developers.

This means I have a limited number of attempts to do something about
this. 2 machines can no longer reproduce the problem reliably.

However, that also says a "solution" is to enable the state on each of
these machines, discard the historical results and carry on. In practice,
if reports are received either upstream or in distros about strange
behaviour on old machines when upgrading then first check the c-states
and the CPU generation. Given long enough, the machines with that specific
CPU and a crappy BIOS will phase out :/

> Also, I need to know what happens if that system is started with intel_idle
> disabled.  That is, what idle states are visible in sysfs in that
> configuration (what their names and descriptions are in particular) and
> whether or not the issue is still present then.
> 

Kernel name c-state   exit latency   disabled?  default?
--- --   -  
5.9-vanilla  POLL latency:0  disabled:0 default:enabled
5.9-vanilla  C1   latency:2  disabled:0 default:enabled
5.9-vanilla  C1E  latency:10 disabled:0 default:enabled
5.9-vanilla  C3   latency:33 disabled:1 default:disabled
5.9-vanilla  C6   latency:133disabled:0 default:enabled
5.9-intel_idle-disabled  POLL latency:0  disabled:0 default:enabled
5.9-intel_idle-disabled  C1   latency:1  disabled:0 default:enabled
5.9-intel_idle-disabled  C2   latency:41 disabled:0 default:enabled
5.9-acpi-disable POLL latency:0  disabled:0 default:enabled
5.9-acpi-disable C1   latency:2  disabled:0 default:enabled
5.9-acpi-disable C1E  latency:10 disabled:0 default:enabled
5.9-acpi-disable C3   latency:33 disabled:0 default:enabled
5.9-acpi-disable C6   latency:133disabled:0 default:enabled
5.9-custom-powerprofile  POLL latency:0  disabled:0 default:enabled
5.9-custom-powerprofile  C1   latency:2  disabled:0 default:enabled
5.9-custom-powerprofile  C1E  latency:10 disabled:0 default:enabled
5.9-custom-powerprofile  C3   latency:33 disabled:1 default:disabled
5.9-custom-powerprofile  C6   latency:133disabled:0 default:enabled

In this case, the test results are similar. I vaguely recall the bios
was reconfigured on the second machine for unrelated reasons and it's no
longer able to reproduce the problem properly. As the results show little
difference in this case, I didn't include the turbostat figures. The
summary from mmtests showed this

  5.9 5.9 5.9 5.9
  vanillaintel_idle-disabledacpi-disablecustom-powerprofile
Hmean Avg_MHz8.318.298.358.26
Hmean Busy%  0.580.560.580.57
Hmean CPU%c1 3.19   40.813.143.18
Hmean CPU%c3 0.000.000.000.00
Hmean CPU%c692.240.00   92.21   92.20
Hmean CPU%c7 0.000.00

Re: [PATCH RFC V3 4/9] x86/pks: Preserve the PKRS MSR on context switch

2020-10-14 Thread Ira Weiny
On Tue, Oct 13, 2020 at 11:31:45AM -0700, Dave Hansen wrote:
> On 10/9/20 12:42 PM, ira.we...@intel.com wrote:
> > From: Ira Weiny 
> > 
> > The PKRS MSR is defined as a per-logical-processor register.  This
> > isolates memory access by logical CPU.  Unfortunately, the MSR is not
> > managed by XSAVE.  Therefore, tasks must save/restore the MSR value on
> > context switch.
> > 
> > Define a saved PKRS value in the task struct, as well as a cached
> > per-logical-processor MSR value which mirrors the MSR value of the
> > current CPU.  Initialize all tasks with the default MSR value.  Then, on
> > schedule in, check the saved task MSR vs the per-cpu value.  If
> > different proceed to write the MSR.  If not avoid the overhead of the
> > MSR write and continue.
> 
> It's probably nice to note how the WRMSR is special here, in addition to
> the comments below.

Sure,

> 
> >  #endif /*_ASM_X86_PKEYS_INTERNAL_H */
> > diff --git a/arch/x86/include/asm/processor.h 
> > b/arch/x86/include/asm/processor.h
> > index 97143d87994c..da2381136b2d 100644
> > --- a/arch/x86/include/asm/processor.h
> > +++ b/arch/x86/include/asm/processor.h
> > @@ -18,6 +18,7 @@ struct vm86;
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -542,6 +543,11 @@ struct thread_struct {
> >  
> > unsigned intsig_on_uaccess_err:1;
> >  
> > +#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
> > +   /* Saved Protection key register for supervisor mappings */
> > +   u32 saved_pkrs;
> > +#endif
> 
> Could you take a look around thread_struct and see if there are some
> other MSRs near which you can stash this?  This seems like a bit of a
> lonely place.

Are you more concerned with aesthetics or the in memory struct layout?

How about I put it after error_code?

unsigned long   error_code; 
+   
+#ifdef CONFIG_ARCH_HAS_SUPERVISOR_PKEYS
+   /* Saved Protection key register for supervisor mappings */ 
+   u32 saved_pkrs; 
+#endif 
+   

?

> 
> ...
> >  void flush_thread(void)
> >  {
> > struct task_struct *tsk = current;
> > @@ -195,6 +212,8 @@ void flush_thread(void)
> > memset(tsk->thread.tls_array, 0, sizeof(tsk->thread.tls_array));
> >  
> > fpu__clear_all(&tsk->thread.fpu);
> > +
> > +   pks_init_task(tsk);
> >  }
> >  
> >  void disable_TSC(void)
> > @@ -644,6 +663,8 @@ void __switch_to_xtra(struct task_struct *prev_p, 
> > struct task_struct *next_p)
> >  
> > if ((tifp ^ tifn) & _TIF_SLD)
> > switch_to_sld(tifn);
> > +
> > +   pks_sched_in();
> >  }
> >  
> >  /*
> > diff --git a/arch/x86/mm/pkeys.c b/arch/x86/mm/pkeys.c
> > index 3cf8f775f36d..30f65dd3d0c5 100644
> > --- a/arch/x86/mm/pkeys.c
> > +++ b/arch/x86/mm/pkeys.c
> > @@ -229,3 +229,31 @@ u32 update_pkey_val(u32 pk_reg, int pkey, unsigned int 
> > flags)
> >  
> > return pk_reg;
> >  }
> > +
> > +DEFINE_PER_CPU(u32, pkrs_cache);
> > +
> > +/**
> > + * It should also be noted that the underlying WRMSR(MSR_IA32_PKRS) is not
> > + * serializing but still maintains ordering properties similar to WRPKRU.
> > + * The current SDM section on PKRS needs updating but should be the same as
> > + * that of WRPKRU.  So to quote from the WRPKRU text:
> > + *
> > + * WRPKRU will never execute transiently. Memory accesses
> > + * affected by PKRU register will not execute (even transiently)
> > + * until all prior executions of WRPKRU have completed execution
> > + * and updated the PKRU register.
> > + */
> > +void write_pkrs(u32 new_pkrs)
> > +{
> > +   u32 *pkrs;
> > +
> > +   if (!static_cpu_has(X86_FEATURE_PKS))
> > +   return;
> > +
> > +   pkrs = get_cpu_ptr(&pkrs_cache);
> > +   if (*pkrs != new_pkrs) {
> > +   *pkrs = new_pkrs;
> > +   wrmsrl(MSR_IA32_PKRS, new_pkrs);
> > +   }
> > +   put_cpu_ptr(pkrs);
> > +}
> > 
> 
> It bugs me a *bit* that this is being called in a preempt-disabled
> region, but we still bother with the get/put_cpu jazz.  Are there other
> future call-sites for this that aren't in preempt-disabled regions?

I'm not specifically disabling preempt before calling write_pkrs except in the
next patch (which is buggy because I meant to have it around the modification
of thread.saved_pkrs as well).  But that was to protect the thread variable not
the percpu cache vs MSR.

My thought above was it is safer for this call to ensure the per-cpu variable
is consistent with the register.  The other calls to write_pkrs() may require
preemption disable but for reasons unrelated to write_pkrs' state.

After some research 

Re: [PATCH v5 3/5] counter: Add character device interface

2020-10-14 Thread David Lechner

On 10/14/20 2:05 PM, William Breathitt Gray wrote:

On Wed, Oct 14, 2020 at 12:43:08PM -0500, David Lechner wrote:

On 9/26/20 9:18 PM, William Breathitt Gray wrote:

diff --git a/drivers/counter/counter-chrdev.c b/drivers/counter/counter-chrdev.c
new file mode 100644
index ..2be3846e4105
--- /dev/null
+++ b/drivers/counter/counter-chrdev.c




+/**
+ * counter_push_event - queue event for userspace reading
+ * @counter:   pointer to Counter structure
+ * @event: triggered event
+ * @channel:   event channel
+ *
+ * Note: If no one is watching for the respective event, it is silently
+ * discarded.
+ *
+ * RETURNS:
+ * 0 on success, negative error number on failure.
+ */
+int counter_push_event(struct counter_device *const counter, const u8 event,
+  const u8 channel)
+{
+   struct counter_event ev = {0};
+   unsigned int copied = 0;
+   unsigned long flags;
+   struct counter_event_node *event_node;
+   struct counter_comp_node *comp_node;
+   int err;
+
+   ev.timestamp = ktime_get_ns();
+   ev.watch.event = event;
+   ev.watch.channel = channel;
+
+   raw_spin_lock_irqsave(&counter->events_lock, flags);
+
+   /* Search for event in the list */
+   list_for_each_entry(event_node, &counter->events_list, l)
+   if (event_node->event == event &&
+   event_node->channel == channel)
+   break;
+
+   /* If event is not in the list */
+   if (&event_node->l == &counter->events_list)
+   goto exit_early;
+
+   /* Read and queue relevant comp for userspace */
+   list_for_each_entry(comp_node, &event_node->comp_list, l) {
+   err = counter_get_data(counter, comp_node, &ev.value_u8);


Currently all counter devices are memory mapped devices so calling
counter_get_data() here with interrupts disabled is probably OK, but
if any counter drivers are added that use I2C/SPI/etc. that will take
a long time to read, it would cause problems leaving interrupts
disabled here.

Brainstorming: Would it make sense to separate the event from the
component value being read? As I mentioned in one of my previous
reviews, I think there are some cases where we would just want to
know when an event happened and not read any additional data anyway.
In the case of a slow communication bus, this would also let us
queue the event in the kfifo and notify poll right away and then
defer the reads in a workqueue for later.


I don't see any problems with reporting just an event without any
component value attached (e.g. userspace could handle the component
reads via sysfs at a later point). We would just need a way to inform
userspace that the struct counter_component in the struct counter_watch
returned should be ignored.

Perhaps we can add an additional member to struct counter_watch
indicating whether it's an empty watch; or alternatively, add a new
component scope define to differentiate between an actual component and
an empty one (e.g. COUNTER_SCOPE_EVENT). What do you think?

William Breathitt Gray



I made the same suggestion in one of my other replies - except
I called it COUNTER_SCOPE_NONE.


Re: [PATCH v2 20/20] ppc: propagate the calling conventions change down to csum_partial_copy_generic()

2020-10-14 Thread Jason A. Donenfeld
Hi Al,

On Fri, Jul 24, 2020 at 02:25:46AM +0100, Al Viro wrote:
> From: Al Viro 
> 
> ... and get rid of the pointless fallback in the wrappers.  On error it used
> to zero the unwritten area and calculate the csum of the entire thing.  Not
> wanting to do it in assembler part had been very reasonable; doing that in
> the first place, OTOH...  In case of an error the caller discards the data
> we'd copied, along with whatever checksum it might've had.

This patch is causing crashes in WireGuard's CI over at
https://www.wireguard.com/build-status/ . Apparently sending a simple
network packet winds up triggering refcount_t's warn-on-saturate code. I
don't know if the new assembly failed to reset some flag or if something
else is up. I can start digging into it if you want, but I thought I
should let you know first about the issue. The splat follows below.

Thanks,
Jason

$ ping -c 10 -f -W 1 192.168.241.1
PING 192.168.241.1 (192.168.241.1) 56(84) bytes of data.
[1.432922] [ cut here ]
[1.433069] refcount_t: saturated; leaking memory.
[1.433344] WARNING: CPU: 3 PID: 90 at refcount_warn_saturate+0x100/0x1bc
[1.433646] CPU: 3 PID: 90 Comm: ping Not tainted 5.9.0+ #3
[1.433797] NIP:  c01a6fa0 LR: c01a6fa0 CTR: c01ccbec
[1.433964] REGS: cfacfb80 TRAP: 0700   Not tainted  (5.9.0+)
[1.434102] MSR:  00029000   CR: 28022404  XER: 
[1.434345]
[1.434345] GPR00: c01a6fa0 cfacfc38 cf8eeae0 0026 3fffefff cfacfa90 
cfacfaa0 00021000
[1.434345] GPR08: 0f4a1000  c08b4674 c0918704 42022404  
cfa34180 
[1.434345] GPR16:  cf8ef004   0040  
 cfbac230
[1.434345] GPR24: cfacfce8 c02a802c  cfa34180 cfacfc58 c02aa53c 
55c0a4ff 
[1.435471] NIP [c01a6fa0] refcount_warn_saturate+0x100/0x1bc
[1.435615] LR [c01a6fa0] refcount_warn_saturate+0x100/0x1bc
[1.435825] Call Trace:
[1.435922] [cfacfc38] [c01a6fa0] refcount_warn_saturate+0x100/0x1bc 
(unreliable)
[1.436149] [cfacfc48] [c02a7f14] __ip_append_data.isra.0+0x8a8/0xde0
[1.436302] [cfacfce8] [c02a84e0] ip_append_data.part.0+0x94/0xf0
[1.436438] [cfacfd18] [c02dffe0] raw_sendmsg+0x298/0xa84
[1.436544] [cfacfe48] [c020b9ec] __sys_sendto+0xdc/0x13c
[1.436641] [cfacff38] [c000f1dc] ret_from_syscall+0x0/0x38
[1.436824] --- interrupt: c01 at 0xb7e44f00
[1.436824] LR = 0xb7e21ba0
[1.437038] Instruction dump:
[1.437239] 3d20c092 39291bc1 89490001 2c0a 4082ff64 3c60c040 7c0802a6 
3941
[1.437439] 38633b74 90010014 99490001 4be9b6e1 <0fe0> 80010014 7c0803a6 
4b38
[1.437753] ---[ end trace aaa4b4788958d0a6 ]---
[1.440214] [ cut here ]
[1.440301] refcount_t: underflow; use-after-free.
[1.440397] WARNING: CPU: 3 PID: 90 at refcount_warn_saturate+0x1ac/0x1bc
[1.440587] CPU: 3 PID: 90 Comm: ping Tainted: GW 5.9.0+ #3
[1.440741] NIP:  c01a704c LR: c01a704c CTR: c01ccbec
[1.440857] REGS: cfacfaa0 TRAP: 0700   Tainted: GW  (5.9.0+)
[1.441016] MSR:  00029000   CR: 48022404  XER: 
[1.441176]
[1.441176] GPR00: c01a704c cfacfb58 cf8eeae0 0026 3fffefff cfacf9b0 
cfacf9c0 00021000
[1.441176] GPR08: 0f4a1000 0400 c08b4674 c0918704 42022404  
10020464 0003
[1.441176] GPR16: 7ff0 1002 0080 cfb27000 cfb2704c c093 
cfacfc54 c092d260
[1.441176] GPR24: 058c cfa82120 cfa8212c cfa8212c  cfa82000 
cfacfd44 cfacfc58
[1.441995] NIP [c01a704c] refcount_warn_saturate+0x1ac/0x1bc
[1.442125] LR [c01a704c] refcount_warn_saturate+0x1ac/0x1bc
[1.442252] Call Trace:
[1.442320] [cfacfb58] [c01a704c] refcount_warn_saturate+0x1ac/0x1bc 
(unreliable)
[1.442726] [cfacfb68] [c020e7dc] sock_wfree+0x130/0x134
[1.442877] [cfacfb78] [c01f1388] wg_packet_send_staged_packets+0x234/0x6b4
[1.443061] [cfacfbb8] [c01eecf8] wg_xmit+0x2a0/0x46c
[1.443204] [cfacfbf8] [c0232134] dev_hard_start_xmit+0x190/0x1c0
[1.443369] [cfacfc38] [c0232f2c] __dev_queue_xmit+0x4d0/0x844
[1.443527] [cfacfc88] [c02a7134] ip_finish_output2+0x180/0x6b8
[1.443686] [cfacfcb8] [c02aa3e8] ip_output+0xf0/0x1c0
[1.443829] [cfacfd08] [c02ab14c] ip_send_skb+0x24/0xe8
[1.443975] [cfacfd18] [c02e04bc] raw_sendmsg+0x774/0xa84
[1.444124] [cfacfe48] [c020b9ec] __sys_sendto+0xdc/0x13c
[1.444274] [cfacff38] [c000f1dc] ret_from_syscall+0x0/0x38
[1.37] --- interrupt: c01 at 0xb7e44f00
[1.37] LR = 0xb7e21ba0
[1.444644] Instruction dump:
[1.444736] 4be9b661 0fe0 80010014 7c0803a6 4bfffeb8 3c60c040 7c0802a6 
3941
[1.444989] 38633bd8 90010014 99490003 4be9b635 <0fe0> 80010014 7c0803a6 
4bfffe8c
[1.445252] ---[ end trace aaa4b4788958d0a7 ]---
[1.445583] BUG: Unable to handle kernel instruction fetch (NULL pointer?)
[1.445767] Faulting instruction address: 0x
[1.44

Re: linux-next: build warning after merge of the tip tree

2020-10-14 Thread Stephen Rothwell
Hi Kees,

On Sun, 4 Oct 2020 19:44:52 -0700 Kees Cook  wrote:
>
> On Sun, Oct 04, 2020 at 09:00:18PM +1100, Stephen Rothwell wrote:
> > Hi Kees,
> > 
> > On Sun, 4 Oct 2020 01:27:01 -0700 Kees Cook  wrote:  
> > >
> > > I assume CONFIG_CONSTRUCTORS is enabled for your build (it should be for  
> > 
> > yes, indeed.
> >   
> > > allmodconfig). Does this patch fix it? (I'm kind of blindly guessing
> > > based on my understanding of where this could be coming from...)
> > > 
> > > 
> > > diff --git a/include/asm-generic/vmlinux.lds.h 
> > > b/include/asm-generic/vmlinux.lds.h
> > > index e1843976754a..22f14956214a 100644
> > > --- a/include/asm-generic/vmlinux.lds.h
> > > +++ b/include/asm-generic/vmlinux.lds.h
> > > @@ -701,6 +701,7 @@
> > >  #ifdef CONFIG_CONSTRUCTORS
> > >  #define KERNEL_CTORS()   . = ALIGN(8);  \
> > >   __ctors_start = .; \
> > > + KEEP(*(SORT(.ctors.*)))\
> > >   KEEP(*(.ctors))\
> > >   KEEP(*(SORT(.init_array.*)))   \
> > >   KEEP(*(.init_array))   \  
> > 
> > And that makes the messages go away.  
> 
> Okay then! Thanks for testing. :) I'm not sure why the ppc-hosted
> compiler generates those. Regardless, I'll send a proper patch...

I get these warnings from Linus' tree now ...
-- 
Cheers,
Stephen Rothwell


pgpaFNBvCn2Bv.pgp
Description: OpenPGP digital signature


Re: [tip: locking/core] lockdep: Fix lockdep recursion

2020-10-14 Thread Paul E. McKenney
On Wed, Oct 14, 2020 at 11:53:19PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 14, 2020 at 11:34:05AM -0700, Paul E. McKenney wrote:
> > commit 7deaa04b02298001426730ed0e6214ac20d1a1c1
> > Author: Paul E. McKenney 
> > Date:   Tue Oct 13 12:39:23 2020 -0700
> > 
> > rcu: Prevent lockdep-RCU splats on lock acquisition/release
> > 
> > The rcu_cpu_starting() and rcu_report_dead() functions transition the
> > current CPU between online and offline state from an RCU perspective.
> > Unfortunately, this means that the rcu_cpu_starting() function's lock
> > acquisition and the rcu_report_dead() function's lock releases happen
> > while the CPU is offline from an RCU perspective, which can result in
> > lockdep-RCU splats about using RCU from an offline CPU.  In reality,
> > aside from the splats, both transitions are safe because a new grace
> > period cannot start until these functions release their locks.
> 
> But we call the trace_* crud before we acquire the lock. Are you sure
> that's a false-positive? 

You lost me on this one.

I am assuming that you are talking about rcu_cpu_starting(), because
that is the one where RCU is not initially watching, that is, the
case where tracing before the lock acquisition would be a problem.
You cannot be talking about rcu_cpu_starting() itself, because it does
not do any tracing before acquiring the lock.  But if you are talking
about the caller of rcu_cpu_starting(), then that caller should put the
rcu_cpu_starting() before the tracing.  But that would be the other
patch earlier in this thread that was proposing moving the call to
rcu_cpu_starting() much earlier in CPU bringup.

So what am I missing here?

Thanx, Paul


Re: [PATCH v3 2/3] clk: qcom: lpass-sc7180: Disentangle the two clock devices

2020-10-14 Thread Stephen Boyd
Quoting Douglas Anderson (2020-10-14 14:05:22)
> diff --git a/drivers/clk/qcom/lpasscorecc-sc7180.c 
> b/drivers/clk/qcom/lpasscorecc-sc7180.c
> index abcf36006926..48d370e2108e 100644
> --- a/drivers/clk/qcom/lpasscorecc-sc7180.c
> +++ b/drivers/clk/qcom/lpasscorecc-sc7180.c
> @@ -356,12 +356,48 @@ static const struct qcom_cc_desc 
> lpass_audio_hm_sc7180_desc = {
> .num_gdscs = ARRAY_SIZE(lpass_audio_hm_sc7180_gdscs),
>  };
>  
> +static void lpass_pm_runtime_disable(void *data)
> +{
> +   pm_runtime_disable(data);
> +}
> +
> +static void lapss_pm_clk_destroy(void *data)
> +{
> +   pm_clk_destroy(data);
> +}

Why are these helpers added again? And do we even need them? Can't we
just pass pm_runtime_disable or pm_clk_destroy to the
devm_add_action_or_reset() second parameter?


Re: [PATCH] MIPS: dec: fix section mismatch

2020-10-14 Thread Thomas Bogendoerfer
On Tue, Oct 06, 2020 at 04:00:03PM +0200, Thomas Bogendoerfer wrote:
> Drop inline for memory setup functions and mark them __init to
> fix section mismatch of pmax_setup_memory_region.
> 
> Signed-off-by: Thomas Bogendoerfer 
> ---
>  arch/mips/dec/prom/memory.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

applied to mips-next.

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]


Re: [PATCH v3] MIPS: replace add_memory_region with memblock

2020-10-14 Thread Thomas Bogendoerfer
On Wed, Oct 14, 2020 at 01:47:06PM +0100, Maciej W. Rozycki wrote:
> On Fri, 9 Oct 2020, Thomas Bogendoerfer wrote:
> 
> > add_memory_region was the old interface for registering memory and
> > was already changed to used memblock internaly. Replace it by
> > directly calling memblock functions.
> > 
> > Signed-off-by: Thomas Bogendoerfer 
> 
>  For the DEC part:
> 
> Acked-by: Maciej W. Rozycki 
> 
> NB this does not apply cleanly to upstream master, but I was able to 
> verify this regardless as the DEC part does.  For future reference: what 
> tree do you usually use that you post patches against?

mips-next branch in 

git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea.[ RFC1925, 2.3 ]


Re: [PATCH v3 4/8] of: unittest: Add test for of_dma_get_max_cpu_address()

2020-10-14 Thread Rob Herring
On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne
 wrote:
>
> Introduce a test for of_dma_get_max_cup_address(), it uses the same DT
> data as the rest of dma-ranges unit tests.
>
> Signed-off-by: Nicolas Saenz Julienne 
> ---
>  drivers/of/unittest.c | 20 
>  1 file changed, 20 insertions(+)
>
> diff --git a/drivers/of/unittest.c b/drivers/of/unittest.c
> index 06cc988faf78..2cbf2a585c9f 100644
> --- a/drivers/of/unittest.c
> +++ b/drivers/of/unittest.c
> @@ -869,6 +869,25 @@ static void __init of_unittest_changeset(void)
>  #endif
>  }
>
> +static void __init of_unittest_dma_get_max_cpu_address(void)
> +{
> +#ifdef CONFIG_HAS_DMA

Can't the unittest run without this? I run the unittests under UML.

> +   struct device_node *np;
> +   phys_addr_t cpu_addr;
> +
> +   np = of_find_node_by_path("/testcase-data/address-tests");
> +   if (!np) {
> +   pr_err("missing testcase data\n");
> +   return;
> +   }
> +
> +   cpu_addr = of_dma_get_max_cpu_address(np);
> +   unittest(cpu_addr == 0x5000ULL,
> +"of_dma_get_max_cpu_address: wrong CPU addr %pad (expecting 
> %llx)\n",
> +&cpu_addr, 0x5000ULL);
> +#endif
> +}
> +
>  static void __init of_unittest_dma_ranges_one(const char *path,
> u64 expect_dma_addr, u64 expect_paddr)
>  {
> @@ -3266,6 +3285,7 @@ static int __init of_unittest(void)
> of_unittest_changeset();
> of_unittest_parse_interrupts();
> of_unittest_parse_interrupts_extended();
> +   of_unittest_dma_get_max_cpu_address();
> of_unittest_parse_dma_ranges();
> of_unittest_pci_dma_ranges();
> of_unittest_match_node();
> --
> 2.28.0
>


Re: [PATCH v3 3/8] of/address: Introduce of_dma_get_max_cpu_address()

2020-10-14 Thread Rob Herring
On Wed, Oct 14, 2020 at 2:12 PM Nicolas Saenz Julienne
 wrote:
>
> Introduce of_dma_get_max_cpu_address(), which provides the highest CPU
> physical address addressable by all DMA masters in the system. It's
> specially useful for setting memory zones sizes at early boot time.
>
> Signed-off-by: Nicolas Saenz Julienne 
>
> ---
>
> Changes since v2:
>  - Use PHYS_ADDR_MAX
>  - return phys_dma_t
>  - Rename function
>  - Correct subject
>  - Add support to start parsing from an arbitrary device node in order
>for the function to work with unit tests
>
>  drivers/of/address.c | 42 ++
>  include/linux/of.h   |  7 +++
>  2 files changed, 49 insertions(+)
>
> diff --git a/drivers/of/address.c b/drivers/of/address.c
> index eb9ab4f1e80b..b5a9695aaf82 100644
> --- a/drivers/of/address.c
> +++ b/drivers/of/address.c
> @@ -1024,6 +1024,48 @@ int of_dma_get_range(struct device_node *np, const 
> struct bus_dma_region **map)
>  }
>  #endif /* CONFIG_HAS_DMA */
>
> +/**
> + * of_dma_get_max_cpu_address - Gets highest CPU address suitable for DMA
> + * @np: The node to start searching from or NULL to start from the root
> + *
> + * Gets the highest CPU physical address that is addressable by all DMA 
> masters
> + * in the system (or subtree when np is non-NULL). If no DMA constrained 
> device
> + * is found, it returns PHYS_ADDR_MAX.
> + */
> +phys_addr_t __init of_dma_get_max_cpu_address(struct device_node *np)
> +{
> +   phys_addr_t max_cpu_addr = PHYS_ADDR_MAX;

One issue with using phys_addr_t is it may be 32-bit even though the
DT is 64-bit addresses. LPAE capable system with LPAE disabled. Maybe
the truncation is fine here? Maybe not.

> +   struct of_range_parser parser;
> +   phys_addr_t subtree_max_addr;
> +   struct device_node *child;
> +   phys_addr_t cpu_end = 0;
> +   struct of_range range;
> +   const __be32 *ranges;
> +   int len;
> +
> +   if (!np)
> +   np = of_root;
> +
> +   ranges = of_get_property(np, "dma-ranges", &len);

I'm not really following why you changed the algorithm here. You're
skipping disabled nodes which is good. Was there some other reason?

> +   if (ranges && len) {
> +   of_dma_range_parser_init(&parser, np);
> +   for_each_of_range(&parser, &range)
> +   if (range.cpu_addr + range.size > cpu_end)
> +   cpu_end = range.cpu_addr + range.size;
> +
> +   if (max_cpu_addr > cpu_end)
> +   max_cpu_addr = cpu_end;
> +   }
> +
> +   for_each_available_child_of_node(np, child) {
> +   subtree_max_addr = of_dma_get_max_cpu_address(child);
> +   if (max_cpu_addr > subtree_max_addr)
> +   max_cpu_addr = subtree_max_addr;
> +   }
> +
> +   return max_cpu_addr;
> +}
> +
>  /**
>   * of_dma_is_coherent - Check if device is coherent
>   * @np:device node
> diff --git a/include/linux/of.h b/include/linux/of.h
> index 481ec0467285..db8db8f2c967 100644
> --- a/include/linux/of.h
> +++ b/include/linux/of.h
> @@ -558,6 +558,8 @@ int of_map_id(struct device_node *np, u32 id,
>const char *map_name, const char *map_mask_name,
>struct device_node **target, u32 *id_out);
>
> +phys_addr_t of_dma_get_max_cpu_address(struct device_node *np);
> +
>  #else /* CONFIG_OF */
>
>  static inline void of_core_init(void)
> @@ -995,6 +997,11 @@ static inline int of_map_id(struct device_node *np, u32 
> id,
> return -EINVAL;
>  }
>
> +static inline phys_addr_t of_dma_get_max_cpu_address(struct device_node *np)
> +{
> +   return PHYS_ADDR_MAX;
> +}
> +
>  #define of_match_ptr(_ptr) NULL
>  #define of_match_node(_matches, _node) NULL
>  #endif /* CONFIG_OF */
> --
> 2.28.0
>


Re: [PATCH v6 35/80] docs: fs: fscrypt.rst: get rid of :c:type: tags

2020-10-14 Thread Eric Biggers
On Wed, Oct 14, 2020 at 08:59:07AM +0200, Mauro Carvalho Chehab wrote:
> [PATCH v6.1 35/80] docs: fs: fscrypt.rst: get rid of :c:type: tags
> 
> The :c:type: tag has problems with Sphinx 3.x, as structs
> there should be declared with c:struct.
> 
> So, remove them, relying at automarkup.py extension to
> convert them into cross-references.
>
> Signed-off-by: Mauro Carvalho Chehab 

"relying at" => "relying on".

Otherwise looks fine, you can add:

Reviewed-by: Eric Biggers 

I do still wonder about your comment though:

> It should be said that, currently, if there's no documentation for "foo",
> automarkup will just keep using the regular text font, keeping the text
> untouched.

That will apply to most (maybe all) of the structures mentioned in this file.
I expected that if the documentation system now automatically recognizes
'struct foo', then it would render it in code font even when 'struct foo' isn't
documented.  Any particular reason why that isn't the case?  Not like I care
much myself, but it's a bit unexpected and it means this change actually makes
the rendered documentation look worse...

- Eric


Linux 5.9: smartpqi: controller is offline: status code 0x6100c

2020-10-14 Thread Paul Menzel

Dear Linux folks,


With Linux 5.9 and


$ lspci -nn -s 89:
89:00.0 Serial Attached SCSI controller [0107]: Adaptec Smart 
Storage PQI 12G SAS/PCIe 3 [9005:028f] (rev 01)
$ more 
/sys/devices/pci:88/:88:00.0/:89:00.0/host15/scsi_host/host15/driver_version

1.2.8-026
$ more 
/sys/devices/pci:88/:88:00.0/:89:00.0/host15/scsi_host/host15/firmware_version

2.62-0

the controller went offline with status code 0x6100c.


Oct 14 14:54:01 done.molgen.mpg.de kernel: smartpqi :89:00.0: controller is 
offline: status code 0x6100c
Oct 14 14:54:01 done.molgen.mpg.de kernel: smartpqi :89:00.0: controller 
offline
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:2:0: [sdu] tag#709 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:15:0: [sdah] tag#274 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:4:0: [sdw] tag#516 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:4:0: [sdw] tag#516 CDB: 
Write(10) 2a 00 0d e6 9e 88 00 00 01 00
Oct 14 14:54:01 done.molgen.mpg.de kernel: blk_update_request: I/O error, dev 
sdw, sector 1865741376 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:0:0: [sds] tag#529 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:0:0: [sds] tag#529 CDB: 
Write(10) 2a 00 29 4e e8 ff 00 00 01 00
Oct 14 14:54:01 done.molgen.mpg.de kernel: blk_update_request: I/O error, dev 
sds, sector 5544298488 op 0x1:(WRITE) flags 0x0 phys_seg 1 prio class 0
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:0:0: [sds] tag#627 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:0:0: [sds] tag#627 CDB: 
Read(10) 28 00 5d df 2c 04 00 00 04 00
Oct 14 14:54:01 done.molgen.mpg.de kernel: blk_update_request: I/O error, dev 
sds, sector 12599255072 op 0x0:(READ) flags 0x1000 phys_seg 1 prio class
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:5:0: [sdx] tag#567 FAILED 
Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=6s
Oct 14 14:54:01 done.molgen.mpg.de kernel: sd 15:0:5:0: [sdx] tag#567 CDB: 
Write(10) 2a 00 21 4e ce 04 00 00 04 00


How can the status code 0x6100c be deciphered?


Kind regards,

Paul


Re: [mm/writeback] 8d92890bd6: will-it-scale.per_process_ops -15.3% regression

2020-10-14 Thread NeilBrown
On Wed, Oct 14 2020, Jan Kara wrote:

> On Wed 14-10-20 16:47:06, kernel test robot wrote:
>> Greeting,
>> 
>> FYI, we noticed a -15.3% regression of will-it-scale.per_process_ops due
>> to commit:
>> 
>> commit: 8d92890bd6b8502d6aee4b37430ae6444ade7a8c ("mm/writeback: discard
>> NR_UNSTABLE_NFS, use NR_WRITEBACK instead")
>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> Thanks for report but it doesn't quite make sense to me. If we omit
> reporting & NFS changes in that commit (which is code not excercised by
> this benchmark), what remains are changes like:
>
> nr_pages += node_page_state(pgdat, NR_FILE_DIRTY);
> -   nr_pages += node_page_state(pgdat, NR_UNSTABLE_NFS);
> nr_pages += node_page_state(pgdat, NR_WRITEBACK);
> ...
> -   nr_reclaimable = global_node_page_state(NR_FILE_DIRTY) +
> -   
> global_node_page_state(NR_UNSTABLE_NFS);
> +   nr_reclaimable = global_node_page_state(NR_FILE_DIRTY);
> ...
> -   gdtc->dirty = global_node_page_state(NR_FILE_DIRTY) +
> - global_node_page_state(NR_UNSTABLE_NFS);
> +   gdtc->dirty = global_node_page_state(NR_FILE_DIRTY);
>
> So if there's any negative performance impact of these changes, they're
> likely due to code alignment changes or something like that... So I don't
> think there's much to do here since optimal code alignment is highly specific
> to a particular CPU etc.

I agree, it seems odd.

Removing NR_UNSTABLE_NFS from enum node_stat_item would renumber all the
following value and would, I think, change NR_DIRTIED from 32 to 31.
Might that move something to a different cache line and change some
contention?

That would be easy enough to test: just re-add NR_UNSTABLE_NFS.

I have no experience reading will-it-scale results, but 15% does seem
like a lot.

NeilBrown


signature.asc
Description: PGP signature


Re: [GIT PULL] kernel_clone() for v5.10

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 10:41:54 +0200:

> g...@gitolite.kernel.org:pub/scm/linux/kernel/git/brauner/linux 
> tags/kernel-clone-v5.9

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/612e7a4c1645f09449355cf08b6fd3de80b4f8cc

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


Re: [GIT PULL] xfs: new code for 5.10, part 1

2020-10-14 Thread pr-tracker-bot
The pull request you sent on Wed, 14 Oct 2020 13:50:59 -0700:

> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-5.10-merge-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/2fc61f25fb296827387a5f01129dbc00cbe3ca58

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html


<    1   2   3   4   5   6   7   8   9   10   >