Re: [PATCH] Input: trackpoint - add new trackpoint variant IDs

2020-09-13 Thread Dmitry Torokhov
Hi Vincent,

On Wed, Sep 09, 2020 at 04:36:32PM +0800, Vincent Huang wrote:
> Add trackpoint variant IDs to allow supported control
> on Synaptics trackpoints
> 
> Signed-off-by: Vincent Huang 
> ---
>  drivers/input/mouse/trackpoint.c | 2 ++
>  drivers/input/mouse/trackpoint.h | 2 ++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/drivers/input/mouse/trackpoint.c 
> b/drivers/input/mouse/trackpoint.c
> index 3eefee2ee2a1..c54d2f9e1c4a 100644
> --- a/drivers/input/mouse/trackpoint.c
> +++ b/drivers/input/mouse/trackpoint.c
> @@ -21,6 +21,8 @@ static const char * const trackpoint_variants[] = {
>   [TP_VARIANT_ALPS]   = "ALPS",
>   [TP_VARIANT_ELAN]   = "Elan",
>   [TP_VARIANT_NXP]= "NXP",
> + [TP_VARIANT_JYT_SYNAPTICS]  = "JYT_SYNAPTICS",
> + [TP_VARIANT_SYNAPTICS]  = "SYNAPTICS",

Do these need to be capitalized? This is simply used in log messages, so
nicer formatted strings can be used.

Thanks.

-- 
Dmitry


Re: [PATCH RFC 4/4] 9p: fix race issue in fid contention.

2020-09-13 Thread Dominique Martinet


Thanks for having a look a this!

Jianyong Wu wrote on Mon, Sep 14, 2020:
> Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
> bug in 9p. But there is race issue in fid comtention.
> As Greg's patch stores all of fids from opened files into according inode,
> so all the lookup fid ops can retrieve fid from inode preferentially. But
> there is no mechanism to handle the fid comtention issue. For example,
> there are two threads get the same fid in the same time and one of them
> clunk the fid before the other thread ready to discard the fid. In this
> scenario, it will lead to some fatal problems, even kernel core dump.

Ah, so that's what the problem was. Good job finding the problem!


> I introduce a mechanism to fix this race issue. A counter field introduced
> into p9_fid struct to store the reference counter to the fid. When a fid
> is allocated from the inode, the counter will increase, and will decrease
> at the end of its occupation. It is guaranteed that the fid won't be clunked
> before the reference counter go down to 0, then we can avoid the clunked
> fid to be used.
> As there is no need to retrieve fid from inode in all conditions, a enum value
> denotes the source of the fid is introduced to 9p_fid either. So we can only
> handle the reference counter as to the fid obtained from inode.

If there is no contention then an always-one refcount and an enum are
the same thing.
I'd rather not make a difference but make it a full-fledged refcount
thing; the enum in the code introduces quite a bit of code churn that
doesn't strike me as useful (and I don't like int arguments like this,
but if we can just do away with it there's no need to argue about that)

Not having exceptions for that will also make the code around
fid_atomic_dec much simpler: just have clunk do an atomic dec and only
do the actual clunk if that hit zero, and we should be able to get rid
of that helper?


Timing wise it's a bit awkward but I just dug out the async clunk
mechanism I wrote two years ago, that will conflict with this patch but
might also help a bit I guess?
I should probably have reposted them...


So to recap:
 - Let's try some more straight-forward refcounting: set to 1 on alloc,
increment when it's found in fid.c, decrement in clunk and only send the
actual clunk if counter hit 0

 - Ideally base yourself of my 9p-test branch to have async clunk:
https://github.com/martinetd/linux/commits/9p-test
I've been promising to push it to next this week™ for a couple of weeks
but if something is based on it I won't be able to delay this much
longer, it'll get pushed to 5.10 cycle anyway.
(I'll resend the patches to be clean)

 - (please, no polling 10ms then leaking something!)


Thanks,
-- 
Dominique


Re: [PATCH 07/26] perf tools: Add check for existing link in buildid dir

2020-09-13 Thread Namhyung Kim
On Mon, Sep 14, 2020 at 6:05 AM Jiri Olsa  wrote:
>
> When adding new build id link we fail if the link is already
> there. Adding check for existing link and warn/replace the
> link with new target.
>
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/util/build-id.c | 20 +++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> index bdee4e08e60d..ecdc167aa1a0 100644
> --- a/tools/perf/util/build-id.c
> +++ b/tools/perf/util/build-id.c
> @@ -751,8 +751,26 @@ int build_id_cache__add_s(const char *sbuild_id, const 
> char *name,
> tmp = dir_name + strlen(buildid_dir) - 5;
> memcpy(tmp, "../..", 5);
>
> -   if (symlink(tmp, linkname) == 0)
> +   if (symlink(tmp, linkname) == 0) {
> err = 0;
> +   } else if (errno == EEXIST) {
> +   char path[PATH_MAX];
> +
> +   if (readlink(linkname, path, sizeof(path)) == -1) {
> +   pr_err("Cant read link: %s\n", linkname);

typo

> +   goto out_free;
> +   }
> +   if (strcmp(tmp, path)) {
> +   pr_err("Inconsistent .debug record, updating [%s]\n",
> +   linkname);

But isn't it ok to copy a binary to another location?
There can be multiple binaries with the same build-id..

Thanks
Namhyung


> +
> +   unlink(linkname);
> +
> +   if (symlink(tmp, linkname))
> +   goto out_free;
> +   }
> +   err = 0;
> +   }
>
> /* Update SDT cache : error is just warned */
> if (realname &&
> --
> 2.26.2
>


[PATCH] dmaengine: sf-pdma: remove unused 'desc'

2020-09-13 Thread Vinod Koul
'desc' variable is now defined but not used in sf_pdma_donebh_tasklet(),
causing this warning:

drivers/dma/sf-pdma/sf-pdma.c: In function 'sf_pdma_donebh_tasklet':
drivers/dma/sf-pdma/sf-pdma.c:287:23: warning: unused variable 'desc' 
[-Wunused-variable]

Remove this unused variable

Reported-by: Stephen Rothwell 
Signed-off-by: Vinod Koul 
---
 drivers/dma/sf-pdma/sf-pdma.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/dma/sf-pdma/sf-pdma.c b/drivers/dma/sf-pdma/sf-pdma.c
index 754994087e5f..1e66c6990d81 100644
--- a/drivers/dma/sf-pdma/sf-pdma.c
+++ b/drivers/dma/sf-pdma/sf-pdma.c
@@ -284,7 +284,6 @@ static void sf_pdma_free_desc(struct virt_dma_desc *vdesc)
 static void sf_pdma_donebh_tasklet(unsigned long arg)
 {
struct sf_pdma_chan *chan = (struct sf_pdma_chan *)arg;
-   struct sf_pdma_desc *desc = chan->desc;
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
-- 
2.26.2



Re: [PATCH v18 2/3] drivers: input:keyboard: Add mtk keypad driver

2020-09-13 Thread Dmitry Torokhov
Hi Fengping,

On Wed, Sep 09, 2020 at 03:22:00PM +0800, Fengping Yu wrote:
> From: "fengping.yu" 
> 
> This patch adds matrix keypad support for Mediatek SoCs.

I am generally happy with the driver, however I do not believe this will
be the only Mediatek driver ever. Do you think we could rename it to
mt6779-keypad.c and use mt6779_keypad_ as prefix for function names?

Thanks!

-- 
Dmitry


Re: [PATCH v18 1/3] dt-bindings: Add bindings for Mediatek matrix keypad

2020-09-13 Thread Dmitry Torokhov
Hi Rob,

On Wed, Sep 09, 2020 at 03:21:58PM +0800, Fengping Yu wrote:
> From: "fengping.yu" 
> 
> This patch add devicetree bindings for Mediatek matrix keypad driver.

I am generally happy with the driver itself, do you have any concerns
with the binding?

Thanks!

> 
> Signed-off-by: fengping.yu 
> ---
>  .../devicetree/bindings/input/mtk-kpd.yaml| 83 +++
>  1 file changed, 83 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/input/mtk-kpd.yaml
> 
> diff --git a/Documentation/devicetree/bindings/input/mtk-kpd.yaml 
> b/Documentation/devicetree/bindings/input/mtk-kpd.yaml
> new file mode 100644
> index ..eda2c6efbfbf
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/input/mtk-kpd.yaml
> @@ -0,0 +1,83 @@
> +# SPDX-License-Identifier: GPL-2.0
> +%YAML 1.2
> +---
> +version: 1
> +
> +$id: http://devicetree.org/schemas/input/mtk-keypad.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Mediatek's Keypad Controller device tree bindings
> +
> +maintainer:
> +  - Fengping Yu 
> +
> +description: |
> +  Mediatek's Keypad controller is used to interface a SoC with a matrix-type
> +  keypad device. The keypad controller supports multiple row and column 
> lines.
> +  A key can be placed at each intersection of a unique row and a unique 
> column.
> +  The keypad controller can sense a key-press and key-release and report the
> +  event using a interrupt to the cpu.
> +
> +properties:
> +  compatible:
> +oneOf:
> +  - const: "mediatek,mt6779-keypad"
> +  - const: "mediatek,mt6873-keypad"
> +
> +  clock-names:
> +description: Names of the clocks listed in clocks property in the same 
> order
> +maxItems: 1
> +items:
> + - const: kpd
> +
> +  clocks:
> +description: Must contain one entry, for the module clock
> +refs: devicetree/bindings/clocks/clock-bindings.txt for details.
> +
> +  interrupts:
> +description: A single interrupt specifier
> +maxItems: 1
> +
> +  linux,keymap:
> +description: The keymap for keys as described in the binding document
> +refs: devicetree/bindings/input/matrix-keymap.txt
> +minItems: 1
> +
> +  reg:
> +description: The base address of the Keypad register bank
> +maxItems: 1
> +
> +  wakeup-source:
> +description: use any event on keypad as wakeup event
> +type: boolean
> +
> +  keypad,num-columns:
> +description: Number of column lines connected to the keypad controller.
> +
> +  keypad,num-rows:
> +description: Number of row lines connected to the keypad controller.
> +
> +  mediatek,debounce-us:
> +description: Debounce interval in microseconds, if not specified, the 
> default
> +value is 16000
> +maximum: 256000
> +
> +required:
> +  - compatible
> +  - reg
> +  - interrupts
> +  - linux,keymap
> +  - clocks
> +  - clock-names
> +
> +examples:
> +  - |
> +
> +  kp@1001 {
> +compatible = "mediatek,kp";
> +reg = <0 0x1001 0 0x1000>;
> +linux,keymap = < MATRIX_KEY(0x00, 0x00, KEY_VOLUMEDOWN) >;
> +interrupts = ;
> +clocks = <>;
> +clock-names = "kpd";
> +  };
> -- 
> 2.18.0

-- 
Dmitry


Re: [PATCH 13/15] selftests/seccomp: powerpc: Set syscall return during ptrace syscall exit

2020-09-13 Thread Michael Ellerman
Kees Cook  writes:
> Some archs (like ppc) only support changing the return code during
> syscall exit when ptrace is used. As the syscall number might not
> be available anymore during syscall exit, it needs to be saved
> during syscall enter. Adjust the ptrace tests to do this.

I'm not that across all the fixture stuff, but if I'm reading it right
you're now calling change_syscall() on both entry and exit for all
arches.

That should work, but it no longer tests changing the return code on
entry on the arches that support it, which seems like a backward step?

cheers


> Reported-by: Thadeu Lima de Souza Cascardo 
> Suggested-by: Thadeu Lima de Souza Cascardo 
> Link: 
> https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
> Fixes: 58d0a862f573 ("seccomp: add tests for ptrace hole")
> Signed-off-by: Kees Cook 
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 34 +++
>  1 file changed, 20 insertions(+), 14 deletions(-)
>
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index bbab2420d708..26c712c6a575 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1949,12 +1949,19 @@ void tracer_seccomp(struct __test_metadata 
> *_metadata, pid_t tracee,
>  
>  }
>  
> +FIXTURE(TRACE_syscall) {
> + struct sock_fprog prog;
> + pid_t tracer, mytid, mypid, parent;
> + long syscall_nr;
> +};
> +
>  void tracer_ptrace(struct __test_metadata *_metadata, pid_t tracee,
>  int status, void *args)
>  {
> - int ret, nr;
> + int ret;
>   unsigned long msg;
>   static bool entry;
> + FIXTURE_DATA(TRACE_syscall) *self = args;
>  
>   /*
>* The traditional way to tell PTRACE_SYSCALL entry/exit
> @@ -1968,24 +1975,23 @@ void tracer_ptrace(struct __test_metadata *_metadata, 
> pid_t tracee,
>   EXPECT_EQ(entry ? PTRACE_EVENTMSG_SYSCALL_ENTRY
>   : PTRACE_EVENTMSG_SYSCALL_EXIT, msg);
>  
> - if (!entry)
> - return;
> -
> - nr = get_syscall(_metadata, tracee);
> + /*
> +  * Some architectures only support setting return values during
> +  * syscall exit under ptrace, and on exit the syscall number may
> +  * no longer be available. Therefore, save it here, and call
> +  * "change syscall and set return values" on both entry and exit.
> +  */
> + if (entry)
> + self->syscall_nr = get_syscall(_metadata, tracee);
>  
> - if (nr == __NR_getpid)
> + if (self->syscall_nr == __NR_getpid)
>   change_syscall(_metadata, tracee, __NR_getppid, 0);
> - if (nr == __NR_gettid)
> + if (self->syscall_nr == __NR_gettid)
>   change_syscall(_metadata, tracee, -1, 45000);
> - if (nr == __NR_openat)
> + if (self->syscall_nr == __NR_openat)
>   change_syscall(_metadata, tracee, -1, -ESRCH);
>  }
>  
> -FIXTURE(TRACE_syscall) {
> - struct sock_fprog prog;
> - pid_t tracer, mytid, mypid, parent;
> -};
> -
>  FIXTURE_VARIANT(TRACE_syscall) {
>   /*
>* All of the SECCOMP_RET_TRACE behaviors can be tested with either
> @@ -2044,7 +2050,7 @@ FIXTURE_SETUP(TRACE_syscall)
>   self->tracer = setup_trace_fixture(_metadata,
>  variant->use_ptrace ? tracer_ptrace
>  : tracer_seccomp,
> -NULL, variant->use_ptrace);
> +self, variant->use_ptrace);
>  
>   ret = prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
>   ASSERT_EQ(0, ret);
> -- 
> 2.25.1


Re: [PATCH] cpuidle: add riscv cpuidle driver

2020-09-13 Thread Daniel Lezcano
On 14/09/2020 03:52, liush wrote:
> This patch adds a cpuidle driver for systems based RISCV architecture.
> This patch supports state WFI. Other states will be supported in the
> future.
> 
> Signed-off-by: liush 
> ---

[ ... ]

>  
>  obj-$(CONFIG_RISCV_M_MODE)   += traps_misaligned.o
> diff --git a/arch/riscv/kernel/cpuidle.c b/arch/riscv/kernel/cpuidle.c
> new file mode 100644
> index ..a3289e7
> --- /dev/null
> +++ b/arch/riscv/kernel/cpuidle.c
> @@ -0,0 +1,8 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include 
> +
> +void cpu_do_idle(void)
> +{
> + __asm__ __volatile__ ("wfi");
> +

extra line

> +}

As for the next deeper states should end up with the cpu_do_idle
function, isn't there an extra operation with the wfi() like flushing
the l1 cache?

> diff --git a/drivers/cpuidle/Kconfig b/drivers/cpuidle/Kconfig
> index c0aeedd..f6be0fd 100644
> --- a/drivers/cpuidle/Kconfig
> +++ b/drivers/cpuidle/Kconfig
> @@ -62,6 +62,11 @@ depends on PPC
>  source "drivers/cpuidle/Kconfig.powerpc"
>  endmenu
>  
> +menu "RISCV CPU Idle Drivers"
> +depends on RISCV
> +source "drivers/cpuidle/Kconfig.riscv"
> +endmenu
> +
>  config HALTPOLL_CPUIDLE
>   tristate "Halt poll cpuidle driver"
>   depends on X86 && KVM_GUEST
> diff --git a/drivers/cpuidle/Kconfig.riscv b/drivers/cpuidle/Kconfig.riscv
> new file mode 100644
> index ..e86d36b
> --- /dev/null
> +++ b/drivers/cpuidle/Kconfig.riscv
> @@ -0,0 +1,11 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# RISCV CPU Idle drivers
> +#
> +config RISCV_CPUIDLE
> +bool "Generic RISCV CPU idle Driver"
> +select DT_IDLE_STATES
> + select CPU_IDLE_MULTIPLE_DRIVERS
> +help
> +  Select this option to enable generic cpuidle driver for RISCV.
> +   Now only support C0 State.

Identation

Rest looks ok for me.


-- 
 Linaro.org │ Open source software for ARM SoCs

Follow Linaro:   Facebook |
 Twitter |
 Blog


Re: [RFC PATCH V3 15/21] mmc: sdhci: UHS-II support, modify set_power() to handle vdd2

2020-09-13 Thread AKASHI Takahiro
Adrian,

On Fri, Aug 21, 2020 at 05:11:18PM +0300, Adrian Hunter wrote:
> On 10/07/20 2:11 pm, Ben Chuang wrote:
> > From: AKASHI Takahiro 
> > 
> > VDD2 is used for powering UHS-II interface.
> > Modify sdhci_set_power_and_bus_voltage(), sdhci_set_power_noreg()
> > and sdhci_set_power_noreg() to handle VDD2.
> 
> vdd2 is always 1.8 V and I suspect there may never be support for anything
> else, so we should start with 1.8 V only.

What do you mean here?
You don't want to add an extra argument, vdd2, to sdhci_set_power().
Correct?

> Also can we create uhs2_set_power_reg() and uhs2_set_power_noreg() and use
> the existing ->set_power() callback

Again what do you expect here?

Do you want to see any platform-specific mmc driver who supports UHS-II
to implement its own call back like:

void sdhci_foo_set_power(struct sdhci_host *host, unsigned char mode,
  unsigned short vdd)
{
sdhci_set_power(host, mode,vdd);

/* in case that sdhci_uhs2 module is not inserted */
if (!(mmc->caps & MMC_CAP_UHS2))
return;

/* vdd2 specific operation */
if (IS_ERR_OR_NULL(host->mmc->supply.vmmc2))
sdhci_uhs2_set_power_noreg(host, mode);
else
sdhci_uhs2_set_power_reg(host, mode);

/* maybe more platform-specific initialization */
}

struct sdhci_ops sdhci_foo_ops = {
.set_power = sdhci_foo_set_power,
...
}

Is this what you mean?
(I'm not quite sure yet that sdhci_ush2_set_power_noreg() can be split off
from sdhci_set_power_noreg().)

-Takahiro Akashi

}

> > 
> > Signed-off-by: Ben Chuang 
> > Signed-off-by: AKASHI Takahiro 
> > ---
> >  drivers/mmc/host/sdhci-omap.c |  2 +-
> >  drivers/mmc/host/sdhci-pci-core.c |  4 +--
> >  drivers/mmc/host/sdhci-pxav3.c|  4 +--
> >  drivers/mmc/host/sdhci-xenon.c|  4 +--
> >  drivers/mmc/host/sdhci.c  | 42 ---
> >  drivers/mmc/host/sdhci.h  |  9 +++
> >  6 files changed, 43 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/mmc/host/sdhci-omap.c b/drivers/mmc/host/sdhci-omap.c
> > index 1ec74c2d5c17..1926585debe5 100644
> > --- a/drivers/mmc/host/sdhci-omap.c
> > +++ b/drivers/mmc/host/sdhci-omap.c
> > @@ -678,7 +678,7 @@ static void sdhci_omap_set_clock(struct sdhci_host 
> > *host, unsigned int clock)
> >  }
> >  
> >  static void sdhci_omap_set_power(struct sdhci_host *host, unsigned char 
> > mode,
> > - unsigned short vdd)
> > + unsigned short vdd, unsigned short vdd2)
> >  {
> > struct mmc_host *mmc = host->mmc;
> >  
> > diff --git a/drivers/mmc/host/sdhci-pci-core.c 
> > b/drivers/mmc/host/sdhci-pci-core.c
> > index bb6802448b2f..40f5a24a8982 100644
> > --- a/drivers/mmc/host/sdhci-pci-core.c
> > +++ b/drivers/mmc/host/sdhci-pci-core.c
> > @@ -629,12 +629,12 @@ static int bxt_get_cd(struct mmc_host *mmc)
> >  #define SDHCI_INTEL_PWR_TIMEOUT_UDELAY 100
> >  
> >  static void sdhci_intel_set_power(struct sdhci_host *host, unsigned char 
> > mode,
> > - unsigned short vdd)
> > + unsigned short vdd, unsigned short vdd2)
> >  {
> > int cntr;
> > u8 reg;
> >  
> > -   sdhci_set_power(host, mode, vdd);
> > +   sdhci_set_power(host, mode, vdd, -1);
> >  
> > if (mode == MMC_POWER_OFF)
> > return;
> > diff --git a/drivers/mmc/host/sdhci-pxav3.c b/drivers/mmc/host/sdhci-pxav3.c
> > index e55037ceda73..457e9425339a 100644
> > --- a/drivers/mmc/host/sdhci-pxav3.c
> > +++ b/drivers/mmc/host/sdhci-pxav3.c
> > @@ -298,12 +298,12 @@ static void pxav3_set_uhs_signaling(struct sdhci_host 
> > *host, unsigned int uhs)
> >  }
> >  
> >  static void pxav3_set_power(struct sdhci_host *host, unsigned char mode,
> > -   unsigned short vdd)
> > +   unsigned short vdd, unsigned short vdd2)
> >  {
> > struct mmc_host *mmc = host->mmc;
> > u8 pwr = host->pwr;
> >  
> > -   sdhci_set_power_noreg(host, mode, vdd);
> > +   sdhci_set_power_noreg(host, mode, vdd, -1);
> >  
> > if (host->pwr == pwr)
> > return;
> > diff --git a/drivers/mmc/host/sdhci-xenon.c b/drivers/mmc/host/sdhci-xenon.c
> > index 4703cd540c7f..2b0ebb91895a 100644
> > --- a/drivers/mmc/host/sdhci-xenon.c
> > +++ b/drivers/mmc/host/sdhci-xenon.c
> > @@ -214,12 +214,12 @@ static void xenon_set_uhs_signaling(struct sdhci_host 
> > *host,
> >  }
> >  
> >  static void xenon_set_power(struct sdhci_host *host, unsigned char mode,
> > -   unsigned short vdd)
> > +   unsigned short vdd, unsigned short vdd2)
> >  {
> > struct mmc_host *mmc = host->mmc;
> > u8 pwr = host->pwr;
> >  
> > -   sdhci_set_power_noreg(host, mode, vdd);
> > +   sdhci_set_power_noreg(host, mode, vdd, -1);
> >  
> > if (host->pwr == pwr)
> > return;
> > diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> > index 

Re: [PATCH 05/26] perf tools: Add build_id__is_defined function

2020-09-13 Thread Namhyung Kim
On Mon, Sep 14, 2020 at 6:05 AM Jiri Olsa  wrote:
>
> Adding build_id__is_defined helper to check build id
> is defined and is != zero build id.
>
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/util/build-id.c | 11 +++
>  tools/perf/util/build-id.h |  1 +
>  2 files changed, 12 insertions(+)
>
> diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
> index 31207b6e2066..bdee4e08e60d 100644
> --- a/tools/perf/util/build-id.c
> +++ b/tools/perf/util/build-id.c
> @@ -902,3 +902,14 @@ bool perf_session__read_build_ids(struct perf_session 
> *session, bool with_hits)
>
> return ret;
>  }
> +
> +bool build_id__is_defined(const u8 *build_id)
> +{
> +   static u8 zero[BUILD_ID_SIZE];
> +   int err = 0;
> +
> +   if (build_id)
> +   err = memcmp(build_id, , BUILD_ID_SIZE);
> +
> +   return err ? true : false;
> +}

I think this is a bit confusing.. How about this?

  bool ret = false;
  if (build_id)
  ret = memcmp(...);
  return ret;

Or, it can be a oneliner..

Thanks
Namhyung


> diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
> index aad419bb165c..1ceede45c231 100644
> --- a/tools/perf/util/build-id.h
> +++ b/tools/perf/util/build-id.h
> @@ -14,6 +14,7 @@ extern struct perf_tool build_id__mark_dso_hit_ops;
>  struct dso;
>  struct feat_fd;
>
> +bool build_id__is_defined(const u8 *build_id);
>  int build_id__sprintf(const u8 *build_id, int len, char *bf);
>  int sysfs__sprintf_build_id(const char *root_dir, char *sbuild_id);
>  int filename__sprintf_build_id(const char *pathname, char *sbuild_id);
> --
> 2.26.2
>


Re: [PATCH v2] arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()

2020-09-13 Thread Gavin Shan

Hi Anshuman,

On 9/14/20 3:31 PM, Anshuman Khandual wrote:

On 09/14/2020 05:17 AM, Gavin Shan wrote:

The function __{pgd, pud, pmd, pte}_error() are introduced so that
they can be called by {pgd, pud, pmd, pte}_ERROR(). However, some
of the functions could never be called when the corresponding page
table level isn't enabled. For example, __{pud, pmd}_error() are
unused when PUD and PMD are folded to PGD.


Right, it makes sense not to have these helpers generally available.
Given pxx_ERROR() is enabled only when required page table level is
available, with a CONFIG_PGTABLE_LEVEL check.



Yep.



This removes __{pgd, pud, pmd, pte}_error() and call pr_err() from
{pgd, pud, pmd, pte}_ERROR() directly, similar to what x86/powerpc
are doing. With this, the code looks a bit simplified either.


Do we need p4d_ERROR() here as well !



p4d_ERROR() is defined in include/asm-generic/pgtable-nop4d.h, which
is always included because we have 4 levels of page tables to the
maximum.



Signed-off-by: Gavin Shan 
---
v2: Fix build warning caused by wrong printk format
---
  arch/arm64/include/asm/pgtable.h | 17 -
  arch/arm64/kernel/traps.c| 20 
  2 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index d5d3fbe73953..e0ab81923c30 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -35,11 +35,6 @@
  
  extern struct page *vmemmap;
  
-extern void __pte_error(const char *file, int line, unsigned long val);

-extern void __pmd_error(const char *file, int line, unsigned long val);
-extern void __pud_error(const char *file, int line, unsigned long val);
-extern void __pgd_error(const char *file, int line, unsigned long val);
-
  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
  #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
  
@@ -57,7 +52,8 @@ extern void __pgd_error(const char *file, int line, unsigned long val);

  extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
  #define ZERO_PAGE(vaddr)  phys_to_page(__pa_symbol(empty_zero_page))
  
-#define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))

+#define pte_ERROR(e)   \
+   pr_err("%s:%d: bad pte %016llx.\n", __FILE__, __LINE__, pte_val(e))
  
  /*

   * Macros to convert between a physical address and its placement in a
@@ -541,7 +537,8 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
  
  #if CONFIG_PGTABLE_LEVELS > 2
  
-#define pmd_ERROR(pmd)		__pmd_error(__FILE__, __LINE__, pmd_val(pmd))

+#define pmd_ERROR(e)   \
+   pr_err("%s:%d: bad pmd %016llx.\n", __FILE__, __LINE__, pmd_val(e))
  
  #define pud_none(pud)		(!pud_val(pud))

  #define pud_bad(pud)  (!(pud_val(pud) & PUD_TABLE_BIT))
@@ -608,7 +605,8 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
  
  #if CONFIG_PGTABLE_LEVELS > 3
  
-#define pud_ERROR(pud)		__pud_error(__FILE__, __LINE__, pud_val(pud))

+#define pud_ERROR(e)   \
+   pr_err("%s:%d: bad pud %016llx.\n", __FILE__, __LINE__, pud_val(e))
  
  #define p4d_none(p4d)		(!p4d_val(p4d))

  #define p4d_bad(p4d)  (!(p4d_val(p4d) & 2))
@@ -667,7 +665,8 @@ static inline unsigned long p4d_page_vaddr(p4d_t p4d)
  
  #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
  
-#define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))

+#define pgd_ERROR(e)   \
+   pr_err("%s:%d: bad pgd %016llx.\n", __FILE__, __LINE__, pgd_val(e))


A line break in these macros might not be required any more, as checkpatch.pl
now accepts bit longer lines.



Correct, but I guess it's nice to limit the width to 80 characters :)

  
  #define pgd_set_fixmap(addr)	((pgd_t *)set_fixmap_offset(FIX_PGD, addr))

  #define pgd_clear_fixmap()clear_fixmap(FIX_PGD)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 13ebd5ca2070..12fba7136dbd 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -935,26 +935,6 @@ asmlinkage void enter_from_user_mode(void)
  }
  NOKPROBE_SYMBOL(enter_from_user_mode);
  
-void __pte_error(const char *file, int line, unsigned long val)

-{
-   pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
-}
-
-void __pmd_error(const char *file, int line, unsigned long val)
-{
-   pr_err("%s:%d: bad pmd %016lx.\n", file, line, val);
-}
-
-void __pud_error(const char *file, int line, unsigned long val)
-{
-   pr_err("%s:%d: bad pud %016lx.\n", file, line, val);
-}
-
-void __pgd_error(const char *file, int line, unsigned long val)
-{
-   pr_err("%s:%d: bad pgd %016lx.\n", file, line, val);
-}


While moving %016lx now becomes %016llx. I guess this should be okay.
Looks much cleaner to have removed these helpers from trap.c



Yep.


-
  /* GENERIC_BUG traps */
  
  int is_valid_bugaddr(unsigned long addr)






Thanks,
Gavin



Re: [PATCH v2] x86/boot/compressed: Disable relocation relaxation

2020-09-13 Thread Ard Biesheuvel
On Mon, 14 Sep 2020 at 01:34, Arvind Sankar  wrote:
>
> On Tue, Aug 25, 2020 at 10:56:52AM -0400, Arvind Sankar wrote:
> > On Sat, Aug 15, 2020 at 01:56:49PM -0700, Nick Desaulniers wrote:
> > > Hi Ingo,
> > > I saw you picked up Arvind's other series into x86/boot.  Would you
> > > mind please including this, as well?  Our CI is quite red for x86...
> > >
> > > EOM
> > >
> >
> > Hi Ingo, while this patch is unnecessary after the series in
> > tip/x86/boot, it is still needed for 5.9 and older. Would you be able to
> > send it in for the next -rc? It shouldn't hurt the tip/x86/boot series,
> > and we can add a revert on top of that later.
> >
> > Thanks.
>
> Ping.
>
> https://lore.kernel.org/lkml/20200812004308.1448603-1-nived...@alum.mit.edu/

Acked-by: Ard Biesheuvel 


[PATCH] usb: gadget: bcm63xx_udc: fix up the error of undeclared usb_debug_root

2020-09-13 Thread Chunfeng Yun
Fix up the build error caused by undeclared usb_debug_root

Cc: stable 
Fixes: a66ada4f241c("usb: gadget: bcm63xx_udc: create debugfs directory under 
usb root")
Reported-by: kernel test robot 
Signed-off-by: Chunfeng Yun 
---
 drivers/usb/gadget/udc/bcm63xx_udc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/usb/gadget/udc/bcm63xx_udc.c 
b/drivers/usb/gadget/udc/bcm63xx_udc.c
index feaec00..9cd4a70 100644
--- a/drivers/usb/gadget/udc/bcm63xx_udc.c
+++ b/drivers/usb/gadget/udc/bcm63xx_udc.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
1.9.1


Re: [PATCH net-next] drivers/net/wan/x25_asy: Remove an unnecessary x25_type_trans call

2020-09-13 Thread Martin Schiller

On 2020-09-12 04:18, Xie He wrote:

x25_type_trans only needs to be called before we call netif_rx to pass
the skb to upper layers.

It does not need to be called before lapb_data_received. The LAPB 
module

does not need the fields that are set by calling it.

In the other two X.25 drivers - lapbether and hdlc_x25. x25_type_trans
is only called before netif_rx and not before lapb_data_received.

Cc: Martin Schiller 
Signed-off-by: Xie He 
---
 drivers/net/wan/x25_asy.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/net/wan/x25_asy.c b/drivers/net/wan/x25_asy.c
index 5a7cf8bf9d0d..ab56a5e6447a 100644
--- a/drivers/net/wan/x25_asy.c
+++ b/drivers/net/wan/x25_asy.c
@@ -202,8 +202,7 @@ static void x25_asy_bump(struct x25_asy *sl)
return;
}
skb_put_data(skb, sl->rbuff, count);
-   skb->protocol = x25_type_trans(skb, sl->dev);
-   err = lapb_data_received(skb->dev, skb);
+   err = lapb_data_received(sl->dev, skb);
if (err != LAPB_OK) {
kfree_skb(skb);
printk(KERN_DEBUG "x25_asy: data received err - %d\n", err);


Acked-by: Martin Schiller 



Re: [PATCH] MAINTAINERS: make linux-mediatek list remarks consistent

2020-09-13 Thread Lukas Bulwahn



On Mon, 14 Sep 2020, Lukas Bulwahn wrote:

> Commit 637cfacae96f ("PCI: mediatek: Add MediaTek PCIe host controller
> support") does not mention that linux-media...@lists.infradead.org is
> moderated for non-subscribers, but the other eight entries for
> linux-media...@lists.infradead.org do.
> 
> Adjust this entry to be consistent with all others.
> 
> Signed-off-by: Lukas Bulwahn 
> ---
> applies cleanly on v5.9-rc5 and next-20200911
> 
> Ryder, please ack.
> 
> Bjorn, Matthias, please pick this minor non-urgent clean-up patch.
> 
> This patch submission will also show me if linux-mediatek is moderated or
> not. I have not subscribed to linux-mediatek and if it shows up quickly in
> the archive, the list is probably not moderated; and if it takes longer, it
> is moderated, and hence, validating the patch.
> 

Okay, my patch showed up within seconds in the archive:

https://lore.kernel.org/linux-mediatek/20200914053110.23286-1-lukas.bulw...@gmail.com/


I think the linux-mediatek list is actually NOT _moderated for 
non-subscribers_.

Please IGNORE this patch until someone can confirm if it is moderated or 
not. I will then send the patch that reflects the actual state.

Thanks, Lukas


Re: [PATCH 03/26] tools headers uapi: Sync tools/include/uapi/linux/perf_event.h

2020-09-13 Thread Namhyung Kim
On Mon, Sep 14, 2020 at 6:03 AM Jiri Olsa  wrote:
>
> Sync uapi header with kernel version for mmap3 support.
>
> Signed-off-by: Jiri Olsa 
> ---
>  tools/include/uapi/linux/perf_event.h | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/tools/include/uapi/linux/perf_event.h 
> b/tools/include/uapi/linux/perf_event.h
> index 3e5dcdd48a49..84a0cbdab1ef 100644
> --- a/tools/include/uapi/linux/perf_event.h
> +++ b/tools/include/uapi/linux/perf_event.h
> @@ -384,7 +384,8 @@ struct perf_event_attr {
> aux_output :  1, /* generate AUX records 
> instead of events */
> cgroup :  1, /* include cgroup events 
> */
> text_poke  :  1, /* include text poke 
> events */
> -   __reserved_1   : 30;
> +   mmap3  :  1, /* include bpf events */

Same here..

Thanks
Namhyung


> +   __reserved_1   : 29;
>
> union {
> __u32   wakeup_events;/* wakeup every n events */
> @@ -1060,6 +1061,30 @@ enum perf_event_type {
>  */
> PERF_RECORD_TEXT_POKE   = 20,
>
> +   /*
> +* The MMAP3 records are an augmented version of MMAP2, they add
> +* build id value to identify the exact binary behind map
> +*
> +* struct {
> +*  struct perf_event_headerheader;
> +*
> +*  u32 pid, tid;
> +*  u64 addr;
> +*  u64 len;
> +*  u64 pgoff;
> +*  u32 maj;
> +*  u32 min;
> +*  u64 ino;
> +*  u64 ino_generation;
> +*  u32 prot, flags;
> +*  u32 reserved;
> +*  u8  buildid[20];
> +*  charfilename[];
> +*  struct sample_idsample_id;
> +* };
> +*/
> +   PERF_RECORD_MMAP3   = 21,
> +
> PERF_RECORD_MAX,/* non-ABI */
>  };
>
> --
> 2.26.2
>


Re: [PATCH 02/26] perf: Introduce mmap3 version of mmap event

2020-09-13 Thread Namhyung Kim
On Mon, Sep 14, 2020 at 6:03 AM Jiri Olsa  wrote:
>
> Add new version of mmap event. The MMAP3 record is an
> augmented version of MMAP2, it adds build id value to
> identify the exact binary object behind memory map:
>
>   struct {
> struct perf_event_header header;
>
> u32  pid, tid;
> u64  addr;
> u64  len;
> u64  pgoff;
> u32  maj;
> u32  min;
> u64  ino;
> u64  ino_generation;
> u32  prot, flags;
> u32  reserved;
> u8   buildid[20];

Do we need maj, min, ino, ino_generation for mmap3 event?
I think they are to compare binaries, then we can do it with
build-id (and I think it'd be better)..


> char filename[];
> struct sample_id sample_id;
>   };
>
> Adding 4 bytes reserved field to align buildid data to 8 bytes,
> so sample_id data is properly aligned.
>
> The mmap3 event is enabled by new mmap3 bit in perf_event_attr
> struct.  When set for an event, it enables the build id retrieval
> and will use mmap3 format for the event.
>
> Keeping track of mmap3 events and calling build_id_parse
> in perf_event_mmap_event only if we have any defined.
>
> Having build id attached directly to the mmap event will help
> tool like perf to skip final search through perf data for
> binaries that are needed in the report time. Also it prevents
> possible race when the binary could be removed or replaced
> during profiling.
>
> Signed-off-by: Jiri Olsa 
> ---
>  include/uapi/linux/perf_event.h | 27 ++-
>  kernel/events/core.c| 38 +++--
>  2 files changed, 57 insertions(+), 8 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 077e7ee69e3d..facfc3c673ed 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -384,7 +384,8 @@ struct perf_event_attr {
> aux_output :  1, /* generate AUX records 
> instead of events */
> cgroup :  1, /* include cgroup events 
> */
> text_poke  :  1, /* include text poke 
> events */
> -   __reserved_1   : 30;
> +   mmap3  :  1, /* include bpf events */

???

> +   __reserved_1   : 29;
>
> union {
> __u32   wakeup_events;/* wakeup every n events */
> @@ -1060,6 +1061,30 @@ enum perf_event_type {
>  */
> PERF_RECORD_TEXT_POKE   = 20,
>
> +   /*
> +* The MMAP3 records are an augmented version of MMAP2, they add
> +* build id value to identify the exact binary behind map
> +*
> +* struct {
> +*  struct perf_event_headerheader;
> +*
> +*  u32 pid, tid;
> +*  u64 addr;
> +*  u64 len;
> +*  u64 pgoff;
> +*  u32 maj;
> +*  u32 min;
> +*  u64 ino;
> +*  u64 ino_generation;
> +*  u32 prot, flags;
> +*  u32 reserved;
> +*  u8  buildid[20];
> +*  charfilename[];
> +*  struct sample_idsample_id;
> +* };
> +*/
> +   PERF_RECORD_MMAP3   = 21,
> +
> PERF_RECORD_MAX,/* non-ABI */
>  };
>
[SNIP]
> @@ -8098,6 +8116,9 @@ static void perf_event_mmap_event(struct 
> perf_mmap_event *mmap_event)
> mmap_event->prot = prot;
> mmap_event->flags = flags;
>
> +   if (atomic_read(_mmap3_events))
> +   build_id_parse(vma, mmap_event->buildid);

What about if it failed?  We should zero out the build-id..

Thanks
Namhyung

> +
> if (!(vma->vm_flags & VM_EXEC))
> mmap_event->event_id.header.misc |= 
> PERF_RECORD_MISC_MMAP_DATA;
>
> @@ -8241,6 +8262,7 @@ void perf_event_mmap(struct vm_area_struct *vma)
> /* .ino_generation (attr_mmap2 only) */
> /* .prot (attr_mmap2 only) */
> /* .flags (attr_mmap2 only) */
> +   /* .buildid (attr_mmap3 only) */
> };
>
> perf_addr_filters_adjust(vma);
> @@ -11040,6 +11062,8 @@ static void account_event(struct perf_event *event)
> inc = true;
> if (event->attr.mmap || event->attr.mmap_data)
>  

Re: [PATCH] tty: hvcs: Don't NULL tty->driver_data until hvcs_cleanup()

2020-09-13 Thread Jiri Slaby
On 21. 08. 20, 1:46, Tyrel Datwyler wrote:
> The code currently NULLs tty->driver_data in hvcs_close() with the
> intent of informing the next call to hvcs_open() that device needs to be
> reconfigured. However, when hvcs_cleanup() is called we copy hvcsd from
> tty->driver_data which was previoulsy NULLed by hvcs_close() and our
> call to tty_port_put(>port) doesn't actually do anything since
> >port ends up translating to NULL by chance. This has the side
> effect that when hvcs_remove() is called we have one too many port
> references preventing hvcs_destuct_port() from ever being called. This
> also prevents us from reusing the /dev/hvcsX node in a future
> hvcs_probe() and we can eventually run out of /dev/hvcsX devices.
> 
> Fix this by waiting to NULL tty->driver_data in hvcs_cleanup().

Without actually looking into the code, it looks like we need a fix
similar to:
commit 24eb2377f977fe06d84fca558f891f95bc28a449
Author: Jiri Slaby 
Date:   Tue May 26 16:56:32 2020 +0200

tty: hvc_console, fix crashes on parallel open/close

here too?

> Signed-off-by: Tyrel Datwyler 
> ---
>  drivers/tty/hvc/hvcs.c | 14 +++---
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/tty/hvc/hvcs.c b/drivers/tty/hvc/hvcs.c
> index 55105ac38f89..509d1042825a 100644
> --- a/drivers/tty/hvc/hvcs.c
> +++ b/drivers/tty/hvc/hvcs.c
> @@ -1216,13 +1216,6 @@ static void hvcs_close(struct tty_struct *tty, struct 
> file *filp)
>  
>   tty_wait_until_sent(tty, HVCS_CLOSE_WAIT);
>  
> - /*
> -  * This line is important because it tells hvcs_open that this
> -  * device needs to be re-configured the next time hvcs_open is
> -  * called.
> -  */
> - tty->driver_data = NULL;
> -
>   free_irq(irq, hvcsd);
>   return;
>   } else if (hvcsd->port.count < 0) {
> @@ -1237,6 +1230,13 @@ static void hvcs_cleanup(struct tty_struct * tty)
>  {
>   struct hvcs_struct *hvcsd = tty->driver_data;
>  
> + /*
> +  * This line is important because it tells hvcs_open that this
> +  * device needs to be re-configured the next time hvcs_open is
> +  * called.
> +  */
> + tty->driver_data = NULL;
> +
>   tty_port_put(>port);
>  }
>  
> 

thanks,
-- 
js


Re: [PATCH v2] arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()

2020-09-13 Thread Anshuman Khandual



On 09/14/2020 05:17 AM, Gavin Shan wrote:
> The function __{pgd, pud, pmd, pte}_error() are introduced so that
> they can be called by {pgd, pud, pmd, pte}_ERROR(). However, some
> of the functions could never be called when the corresponding page
> table level isn't enabled. For example, __{pud, pmd}_error() are
> unused when PUD and PMD are folded to PGD.

Right, it makes sense not to have these helpers generally available.
Given pxx_ERROR() is enabled only when required page table level is
available, with a CONFIG_PGTABLE_LEVEL check.

> 
> This removes __{pgd, pud, pmd, pte}_error() and call pr_err() from
> {pgd, pud, pmd, pte}_ERROR() directly, similar to what x86/powerpc
> are doing. With this, the code looks a bit simplified either.

Do we need p4d_ERROR() here as well !

> 
> Signed-off-by: Gavin Shan 
> ---
> v2: Fix build warning caused by wrong printk format
> ---
>  arch/arm64/include/asm/pgtable.h | 17 -
>  arch/arm64/kernel/traps.c| 20 
>  2 files changed, 8 insertions(+), 29 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h 
> b/arch/arm64/include/asm/pgtable.h
> index d5d3fbe73953..e0ab81923c30 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -35,11 +35,6 @@
>  
>  extern struct page *vmemmap;
>  
> -extern void __pte_error(const char *file, int line, unsigned long val);
> -extern void __pmd_error(const char *file, int line, unsigned long val);
> -extern void __pud_error(const char *file, int line, unsigned long val);
> -extern void __pgd_error(const char *file, int line, unsigned long val);
> -
>  #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>  #define __HAVE_ARCH_FLUSH_PMD_TLB_RANGE
>  
> @@ -57,7 +52,8 @@ extern void __pgd_error(const char *file, int line, 
> unsigned long val);
>  extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
>  #define ZERO_PAGE(vaddr) phys_to_page(__pa_symbol(empty_zero_page))
>  
> -#define pte_ERROR(pte)   __pte_error(__FILE__, __LINE__, 
> pte_val(pte))
> +#define pte_ERROR(e) \
> + pr_err("%s:%d: bad pte %016llx.\n", __FILE__, __LINE__, pte_val(e))
>  
>  /*
>   * Macros to convert between a physical address and its placement in a
> @@ -541,7 +537,8 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd)
>  
>  #if CONFIG_PGTABLE_LEVELS > 2
>  
> -#define pmd_ERROR(pmd)   __pmd_error(__FILE__, __LINE__, 
> pmd_val(pmd))
> +#define pmd_ERROR(e) \
> + pr_err("%s:%d: bad pmd %016llx.\n", __FILE__, __LINE__, pmd_val(e))
>  
>  #define pud_none(pud)(!pud_val(pud))
>  #define pud_bad(pud) (!(pud_val(pud) & PUD_TABLE_BIT))
> @@ -608,7 +605,8 @@ static inline unsigned long pud_page_vaddr(pud_t pud)
>  
>  #if CONFIG_PGTABLE_LEVELS > 3
>  
> -#define pud_ERROR(pud)   __pud_error(__FILE__, __LINE__, 
> pud_val(pud))
> +#define pud_ERROR(e) \
> + pr_err("%s:%d: bad pud %016llx.\n", __FILE__, __LINE__, pud_val(e))
>  
>  #define p4d_none(p4d)(!p4d_val(p4d))
>  #define p4d_bad(p4d) (!(p4d_val(p4d) & 2))
> @@ -667,7 +665,8 @@ static inline unsigned long p4d_page_vaddr(p4d_t p4d)
>  
>  #endif  /* CONFIG_PGTABLE_LEVELS > 3 */
>  
> -#define pgd_ERROR(pgd)   __pgd_error(__FILE__, __LINE__, 
> pgd_val(pgd))
> +#define pgd_ERROR(e) \
> + pr_err("%s:%d: bad pgd %016llx.\n", __FILE__, __LINE__, pgd_val(e))

A line break in these macros might not be required any more, as checkpatch.pl
now accepts bit longer lines.

>  
>  #define pgd_set_fixmap(addr) ((pgd_t *)set_fixmap_offset(FIX_PGD, addr))
>  #define pgd_clear_fixmap()   clear_fixmap(FIX_PGD)
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 13ebd5ca2070..12fba7136dbd 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -935,26 +935,6 @@ asmlinkage void enter_from_user_mode(void)
>  }
>  NOKPROBE_SYMBOL(enter_from_user_mode);
>  
> -void __pte_error(const char *file, int line, unsigned long val)
> -{
> - pr_err("%s:%d: bad pte %016lx.\n", file, line, val);
> -}
> -
> -void __pmd_error(const char *file, int line, unsigned long val)
> -{
> - pr_err("%s:%d: bad pmd %016lx.\n", file, line, val);
> -}
> -
> -void __pud_error(const char *file, int line, unsigned long val)
> -{
> - pr_err("%s:%d: bad pud %016lx.\n", file, line, val);
> -}
> -
> -void __pgd_error(const char *file, int line, unsigned long val)
> -{
> - pr_err("%s:%d: bad pgd %016lx.\n", file, line, val);
> -}

While moving %016lx now becomes %016llx. I guess this should be okay.
Looks much cleaner to have removed these helpers from trap.c

> -
>  /* GENERIC_BUG traps */
>  
>  int is_valid_bugaddr(unsigned long addr)
> 


[PATCH] MAINTAINERS: make linux-mediatek list remarks consistent

2020-09-13 Thread Lukas Bulwahn
Commit 637cfacae96f ("PCI: mediatek: Add MediaTek PCIe host controller
support") does not mention that linux-media...@lists.infradead.org is
moderated for non-subscribers, but the other eight entries for
linux-media...@lists.infradead.org do.

Adjust this entry to be consistent with all others.

Signed-off-by: Lukas Bulwahn 
---
applies cleanly on v5.9-rc5 and next-20200911

Ryder, please ack.

Bjorn, Matthias, please pick this minor non-urgent clean-up patch.

This patch submission will also show me if linux-mediatek is moderated or
not. I have not subscribed to linux-mediatek and if it shows up quickly in
the archive, the list is probably not moderated; and if it takes longer, it
is moderated, and hence, validating the patch.

 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5e6e36542c62..83c83d7ef2a5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13485,7 +13485,7 @@ F:  drivers/pci/controller/dwc/pcie-histb.c
 PCIE DRIVER FOR MEDIATEK
 M: Ryder Lee 
 L: linux-...@vger.kernel.org
-L: linux-media...@lists.infradead.org
+L: linux-media...@lists.infradead.org (moderated for non-subscribers)
 S: Supported
 F: Documentation/devicetree/bindings/pci/mediatek*
 F: drivers/pci/controller/*mediatek*
-- 
2.17.1



Re: [PATCH 0/5] Some improvements for blk-throttle

2020-09-13 Thread Baolin Wang
Hi Tejun and Jens,

On Mon, Sep 07, 2020 at 04:10:12PM +0800, Baolin Wang wrote:
> Hi All,
> 
> This patch set did some clean-ups, as well as removing some unnecessary
> bps/iops limitation calculation when checking if can dispatch a bio or
> not for a tg. Please help to review. Thanks.

Any comments for this patch set?

> 
> Baolin Wang (5):
>   blk-throttle: Fix some comments' typos
>   blk-throttle: Use readable READ/WRITE macros
>   blk-throttle: Define readable macros instead of static variables
>   blk-throttle: Avoid calculating bps/iops limitation repeatedly
>   blk-throttle: Avoid checking bps/iops limitation if bps or iops is
> unlimited
> 
>  block/blk-throttle.c | 59 
> 
>  1 file changed, 36 insertions(+), 23 deletions(-)
> 
> -- 
> 1.8.3.1


Re: linux-next: build warning after merge of the dmaengine tree

2020-09-13 Thread Vinod Koul
Hi Stephen,

On 14-09-20, 14:29, Stephen Rothwell wrote:
> Hi all,
> 
> After merging the dmaengine tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
> 
> drivers/dma/sf-pdma/sf-pdma.c: In function 'sf_pdma_donebh_tasklet':
> drivers/dma/sf-pdma/sf-pdma.c:287:23: warning: unused variable 'desc' 
> [-Wunused-variable]
>   287 |  struct sf_pdma_desc *desc = chan->desc;
>   |   ^~~~
> 
> Introduced by commit
> 
>   8f6b6d060602 ("dmaengine: sf-pdma: Fix an error that calls callback twice")

Thanks for the report. The function directly uses chan->desc, so yes
this can be removed, Sending patch shortly

Thanks
-- 
~Vinod


Re: [PATCH net-next] octeontx2-af: Constify npc_kpu_profile_{action,cam}

2020-09-13 Thread Joe Perches
On Sat, 2020-09-12 at 00:00 +0200, Rikard Falkeborn wrote:
> These are never modified, so constify them to allow the compiler to
> place them in read-only memory. This moves about 25kB to read-only
> memory as seen by the output of the size command.

Nice.

Did you find this by tool or inspection?




Re: [PATCH net v2] drivers/net/wan/hdlc_fr: Add needed_headroom for PVC devices

2020-09-13 Thread Krzysztof Hałasa
Xie He  writes:

> The HDLC device is not actually prepending any header when it is used
> with this driver. When the PVC device has prepended its header and
> handed over the skb to the HDLC device, the HDLC device just hands it
> over to the hardware driver for transmission without prepending any
> header.

That's correct. IIRC:
- Cisco and PPP modes add 4 bytes
- Frame Relay adds 4 (specific protocols - mostly IPv4) or 10 (general
  case) bytes. There is that pvcX->hdlcX transition which adds nothing
  (the header is already in place when the packet leaves pvcX device).
- Raw mode adds nothing (IPv4 only, though it could be modified for
  both IPv4/v6 easily)
- Ethernet (hdlc_raw_eth.c) adds normal Ethernet header.

(I had been "unplugged" for some time).
-- 
Krzysztof Halasa

Sieć Badawcza Łukasiewicz
Przemysłowy Instytut Automatyki i Pomiarów PIAP
Al. Jerozolimskie 202, 02-486 Warszawa


Re: [PATCH] brcmfmac: initialize variable

2020-09-13 Thread Arend Van Spriel

On September 13, 2020 4:35:44 PM t...@redhat.com wrote:


From: Tom Rix 

clang static analysis flags this problem
sdio.c:3265:13: warning: Branch condition evaluates to
 a garbage value
   } else if (pending) {
  ^~~

brcmf_sdio_dcmd_resp_wait() only sets pending to true.
So pending needs to be initialized to false.


True. However, I prefer to fix it in brcmf_sdio_dcmd_resp_wait() and say:

*pending = signal_pending(current);

Regards,
Arend


Fixes: 5b435de0d786 ("net: wireless: add brcm80211 drivers")
Signed-off-by: Tom Rix 
---
drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)






smime.p7s
Description: S/MIME Cryptographic Signature


Re: [RFC 00/26] perf: Add mmap3 support

2020-09-13 Thread Namhyung Kim
Hi Jiri,

On Mon, Sep 14, 2020 at 6:03 AM Jiri Olsa  wrote:
>
> hi,
> while playing with perf daemon support I realized I need
> the build id data in mmap events, so we don't need to care
> about removed/updated binaries during long perf runs.
>
> This RFC patchset adds new mmap3 events that copies mmap2
> event and adds build id in it. It makes mmap3 the default
> mmap event for synthesizing kernel/modules/tasks and adds
> some tooling enhancements to enable the workflow below.

Cool! It's nice that we can skip the final build-id collection stage
with this while data size will be bigger.

Thanks
Namhyung


dma-coherent property for PCIe Root

2020-09-13 Thread Valmiki

Hi All,

How does "dma-coherent" property will work for PCIe as RC on an
ARM SOC ?
Because the end point device drivers are the one which will request dma 
buffers and Root port driver doesn't involve in data path of end point

except for handling interrupts.

How does EP DMA buffers will be hardware coherent if RC driver exposes
dma-coherent property ?

Regards,
Valmiki




RE: drivers/net/wireless/realtek/rtw88/pci.c:1477:5: warning: no previous prototype for 'rtw_pci_probe'

2020-09-13 Thread Tony Chuang
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   ef2e9a563b0cd7965e2a1263125dcbb1c86aa6cc
> commit: ba0fbe236fb8a7b992e82d6eafb03a600f5eba43 rtw88: extract: make
> 8822c an individual kernel module
> date:   4 months ago
> config: i386-randconfig-r034-20200913 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> reproduce (this is a W=1 build):
> git checkout ba0fbe236fb8a7b992e82d6eafb03a600f5eba43
> # save the attached .config to linux build tree
> make W=1 ARCH=i386
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> All warnings (new ones prefixed by >>):
> 
> >> drivers/net/wireless/realtek/rtw88/pci.c:1477:5: warning: no previous
> >> prototype for 'rtw_pci_probe' [-Wmissing-prototypes]
> 1477 | int rtw_pci_probe(struct pci_dev *pdev,
>  | ^
> >> drivers/net/wireless/realtek/rtw88/pci.c:1557:6: warning: no previous
> >> prototype for 'rtw_pci_remove' [-Wmissing-prototypes]
> 1557 | void rtw_pci_remove(struct pci_dev *pdev)
>  |  ^~
> >> drivers/net/wireless/realtek/rtw88/pci.c:1579:6: warning: no previous
> >> prototype for 'rtw_pci_shutdown' [-Wmissing-prototypes]
> 1579 | void rtw_pci_shutdown(struct pci_dev *pdev)
>  |  ^~~~
> 
> #
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b
> a0fbe236fb8a7b992e82d6eafb03a600f5eba43
> git remote add linus
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> git fetch --no-tags linus master
> git checkout ba0fbe236fb8a7b992e82d6eafb03a600f5eba43
> vim +/rtw_pci_probe +1477 drivers/net/wireless/realtek/rtw88/pci.c
> 
> 79066903454b0fe Yu-Yen Ting  2019-09-03  1476
> 72f256c2b948622 Zong-Zhe Yang2020-05-15 @1477  int
> rtw_pci_probe(struct pci_dev *pdev,
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1478   const
> struct pci_device_id *id)
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1479  {
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1480 struct
> ieee80211_hw *hw;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1481 struct rtw_dev
> *rtwdev;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1482 int
> drv_data_size;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1483 int ret;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1484
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1485 drv_data_size
> = sizeof(struct rtw_dev) + sizeof(struct rtw_pci);
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1486 hw =
> ieee80211_alloc_hw(drv_data_size, _ops);
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1487 if (!hw) {
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1488
>   dev_err(>dev, "failed to allocate hw\n");
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1489 return
> -ENOMEM;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1490 }
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1491
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1492 rtwdev =
> hw->priv;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1493 rtwdev->hw =
> hw;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1494 rtwdev->dev =
> >dev;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1495 rtwdev->chip =
> (struct rtw_chip_info *)id->driver_data;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1496
>   rtwdev->hci.ops = _pci_ops;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1497
>   rtwdev->hci.type = RTW_HCI_TYPE_PCIE;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1498
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1499 ret =
> rtw_core_init(rtwdev);
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1500 if (ret)
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1501 goto
> err_release_hw;
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1502
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1503
>   rtw_dbg(rtwdev, RTW_DBG_PCI,
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1504 "rtw88 pci
> probe: vendor=0x%4.04X device=0x%4.04X rev=%d\n",
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1505
>   pdev->vendor, pdev->device, pdev->revision);
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1506
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1507 ret =
> rtw_pci_claim(rtwdev, pdev);
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1508 if (ret) {
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1509
>   rtw_err(rtwdev, "failed to claim pci device\n");
> e3037485c68ec1a Yan-Hsuan Chuang 2019-04-26  1510 goto
> 

Re: [PATCH v2 12/14] habanalabs/gaudi: Add ethtool support using coresight

2020-09-13 Thread Oded Gabbay
On Mon, Sep 14, 2020 at 4:37 AM Andrew Lunn  wrote:
>
> > +static int gaudi_nic_get_module_eeprom(struct net_device *netdev,
> > + struct ethtool_eeprom *ee, u8 *data)
> > +{
> > + struct gaudi_nic_device **ptr = netdev_priv(netdev);
> > + struct gaudi_nic_device *gaudi_nic = *ptr;
> > + struct hl_device *hdev = gaudi_nic->hdev;
> > +
> > + if (!ee->len)
> > + return -EINVAL;
> > +
> > + memset(data, 0, ee->len);
> > + memcpy(data, hdev->asic_prop.cpucp_nic_info.qsfp_eeprom, ee->len);
> > +
>
> You memset and then memcpy the same number of bytes?
Thanks for catching this, we will fix it.

>
> You also need to validate ee->offset, and ee->len. Otherwise this is a
> vector for user space to read kernel memory after
> hdev->asic_prop.cpucp_nic_info.qsfp_eeprom. See drivers/net/phy/sfp.c:
> sfp_module_eeprom() as a good example of this validation.
>
> Andrew

Thanks for the pointer, we will take a look and fix it.
Oded


Re: [PATCH v2 12/14] habanalabs/gaudi: Add ethtool support using coresight

2020-09-13 Thread Oded Gabbay
On Mon, Sep 14, 2020 at 4:39 AM Florian Fainelli  wrote:
>
>
>
> On 9/12/2020 7:41 AM, Oded Gabbay wrote:
> > From: Omer Shpigelman 
> >
> > The driver supports ethtool callbacks and provides statistics using the
> > device's profiling infrastructure (coresight).
>
> Is there any relationship near or far with ARM's CoreSight:
>
> https://developer.arm.com/ip-products/system-ip/coresight-debug-and-trace
>
> if not, should you rename this?
> --
> Florian

We have a cortex A53 inside our ASIC and we use other ARM IPs.
One of those IPs is the CoreSight infrastructure for trace and profiling.
It also provides us with something they call SPMU (performance monitoring).
Those units provide us counters per port with which we provide the statistics.

Thanks,
Oded


[PATCH v1 1/1] mmc: sdhci-of-arasan: Enable UHS-1 support for Keem Bay SOC

2020-09-13 Thread muhammad . husaini . zulkifli
From: Muhammad Husaini Zulkifli 

Voltage switching sequence is needed to support UHS-1 interface
as Keem Bay EVM is using external voltage regulator to switch between
1.8V and 3.3V.

Signed-off-by: Muhammad Husaini Zulkifli 
Reviewed-by: Andy Shevchenko 
Reviewed-by: Adrian Hunter 
---
 drivers/mmc/host/sdhci-of-arasan.c | 140 +
 1 file changed, 140 insertions(+)

diff --git a/drivers/mmc/host/sdhci-of-arasan.c 
b/drivers/mmc/host/sdhci-of-arasan.c
index f186fbd016b1..c133408d0c74 100644
--- a/drivers/mmc/host/sdhci-of-arasan.c
+++ b/drivers/mmc/host/sdhci-of-arasan.c
@@ -16,7 +16,9 @@
  */
 
 #include 
+#include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -41,6 +43,11 @@
 #define SDHCI_ITAPDLY_ENABLE   0x100
 #define SDHCI_OTAPDLY_ENABLE   0x40
 
+/* Setting for Keem Bay IO Pad 1.8 Voltage Selection */
+#define KEEMBAY_AON_SIP_FUNC_ID0x8200ff26
+#define KEEMBAY_AON_SET_1V8_VOLT   0x01
+#define KEEMBAY_AON_SET_3V3_VOLT   0x00
+
 /* Default settings for ZynqMP Clock Phases */
 #define ZYNQMP_ICLK_PHASE {0, 63, 63, 0, 63,  0,   0, 183, 54,  0, 0}
 #define ZYNQMP_OCLK_PHASE {0, 72, 60, 0, 60, 72, 135, 48, 72, 135, 0}
@@ -150,6 +157,7 @@ struct sdhci_arasan_data {
struct regmap   *soc_ctl_base;
const struct sdhci_arasan_soc_ctl_map *soc_ctl_map;
unsigned intquirks;
+   struct gpio_desc *uhs_gpio;
 
 /* Controller does not have CD wired and will not function normally without */
 #define SDHCI_ARASAN_QUIRK_FORCE_CDTESTBIT(0)
@@ -361,6 +369,121 @@ static int sdhci_arasan_voltage_switch(struct mmc_host 
*mmc,
return -EINVAL;
 }
 
+static int sdhci_arasan_keembay_set_voltage(int volt)
+{
+#if IS_ENABLED(CONFIG_HAVE_ARM_SMCCC)
+   struct arm_smccc_res res;
+
+   arm_smccc_smc(KEEMBAY_AON_SIP_FUNC_ID, volt, 0, 0, 0, 0, 0, 0, );
+   if (res.a0)
+   return -EINVAL;
+   return 0;
+#else
+   return -EINVAL;
+#endif
+}
+
+static int sdhci_arasan_keembay_voltage_switch(struct mmc_host *mmc,
+  struct mmc_ios *ios)
+{
+   struct sdhci_host *host = mmc_priv(mmc);
+   struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
+   struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
+   u16 ctrl_2;
+   u16 clk;
+   int ret;
+
+   switch (ios->signal_voltage) {
+   case MMC_SIGNAL_VOLTAGE_180:
+   clk  = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+   clk &= ~SDHCI_CLOCK_CARD_EN;
+   sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
+
+   clk  = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+   if (clk & SDHCI_CLOCK_CARD_EN)
+   return -EAGAIN;
+
+   sdhci_writeb(host, SDHCI_POWER_ON | SDHCI_POWER_180,
+  SDHCI_POWER_CONTROL);
+
+   /* Set VDDIO_B voltage to Low for 1.8V */
+   gpiod_set_value_cansleep(sdhci_arasan->uhs_gpio, 0);
+
+   /*
+* This is like final gatekeeper. Need to ensure changed voltage
+* is settled before and after turn on this bit.
+*/
+   usleep_range(1000, 1100);
+
+   ret = 
sdhci_arasan_keembay_set_voltage(KEEMBAY_AON_SET_1V8_VOLT);
+   if (ret)
+   return ret;
+
+   usleep_range(1000, 1100);
+
+   ctrl_2 = sdhci_readw(host, SDHCI_HOST_CONTROL2);
+   ctrl_2 |= SDHCI_CTRL_VDD_180;
+   sdhci_writew(host, ctrl_2, SDHCI_HOST_CONTROL2);
+
+   /* Sleep for 5ms to stabilize 1.8V regulator */
+   usleep_range(5000, 5500);
+
+   /* 1.8V regulator output should be stable within 5 ms */
+   ctrl_2 = sdhci_readw(host, SDHCI_HOST_CONTROL2);
+   if (!(ctrl_2 & SDHCI_CTRL_VDD_180))
+   return -EAGAIN;
+
+   clk  = sdhci_readw(host, SDHCI_CLOCK_CONTROL);
+   clk |= SDHCI_CLOCK_CARD_EN;
+   sdhci_writew(host, clk, SDHCI_CLOCK_CONTROL);
+   break;
+   case MMC_SIGNAL_VOLTAGE_330:
+   /* Set VDDIO_B voltage to High for 3.3V */
+   gpiod_set_value_cansleep(sdhci_arasan->uhs_gpio, 1);
+
+   /*
+* This is like final gatekeeper. Need to ensure changed voltage
+* is settled before and after turn on this bit.
+*/
+   usleep_range(1000, 1100);
+
+   ret = 
sdhci_arasan_keembay_set_voltage(KEEMBAY_AON_SET_3V3_VOLT);
+   if (ret)
+   return ret;
+
+   usleep_range(1000, 1100);
+
+   /* Set 1.8V Signal Enable in the Host Control2 register to 0 */
+   ctrl_2 = sdhci_readw(host, SDHCI_HOST_CONTROL2);
+   ctrl_2 &= ~SDHCI_CTRL_VDD_180;
+   sdhci_writew(host, ctrl_2, 

Re: [Linux-kernel-mentees] [PATCH] net: fix uninit value error in __sys_sendmmsg

2020-09-13 Thread Anant Thazhemadam
I can assure you that when I said "I think", I meant it in an assertive manner,
and not an assumptive one, but I can understand how that could easily get lost 
in translation.
I wouldn't have sent in the patch if I had caught the build warning, and once 
again, my apologies for not fixing it sooner, like I should have.
I didn't mean to disrespect or offend anyone, and it definitely wasn't my 
intention to waste anybody's time. Needless to say, something like this won't 
happen again from my end. :)
I have sent in a v2 for this, which doesn't add a build warning to the system.
Thank you for your time, and once again, my apologies.

Thanks,
Anant


Re: [PATCH v2 2/3] soundwire: SDCA: add helper macro to access controls

2020-09-13 Thread Vinod Koul
Hi Pierre,

On 11-09-20, 09:50, Pierre-Louis Bossart wrote:
> > > > > > > > > + *   25  0 (Reserved)
> > > > > > > > > + *   24:22   Function Number [2:0]
> > > > > > > > > + *   21  Entity[6]
> > > > > > > > > + *   20:19   Control Selector[5:4]
> > > > > > > > > + *   18  0 (Reserved)
> > > > > > > > > + *   17:15   Control Number[5:3]
> > > > > > > > > + *   14  Next
> > > > > > > > > + *   13  MBQ
> > > > > > > > > + *   12:7Entity[5:0]
> > > > > > > > > + *   6:3 Control Selector[3:0]
> > > > > > > > > + *   2:0 Control Number[2:0]
> 
> [...]
> 
> > > > > 
> > > > > #define SDCA_CONTROL_DEST_MASK1 GENMASK(20, 19)
> > > > > #define SDCA_CONTROL_ORIG_MASK1 GENMASK(5, 4)
> > > > > #define SDCA_CONTROL_DEST_MASK2 GENMASK(6, 3)
> > > > > #define SDCA_CONTROL_ORIG_MASK2 GENMASK(3, 0)
> > 
> > I think I missed ORIG and DEST stuff, what does this mean here?
> 
> If you missed this, it means my explanations are not good enough and I need
> to make it clearer in the commit log/documentation. Point taken, I'll
> improve this for the next version.
> 
> > Relooking at the bit definition, for example 'Control Number' is defined
> > in both 17:15 as well as 2:0, why is that. Is it split?
> > 
> > How does one program a control number into this?
> 
> A Control Number is represented on 6 bits.
> 
> See the documentation above.
> 
>   17:15   Control Selector[5:3]
>   2:0 Control Selector[2:0]
> 
> The 3 MSBs for into bits 17:15 of the address, and the 3 LSBs into bits 2:0
> of the address. The second part is simpler for Control Number but for
> entities and control selectors the LSB positions don't match.
> 
> Yes it's convoluted but it was well-intended: in most cases, there is a
> limited number of entities, control selectors, channel numbers, and putting
> the LSBs together in the 16-LSB of the address helps avoid reprogramming
> paging registers: all the addresses for a given function typically map into
> the same page.
> 
> That said, I am not sure the optimization is that great in the end, because
> we end-up having to play with bits for each address. Fewer changes of the
> paging registers but tons of operations in the core.
> 
> I wasn't around when this mapping was defined, and it is what is is now.
> There's hardware built based on this formula so we have to make it work.
> 
> Does this clarify the usage?

Thanks, that is very helpful. I have overlooked this bit.

For LSB bits, I dont think this is an issue. I expect it to work, for example:
#define CONTROL_LSB_MASK  GENMASK(2, 0)
foo |= u32_encode_bits(control, CONTROL_LSB_MASK);

would mask the control value and program that in specific bitfeild.

But for MSB bits, I am not sure above will work so, you may need to extract
the bits and then use, for example:
#define CONTROL_MSB_BITSGENMASK(5, 3)
#define CONTROL_MSB_MASKGENMASK(17, 15)

control = FIELD_GET(CONTROL_MSB_BITS, control);
foo |= u32_encode_bits(control, CONTROL_MSB_MASK);

> If you have a better suggestion that the FIELD_PREP/FIELD_GET use, I am all
> ears. At the end of the day, the mapping is pre-defined and we don't have
> any degree of freedom. What I do want is that this macro/inline function is
> shared by all codec drivers so that we don't have different interpretations
> of how the address is constructed.

Absolutely, this need to be defined here and used by everyone else.

-- 
~Vinod


Re: [PATCH v4 5/5] iommu/vt-d: Add is_aux_domain support

2020-09-13 Thread Lu Baolu

Hi Alex,

On 9/11/20 6:05 AM, Alex Williamson wrote:

On Tue,  1 Sep 2020 11:34:22 +0800
Lu Baolu  wrote:


With subdevice information opt-in through iommu_ops.aux_at(de)tach_dev()
interfaces, the vendor iommu driver is able to learn the knowledge about
the relationships between the subdevices and the aux-domains. Implement
is_aux_domain() support based on the relationship knowledges.

Signed-off-by: Lu Baolu 
---
  drivers/iommu/intel/iommu.c | 125 ++--
  include/linux/intel-iommu.h |  17 +++--
  2 files changed, 103 insertions(+), 39 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3c12fd06856c..50431c7b2e71 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -334,6 +334,8 @@ static int intel_iommu_attach_device(struct iommu_domain 
*domain,
 struct device *dev);
  static phys_addr_t intel_iommu_iova_to_phys(struct iommu_domain *domain,
dma_addr_t iova);
+static bool intel_iommu_dev_feat_enabled(struct device *dev,
+enum iommu_dev_features feat);
  
  #ifdef CONFIG_INTEL_IOMMU_DEFAULT_ON

  int dmar_disabled = 0;
@@ -1832,6 +1834,7 @@ static struct dmar_domain *alloc_domain(int flags)
domain->flags |= DOMAIN_FLAG_USE_FIRST_LEVEL;
domain->has_iotlb_device = false;
INIT_LIST_HEAD(>devices);
+   INIT_LIST_HEAD(>subdevices);
  
  	return domain;

  }
@@ -2580,7 +2583,7 @@ static struct dmar_domain 
*dmar_insert_one_dev_info(struct intel_iommu *iommu,
info->iommu = iommu;
info->pasid_table = NULL;
info->auxd_enabled = 0;
-   INIT_LIST_HEAD(>auxiliary_domains);
+   INIT_LIST_HEAD(>subdevices);
  
  	if (dev && dev_is_pci(dev)) {

struct pci_dev *pdev = to_pci_dev(info->dev);
@@ -5137,21 +5140,28 @@ static void intel_iommu_domain_free(struct iommu_domain 
*domain)
domain_exit(to_dmar_domain(domain));
  }
  
-/*

- * Check whether a @domain could be attached to the @dev through the
- * aux-domain attach/detach APIs.
- */
-static inline bool
-is_aux_domain(struct device *dev, struct iommu_domain *domain)
+/* Lookup subdev_info in the domain's subdevice siblings. */
+static struct subdev_info *
+subdev_lookup_domain(struct dmar_domain *domain, struct device *dev,
+struct device *subdev)
  {
-   struct device_domain_info *info = get_domain_info(dev);
+   struct subdev_info *sinfo = NULL, *tmp;
  
-	return info && info->auxd_enabled &&

-   domain->type == IOMMU_DOMAIN_UNMANAGED;
+   assert_spin_locked(_domain_lock);
+
+   list_for_each_entry(tmp, >subdevices, link_domain) {
+   if ((!dev || tmp->pdev == dev) && tmp->dev == subdev) {
+   sinfo = tmp;
+   break;
+   }
+   }
+
+   return sinfo;
  }
  
-static void auxiliary_link_device(struct dmar_domain *domain,

- struct device *dev)
+static void
+subdev_link_device(struct dmar_domain *domain, struct device *dev,
+  struct subdev_info *sinfo)
  {
struct device_domain_info *info = get_domain_info(dev);
  
@@ -5159,12 +5169,13 @@ static void auxiliary_link_device(struct dmar_domain *domain,

if (WARN_ON(!info))
return;
  
-	domain->auxd_refcnt++;

-   list_add(>auxd, >auxiliary_domains);
+   list_add(>subdevices, >link_phys);
+   list_add(>subdevices, >link_domain);
  }
  
-static void auxiliary_unlink_device(struct dmar_domain *domain,

-   struct device *dev)
+static void
+subdev_unlink_device(struct dmar_domain *domain, struct device *dev,
+struct subdev_info *sinfo)
  {
struct device_domain_info *info = get_domain_info(dev);
  
@@ -5172,24 +5183,30 @@ static void auxiliary_unlink_device(struct dmar_domain *domain,

if (WARN_ON(!info))
return;
  
-	list_del(>auxd);

-   domain->auxd_refcnt--;
+   list_del(>link_phys);
+   list_del(>link_domain);
+   kfree(sinfo);
  
-	if (!domain->auxd_refcnt && domain->default_pasid > 0)

+   if (list_empty(>subdevices) && domain->default_pasid > 0)
ioasid_free(domain->default_pasid);
  }
  
-static int aux_domain_add_dev(struct dmar_domain *domain,

- struct device *dev)
+static int aux_domain_add_dev(struct dmar_domain *domain, struct device *dev,
+ struct device *subdev)
  {
int ret;
unsigned long flags;
struct intel_iommu *iommu;
+   struct subdev_info *sinfo;
  
  	iommu = device_to_iommu(dev, NULL, NULL);

if (!iommu)
return -ENODEV;
  
+	sinfo = kzalloc(sizeof(*sinfo), GFP_KERNEL);

+   if (!sinfo)
+   return -ENOMEM;
+
if (domain->default_pasid 

Re: [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr()

2020-09-13 Thread Hao Luo
Thanks for taking a look!

On Fri, Sep 4, 2020 at 1:09 PM Andrii Nakryiko
 wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo  wrote:
> >
> > Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
> > helper always returns a valid pointer, therefore no need to check
> > returned value for NULL. Also note that all programs run with
> > preemption disabled, which means that the returned pointer is stable
> > during all the execution of the program.
> >
> > Signed-off-by: Hao Luo 
> > ---
>
> looks good, few small things, but otherwise:
>
> Acked-by: Andrii Nakryiko 
>
[...]
> >
> >  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF 
> > programs
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index d0ec94d5bdbf..e7ca91c697ed 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3612,6 +3612,19 @@ union bpf_attr {
> >   * bpf_per_cpu_ptr() must check the returned value.
> >   * Return
> >   * A generic pointer pointing to the kernel percpu variable on 
> > *cpu*.
> > + *
> > + * void *bpf_this_cpu_ptr(const void *percpu_ptr)
> > + * Description
> > + * Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> > + * pointer to the percpu kernel variable on this cpu. See the
> > + * description of 'ksym' in **bpf_per_cpu_ptr**\ ().
> > + *
> > + * bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() 
> > in
> > + * the kernel. Different from **bpf_per_cpu_ptr**\ (), it would
> > + * never return NULL.
> > + * Return
> > + * A generic pointer pointing to the kernel percpu variable on
>
> what's "a generic pointer"? is it as opposed to sk_buff pointer or something?
>

Ack. "A pointer" should be good enough. I wrote "generic pointer"
because the per_cpu_ptr() in kernel code is a macro, whose returned
value is a typed pointer, IIUC. But here we are missing the type. This
is another difference between this helper and per_cpu_ptr(). But this
may not matter.

> >  /* integer value in 'imm' field of BPF_CALL instruction selects which 
> > helper
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a702600ff581..e070d2abc405 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5016,8 +5016,10 @@ static int check_helper_call(struct bpf_verifier_env 
> > *env, int func_id, int insn
> > regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> > regs[BPF_REG_0].id = ++env->id_gen;
> > regs[BPF_REG_0].mem_size = meta.mem_size;
> > -   } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
> > +   } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL ||
> > +  fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID) {
> > const struct btf_type *t;
> > +   bool not_null = fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID;
>
> nit: this is fine, but I'd inline it below
>

Ack.

> >
> > mark_reg_known_zero(env, regs, BPF_REG_0);
> > t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, 
> > NULL);
> > @@ -5034,10 +5036,12 @@ static int check_helper_call(struct 
> > bpf_verifier_env *env, int func_id, int insn
> > tname, PTR_ERR(ret));
> > return -EINVAL;
> > }
> > -   regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> > +   regs[BPF_REG_0].type = not_null ?
> > +   PTR_TO_MEM : PTR_TO_MEM_OR_NULL;
> > regs[BPF_REG_0].mem_size = tsize;
> > } else {
> > -   regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
> > +   regs[BPF_REG_0].type = not_null ?
> > +   PTR_TO_BTF_ID : PTR_TO_BTF_ID_OR_NULL;
> > regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > }
> > } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index d474c1530f87..466acf82a9c7 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1160,6 +1160,18 @@ static const struct bpf_func_proto 
> > bpf_per_cpu_ptr_proto = {
> > .arg2_type  = ARG_ANYTHING,
> >  };
> >
> > +BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr)
> > +{
> > +   return (u64)this_cpu_ptr(percpu_ptr);
>
> see previous comment, this might trigger unnecessary compilation
> warnings on 32-bit arches
>

Ack. Will cast to "unsigned long". Thanks for catching this!


> > +}
> > +
> > +static const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
> > +   .func   = bpf_this_cpu_ptr,
> > +   .gpl_only   = false,
> > +   .ret_type   = RET_PTR_TO_MEM_OR_BTF_ID,
> > +   .arg1_type  = 

Re: [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr()

2020-09-13 Thread Hao Luo
Thanks for review, Andrii.

One question, should I add bpf_{per, this}_cpu_ptr() to the
bpf_base_func_proto() in kernel/bpf/helpers.c?

On Fri, Sep 4, 2020 at 1:04 PM Andrii Nakryiko
 wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo  wrote:
> >
> > Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
> > bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
> > except that it may return NULL. This happens when the cpu parameter is
> > out of range. So the caller must check the returned value.
> >
> > Acked-by: Andrii Nakryiko 
> > Signed-off-by: Hao Luo 
> > ---
> >  include/linux/bpf.h|  3 ++
> >  include/linux/btf.h| 11 ++
> >  include/uapi/linux/bpf.h   | 17 +
> >  kernel/bpf/btf.c   | 10 --
> >  kernel/bpf/verifier.c  | 66 +++---
> >  kernel/trace/bpf_trace.c   | 18 ++
> >  tools/include/uapi/linux/bpf.h | 17 +
> >  7 files changed, 128 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index c6d9f2c444f4..6b2034f7665e 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -292,6 +292,7 @@ enum bpf_arg_type {
> > ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory 
> > */
> > ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated 
> > memory or NULL */
> > ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes 
> > requested */
> > +   ARG_PTR_TO_PERCPU_BTF_ID,   /* pointer to in-kernel percpu type 
> > */
> >  };
> >
> >  /* type of values returned from helper functions */
> > @@ -305,6 +306,7 @@ enum bpf_return_type {
> > RET_PTR_TO_SOCK_COMMON_OR_NULL, /* returns a pointer to a 
> > sock_common or NULL */
> > RET_PTR_TO_ALLOC_MEM_OR_NULL,   /* returns a pointer to dynamically 
> > allocated memory or NULL */
> > RET_PTR_TO_BTF_ID_OR_NULL,  /* returns a pointer to a btf_id or 
> > NULL */
> > +   RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid 
> > memory or a btf_id or NULL */
> >  };
> >
> >  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF 
> > programs
> > @@ -385,6 +387,7 @@ enum bpf_reg_type {
> > PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or 
> > NULL */
> > PTR_TO_RDWR_BUF, /* reg points to a read/write buffer */
> > PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or 
> > NULL */
> > +   PTR_TO_PERCPU_BTF_ID,/* reg points to percpu kernel type */
> >  };
> >
> >  /* The information passed from prog-specific *_is_valid_access
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index 592373d359b9..07b7de1c05b0 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -71,6 +71,11 @@ btf_resolve_size(const struct btf *btf, const struct 
> > btf_type *type,
> >  i < btf_type_vlen(struct_type);\
> >  i++, member++)
> >
> > +#define for_each_vsi(i, struct_type, member)   \
>
> datasec_type?
>

Hmmm, right. It seems to come when copy-pasted from "for_each_member".

> > +   for (i = 0, member = btf_type_var_secinfo(struct_type); \
> > +i < btf_type_vlen(struct_type);\
> > +i++, member++)
> > +
> >  static inline bool btf_type_is_ptr(const struct btf_type *t)
> >  {
> > return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
> > @@ -155,6 +160,12 @@ static inline const struct btf_member 
> > *btf_type_member(const struct btf_type *t)
> > return (const struct btf_member *)(t + 1);
> >  }
> >
> > +static inline const struct btf_var_secinfo *btf_type_var_secinfo(
> > +   const struct btf_type *t)
> > +{
> > +   return (const struct btf_var_secinfo *)(t + 1);
> > +}
> > +
> >  #ifdef CONFIG_BPF_SYSCALL
> >  const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
> >  const char *btf_name_by_offset(const struct btf *btf, u32 offset);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index ab00ad9b32e5..d0ec94d5bdbf 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3596,6 +3596,22 @@ union bpf_attr {
> >   * the data in *dst*. This is a wrapper of copy_from_user().
> >   * Return
> >   * 0 on success, or a negative error in case of failure.
> > + *
> > + * void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
> > + * Description
> > + * Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> > + * pointer to the percpu kernel variable on *cpu*. A ksym is an
> > + * extern variable decorated with '__ksym'. For ksym, there is 
> > a
> > + * global var (either static or global) defined of the same 
> > name
> > + * in the kernel. The ksym is percpu if the global var is 

Re: [PATCH v2 1/2] scsi: ufs: Abort tasks before clear them from doorbell

2020-09-13 Thread Can Guo

On 2020-09-11 17:09, Bean Huo wrote:

On Fri, 2020-09-11 at 02:16 +, Can Guo wrote:

> >
> > So your resolution looks good to me.
> >
> > Thanks so much : )
>
> You're welcome ... but just remember I have to explain this to
> Linus
> when the merge window opens.  It would be a lot easier if this
> hadn't
> happened so please don't make it any worse ...
>
> James

Sorry that my changes got you confused and thank you for help
resolve
the
conflicts. My change ("scsi: ufs: Abort tasks before clearing them
from
doorbell") is to serve my fixes to ufs error recovery which only got
picked
up on scsi-queue-5.10. So I checked out to scsi-queue-5.10 and made
my
changes on the tip of scsi-queue-5.10, below 2 changes were not even
present in scsi-queue-5.10 back that time.


I mentioned here https://patchwork.kernel.org/patch/11734713/

this change (scsi: ufs: Abort tasks before clearing them from doorbell)
has conflicts with the scsi-fixes branch. I don't know which branch is
the main branch we should focus on.


Bean


Yeah, I know that one, but I was not even working on scsi-fixes branch
at that time. Now I have two more fixes to ufshcd_abort(), not sure
which branch I should work on, so asking the same here.

Regards,

Can Guo.


Re: [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test typed ksyms

2020-09-13 Thread Hao Luo
Thanks for taking a look, Andrii.

On Fri, Sep 4, 2020 at 12:49 PM Andrii Nakryiko
 wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo  wrote:
> >
> > Selftests for typed ksyms. Tests two types of ksyms: one is a struct,
> > the other is a plain int. This tests two paths in the kernel. Struct
> > ksyms will be converted into PTR_TO_BTF_ID by the verifier while int
> > typed ksyms will be converted into PTR_TO_MEM.
> >
> > Signed-off-by: Hao Luo 
> > ---
> >  .../testing/selftests/bpf/prog_tests/ksyms.c  | 31 +++--
> >  .../selftests/bpf/prog_tests/ksyms_btf.c  | 63 +++
> >  .../selftests/bpf/progs/test_ksyms_btf.c  | 23 +++
> >  tools/testing/selftests/bpf/trace_helpers.c   | 26 
> >  tools/testing/selftests/bpf/trace_helpers.h   |  4 ++
> >  5 files changed, 123 insertions(+), 24 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> >

[...]

> > +   percpu_datasec = btf__find_by_name_kind(btf, ".data..percpu",
> > +   BTF_KIND_DATASEC);
> > +   if (percpu_datasec < 0) {
> > +   printf("%s:SKIP:no PERCPU DATASEC in kernel btf\n",
> > +  __func__);
> > +   test__skip();
>
> leaking btf here
>
> > +   return;
> > +   }
> > +
> > +   skel = test_ksyms_btf__open_and_load();
> > +   if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))
>
> here
>

Oops. Good catches. Will fix.

> > +   return;
> > +
> > +   err = test_ksyms_btf__attach(skel);
> > +   if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
> > +   goto cleanup;
> > +
> > +   /* trigger tracepoint */
> > +   usleep(1);
> > +
> > +   data = skel->data;
> > +   CHECK(data->out__runqueues != runqueues_addr, "runqueues",
> > + "got %llu, exp %llu\n", data->out__runqueues, runqueues_addr);
> > +   CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, 
> > "bpf_prog_active",
> > + "got %llu, exp %llu\n", data->out__bpf_prog_active, 
> > bpf_prog_active_addr);
>
> u64 is not %llu on some arches, please cast explicitly to (unsigned long long)
>

Ack.

> > +
> > +cleanup:
>
> ... and here (I suggest to just jump from all those locations here for 
> cleanup)
>

Makes sense. Will do.

> > +   test_ksyms_btf__destroy(skel);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c 
> > b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > new file mode 100644
> > index ..e04e31117f84
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > @@ -0,0 +1,23 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2020 Google */
> > +
> > +#include "vmlinux.h"
> > +
> > +#include 
> > +
> > +__u64 out__runqueues = -1;
> > +__u64 out__bpf_prog_active = -1;
>
> this is addresses, not values, so _addr part would make it clearer.
>

Ack.

> > +
> > +extern const struct rq runqueues __ksym; /* struct type global var. */
> > +extern const int bpf_prog_active __ksym; /* int type global var. */
>
> When we add non-per-CPU kernel variables, I wonder if the fact that we
> have both per-CPU and global kernel variables under the same __ksym
> section would cause any problems and confusion? It's not clear to me
> if we need to have a special __percpu_ksym section or not?..
>

Yeah. Totally agree. I thought about this. I think a separate
__percpu_ksym attribute is *probably* more clear. Not sure though. How
about we introduce a "__percpu_ksym" and make it an alias to "__ksym"
for now? If needed, we make an actual section for it in future.

> > +
> > +SEC("raw_tp/sys_enter")
> > +int handler(const void *ctx)
> > +{
> > +   out__runqueues = (__u64)
> > +   out__bpf_prog_active = (__u64)_prog_active;
> > +
> > +   return 0;
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";
> > diff --git a/tools/testing/selftests/bpf/trace_helpers.c 
> > b/tools/testing/selftests/bpf/trace_helpers.c
> > index 4d0e913bbb22..ade555fe8294 100644
> > --- a/tools/testing/selftests/bpf/trace_helpers.c
> > +++ b/tools/testing/selftests/bpf/trace_helpers.c
> > @@ -90,6 +90,32 @@ long ksym_get_addr(const char *name)
> > return 0;
> >  }
> >
> > +/* open kallsyms and read symbol addresses on the fly. Without caching all 
> > symbols,
> > + * this is faster than load + find. */
> > +int kallsyms_find(const char *sym, unsigned long long *addr)
> > +{
> > +   char type, name[500];
> > +   unsigned long long value;
> > +   int err = 0;
> > +   FILE *f;
> > +
> > +   f = fopen("/proc/kallsyms", "r");
> > +   if (!f)
> > +   return -ENOENT;
> > +
> > +   while (fscanf(f, "%llx %c %499s%*[^\n]\n", , , name) > 
> > 0) {
> > +   if (strcmp(name, sym) == 0) {
> > +   *addr = 

Re: [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms

2020-09-13 Thread Hao Luo
Will follow the libbpf logging convention. Thanks for the suggestions.

On Fri, Sep 4, 2020 at 12:34 PM Andrii Nakryiko
 wrote:
>
> On Thu, Sep 3, 2020 at 3:34 PM Hao Luo  wrote:
> >
> > If a ksym is defined with a type, libbpf will try to find the ksym's btf
> > information from kernel btf. If a valid btf entry for the ksym is found,
> > libbpf can pass in the found btf id to the verifier, which validates the
> > ksym's type and value.
> >
> > Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
> > but it has the symbol's address (read from kallsyms) and its value is
> > treated as a raw pointer.
> >
> > Signed-off-by: Hao Luo 
> > ---
>
> Logic looks correct, but I have complaints about libbpf logging
> consistency, please see suggestions below.
>
> >  tools/lib/bpf/libbpf.c | 116 -
> >  1 file changed, 102 insertions(+), 14 deletions(-)
> >
>
> [...]
>
> > @@ -3119,6 +3130,8 @@ static int bpf_object__collect_externs(struct 
> > bpf_object *obj)
> > vt->type = int_btf_id;
> > vs->offset = off;
> > vs->size = sizeof(int);
> > +   pr_debug("ksym var_secinfo: var '%s', type #%d, 
> > size %d, offset %d\n",
> > +ext->name, vt->type, vs->size, vs->offset);
>
> debug leftover?
>

I was thinking we should leave a debug message when some entries in
BTF are modified. It's probably unnecessary, as I'm thinking of it
right now. I will remove this in v3.

> > }
> > sec->size = off;
> > }
> > @@ -5724,8 +5737,13 @@ bpf_program__relocate(struct bpf_program *prog, 
> > struct bpf_object *obj)
> > insn[0].imm = 
> > obj->maps[obj->kconfig_map_idx].fd;
> > insn[1].imm = ext->kcfg.data_off;
> > } else /* EXT_KSYM */ {
> > -   insn[0].imm = (__u32)ext->ksym.addr;
> > -   insn[1].imm = ext->ksym.addr >> 32;
> > +   if (ext->ksym.type_id) { /* typed ksyms */
> > +   insn[0].src_reg = BPF_PSEUDO_BTF_ID;
> > +   insn[0].imm = 
> > ext->ksym.vmlinux_btf_id;
> > +   } else { /* typeless ksyms */
> > +   insn[0].imm = (__u32)ext->ksym.addr;
> > +   insn[1].imm = ext->ksym.addr >> 32;
> > +   }
> > }
> > break;
> > case RELO_CALL:
> > @@ -6462,10 +6480,72 @@ static int bpf_object__read_kallsyms_file(struct 
> > bpf_object *obj)
> > return err;
> >  }
> >
> > +static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
> > +{
> > +   struct extern_desc *ext;
> > +   int i, id;
> > +
> > +   if (!obj->btf_vmlinux) {
> > +   pr_warn("support of typed ksyms needs kernel btf.\n");
> > +   return -ENOENT;
> > +   }
>
> This check shouldn't be needed, you'd either successfully load
> btf_vmlinux by now or will fail earlier, because BTF is required but
> not found.
>
> > +
> > +   for (i = 0; i < obj->nr_extern; i++) {
> > +   const struct btf_type *targ_var, *targ_type;
> > +   __u32 targ_type_id, local_type_id;
> > +   int ret;
> > +
> > +   ext = >externs[i];
> > +   if (ext->type != EXT_KSYM || !ext->ksym.type_id)
> > +   continue;
> > +
> > +   id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name,
> > +   BTF_KIND_VAR);
> > +   if (id <= 0) {
> > +   pr_warn("no btf entry for ksym '%s' in vmlinux.\n",
> > +   ext->name);
>
> please try to stick to consistent style of comments:
>
> "extern (ksym) '%s': failed to find BTF ID in vmlinux BTF" or
> something like that
>
>
> > +   return -ESRCH;
> > +   }
> > +
> > +   /* find target type_id */
> > +   targ_var = btf__type_by_id(obj->btf_vmlinux, id);
> > +   targ_type = skip_mods_and_typedefs(obj->btf_vmlinux,
> > +  targ_var->type,
> > +  _type_id);
> > +
> > +   /* find local type_id */
> > +   local_type_id = ext->ksym.type_id;
> > +
> > +   ret = bpf_core_types_are_compat(obj->btf_vmlinux, 
> > targ_type_id,
> > +   obj->btf, local_type_id);
>
> you reversed the order, it's always local btf/id, then target btf/id.
>
> > +   if (ret <= 0) {
> > +   const struct btf_type *local_type;
> > +   const 

Re: [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id

2020-09-13 Thread Hao Luo
Andrii,

Sorry for the late reply. Your suggestions are concrete and helpful. I
can apply them in v3.

Thanks!
Hao

On Fri, Sep 4, 2020 at 12:05 PM Andrii Nakryiko
 wrote:
>
> On Thu, Sep 3, 2020 at 3:34 PM Hao Luo  wrote:
> >
> > Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
> > ksym so that further dereferences on the ksym can use the BTF info
> > to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
> > the verifier reads the btf_id stored in the insn[0]'s imm field and
> > marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
> > which is encoded in btf_vminux by pahole. If the VAR is not of a struct
> > type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
> > and the mem_size is resolved to the size of the VAR's type.
> >
> > From the VAR btf_id, the verifier can also read the address of the
> > ksym's corresponding kernel var from kallsyms and use that to fill
> > dst_reg.
> >
> > Therefore, the proper functionality of pseudo_btf_id depends on (1)
> > kallsyms and (2) the encoding of kernel global VARs in pahole, which
> > should be available since pahole v1.18.
> >
> > Signed-off-by: Hao Luo 
> > ---
>
> Logic looks correct, but I have a few suggestions for naming things
> and verifier logs. Please see below.
>
> >  include/linux/bpf_verifier.h   |   4 ++
> >  include/linux/btf.h|  15 +
> >  include/uapi/linux/bpf.h   |  38 ---
> >  kernel/bpf/btf.c   |  15 -
> >  kernel/bpf/verifier.c  | 112 ++---
> >  tools/include/uapi/linux/bpf.h |  38 ---
> >  6 files changed, 182 insertions(+), 40 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 53c7bd568c5d..a14063f64d96 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -308,6 +308,10 @@ struct bpf_insn_aux_data {
> > u32 map_index;  /* index into used_maps[] */
> > u32 map_off;/* offset from value base 
> > address */
> > };
> > +   struct {
> > +   u32 pseudo_btf_id_type; /* type of pseudo_btf_id */
> > +   u32 pseudo_btf_id_meta; /* memsize or btf_id */
>
> a bit misleading names, not clear at all in the code what's in there.
> This section is for ldimm64 insns that are loading BTF variables,
> right? so how about this:
>
> struct {
> u32 reg_type;
> union {
> u32 btf_id;
> u32 memsize;
> };
> } btf_var;
>
> In case someone hates non-anonymous structs, I'd still go with
> btf_var_reg_type, btf_var_btf_id and btf_var_memsize.
>
> > +   };
> > };
> > u64 map_key_state; /* constant (32 bit) key tracking for maps */
> > int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index a9af5e7a7ece..592373d359b9 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -106,6 +106,21 @@ static inline bool btf_type_is_func_proto(const struct 
> > btf_type *t)
> > return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
> >  }
> >
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index b4e9c56b8b32..3b382c080cfd 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -7398,6 +7398,24 @@ static int check_ld_imm(struct bpf_verifier_env 
> > *env, struct bpf_insn *insn)
> > return 0;
> > }
> >
> > +   if (insn->src_reg == BPF_PSEUDO_BTF_ID) {
> > +   u32 type = aux->pseudo_btf_id_type;
> > +   u32 meta = aux->pseudo_btf_id_meta;
> > +
> > +   mark_reg_known_zero(env, regs, insn->dst_reg);
> > +
> > +   regs[insn->dst_reg].type = type;
> > +   if (type == PTR_TO_MEM) {
> > +   regs[insn->dst_reg].mem_size = meta;
> > +   } else if (type == PTR_TO_BTF_ID) {
> > +   regs[insn->dst_reg].btf_id = meta;
>
> nit: probably worthwhile to introduce a local variable (dst_reg) to
> capture pointer to regs[insn->dst_reg] in this entire function. Then
> no reall need for type and meta local vars above, everything is going
> to be short and sweet.
>
> > +   } else {
> > +   verbose(env, "bpf verifier is misconfigured\n");
> > +   return -EFAULT;
> > +   }
> > +   return 0;
> > +   }
> > +
> > map = env->used_maps[aux->map_index];
> > mark_reg_known_zero(env, regs, insn->dst_reg);
> > regs[insn->dst_reg].map_ptr = map;
> > @@ -9284,6 +9302,74 @@ static int do_check(struct bpf_verifier_env *env)
> > return 0;
> >  }
> >
> > +/* replace pseudo btf_id with kernel symbol address */
> > +static int check_pseudo_btf_id(struct bpf_verifier_env *env,

[PATCH v2 3/4] sparc64: remove mm_cpumask clearing to fix kthread_use_mm race

2020-09-13 Thread Nicholas Piggin
The de facto (and apparently uncommented) standard for using an mm had,
thanks to this code in sparc if nothing else, been that you must have a
reference on mm_users *and that reference must have been obtained with
mmget()*, i.e., from a thread with a reference to mm_users that had used
the mm.

The introduction of mmget_not_zero() in commit d2005e3f41d4
("userfaultfd: don't pin the user memory in userfaultfd_file_create()")
allowed mm_count holders to aoperate on user mappings asynchronously
from the actual threads using the mm, but they were not to load those
mappings into their TLB (i.e., walking vmas and page tables is okay,
kthread_use_mm() is not).

io_uring 2b188cc1bb857 ("Add io_uring IO interface") added code which
does a kthread_use_mm() from a mmget_not_zero() refcount.

The problem with this is code which previously assumed mm == current->mm
and mm->mm_users == 1 implies the mm will remain single-threaded at
least until this thread creates another mm_users reference, has now
broken.

arch/sparc/kernel/smp_64.c:

if (atomic_read(>mm_users) == 1) {
cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
goto local_flush_and_out;
}

vs fs/io_uring.c

if (unlikely(!(ctx->flags & IORING_SETUP_SQPOLL) ||
 !mmget_not_zero(ctx->sqo_mm)))
return -EFAULT;
kthread_use_mm(ctx->sqo_mm);

mmget_not_zero() could come in right after the mm_users == 1 test, then
kthread_use_mm() which sets its CPU in the mm_cpumask. That update could
be lost if cpumask_copy() occurs afterward.

I propose we fix this by allowing mmget_not_zero() to be a first-class
reference, and not have this obscure undocumented and unchecked
restriction.

The basic fix for sparc64 is to remove its mm_cpumask clearing code. The
optimisation could be effectively restored by sending IPIs to mm_cpumask
members and having them remove themselves from mm_cpumask. This is more
tricky so I leave it as an exercise for someone with a sparc64 SMP.
powerpc has a (currently similarly broken) example.

Signed-off-by: Nicholas Piggin 
---
 arch/sparc/kernel/smp_64.c | 65 --
 1 file changed, 14 insertions(+), 51 deletions(-)

diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index e286e2badc8a..e38d8bf454e8 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1039,38 +1039,9 @@ void smp_fetch_global_pmu(void)
  * are flush_tlb_*() routines, and these run after flush_cache_*()
  * which performs the flushw.
  *
- * The SMP TLB coherency scheme we use works as follows:
- *
- * 1) mm->cpu_vm_mask is a bit mask of which cpus an address
- *space has (potentially) executed on, this is the heuristic
- *we use to avoid doing cross calls.
- *
- *Also, for flushing from kswapd and also for clones, we
- *use cpu_vm_mask as the list of cpus to make run the TLB.
- *
- * 2) TLB context numbers are shared globally across all processors
- *in the system, this allows us to play several games to avoid
- *cross calls.
- *
- *One invariant is that when a cpu switches to a process, and
- *that processes tsk->active_mm->cpu_vm_mask does not have the
- *current cpu's bit set, that tlb context is flushed locally.
- *
- *If the address space is non-shared (ie. mm->count == 1) we avoid
- *cross calls when we want to flush the currently running process's
- *tlb state.  This is done by clearing all cpu bits except the current
- *processor's in current->mm->cpu_vm_mask and performing the
- *flush locally only.  This will force any subsequent cpus which run
- *this task to flush the context from the local tlb if the process
- *migrates to another cpu (again).
- *
- * 3) For shared address spaces (threads) and swapping we bite the
- *bullet for most cases and perform the cross call (but only to
- *the cpus listed in cpu_vm_mask).
- *
- *The performance gain from "optimizing" away the cross call for threads is
- *questionable (in theory the big win for threads is the massive sharing of
- *address space state across processors).
+ * mm->cpu_vm_mask is a bit mask of which cpus an address
+ * space has (potentially) executed on, this is the heuristic
+ * we use to limit cross calls.
  */
 
 /* This currently is only used by the hugetlb arch pre-fault
@@ -1080,18 +1051,13 @@ void smp_fetch_global_pmu(void)
 void smp_flush_tlb_mm(struct mm_struct *mm)
 {
u32 ctx = CTX_HWBITS(mm->context);
-   int cpu = get_cpu();
 
-   if (atomic_read(>mm_users) == 1) {
-   cpumask_copy(mm_cpumask(mm), cpumask_of(cpu));
-   goto local_flush_and_out;
-   }
+   get_cpu();
 
smp_cross_call_masked(_flush_tlb_mm,
  ctx, 0, 0,
  mm_cpumask(mm));
 
-local_flush_and_out:
__flush_tlb_mm(ctx, SECONDARY_CONTEXT);
 
put_cpu();
@@ -1114,17 +1080,15 @@ void smp_flush_tlb_pending(struct 

[PATCH v2 4/4] powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mm

2020-09-13 Thread Nicholas Piggin
Commit 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of
single-threaded mm_cpumask") added a mechanism to trim the mm_cpumask of
a process under certain conditions. One of the assumptions is that
mm_users would not be incremented via a reference outside the process
context with mmget_not_zero() then go on to kthread_use_mm() via that
reference.

That invariant was broken by io_uring code (see previous sparc64 fix),
but I'll point Fixes: to the original powerpc commit because we are
changing that assumption going forward, so this will make backports
match up.

Fix this by no longer relying on that assumption, but by having each CPU
check the mm is not being used, and clearing their own bit from the mask
only if it hasn't been switched-to by the time the IPI is processed.

This relies on commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
invalidate") and ARCH_WANT_IRQS_OFF_ACTIVATE_MM to disable irqs over mm
switch sequences.

Reviewed-by: Michael Ellerman 
Depends-on: 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB invalidate")
Fixes: 0cef77c7798a7 ("powerpc/64s/radix: flush remote CPUs out of 
single-threaded mm_cpumask")
Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/tlb.h   | 13 -
 arch/powerpc/mm/book3s64/radix_tlb.c | 23 ---
 2 files changed, 16 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
index fbc6f3002f23..d97f061fecac 100644
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -66,19 +66,6 @@ static inline int mm_is_thread_local(struct mm_struct *mm)
return false;
return cpumask_test_cpu(smp_processor_id(), mm_cpumask(mm));
 }
-static inline void mm_reset_thread_local(struct mm_struct *mm)
-{
-   WARN_ON(atomic_read(>context.copros) > 0);
-   /*
-* It's possible for mm_access to take a reference on mm_users to
-* access the remote mm from another thread, but it's not allowed
-* to set mm_cpumask, so mm_users may be > 1 here.
-*/
-   WARN_ON(current->mm != mm);
-   atomic_set(>context.active_cpus, 1);
-   cpumask_clear(mm_cpumask(mm));
-   cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
-}
 #else /* CONFIG_PPC_BOOK3S_64 */
 static inline int mm_is_thread_local(struct mm_struct *mm)
 {
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 0d233763441f..143b4fd396f0 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -645,19 +645,29 @@ static void do_exit_flush_lazy_tlb(void *arg)
struct mm_struct *mm = arg;
unsigned long pid = mm->context.id;
 
+   /*
+* A kthread could have done a mmget_not_zero() after the flushing CPU
+* checked mm_is_singlethreaded, and be in the process of
+* kthread_use_mm when interrupted here. In that case, current->mm will
+* be set to mm, because kthread_use_mm() setting ->mm and switching to
+* the mm is done with interrupts off.
+*/
if (current->mm == mm)
-   return; /* Local CPU */
+   goto out_flush;
 
if (current->active_mm == mm) {
-   /*
-* Must be a kernel thread because sender is single-threaded.
-*/
-   BUG_ON(current->mm);
+   WARN_ON_ONCE(current->mm != NULL);
+   /* Is a kernel thread and is using mm as the lazy tlb */
mmgrab(_mm);
-   switch_mm(mm, _mm, current);
current->active_mm = _mm;
+   switch_mm_irqs_off(mm, _mm, current);
mmdrop(mm);
}
+
+   atomic_dec(>context.active_cpus);
+   cpumask_clear_cpu(smp_processor_id(), mm_cpumask(mm));
+
+out_flush:
_tlbiel_pid(pid, RIC_FLUSH_ALL);
 }
 
@@ -672,7 +682,6 @@ static void exit_flush_lazy_tlbs(struct mm_struct *mm)
 */
smp_call_function_many(mm_cpumask(mm), do_exit_flush_lazy_tlb,
(void *)mm, 1);
-   mm_reset_thread_local(mm);
 }
 
 void radix__flush_tlb_mm(struct mm_struct *mm)
-- 
2.23.0



[PATCH v2 0/4] more mm switching vs TLB shootdown and lazy tlb fixes

2020-09-13 Thread Nicholas Piggin
This is an attempt to fix a few different related issues around
switching mm, TLB flushing, and lazy tlb mm handling.

This will require all architectures to eventually move to disabling
irqs over activate_mm, but it's possible we could add another arch
call after irqs are re-enabled for those few which can't do their
entire activation with irqs disabled.

Testing so far indicates this has fixed a mm refcounting bug that
powerpc was running into (via distro report and backport). I haven't
had any real feedback on this series outside powerpc (and it doesn't
really affect other archs), so I propose patches 1,2,4 go via the
powerpc tree.

There is no dependency between them and patch 3, I put it there only
because it follows the history of the code (powerpc code was written
using the sparc64 logic), but I guess they have to go via different arch
trees. Dave, I'll leave patch 3 with you.

Thanks,
Nick

Since v1:
- Updates from Michael Ellerman's review comments.

Nicholas Piggin (4):
  mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race
  powerpc: select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
  sparc64: remove mm_cpumask clearing to fix kthread_use_mm race
  powerpc/64s/radix: Fix mm_cpumask trimming race vs kthread_use_mm

 arch/Kconfig   |  7 +++
 arch/powerpc/Kconfig   |  1 +
 arch/powerpc/include/asm/mmu_context.h |  2 +-
 arch/powerpc/include/asm/tlb.h | 13 --
 arch/powerpc/mm/book3s64/radix_tlb.c   | 23 ++---
 arch/sparc/kernel/smp_64.c | 65 ++
 fs/exec.c  | 17 ++-
 7 files changed, 54 insertions(+), 74 deletions(-)

-- 
2.23.0



[PATCH v2 1/4] mm: fix exec activate_mm vs TLB shootdown and lazy tlb switching race

2020-09-13 Thread Nicholas Piggin
Reading and modifying current->mm and current->active_mm and switching
mm should be done with irqs off, to prevent races seeing an intermediate
state.

This is similar to commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
invalidate"). At exec-time when the new mm is activated, the old one
should usually be single-threaded and no longer used, unless something
else is holding an mm_users reference (which may be possible).

Absent other mm_users, there is also a race with preemption and lazy tlb
switching. Consider the kernel_execve case where the current thread is
using a lazy tlb active mm:

  call_usermodehelper()
kernel_execve()
  old_mm = current->mm;
  active_mm = current->active_mm;
  *** preempt *** >  schedule()
   prev->active_mm = NULL;
   mmdrop(prev active_mm);
 ...
  <  schedule()
  current->mm = mm;
  current->active_mm = mm;
  if (!old_mm)
  mmdrop(active_mm);

If we switch back to the kernel thread from a different mm, there is a
double free of the old active_mm, and a missing free of the new one.

Closing this race only requires interrupts to be disabled while ->mm
and ->active_mm are being switched, but the TLB problem requires also
holding interrupts off over activate_mm. Unfortunately not all archs
can do that yet, e.g., arm defers the switch if irqs are disabled and
expects finish_arch_post_lock_switch() to be called to complete the
flush; um takes a blocking lock in activate_mm().

So as a first step, disable interrupts across the mm/active_mm updates
to close the lazy tlb preempt race, and provide an arch option to
extend that to activate_mm which allows architectures doing IPI based
TLB shootdowns to close the second race.

This is a bit ugly, but in the interest of fixing the bug and backporting
before all architectures are converted this is a compromise.

Signed-off-by: Nicholas Piggin 
---
 arch/Kconfig |  7 +++
 fs/exec.c| 17 +++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index af14a567b493..94821e3f94d1 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -414,6 +414,13 @@ config MMU_GATHER_NO_GATHER
bool
depends on MMU_GATHER_TABLE_FREE
 
+config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
+   bool
+   help
+ Temporary select until all architectures can be converted to have
+ irqs disabled over activate_mm. Architectures that do IPI based TLB
+ shootdowns should enable this.
+
 config ARCH_HAVE_NMI_SAFE_CMPXCHG
bool
 
diff --git a/fs/exec.c b/fs/exec.c
index a91003e28eaa..d4fb18baf1fb 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1130,11 +1130,24 @@ static int exec_mmap(struct mm_struct *mm)
}
 
task_lock(tsk);
-   active_mm = tsk->active_mm;
membarrier_exec_mmap(mm);
-   tsk->mm = mm;
+
+   local_irq_disable();
+   active_mm = tsk->active_mm;
tsk->active_mm = mm;
+   tsk->mm = mm;
+   /*
+* This prevents preemption while active_mm is being loaded and
+* it and mm are being updated, which could cause problems for
+* lazy tlb mm refcounting when these are updated by context
+* switches. Not all architectures can handle irqs off over
+* activate_mm yet.
+*/
+   if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+   local_irq_enable();
activate_mm(active_mm, mm);
+   if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+   local_irq_enable();
tsk->mm->vmacache_seqnum = 0;
vmacache_flush(tsk);
task_unlock(tsk);
-- 
2.23.0



[PATCH v2 2/4] powerpc: select ARCH_WANT_IRQS_OFF_ACTIVATE_MM

2020-09-13 Thread Nicholas Piggin
powerpc uses IPIs in some situations to switch a kernel thread away
from a lazy tlb mm, which is subject to the TLB flushing race
described in the changelog introducing ARCH_WANT_IRQS_OFF_ACTIVATE_MM.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig   | 1 +
 arch/powerpc/include/asm/mmu_context.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 65bed1fdeaad..587ba8352d01 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -149,6 +149,7 @@ config PPC
select ARCH_USE_QUEUED_RWLOCKS  if PPC_QUEUED_SPINLOCKS
select ARCH_USE_QUEUED_SPINLOCKSif PPC_QUEUED_SPINLOCKS
select ARCH_WANT_IPC_PARSE_VERSION
+   select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
select ARCH_WEAK_RELEASE_ACQUIRE
select BINFMT_ELF
select BUILDTIME_TABLE_SORT
diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index a3a12a8341b2..b42813359f49 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -244,7 +244,7 @@ static inline void switch_mm(struct mm_struct *prev, struct 
mm_struct *next,
 #define activate_mm activate_mm
 static inline void activate_mm(struct mm_struct *prev, struct mm_struct *next)
 {
-   switch_mm(prev, next, current);
+   switch_mm_irqs_off(prev, next, current);
 }
 
 /* We don't currently use enter_lazy_tlb() for anything */
-- 
2.23.0



Re: [PATCH v2 1/2] iommu/iova: Retry from last rb tree node if iova search fails

2020-09-13 Thread Vijayanand Jitta



On 8/28/2020 1:01 PM, Vijayanand Jitta wrote:
> 
> 
> On 8/20/2020 6:19 PM, vji...@codeaurora.org wrote:
>> From: Vijayanand Jitta 
>>
>> When ever a new iova alloc request comes iova is always searched
>> from the cached node and the nodes which are previous to cached
>> node. So, even if there is free iova space available in the nodes
>> which are next to the cached node iova allocation can still fail
>> because of this approach.
>>
>> Consider the following sequence of iova alloc and frees on
>> 1GB of iova space
>>
>> 1) alloc - 500MB
>> 2) alloc - 12MB
>> 3) alloc - 499MB
>> 4) free -  12MB which was allocated in step 2
>> 5) alloc - 13MB
>>
>> After the above sequence we will have 12MB of free iova space and
>> cached node will be pointing to the iova pfn of last alloc of 13MB
>> which will be the lowest iova pfn of that iova space. Now if we get an
>> alloc request of 2MB we just search from cached node and then look
>> for lower iova pfn's for free iova and as they aren't any, iova alloc
>> fails though there is 12MB of free iova space.
>>
>> To avoid such iova search failures do a retry from the last rb tree node
>> when iova search fails, this will search the entire tree and get an iova
>> if its available.
>>
>> Signed-off-by: Vijayanand Jitta 
>> ---
>>  drivers/iommu/iova.c | 23 +--
>>  1 file changed, 17 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>> index 49fc01f..4e77116 100644
>> --- a/drivers/iommu/iova.c
>> +++ b/drivers/iommu/iova.c
>> @@ -184,8 +184,9 @@ static int __alloc_and_insert_iova_range(struct 
>> iova_domain *iovad,
>>  struct rb_node *curr, *prev;
>>  struct iova *curr_iova;
>>  unsigned long flags;
>> -unsigned long new_pfn;
>> +unsigned long new_pfn, low_pfn_new;
>>  unsigned long align_mask = ~0UL;
>> +unsigned long high_pfn = limit_pfn, low_pfn = iovad->start_pfn;
>>  
>>  if (size_aligned)
>>  align_mask <<= fls_long(size - 1);
>> @@ -198,15 +199,25 @@ static int __alloc_and_insert_iova_range(struct 
>> iova_domain *iovad,
>>  
>>  curr = __get_cached_rbnode(iovad, limit_pfn);
>>  curr_iova = rb_entry(curr, struct iova, node);
>> +low_pfn_new = curr_iova->pfn_hi + 1;
>> +
>> +retry:
>>  do {
>> -limit_pfn = min(limit_pfn, curr_iova->pfn_lo);
>> -new_pfn = (limit_pfn - size) & align_mask;
>> +high_pfn = min(high_pfn, curr_iova->pfn_lo);
>> +new_pfn = (high_pfn - size) & align_mask;
>>  prev = curr;
>>  curr = rb_prev(curr);
>>  curr_iova = rb_entry(curr, struct iova, node);
>> -} while (curr && new_pfn <= curr_iova->pfn_hi);
>> -
>> -if (limit_pfn < size || new_pfn < iovad->start_pfn) {
>> +} while (curr && new_pfn <= curr_iova->pfn_hi && new_pfn >= low_pfn);
>> +
>> +if (high_pfn < size || new_pfn < low_pfn) {
>> +if (low_pfn == iovad->start_pfn && low_pfn_new < limit_pfn) {
>> +high_pfn = limit_pfn;
>> +low_pfn = low_pfn_new;
>> +curr = >anchor.node;
>> +curr_iova = rb_entry(curr, struct iova, node);
>> +goto retry;
>> +}
>>  iovad->max32_alloc_size = size;
>>  goto iova32_full;
>>  }
>>
> 
> ping ?
> 
> Thanks,
> Vijay
> 

ping ?

Thanks,
Vijay
-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of Code Aurora Forum, hosted by The Linux Foundation


mmotm 2020-09-13-21-39 uploaded

2020-09-13 Thread akpm
The mm-of-the-moment snapshot 2020-09-13-21-39 has been uploaded to

   http://www.ozlabs.org/~akpm/mmotm/

mmotm-readme.txt says

README for mm-of-the-moment:

http://www.ozlabs.org/~akpm/mmotm/

This is a snapshot of my -mm patch queue.  Uploaded at random hopefully
more than once a week.

You will need quilt to apply these patches to the latest Linus release (5.x
or 5.x-rcY).  The series file is in broken-out.tar.gz and is duplicated in
http://ozlabs.org/~akpm/mmotm/series

The file broken-out.tar.gz contains two datestamp files: .DATE and
.DATE--mm-dd-hh-mm-ss.  Both contain the string -mm-dd-hh-mm-ss,
followed by the base kernel version against which this patch series is to
be applied.

This tree is partially included in linux-next.  To see which patches are
included in linux-next, consult the `series' file.  Only the patches
within the #NEXT_PATCHES_START/#NEXT_PATCHES_END markers are included in
linux-next.


A full copy of the full kernel tree with the linux-next and mmotm patches
already applied is available through git within an hour of the mmotm
release.  Individual mmotm releases are tagged.  The master branch always
points to the latest release, so it's constantly rebasing.

https://github.com/hnaz/linux-mm

The directory http://www.ozlabs.org/~akpm/mmots/ (mm-of-the-second)
contains daily snapshots of the -mm tree.  It is updated more frequently
than mmotm, and is untested.

A git copy of this tree is also available at

https://github.com/hnaz/linux-mm



This mmotm tree contains the following patches against 5.9-rc5:
(patches marked "*" will be included in linux-next)

* mailmap-add-older-email-addresses-for-kees-cook.patch
* mm-gup_benchmark-update-the-documentation-in-kconfig.patch
* proc-kpageflags-prevent-an-integer-overflow-in-stable_page_flags.patch
* proc-kpageflags-do-not-use-uninitialized-struct-pages.patch
* mm-thp-swap-fix-allocating-cluster-for-swapfile-by-mistake.patch
* ksm-reinstate-memcg-charge-on-copied-pages.patch
* mm-migration-of-hugetlbfs-page-skip-memcg.patch
* shmem-shmem_writepage-split-unlikely-i915-thp.patch
* mm-fix-check_move_unevictable_pages-on-thp.patch
* mlock-fix-unevictable_pgs-event-counts-on-thp.patch
* tmpfs-restore-functionality-of-nr_inodes=0.patch
* kprobes-fix-kill-kprobe-which-has-been-marked-as-gone.patch
* mm-thp-fix-__split_huge_pmd_locked-for-migration-pmd.patch
* selftests-vm-fix-display-of-page-size-in-map_hugetlb.patch
* mm-memory_hotplug-drain-per-cpu-pages-again-during-memory-offline.patch
* ftrace-let-ftrace_enable_sysctl-take-a-kernel-pointer-buffer.patch
* stackleak-let-stack_erasing_sysctl-take-a-kernel-pointer-buffer.patch
* fs-adjust-dirtytime_interval_handler-definition-to-match-prototype.patch
* kcsan-kconfig-move-to-menu-generic-kernel-debugging-instruments.patch
* checkpatch-test-git_dir-changes.patch
* compiler-clang-add-build-check-for-clang-1001.patch
* revert-kbuild-disable-clangs-default-use-of-fmerge-all-constants.patch
* revert-arm64-bti-require-clang-=-1001-for-in-kernel-bti-support.patch
* revert-arm64-vdso-fix-compilation-with-clang-older-than-8.patch
* 
partially-revert-arm-8905-1-emit-__gnu_mcount_nc-when-using-clang-1000-or-newer.patch
* kasan-remove-mentions-of-unsupported-clang-versions.patch
* compiler-gcc-improve-version-error.patch
* ntfs-add-check-for-mft-record-size-in-superblock.patch
* fs-ocfs2-delete-repeated-words-in-comments.patch
* ocfs2-clear-links-count-in-ocfs2_mknod-if-an-error-occurs.patch
* ocfs2-fix-ocfs2-corrupt-when-iputting-an-inode.patch
* ramfs-support-o_tmpfile.patch
* fs-xattrc-fix-kernel-doc-warnings-for-setxattr-removexattr.patch
* kernel-watchdog-flush-all-printk-nmi-buffers-when-hardlockup-detected.patch
  mm.patch
* mm-slub-branch-optimization-in-free-slowpath.patch
* mm-slub-fix-missing-alloc_slowpath-stat-when-bulk-alloc.patch
* mm-slub-make-add_full-condition-more-explicit.patch
* mm-kmemleak-rely-on-rcu-for-task-stack-scanning.patch
* x86-numa-cleanup-configuration-dependent-command-line-options.patch
* x86-numa-add-nohmat-option.patch
* x86-numa-add-nohmat-option-fix.patch
* efi-fake_mem-arrange-for-a-resource-entry-per-efi_fake_mem-instance.patch
* acpi-hmat-refactor-hmat_register_target_device-to-hmem_register_device.patch
* 
acpi-hmat-refactor-hmat_register_target_device-to-hmem_register_device-fix.patch
* resource-report-parent-to-walk_iomem_res_desc-callback.patch
* mm-memory_hotplug-introduce-default-phys_to_target_node-implementation.patch
* 
mm-memory_hotplug-introduce-default-phys_to_target_node-implementation-fix.patch
* acpi-hmat-attach-a-device-for-each-soft-reserved-range.patch
* acpi-hmat-attach-a-device-for-each-soft-reserved-range-fix.patch
* device-dax-drop-the-dax_regionpfn_flags-attribute.patch
* device-dax-move-instance-creation-parameters-to-struct-dev_dax_data.patch
* device-dax-make-pgmap-optional-for-instance-creation.patch
* device-dax-make-pgmap-optional-for-instance-creation-fix.patch
* device-dax-kill-dax_kmem_res.patch
* 

Re: [PATCH] MIPS: Remove unused BOOT_MEM_INIT_RAM

2020-09-13 Thread Jiaxun Yang




在 2020/9/12 9:59, Youling Tang 写道:

Commit a94e4f24ec83 ("MIPS: init: Drop boot_mem_map") left
the BOOT_MEM_INIT_RAM unused, remove it.

Signed-off-by: Youling Tang 
---
  arch/mips/include/asm/bootinfo.h | 1 -
  1 file changed, 1 deletion(-)

diff --git a/arch/mips/include/asm/bootinfo.h b/arch/mips/include/asm/bootinfo.h
index 147c932..39196ae 100644
--- a/arch/mips/include/asm/bootinfo.h
+++ b/arch/mips/include/asm/bootinfo.h
@@ -91,7 +91,6 @@ extern unsigned long mips_machtype;
  #define BOOT_MEM_RAM  1
  #define BOOT_MEM_ROM_DATA 2
  #define BOOT_MEM_RESERVED 3
-#define BOOT_MEM_INIT_RAM  4


If you're willing to remove that you'd better turn the memtype struct
into a enum.

Btw: It seems you've done a lot of minor clean-up works recently,
if you'd like to I think you can try to turn all the platforms into memblock
and remove all these gules between memblock and legacy code.

Thanks.

- Jiaxun


  #define BOOT_MEM_NOMAP5
  
  extern void add_memory_region(phys_addr_t start, phys_addr_t size, long type);


Re: [RFC PATCH v2] PCI/portdrv: Only disable Bus Master on kexec reboot and connected PCI devices

2020-09-13 Thread Huacai Chen
Hi, Tiezhu

> -- Original --
> From:  "Tiezhu Yang";
> Date:  Mon, Sep 14, 2020 04:29 AM
> To:  "Bjorn Helgaas";
> Cc:  "Konstantin Khlebnikov"; "Khalid 
> Aziz"; "Vivek Goyal"; "Lukas 
> Wunner"; "oohall"; 
> "rafael.j.wysocki"; "Xuefeng 
> Li"; "Huacai Chen"; "Jiaxun 
> Yang"; "linux-pci"; 
> "linux-kernel";
> Subject:  [RFC PATCH v2] PCI/portdrv: Only disable Bus Master on kexec reboot 
> and connected PCI devices
>
>
>
> After commit 745be2e700cd ("PCIe: portdrv: call pci_disable_device
> during remove") and commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe
> services during shutdown"), it also calls pci_disable_device() during
> shutdown, this leads to shutdown or reboot failure occasionally due to
> clear PCI_COMMAND_MASTER on the device in do_pci_disable_device().
>
> drivers/pci/pci.c
> static void do_pci_disable_device(struct pci_dev *dev)
> {
> u16 pci_command;
>
> pci_read_config_word(dev, PCI_COMMAND, _command);
> if (pci_command & PCI_COMMAND_MASTER) {
> pci_command &= ~PCI_COMMAND_MASTER;
> pci_write_config_word(dev, PCI_COMMAND, pci_command);
> }
>
> pcibios_disable_device(dev);
> }
>
> When remove "pci_command &= ~PCI_COMMAND_MASTER;", it can work well.
>
> As Oliver O'Halloran said, no need to call pci_disable_device() when
> actually shutting down, but we should call pci_disable_device() before
> handing over to the new kernel on kexec reboot, so we can do some
> condition checks which are similar with pci_device_shutdown(), this is
> done by commit 4fc9bbf98fd6 ("PCI: Disable Bus Master only on kexec
> reboot") and commit 6e0eda3c3898 ("PCI: Don't try to disable Bus Master
> on disconnected PCI devices").
>
> drivers/pci/pci-driver.c
> static void pci_device_shutdown(struct device *dev)
> {
> ...
>
> /*
>  * If this is a kexec reboot, turn off Bus Master bit on the
>  * device to tell it to not continue to do DMA. Don't touch
>  * devices in D3cold or unknown states.
>  * If it is not a kexec reboot, firmware will hit the PCI
>  * devices with big hammer and stop their DMA any way.
>  */
> if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot))
> pci_clear_master(pci_dev);
> }
Have you really tried kexec? Why do you think kexec can disable pci
device successfully while normal reboot/poweroff cannot?

Huacai
>
> Signed-off-by: Tiezhu Yang 
> ---
>  drivers/pci/pcie/portdrv_core.c |  1 -
>  drivers/pci/pcie/portdrv_pci.c  | 25 -
>  2 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
> index 50a9522..1991aca 100644
> --- a/drivers/pci/pcie/portdrv_core.c
> +++ b/drivers/pci/pcie/portdrv_core.c
> @@ -491,7 +491,6 @@ void pcie_port_device_remove(struct pci_dev *dev)
>  {
> device_for_each_child(>dev, NULL, remove_iter);
> pci_free_irq_vectors(dev);
> -   pci_disable_device(dev);
>  }
>
>  /**
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index 3a3ce40..ce89a9e8 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "../pci.h"
>  #include "portdrv.h"
> @@ -143,6 +144,28 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
> }
>
> pcie_port_device_remove(dev);
> +   pci_disable_device(dev);
> +}
> +
> +static void pcie_portdrv_shutdown(struct pci_dev *dev)
> +{
> +   if (pci_bridge_d3_possible(dev)) {
> +   pm_runtime_forbid(>dev);
> +   pm_runtime_get_noresume(>dev);
> +   pm_runtime_dont_use_autosuspend(>dev);
> +   }
> +
> +   pcie_port_device_remove(dev);
> +
> +   /*
> +* If this is a kexec reboot, turn off Bus Master bit on the
> +* device to tell it to not continue to do DMA. Don't touch
> +* devices in D3cold or unknown states.
> +* If it is not a kexec reboot, firmware will hit the PCI
> +* devices with big hammer and stop their DMA any way.
> +*/
> +   if (kexec_in_progress && (dev->current_state <= PCI_D3hot))
> +   pci_disable_device(dev);
>  }
>
>  static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
> @@ -211,7 +234,7 @@ static void pcie_portdrv_err_resume(struct pci_dev *dev)
>
> .probe  = pcie_portdrv_probe,
> .remove = pcie_portdrv_remove,
> -   .shutdown   = pcie_portdrv_remove,
> +   .shutdown   = pcie_portdrv_shutdown,
>
> .err_handler= _portdrv_err_handler,
>
> --
> 1.8.3.1


Re: [PATCH] trace: Fix race in trace_open and buffer resize call

2020-09-13 Thread Gaurav Kohli

Hi Steven,

Please let us know, if below change looks good.
Or let us know some other way to solve this.

Thanks,
Gaurav



On 9/4/2020 11:39 AM, Gaurav Kohli wrote:

Below race can come, if trace_open and resize of
cpu buffer is running parallely on different cpus
CPUXCPUY
ring_buffer_resize
atomic_read(>resize_disabled)
tracing_open
tracing_reset_online_cpus
ring_buffer_reset_cpu
rb_reset_cpu
rb_update_pages
remove/insert pages
resetting pointer
This race can cause data abort or some times infinte loop in
rb_remove_pages and rb_insert_pages while checking pages
for sanity.
Take ring buffer lock in trace_open to avoid resetting of cpu buffer.

Signed-off-by: Gaurav Kohli 

diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h
index 136ea09..55f9115 100644
--- a/include/linux/ring_buffer.h
+++ b/include/linux/ring_buffer.h
@@ -163,6 +163,8 @@ bool ring_buffer_empty_cpu(struct trace_buffer *buffer, int 
cpu);
  
  void ring_buffer_record_disable(struct trace_buffer *buffer);

  void ring_buffer_record_enable(struct trace_buffer *buffer);
+void ring_buffer_mutex_acquire(struct trace_buffer *buffer);
+void ring_buffer_mutex_release(struct trace_buffer *buffer);
  void ring_buffer_record_off(struct trace_buffer *buffer);
  void ring_buffer_record_on(struct trace_buffer *buffer);
  bool ring_buffer_record_is_on(struct trace_buffer *buffer);
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 93ef0ab..638ec8f 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -3632,6 +3632,25 @@ void ring_buffer_record_enable(struct trace_buffer 
*buffer)
  EXPORT_SYMBOL_GPL(ring_buffer_record_enable);
  
  /**

+ * ring_buffer_mutex_acquire - prevent resetting of buffer
+ * during resize
+ */
+void ring_buffer_mutex_acquire(struct trace_buffer *buffer)
+{
+   mutex_lock(>mutex);
+}
+EXPORT_SYMBOL_GPL(ring_buffer_mutex_acquire);
+
+/**
+ * ring_buffer_mutex_release - prevent resetting of buffer
+ * during resize
+ */
+void ring_buffer_mutex_release(struct trace_buffer *buffer)
+{
+   mutex_unlock(>mutex);
+}
+EXPORT_SYMBOL_GPL(ring_buffer_mutex_release);
+/**
   * ring_buffer_record_off - stop all writes into the buffer
   * @buffer: The ring buffer to stop writes to.
   *
@@ -4918,6 +4937,8 @@ void ring_buffer_reset(struct trace_buffer *buffer)
struct ring_buffer_per_cpu *cpu_buffer;
int cpu;
  
+	/* prevent another thread from changing buffer sizes */

+   mutex_lock(>mutex);
for_each_buffer_cpu(buffer, cpu) {
cpu_buffer = buffer->buffers[cpu];
  
@@ -4936,6 +4957,7 @@ void ring_buffer_reset(struct trace_buffer *buffer)

atomic_dec(_buffer->record_disabled);
atomic_dec(_buffer->resize_disabled);
}
+   mutex_unlock(>mutex);
  }
  EXPORT_SYMBOL_GPL(ring_buffer_reset);
  
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c

index f40d850..392e9aa 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2006,6 +2006,8 @@ void tracing_reset_online_cpus(struct array_buffer *buf)
if (!buffer)
return;
  
+	ring_buffer_mutex_acquire(buffer);

+
ring_buffer_record_disable(buffer);
  
  	/* Make sure all commits have finished */

@@ -2016,6 +2018,8 @@ void tracing_reset_online_cpus(struct array_buffer *buf)
ring_buffer_reset_online_cpus(buffer);
  
  	ring_buffer_record_enable(buffer);

+
+   ring_buffer_mutex_release(buffer);
  }
  
  /* Must have trace_types_lock held */




--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center,
Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


linux-next: build warning after merge of the dmaengine tree

2020-09-13 Thread Stephen Rothwell
Hi all,

After merging the dmaengine tree, today's linux-next build (x86_64
allmodconfig) produced this warning:

drivers/dma/sf-pdma/sf-pdma.c: In function 'sf_pdma_donebh_tasklet':
drivers/dma/sf-pdma/sf-pdma.c:287:23: warning: unused variable 'desc' 
[-Wunused-variable]
  287 |  struct sf_pdma_desc *desc = chan->desc;
  |   ^~~~

Introduced by commit

  8f6b6d060602 ("dmaengine: sf-pdma: Fix an error that calls callback twice")

-- 
Cheers,
Stephen Rothwell


pgpsz8iHnSjjq.pgp
Description: OpenPGP digital signature


[PATCH 3/3] crypto: inside-secure - Reuse code in safexcel_hmac_alg_setkey

2020-09-13 Thread Herbert Xu
The code in the current implementation of safexcel_hmac_alg_setkey
can be reused by safexcel_cipher.  This patch does just that by
renaming the previous safexcel_hmac_setkey to __safexcel_hmac_setkey.
The now-shared safexcel_hmac_alg_setkey becomes safexcel_hmac_setkey
and a new safexcel_hmac_alg_setkey has been added for use by ahash
transforms.

As a result safexcel_aead_setkey's stack frame has been reduced by
about half in size, or about 512 bytes.

Reported-by: kernel test robot 
Signed-off-by: Herbert Xu 
---

 drivers/crypto/inside-secure/safexcel.h|5 ++-
 drivers/crypto/inside-secure/safexcel_cipher.c |   36 ++--
 drivers/crypto/inside-secure/safexcel_hash.c   |   37 +++--
 3 files changed, 36 insertions(+), 42 deletions(-)

diff --git a/drivers/crypto/inside-secure/safexcel.h 
b/drivers/crypto/inside-secure/safexcel.h
index dd095f6622ace..7bbb6fcec3739 100644
--- a/drivers/crypto/inside-secure/safexcel.h
+++ b/drivers/crypto/inside-secure/safexcel.h
@@ -908,8 +908,9 @@ void safexcel_rdr_req_set(struct safexcel_crypto_priv *priv,
 inline struct crypto_async_request *
 safexcel_rdr_req_get(struct safexcel_crypto_priv *priv, int ring);
 void safexcel_inv_complete(struct crypto_async_request *req, int error);
-int safexcel_hmac_setkey(const char *alg, const u8 *key, unsigned int keylen,
-void *istate, void *ostate);
+int safexcel_hmac_setkey(struct safexcel_context *base, const u8 *key,
+unsigned int keylen, const char *alg,
+unsigned int state_sz);
 
 /* available algorithms */
 extern struct safexcel_alg_template safexcel_alg_ecb_des;
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c 
b/drivers/crypto/inside-secure/safexcel_cipher.c
index d0cfdbb144a3a..9bcfb79a030f1 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -404,11 +404,11 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, 
const u8 *key,
 {
struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
-   struct safexcel_ahash_export_state istate, ostate;
struct safexcel_crypto_priv *priv = ctx->base.priv;
struct crypto_authenc_keys keys;
struct crypto_aes_ctx aes;
int err = -EINVAL, i;
+   const char *alg;
 
if (unlikely(crypto_authenc_extractkeys(, key, len)))
goto badkey;
@@ -463,53 +463,37 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, 
const u8 *key,
/* Auth key */
switch (ctx->hash_alg) {
case CONTEXT_CONTROL_CRYPTO_ALG_SHA1:
-   if (safexcel_hmac_setkey("safexcel-sha1", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sha1";
break;
case CONTEXT_CONTROL_CRYPTO_ALG_SHA224:
-   if (safexcel_hmac_setkey("safexcel-sha224", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sha224";
break;
case CONTEXT_CONTROL_CRYPTO_ALG_SHA256:
-   if (safexcel_hmac_setkey("safexcel-sha256", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sha256";
break;
case CONTEXT_CONTROL_CRYPTO_ALG_SHA384:
-   if (safexcel_hmac_setkey("safexcel-sha384", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sha384";
break;
case CONTEXT_CONTROL_CRYPTO_ALG_SHA512:
-   if (safexcel_hmac_setkey("safexcel-sha512", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sha512";
break;
case CONTEXT_CONTROL_CRYPTO_ALG_SM3:
-   if (safexcel_hmac_setkey("safexcel-sm3", keys.authkey,
-keys.authkeylen, , ))
-   goto badkey;
+   alg = "safexcel-sm3";
break;
default:
dev_err(priv->dev, "aead: unsupported hash algorithm\n");
goto badkey;
}
 
-   if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma &&
-   (memcmp(>base.ipad, istate.state, ctx->state_sz) ||
-memcmp(>base.opad, ostate.state, ctx->state_sz)))
-   ctx->base.needs_inv = true;
+   if (safexcel_hmac_setkey(>base, keys.authkey, keys.authkeylen,
+alg, ctx->state_sz))
+   goto badkey;
 
/* Now copy the keys into the context */
for (i = 0; i < keys.enckeylen / sizeof(u32); i++)

[PATCH 1/3] crypto: inside-secure - Move priv pointer into safexcel_context

2020-09-13 Thread Herbert Xu
This patch moves the priv pointer into struct safexcel_context
because both structs that extend safexcel_context have that pointer
as well.

Signed-off-by: Herbert Xu 
---

 drivers/crypto/inside-secure/safexcel.h|1 
 drivers/crypto/inside-secure/safexcel_cipher.c |   42 -
 drivers/crypto/inside-secure/safexcel_hash.c   |   17 --
 3 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/inside-secure/safexcel.h 
b/drivers/crypto/inside-secure/safexcel.h
index 7c5fe382d2720..77eb285b335f4 100644
--- a/drivers/crypto/inside-secure/safexcel.h
+++ b/drivers/crypto/inside-secure/safexcel.h
@@ -819,6 +819,7 @@ struct safexcel_context {
 struct crypto_async_request *req, bool *complete,
 int *ret);
struct safexcel_context_record *ctxr;
+   struct safexcel_crypto_priv *priv;
dma_addr_t ctxr_dma;
 
int ring;
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c 
b/drivers/crypto/inside-secure/safexcel_cipher.c
index 1ac3253b7903a..052df0da02f47 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -375,7 +375,7 @@ static int safexcel_skcipher_aes_setkey(struct 
crypto_skcipher *ctfm,
 {
struct crypto_tfm *tfm = crypto_skcipher_tfm(ctfm);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
struct crypto_aes_ctx aes;
int ret, i;
 
@@ -407,7 +407,7 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, 
const u8 *key,
struct crypto_tfm *tfm = crypto_aead_tfm(ctfm);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
struct safexcel_ahash_export_state istate, ostate;
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
struct crypto_authenc_keys keys;
struct crypto_aes_ctx aes;
int err = -EINVAL, i;
@@ -525,7 +525,7 @@ static int safexcel_context_control(struct 
safexcel_cipher_ctx *ctx,
struct safexcel_cipher_req *sreq,
struct safexcel_command_desc *cdesc)
 {
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
int ctrl_size = ctx->key_len / sizeof(u32);
 
cdesc->control_data.control1 = ctx->mode;
@@ -692,7 +692,7 @@ static int safexcel_send_req(struct crypto_async_request 
*base, int ring,
struct skcipher_request *areq = skcipher_request_cast(base);
struct crypto_skcipher *skcipher = crypto_skcipher_reqtfm(areq);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(base->tfm);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
struct safexcel_command_desc *cdesc;
struct safexcel_command_desc *first_cdesc = NULL;
struct safexcel_result_desc *rdesc, *first_rdesc = NULL;
@@ -1020,7 +1020,7 @@ static int safexcel_cipher_send_inv(struct 
crypto_async_request *base,
int ring, int *commands, int *results)
 {
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(base->tfm);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
int ret;
 
ret = safexcel_invalidate_cache(base, priv, ctx->base.ctxr_dma, ring);
@@ -1039,7 +1039,7 @@ static int safexcel_skcipher_send(struct 
crypto_async_request *async, int ring,
struct skcipher_request *req = skcipher_request_cast(async);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct safexcel_cipher_req *sreq = skcipher_request_ctx(req);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
int ret;
 
BUG_ON(!(priv->flags & EIP197_TRC_CACHE) && sreq->needs_inv);
@@ -1072,7 +1072,7 @@ static int safexcel_aead_send(struct crypto_async_request 
*async, int ring,
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
struct safexcel_cipher_req *sreq = aead_request_ctx(req);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
int ret;
 
BUG_ON(!(priv->flags & EIP197_TRC_CACHE) && sreq->needs_inv);
@@ -1094,7 +1094,7 @@ static int safexcel_cipher_exit_inv(struct crypto_tfm 
*tfm,
struct safexcel_inv_result *result)
 {
struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
-   struct safexcel_crypto_priv *priv = ctx->priv;
+   struct safexcel_crypto_priv *priv = ctx->base.priv;
int ring = ctx->base.ring;
 
  

[PATCH 2/3] crypto: inside-secure - Move ipad/opad into safexcel_context

2020-09-13 Thread Herbert Xu
As both safexcel_ahash_ctx and safexcel_cipher_ctx contain ipad
and opad buffers this patch moves them into the common struct
safexcel_context.  It also adds a union so that they can be accessed
in the appropriate endian without crazy casts.

Signed-off-by: Herbert Xu 
---

 drivers/crypto/inside-secure/safexcel.h|9 ++
 drivers/crypto/inside-secure/safexcel_cipher.c |   20 ++--
 drivers/crypto/inside-secure/safexcel_hash.c   |  106 -
 3 files changed, 72 insertions(+), 63 deletions(-)

diff --git a/drivers/crypto/inside-secure/safexcel.h 
b/drivers/crypto/inside-secure/safexcel.h
index 77eb285b335f4..dd095f6622ace 100644
--- a/drivers/crypto/inside-secure/safexcel.h
+++ b/drivers/crypto/inside-secure/safexcel.h
@@ -12,7 +12,9 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
 
 #define EIP197_HIA_VERSION_BE  0xca35
 #define EIP197_HIA_VERSION_LE  0x35ca
@@ -822,6 +824,13 @@ struct safexcel_context {
struct safexcel_crypto_priv *priv;
dma_addr_t ctxr_dma;
 
+   union {
+   __le32 le[SHA3_512_BLOCK_SIZE / 4];
+   __be32 be[SHA3_512_BLOCK_SIZE / 4];
+   u32 word[SHA3_512_BLOCK_SIZE / 4];
+   u8 byte[SHA3_512_BLOCK_SIZE];
+   } ipad, opad;
+
int ring;
bool needs_inv;
bool exit_inv;
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c 
b/drivers/crypto/inside-secure/safexcel_cipher.c
index 052df0da02f47..d0cfdbb144a3a 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -61,8 +61,6 @@ struct safexcel_cipher_ctx {
/* All the below is AEAD specific */
u32 hash_alg;
u32 state_sz;
-   __be32 ipad[SHA512_DIGEST_SIZE / sizeof(u32)];
-   __be32 opad[SHA512_DIGEST_SIZE / sizeof(u32)];
 
struct crypto_cipher *hkaes;
struct crypto_aead *fback;
@@ -500,8 +498,8 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, 
const u8 *key,
}
 
if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma &&
-   (memcmp(ctx->ipad, istate.state, ctx->state_sz) ||
-memcmp(ctx->opad, ostate.state, ctx->state_sz)))
+   (memcmp(>base.ipad, istate.state, ctx->state_sz) ||
+memcmp(>base.opad, ostate.state, ctx->state_sz)))
ctx->base.needs_inv = true;
 
/* Now copy the keys into the context */
@@ -509,8 +507,8 @@ static int safexcel_aead_setkey(struct crypto_aead *ctfm, 
const u8 *key,
ctx->key[i] = cpu_to_le32(((u32 *)keys.enckey)[i]);
ctx->key_len = keys.enckeylen;
 
-   memcpy(ctx->ipad, , ctx->state_sz);
-   memcpy(ctx->opad, , ctx->state_sz);
+   memcpy(>base.ipad, , ctx->state_sz);
+   memcpy(>base.opad, , ctx->state_sz);
 
memzero_explicit(, sizeof(keys));
return 0;
@@ -718,10 +716,10 @@ static int safexcel_send_req(struct crypto_async_request 
*base, int ring,
totlen_dst += digestsize;
 
memcpy(ctx->base.ctxr->data + ctx->key_len / sizeof(u32),
-  ctx->ipad, ctx->state_sz);
+  >base.ipad, ctx->state_sz);
if (!ctx->xcm)
memcpy(ctx->base.ctxr->data + (ctx->key_len +
-  ctx->state_sz) / sizeof(u32), ctx->opad,
+  ctx->state_sz) / sizeof(u32), >base.opad,
   ctx->state_sz);
} else if ((ctx->mode == CONTEXT_CONTROL_CRYPTO_MODE_CBC) &&
   (sreq->direction == SAFEXCEL_DECRYPT)) {
@@ -2618,7 +2616,7 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead 
*ctfm, const u8 *key,
 
if (priv->flags & EIP197_TRC_CACHE && ctx->base.ctxr_dma) {
for (i = 0; i < AES_BLOCK_SIZE / sizeof(u32); i++) {
-   if (be32_to_cpu(ctx->ipad[i]) != hashkey[i]) {
+   if (be32_to_cpu(ctx->base.ipad.be[i]) != hashkey[i]) {
ctx->base.needs_inv = true;
break;
}
@@ -2626,7 +2624,7 @@ static int safexcel_aead_gcm_setkey(struct crypto_aead 
*ctfm, const u8 *key,
}
 
for (i = 0; i < AES_BLOCK_SIZE / sizeof(u32); i++)
-   ctx->ipad[i] = cpu_to_be32(hashkey[i]);
+   ctx->base.ipad.be[i] = cpu_to_be32(hashkey[i]);
 
memzero_explicit(hashkey, AES_BLOCK_SIZE);
memzero_explicit(, sizeof(aes));
@@ -2714,7 +2712,7 @@ static int safexcel_aead_ccm_setkey(struct crypto_aead 
*ctfm, const u8 *key,
 
for (i = 0; i < len / sizeof(u32); i++) {
ctx->key[i] = cpu_to_le32(aes.key_enc[i]);
-   ctx->ipad[i + 2 * AES_BLOCK_SIZE / sizeof(u32)] =
+   ctx->base.ipad.be[i + 2 * AES_BLOCK_SIZE / sizeof(u32)] =
cpu_to_be32(aes.key_enc[i]);
}
 
diff --git 

[PATCH 0/3] crypto: inside-secure - Silence stack frame size warning in safexcel_aead_setkey

2020-09-13 Thread Herbert Xu
On Sun, Sep 13, 2020 at 06:42:09PM +0800, kernel test robot wrote:
> Hi Pascal,
> 
> FYI, the error/warning still remains.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
> master
> head:   ef2e9a563b0cd7965e2a1263125dcbb1c86aa6cc
> commit: bb7679b840cc7cf23868e05c5ef7a044e7fafd97 crypto: inside-secure - 
> Added support for authenc HMAC-SHA1/DES-CBC
> date:   12 months ago
> config: arm-randconfig-r005-20200913 (attached as .config)
> compiler: arm-linux-gnueabi-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> git checkout bb7679b840cc7cf23868e05c5ef7a044e7fafd97
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> ARCH=arm 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
> 
> All warnings (new ones prefixed by >>):
> 
>drivers/crypto/inside-secure/safexcel_cipher.c: In function 
> 'safexcel_aead_setkey':
> >> drivers/crypto/inside-secure/safexcel_cipher.c:457:1: warning: the frame 
> >> size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
>  457 | }
>  | ^

This is primarily caused by the istate/ostate variables on the
stack.  This patch series removes the warning by reusing the
ahash setkey code for aead.  Note that we've simply moved the
istate/ostate into the ahash code and the overall stack usage
is actually unchanged.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


[PATCH v2] documentation: arm: sunxi: Allwinner H2+/H3 update

2020-09-13 Thread Wilken Gottwalt
Updated information about H2+ and H3 differences and added a link to a
slightly newer datasheet.

Signed-off-by: Wilken Gottwalt 
---
Changes in v2:
- addressed comments/proposals from Maxime
---
 Documentation/arm/sunxi.rst | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm/sunxi.rst b/Documentation/arm/sunxi.rst
index b037428aee98..19d78eea31a9 100644
--- a/Documentation/arm/sunxi.rst
+++ b/Documentation/arm/sunxi.rst
@@ -103,12 +103,14 @@ SunXi family
 
 * No document available now, but is known to be working properly with
   H3 drivers and memory map.
+* The changes compared to a H3 are a downgraded GMAC to a 100MBit MAC
+  and the display engine (DE) not having support for 4k.
 
   - Allwinner H3 (sun8i)
 
 * Datasheet
 
-  http://dl.linux-sunxi.org/H3/Allwinner_H3_Datasheet_V1.0.pdf
+  https://linux-sunxi.org/images/4/4b/Allwinner_H3_Datasheet_V1.2.pdf
 
   - Allwinner R40 (sun8i)
 
-- 
2.28.0



Re: [PATCH] mm: memcg: yield cpu when we fail to charge pages

2020-09-13 Thread Xunlei Pang
On 2020/9/9 AM2:50, Julius Hemanth Pitti wrote:
> For non root CG, in try_charge(), we keep trying
> to charge until we succeed. On non-preemptive
> kernel, when we are OOM, this results in holding
> CPU forever.
> 
> On SMP systems, this doesn't create a big problem
> because oom_reaper get a change to kill victim
> and make some free pages. However on a single-core
> CPU (or cases where oom_reaper pinned to same CPU
> where try_charge is executing), oom_reaper shall
> never get scheduled and we stay in try_charge forever.
> 
> Steps to repo this on non-smp:
> 1. mount -t tmpfs none /sys/fs/cgroup
> 2. mkdir /sys/fs/cgroup/memory
> 3. mount -t cgroup none /sys/fs/cgroup/memory -o memory
> 4. mkdir /sys/fs/cgroup/memory/0
> 5. echo 40M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
> 6. echo $$ > /sys/fs/cgroup/memory/0/tasks
> 7. stress -m 5 --vm-bytes 10M --vm-hang 0
> 
> Signed-off-by: Julius Hemanth Pitti 
> ---
>  mm/memcontrol.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 0d6f3ea86738..4620d70267cb 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2652,6 +2652,8 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t 
> gfp_mask,
>   if (fatal_signal_pending(current))
>   goto force;
>  
> + cond_resched();
> +
>   /*
>* keep retrying as long as the memcg oom killer is able to make
>* a forward progress or bypass the charge if the oom killer
> 

This should be fixed by:
https://lkml.org/lkml/2020/8/26/1440

Thanks,
Xunlei


Re: [PATCH] brcmfmac: initialize variable

2020-09-13 Thread Nathan Chancellor
On Sun, Sep 13, 2020 at 07:35:22AM -0700, t...@redhat.com wrote:
> From: Tom Rix 
> 
> clang static analysis flags this problem
> sdio.c:3265:13: warning: Branch condition evaluates to
>   a garbage value
> } else if (pending) {
>^~~
> 
> brcmf_sdio_dcmd_resp_wait() only sets pending to true.
> So pending needs to be initialized to false.
> 
> Fixes: 5b435de0d786 ("net: wireless: add brcm80211 drivers")
> Signed-off-by: Tom Rix 

Reviewed-by: Nathan Chancellor 

> ---
>  drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c 
> b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> index d4989e0cd7be..403b123710ec 100644
> --- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/sdio.c
> @@ -3233,7 +3233,7 @@ brcmf_sdio_bus_rxctl(struct device *dev, unsigned char 
> *msg, uint msglen)
>  {
>   int timeleft;
>   uint rxlen = 0;
> - bool pending;
> + bool pending = false;
>   u8 *buf;
>   struct brcmf_bus *bus_if = dev_get_drvdata(dev);
>   struct brcmf_sdio_dev *sdiodev = bus_if->bus_priv.sdio;
> -- 
> 2.18.1
> 


Re: [RFC PATCH v2] PCI/portdrv: Only disable Bus Master on kexec reboot and connected PCI devices

2020-09-13 Thread Lukas Wunner
On Mon, Sep 14, 2020 at 04:29:10AM +0800, Tiezhu Yang wrote:
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -143,6 +144,28 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
>   }
>  
>   pcie_port_device_remove(dev);
> + pci_disable_device(dev);
> +}
> +
> +static void pcie_portdrv_shutdown(struct pci_dev *dev)
> +{
> + if (pci_bridge_d3_possible(dev)) {
> + pm_runtime_forbid(>dev);
> + pm_runtime_get_noresume(>dev);
> + pm_runtime_dont_use_autosuspend(>dev);
> + }
> +
> + pcie_port_device_remove(dev);
> +
> + /*
> +  * If this is a kexec reboot, turn off Bus Master bit on the
> +  * device to tell it to not continue to do DMA. Don't touch
> +  * devices in D3cold or unknown states.
> +  * If it is not a kexec reboot, firmware will hit the PCI
> +  * devices with big hammer and stop their DMA any way.
> +  */
> + if (kexec_in_progress && (dev->current_state <= PCI_D3hot))
> + pci_disable_device(dev);

The last portion of this function is already executed afterwards by
pci_device_shutdown().  You don't need to duplicate it here:

device_shutdown()
  dev->bus->shutdown() == pci_device_shutdown()
drv->shutdown() == pcie_portdrv_shutdown()
  pci_disable_device()
pci_disable_device()

Thanks,

Lukas


Re: [PATCH V2] arm64/hotplug: Improve memory offline event notifier

2020-09-13 Thread Anshuman Khandual



On 09/11/2020 07:36 PM, Catalin Marinas wrote:
> Hi Anshuman,
> 
> On Mon, Aug 24, 2020 at 09:34:29AM +0530, Anshuman Khandual wrote:
>> This brings about three different changes to the sole memory event notifier
>> for arm64 platform and improves it's robustness while also enhancing debug
>> capabilities during potential memory offlining error conditions.
>>
>> This moves the memory notifier registration bit earlier in the boot process
>> from device_initcall() to setup_arch() which will help in guarding against
>> potential early boot memory offline requests.
>>
>> This enables MEM_OFFLINE memory event handling. It will help intercept any
>> possible error condition such as if boot memory some how still got offlined
>> even after an expilicit notifier failure, potentially by a future change in
>> generic hotplug framework. This would help detect such scenarious and help
>> debug further.
>>
>> It also adds a validation function which scans entire boot memory and makes
>> sure that early memory sections are online. This check is essential for the
>> memory notifier to work properly as it cannot prevent boot memory offlining
>> if they are not online to begin with. But this additional sanity check is
>> enabled only with DEBUG_VM.
> 
> Could you please split this in separate patches rather than having a
> single one doing three somewhat related things?

Sure, will do.

> 
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -376,6 +376,14 @@ void __init __no_sanitize_address setup_arch(char 
>> **cmdline_p)
>>  "This indicates a broken bootloader or old kernel\n",
>>  boot_args[1], boot_args[2], boot_args[3]);
>>  }
>> +
>> +/*
>> + * Register the memory notifier which will prevent boot
>> + * memory offlining requests - early enough. But there
>> + * should not be any actual offlinig request till memory
>> + * block devices are initialized with memory_dev_init().
>> + */
>> +memory_hotremove_notifier();
> 
> Why can this not be an early_initcall()? As you said, memory_dev_init()
> is called much later, after the SMP was initialised.

This proposal moves memory_hotremove_notifier() to setup_arch() because it
could and there is no harm in calling this too early than required for now.
But in case generic MM sequence of events during memory init changes later,
this notifier will still work.

IIUC, the notifier chain registration can be called very early in the boot
process without much problem. There are some precedence on other platforms.

1. arch/s390/mm/init.c- In device_initcall() via 
s390_cma_mem_init()
2. arch/s390/mm/setup.c   - In setup_arch() via 
reserve_crashkernel()
3. arch/powerpc/platforms/pseries/cmm.c   - In module_init() via cmm_init()
4. arch/powerpc/platforms/pseries/iommu.c - via iommu_init_early_pSeries()
via pSeries_init()
via pSeries_probe() aka 
ppc_md.porbe()
via probe_machine()
via setup_arch()

> 
> You could even combine this with validate_bootmem_online_state() in a
> single early_initcall() which, after checking, registers the notifier.
> 

Yes, that will be definitely simpler but there might be still some value
in having this registration in setup_arch() which guard against future
generic MM changes while keeping it separate from the sanity check i.e
validate_bootmem_online_state() which is enabled only with DEBUG_VM. But
will combine both in early_initcall() with some name changes if that is
preferred.


Re: [External] Re: [PATCH] mm: memcontrol: Fix out-of-bounds on the buf returned by memory_stat_format

2020-09-13 Thread Muchun Song
On Sun, Sep 13, 2020 at 8:42 AM Andrew Morton  wrote:
>
> On Sat, 12 Sep 2020 23:51:00 +0800 Muchun Song  
> wrote:
>
> > The memory_stat_format() returns a format string, but the return buf
> > may not including the trailing '\0'. So the users may read the buf
> > out of bounds.
>
> That sounds serious.  Is a cc:stable appropriate?
>

Yeah, I think we should cc:stable.

-- 
Yours,
Muchun


Re: [PATCH] dt-bindings: arm: sunxi: update H2+/H3 cpu clocks

2020-09-13 Thread Wilken Gottwalt
On Wed, 9 Sep 2020 17:53:07 +0200
Maxime Ripard  wrote:

> On Wed, Sep 09, 2020 at 03:54:46PM +0200, Wilken Gottwalt wrote:
> > On Wed, 9 Sep 2020 14:08:59 +0200
> > Maxime Ripard  wrote:
> > > Hi!
> > > 
> > > Thanks for contributing
> > > 
> > > The prefix isn't right though.
> > > 
> > > dt-bindings is used when you're modifying the binding itself, ie the
> > > description of what the node is supposed to look like, not when you
> > > actually use that node in a DT.
> > > 
> > > In that case, that would be ARM: dts: sunxi:
> > > 
> > > (we're on the ARM architecture, modifying dts's, for the sunxi platform)
> > 
> > Ah, I see, it was my first attempt to contribute and wasn't 100% sure, just
> > took the line from similar patches on the LKML. Thanks for the correction.
> > 
> > > On Thu, Sep 03, 2020 at 12:07:08PM +0200, Wilken Gottwalt wrote:
> > > > Change H2+/H3 clocks to 8 steps from 528 MHz up to 1200 MHz to support a
> > > > more fine-grained powersave setup. The SoCs are made for 1296 MHz, so
> > > > these clocks are still in a safe range. Tested on a NanoPi Duo and
> > > > OrangePi Zero.
> > > 
> > > How was this tested?
> > 
> > This is a longer story. It actually runs on hardware which is in production
> > for about 2-3 years, some use H2+ with full voltage regulators and some are
> > similar to the NanoPi DUO, where the voltage regulator can only switch
> > between 1.1 and 1.3 volts. It runs in two ways: A fully dynamic setup where
> > the ondemand scheduler is used and the second way where it is switched to
> > fixed values (based on load and temperature) using the cpufrequtils. The
> > devices running a 4.14.x kernel and are tested against 4.19.x kernels.
> > These devices are routers running a custom tcp/ip stack and have a high I/O
> > load. I also prepared devices based on a custom H3 design, which ran stable
> > at 1.392 GHz (but had passive coolers attached). Do these explanations
> > help?
> 
> To some extent, but not entirely. Depending on the governor / workload,
> some OPPs might never be used at all.

I am aware of this and the devices had 24 hours burnin tests with a selfwritten
tool very similar to your posted cpuburn + scripts. I will try to run your
pointed out tool, but I may need some time for doing so. Getting a ruby
installation into this embedded Linux is not easy and a whole compiler won't
be possible at all. If you are interested I could put our test tool to github.

> > > cpufreq OPP misconfiguration on Allwinner SoCs has been known to create
> > > some errors that are fairly hard to spot and be quite easy to go
> > > unnoticed (like caches corruptions).
> > 
> > Yeah, I noticed that in the beginning where I prepared the first kernels
> > for these devices. But after switching to multiples of 48MHz and bigger
> > steps these issues disappeared. I'm aware that this does not proof that
> > these issues do not appear, but thougth I share the values which I
> > consider stable.
> 
> The only really reliable test we've had so far is the one I pointed out,
> so please run it on one board at least
> 
> > > The only reliable test we have is:
> > > https://github.com/ssvb/cpuburn-arm/blob/master/cpufreq-ljt-stress-test
> > > 
> > > > Signed-off-by: Wilken Gottwalt 
> > > > ---
> > > >  arch/arm/boot/dts/sun8i-h3.dtsi | 34 +++--
> > > >  1 file changed, 32 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/arm/boot/dts/sun8i-h3.dtsi 
> > > > b/arch/arm/boot/dts/sun8i-h3.dtsi
> > > > index 4e89701df91f..5517fcc02b7d 100644
> > > > --- a/arch/arm/boot/dts/sun8i-h3.dtsi
> > > > +++ b/arch/arm/boot/dts/sun8i-h3.dtsi
> > > > @@ -48,23 +48,53 @@ cpu0_opp_table: opp_table0 {
> > > > compatible = "operating-points-v2";
> > > > opp-shared;
> > > >  
> > > > -   opp-64800 {
> > > > -   opp-hz = /bits/ 64 <64800>;
> > > > +   opp-52800 {
> > > > +   opp-hz = /bits/ 64 <52800>;
> > > > +   opp-microvolt = <102 102 130>;
> > > > +   clock-latency-ns = <244144>; /* 8 32k periods */
> > > > +   };
> > > > +
> > > > +   opp-62400 {
> > > > +   opp-hz = /bits/ 64 <62400>;
> > > > opp-microvolt = <104 104 130>;
> > > > clock-latency-ns = <244144>; /* 8 32k periods */
> > > > };
> > > >  
> > > > +   opp-72000 {
> > > > +   opp-hz = /bits/ 64 <72000>;
> > > > +   opp-microvolt = <106 106 130>;
> > > > +   clock-latency-ns = <244144>; /* 8 32k periods */
> > > > +   };
> > > > +
> > > > opp-81600 {
> > > > opp-hz = /bits/ 64 <81600>;
> > > > opp-microvolt = <110 110 130>;
> > > > 

[PATCH -next] mt76: mt7915: convert to use le16_add_cpu()

2020-09-13 Thread Liu Shixin
Convert cpu_to_le16(le16_to_cpu(E1) + E2) to use le16_add_cpu().

Signed-off-by: Liu Shixin 
---
 drivers/net/wireless/mediatek/mt76/mt7915/mcu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c 
b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
index ac8ec257da03..8f0b67d0e93e 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7915/mcu.c
@@ -714,8 +714,8 @@ mt7915_mcu_add_nested_subtlv(struct sk_buff *skb, int 
sub_tag, int sub_len,
ptlv = skb_put(skb, sub_len);
memcpy(ptlv, , sizeof(tlv));
 
-   *sub_ntlv = cpu_to_le16(le16_to_cpu(*sub_ntlv) + 1);
-   *len = cpu_to_le16(le16_to_cpu(*len) + sub_len);
+   le16_add_cpu(sub_ntlv, 1);
+   le16_add_cpu(len, sub_len);
 
return ptlv;
 }
-- 
2.25.1



[PATCH -next] dh key: convert to use be32_add_cpu()

2020-09-13 Thread Liu Shixin
Convert cpu_to_be32(be32_to_cpu(E1) + E2) to use be32_add_cpu().

Signed-off-by: Liu Shixin 
---
 security/keys/dh.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/security/keys/dh.c b/security/keys/dh.c
index 1abfa70ed6e1..2635cb8a4561 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -186,7 +186,7 @@ static int kdf_ctr(struct kdf_sdesc *sdesc, const u8 *src, 
unsigned int slen,
 
dlen -= h;
dst += h;
-   counter = cpu_to_be32(be32_to_cpu(counter) + 1);
+   be32_add_cpu(, 1);
}
 
return 0;
-- 
2.25.1



[PATCH -next] soc/qman: convert to use be32_add_cpu()

2020-09-13 Thread Liu Shixin
Signed-off-by: Liu Shixin 
drivers/soc/fsl/qbman/qman_test_api.c---
 drivers/soc/fsl/qbman/qman_test_api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qbman/qman_test_api.c 
b/drivers/soc/fsl/qbman/qman_test_api.c
index 2895d062cf51..7066b2f1467c 100644
--- a/drivers/soc/fsl/qbman/qman_test_api.c
+++ b/drivers/soc/fsl/qbman/qman_test_api.c
@@ -86,7 +86,7 @@ static void fd_inc(struct qm_fd *fd)
len--;
qm_fd_set_param(fd, fmt, off, len);
 
-   fd->cmd = cpu_to_be32(be32_to_cpu(fd->cmd) + 1);
+   be32_add_cpu(>cmd, 1);
 }
 
 /* The only part of the 'fd' we can't memcmp() is the ppid */
-- 
2.25.1



[PATCH -next] dm integrity: convert to use le64_add_cpu()

2020-09-13 Thread Liu Shixin
Convert cpu_to_le64(le64_to_cpu(E1) + E2) to use le64_add_cpu().

Signed-off-by: Liu Shixin 
---
 drivers/md/dm-integrity.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 3fc3757def55..cf9dadd55625 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -3696,7 +3696,7 @@ static int create_journal(struct dm_integrity_c *ic, char 
**error)
 retest_commit_id:
for (j = 0; j < i; j++) {
if (ic->commit_ids[j] == ic->commit_ids[i]) {
-   ic->commit_ids[i] = 
cpu_to_le64(le64_to_cpu(ic->commit_ids[i]) + 1);
+   le64_add_cpu(>commit_ids[i], 1);
goto retest_commit_id;
}
}
-- 
2.25.1



[PATCH -next] crypto: atmel-aes - convert to use be32_add_cpu()

2020-09-13 Thread Liu Shixin
Convert cpu_to_be32(be32_to_cpu(E1) + E2) to use be32_add_cpu().

Signed-off-by: Liu Shixin 
---
 drivers/crypto/atmel-aes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/atmel-aes.c b/drivers/crypto/atmel-aes.c
index a6e14491e080..b1d286004295 100644
--- a/drivers/crypto/atmel-aes.c
+++ b/drivers/crypto/atmel-aes.c
@@ -1539,7 +1539,7 @@ static int atmel_aes_gcm_length(struct atmel_aes_dev *dd)
 
/* Write incr32(J0) into IV. */
j0_lsw = j0[3];
-   j0[3] = cpu_to_be32(be32_to_cpu(j0[3]) + 1);
+   be32_add_cpu([3], 1);
atmel_aes_write_block(dd, AES_IVR(0), j0);
j0[3] = j0_lsw;
 
-- 
2.25.1



[PATCH -next] can: peak_usb: convert to use le32_add_cpu()

2020-09-13 Thread Liu Shixin
Convert cpu_to_le32(le32_to_cpu(E1) + E2) to use le32_add_cpu().

Signed-off-by: Liu Shixin 
---
 drivers/net/can/usb/peak_usb/pcan_usb_pro.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c 
b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
index 1689ab387612..d8ebf35dea1c 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_pro.c
@@ -186,7 +186,7 @@ static int pcan_msg_add_rec(struct pcan_usb_pro_msg *pm, 
int id, ...)
 
len = pc - pm->rec_ptr;
if (len > 0) {
-   *pm->u.rec_cnt = cpu_to_le32(le32_to_cpu(*pm->u.rec_cnt) + 1);
+   le32_add_cpu(pm->u.rec_cnt, 1);
*pm->rec_ptr = id;
 
pm->rec_ptr = pc;
-- 
2.25.1



RE: [PATCH v2 0/5] Add support of SECVIO from SNVS on iMX8q/x

2020-09-13 Thread Aisheng Dong
> From: Shawn Guo 
> Sent: Wednesday, August 19, 2020 9:23 PM
> 
> On Tue, Aug 18, 2020 at 09:52:02AM +0200, Franck LENORMAND (OSS) wrote:
> > Hello,
> >
> > Peng was able to do a firt pass of review on my patchset which led to
> > this second version. I hope a maintainer will be able to take a look
> > at this patchset once rested after all the work they did for 5.9.
> 
> @Peng, are you okay with this version?
> 
> @Aisheng, have a review on this?

Sorry, just noticed this.
Will find a time to review these two days.

Regards
Aisheng

> 
> Shawn
> 
> >
> > On mar., 2020-07-21 at 17:20 +0200, franck.lenorm...@oss.nxp.com wrote:
> > > From: Franck LENORMAND 
> > >
> > > This patchset aims to add support for the SECurity VIOlation
> > > (SECVIO) of the SNVS. A secvio is a signal emitted by the SNVS when
> > > a hardware attack is detected. On imx8x and imx8q SoC, the SNVS is
> > > controlled by the SECO and it is possible to interact with it using the 
> > > SCU
> using the SC APIs.
> > >
> > > For the driver to communicate with the SNVS via the SCU and the SECO, I
> had to:
> > >  - Add support for exchange of big message with the SCU (needed for
> > > imx_scu_irq_get_status)
> > >  - Add API to check linux can control the SECVIO
> > > (imx_sc_rm_is_resource_owned)
> > >  - Add APIs for the driver to read the state of the SECVIO registers
> > > of the SNVS and DGO (imx_sc_seco_secvio_enable and
> imx_sc_seco_secvio_enable).
> > >
> > > To check the state of the SECVIO IRQ in the SCU, I added the
> > > imx_scu_irq_get_status API.
> > >
> > > The secvio driver is designed to receive the IRQ produced by the
> > > SNVS in case of hardware attack and notify the status to the audit
> > > framework which can be used by the user.
> > >
> > > The goal of the driver is to be self suficient but can be extended
> > > by the user to perform custom operations on values read
> > > (imx_sc_seco_secvio_enable)
> > >
> > > v2:
> > >  - Removed (firmware: imx: scu-rm: Add Resource Management APIs)
> > >   -> Code required is already present
> > >  - Removed (firmware: imx: scu: Support reception of messages of any size)
> > >   -> The imx-scu is already working in fast-ipc mode
> > >  - (soc: imx8: Add the SC SECVIO driver):
> > >   - Fixed the warnings reported by kernel test robot
> > >
> > > Franck LENORMAND (5):
> > >   firmware: imx: scu-seco: Add SEcure Controller APIS
> > >   firmware: imx: scu-irq: Add API to retrieve status of IRQ
> > >   dt-bindings: firmware: imx-scu: Add SECVIO resource
> > >   dt-bindings: arm: imx: Documentation of the SC secvio driver
> > >   soc: imx8: Add the SC SECVIO driver
> > >
> > >  .../bindings/arm/freescale/fsl,imx-sc-secvio.yaml  |  34 +
> > >  drivers/firmware/imx/Makefile  |   2 +-
> > >  drivers/firmware/imx/imx-scu-irq.c |  37 +-
> > >  drivers/firmware/imx/imx-scu.c |   3 +
> > >  drivers/firmware/imx/seco.c| 275
> +++
> > >  drivers/soc/imx/Kconfig|  10 +
> > >  drivers/soc/imx/Makefile   |   1 +
> > >  drivers/soc/imx/secvio/Kconfig |  10 +
> > >  drivers/soc/imx/secvio/Makefile|   3 +
> > >  drivers/soc/imx/secvio/imx-secvio-audit.c  |  39 +
> > >  drivers/soc/imx/secvio/imx-secvio-debugfs.c| 379 +
> > >  drivers/soc/imx/secvio/imx-secvio-sc-int.h |  84 ++
> > >  drivers/soc/imx/secvio/imx-secvio-sc.c | 858
> > > +
> > >  include/dt-bindings/firmware/imx/rsrc.h|   3 +-
> > >  include/linux/firmware/imx/ipc.h   |   1 +
> > >  include/linux/firmware/imx/sci.h   |   5 +
> > >  include/linux/firmware/imx/svc/seco.h  |  73 ++
> > >  include/soc/imx/imx-secvio-sc.h| 177 +
> > >  18 files changed, 1991 insertions(+), 3 deletions(-)
> > >  create mode 100644
> > > Documentation/devicetree/bindings/arm/freescale/fsl,imx-sc-
> > > secvio.yaml
> > >  create mode 100644 drivers/firmware/imx/seco.c
> > >  create mode 100644 drivers/soc/imx/secvio/Kconfig
> > >  create mode 100644 drivers/soc/imx/secvio/Makefile
> > >  create mode 100644 drivers/soc/imx/secvio/imx-secvio-audit.c
> > >  create mode 100644 drivers/soc/imx/secvio/imx-secvio-debugfs.c
> > >  create mode 100644 drivers/soc/imx/secvio/imx-secvio-sc-int.h
> > >  create mode 100644 drivers/soc/imx/secvio/imx-secvio-sc.c
> > >  create mode 100644 include/linux/firmware/imx/svc/seco.h
> > >  create mode 100644 include/soc/imx/imx-secvio-sc.h
> > >
> >


Re: [PATCH v2] i2c: virtio: add a virtio i2c frontend driver

2020-09-13 Thread Jie Deng



On 2020/9/14 10:46, Jason Wang wrote:



+
+#define VIRTIO_I2C_MSG_OK    0
+#define VIRTIO_I2C_MSG_ERR    1
+
+/**
+ * struct virtio_i2c_hdr - the virtio I2C message header structure
+ * @addr: i2c_msg addr, the slave address
+ * @flags: i2c_msg flags
+ * @len: i2c_msg len
+ */
+struct virtio_i2c_hdr {
+    __le16 addr;
+    __le16 flags;
+    __le16 len;
+};



As said in v1, this should belong to uapi.


That's right. I missed this.
I will move these things to uapi. Thanks.





+
+/**
+ * struct virtio_i2c_msg - the virtio I2C message structure
+ * @hdr: the virtio I2C message header
+ * @buf: virtio I2C message data buffer
+ * @status: the processing result from the backend
+ */
+struct virtio_i2c_msg {
+    struct virtio_i2c_hdr hdr;
+    u8 *buf;
+    u8 status;
+};



I'm not quite sure this is the best layout.

E.g virtio scsi differ in buffer out of out one:

structvirtio_scsi_req_cmd{
...
u8 dataout[];
...
u8 datain[];

}

And I would like to have a look at the spec patch.

Thanks


Sure. I will send the v3 along with the spec patch.
Thanks.




RE: [PATCH v3] Input: elants_i2c - Report resolution of ABS_MT_TOUCH_MAJOR by FW information.

2020-09-13 Thread Johnny.Chuang
> On Thu, 27 Aug 2020 at 19:20, Johnny Chuang
>  wrote:
> >
> > This patch adds a new behavior to report touch major resolution based
> > on information provided by firmware.
> >
> > In initial process, driver acquires touch information from touch ic.
> > It contains one byte about the resolution value of ABS_MT_TOUCH_MAJOR.
> > Touch driver will report touch major resolution by this information.
> >
> > Signed-off-by: Johnny Chuang 
> 
> Thanks Johnny!
> 
> Reviewed-by: Harry Cutts 
> 
> Harry Cutts
> Chrome OS Touch/Input team
> 

Hi Sirs,
Can you help to review this patch?

> > ---
> > Changes in v2:
> >   - register a real resolution value from firmware,
> > instead of hardcoding resolution value as 1 by flag.
> > Changes in v3:
> >   - modify git log message from flag to real value.
> >   - modify driver comment from flag to real value.
> > ---
> >  drivers/input/touchscreen/elants_i2c.c | 6 ++
> >  1 file changed, 6 insertions(+)
> >
> > diff --git a/drivers/input/touchscreen/elants_i2c.c
> > b/drivers/input/touchscreen/elants_i2c.c
> > index b0bd5bb..661a3ee 100644
> > --- a/drivers/input/touchscreen/elants_i2c.c
> > +++ b/drivers/input/touchscreen/elants_i2c.c
> > @@ -151,6 +151,7 @@ struct elants_data {
> >
> > bool wake_irq_enabled;
> > bool keep_power_in_suspend;
> > +   u8 report_major_resolution;
> >
> > /* Must be last to be used for DMA operations */
> > u8 buf[MAX_PACKET_SIZE] cacheline_aligned; @@ -459,6
> > +460,9 @@ static int elants_i2c_query_ts_info(struct elants_data *ts)
> > rows = resp[2] + resp[6] + resp[10];
> > cols = resp[3] + resp[7] + resp[11];
> >
> > +   /* get report resolution value of ABS_MT_TOUCH_MAJOR */
> > +   ts->report_major_resolution = resp[16];
> > +
> > /* Process mm_to_pixel information */
> > error = elants_i2c_execute_command(client,
> >get_osr_cmd,
> > sizeof(get_osr_cmd), @@ -1325,6 +1329,8 @@ static int
> elants_i2c_probe(struct i2c_client *client,
> >  0, MT_TOOL_PALM, 0, 0);
> > input_abs_set_res(ts->input, ABS_MT_POSITION_X, ts->x_res);
> > input_abs_set_res(ts->input, ABS_MT_POSITION_Y, ts->y_res);
> > +   if (ts->report_major_resolution > 0)
> > +   input_abs_set_res(ts->input, ABS_MT_TOUCH_MAJOR,
> > + ts->report_major_resolution);
> >
> > touchscreen_parse_properties(ts->input, true, >prop);
> >
> > --
> > 2.7.4
> >



Re: [PATCH 12/15] selftests/seccomp: powerpc: Fix seccomp return value testing

2020-09-13 Thread Michael Ellerman
Kees Cook  writes:
> On powerpc, the errno is not inverted, and depends on ccr.so being
> set. Add this to a powerpc definition of SYSCALL_RET_SET().
>
> Co-developed-by: Thadeu Lima de Souza Cascardo 
> Signed-off-by: Thadeu Lima de Souza Cascardo 
> Link: 
> https://lore.kernel.org/linux-kselftest/20200911181012.171027-1-casca...@canonical.com/
> Fixes: 5d83c2b37d43 ("selftests/seccomp: Add powerpc support")
> Signed-off-by: Kees Cook 
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 15 +++
>  1 file changed, 15 insertions(+)

This looks right to me, and matches what strace does AFAICS.

Reviewed-by: Michael Ellerman 

cheers

> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c 
> b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 623953a53032..bbab2420d708 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -1750,6 +1750,21 @@ TEST_F(TRACE_poke, getpid_runs_normally)
>  # define ARCH_REGS   struct pt_regs
>  # define SYSCALL_NUM(_regs)  (_regs).gpr[0]
>  # define SYSCALL_RET(_regs)  (_regs).gpr[3]
> +# define SYSCALL_RET_SET(_regs, _val)\
> + do {\
> + typeof(_val) _result = (_val);  \
> + /*  \
> +  * A syscall error is signaled by CR0 SO bit\
> +  * and the code is stored as a positive value.  \
> +  */ \
> + if (_result < 0) {  \
> + SYSCALL_RET(_regs) = -result;   \
> + (_regs).ccr |= 0x1000;  \
> + } else {\
> + SYSCALL_RET(_regs) = result;\
> + (_regs).ccr &= ~0x1000; \
> + }   \
> + } while (0)
>  #elif defined(__s390__)
>  # define ARCH_REGS   s390_regs
>  # define SYSCALL_NUM(_regs)  (_regs).gprs[2]
> -- 
> 2.25.1


[PATCH RFC 4/4] 9p: fix race issue in fid contention.

2020-09-13 Thread Jianyong Wu
Eric's and Greg's patch offer a mechanism to fix open-unlink-f*syscall
bug in 9p. But there is race issue in fid comtention.
As Greg's patch stores all of fids from opened files into according inode,
so all the lookup fid ops can retrieve fid from inode preferentially. But
there is no mechanism to handle the fid comtention issue. For example,
there are two threads get the same fid in the same time and one of them
clunk the fid before the other thread ready to discard the fid. In this
scenario, it will lead to some fatal problems, even kernel core dump.

I introduce a mechanism to fix this race issue. A counter field introduced
into p9_fid struct to store the reference counter to the fid. When a fid
is allocated from the inode, the counter will increase, and will decrease
at the end of its occupation. It is guaranteed that the fid won't be clunked
before the reference counter go down to 0, then we can avoid the clunked
fid to be used.
As there is no need to retrieve fid from inode in all conditions, a enum value
denotes the source of the fid is introduced to 9p_fid either. So we can only
handle the reference counter as to the fid obtained from inode.

tests:
race issue test from the old test case:
for file in {01..50}; do touch f.${file}; done
seq 1 1000 | xargs -n 1 -P 50 -I{} cat f.* > /dev/null

open-unlink-f*syscall test:
I have tested for f*syscall include: ftruncate fstat fchown fchmod faccessat.

Signed-off-by: Jianyong Wu 
---
 fs/9p/fid.c | 27 +-
 fs/9p/fid.h | 24 +++
 fs/9p/vfs_dentry.c  |  2 +-
 fs/9p/vfs_dir.c | 23 +-
 fs/9p/vfs_file.c|  2 +-
 fs/9p/vfs_inode.c   | 42 
 fs/9p/vfs_inode_dotl.c  | 43 +++--
 fs/9p/vfs_super.c   |  7 +--
 fs/9p/xattr.c   | 18 +
 include/net/9p/client.h |  7 +++
 net/9p/client.c |  2 ++
 11 files changed, 144 insertions(+), 53 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 0b23b0fe6c51..fd8cfa4776c9 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -60,6 +60,10 @@ static struct p9_fid *v9fs_fid_find_inode(struct inode 
*inode, kuid_t uid)
break;
}
}
+   if (ret && !IS_ERR(ret)) {
+   atomic_inc(>count);
+   ret->s = FID_FROM_INODE;
+   }
spin_unlock(>i_lock);
return ret;
 }
@@ -84,10 +88,13 @@ void v9fs_open_fid_add(struct inode *inode, struct p9_fid 
*fid)
  * @dentry: dentry to look for fid in
  * @uid: return fid that belongs to the specified user
  * @any: if non-zero, return any fid associated with the dentry
+ * @source: from which of inode or dentry caller want to retrieve fid, 0
+ * denotes dentry and other denotes inode. Only if the f* syscall related
+ * funcs are recommended to set to non-zero.
  *
  */
 
-static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int any)
+static struct p9_fid *v9fs_fid_find(struct dentry *dentry, kuid_t uid, int 
any, int source)
 {
struct p9_fid *fid, *ret;
 
@@ -96,7 +103,7 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, 
kuid_t uid, int any)
 any);
ret = NULL;
 
-   if (d_inode(dentry))
+   if (d_inode(dentry) && source)
ret = v9fs_fid_find_inode(d_inode(dentry), uid);
 
/* we'll recheck under lock if there's anything to look in */
@@ -109,6 +116,8 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, 
kuid_t uid, int any)
break;
}
}
+   if (ret && !IS_ERR(ret))
+   ret->s = FID_FROM_DENTRY;
spin_unlock(>d_lock);
}
 
@@ -144,7 +153,7 @@ static int build_path_from_dentry(struct v9fs_session_info 
*v9ses,
 }
 
 static struct p9_fid *v9fs_fid_lookup_with_uid(struct dentry *dentry,
-  kuid_t uid, int any)
+  kuid_t uid, int any, int source)
 {
struct dentry *ds;
const unsigned char **wnames, *uname;
@@ -154,7 +163,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct 
dentry *dentry,
 
v9ses = v9fs_dentry2v9ses(dentry);
access = v9ses->flags & V9FS_ACCESS_MASK;
-   fid = v9fs_fid_find(dentry, uid, any);
+   fid = v9fs_fid_find(dentry, uid, any, source);
if (fid)
return fid;
/*
@@ -164,7 +173,7 @@ static struct p9_fid *v9fs_fid_lookup_with_uid(struct 
dentry *dentry,
 */
down_read(>rename_sem);
ds = dentry->d_parent;
-   fid = v9fs_fid_find(ds, uid, any);
+   fid = v9fs_fid_find(ds, uid, any, 0);
if (fid) {
/* Found the parent fid do a lookup with that */
fid = p9_client_walk(fid, 1, >d_name.name, 1);
@@ -173,7 +182,7 @@ static struct 

[PATCH RFC 3/4] fs/9p: search open fids first

2020-09-13 Thread Jianyong Wu
From: Greg Kurz 

A previous patch fixed the "create-unlink-getattr" idiom: if getattr is
called on an unlinked file, we try to find an open fid attached to the
corresponding inode.

We have a similar issue with file permissions and setattr:

open("./test.txt", O_RDWR|O_CREAT, 0666) = 4
chmod("./test.txt", 0)  = 0
truncate("./test.txt", 0)   = -1 EACCES (Permission denied)
ftruncate(4, 0) = -1 EACCES (Permission denied)

The failure is expected with truncate() but not with ftruncate().

This happens because the lookup code does find a matching fid in the
dentry list. Unfortunately, this is not an open fid and the server
will be forced to rely on the path name, rather than on an open file
descriptor. This is the case in QEMU: the setattr operation will use
truncate() and fail because of bad write permissions.

This patch changes the logic in the lookup code, so that we consider
open fids first. It gives a chance to the server to match this open
fid to an open file descriptor and use ftruncate() instead of truncate().
This does not change the current behaviour for truncate() and other
path name based syscalls, since file permissions are checked earlier
in the VFS layer.

With this patch, we get:

open("./test.txt", O_RDWR|O_CREAT, 0666) = 4
chmod("./test.txt", 0)  = 0
truncate("./test.txt", 0)   = -1 EACCES (Permission denied)
ftruncate(4, 0) = 0

Change-Id: Icb657359493fc9c06956881551e83c7e1af4f024
Signed-off-by: Greg Kurz 
Signed-off-by: Jianyong Wu 
---
 fs/9p/fid.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index d11dd430590d..0b23b0fe6c51 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -95,8 +95,12 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, 
kuid_t uid, int any)
 dentry, dentry, from_kuid(_user_ns, uid),
 any);
ret = NULL;
+
+   if (d_inode(dentry))
+   ret = v9fs_fid_find_inode(d_inode(dentry), uid);
+
/* we'll recheck under lock if there's anything to look in */
-   if (dentry->d_fsdata) {
+   if (!ret && dentry->d_fsdata) {
struct hlist_head *h = (struct hlist_head *)>d_fsdata;
spin_lock(>d_lock);
hlist_for_each_entry(fid, h, dlist) {
@@ -106,9 +110,6 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, 
kuid_t uid, int any)
}
}
spin_unlock(>d_lock);
-   } else {
-   if (dentry->d_inode)
-   ret = v9fs_fid_find_inode(dentry->d_inode, uid);
}
 
return ret;
-- 
2.17.1



[PATCH RFC 1/4] fs/9p: fix create-unlink-getattr idiom

2020-09-13 Thread Jianyong Wu
From: Eric Van Hensbergen 

Fixes several outstanding bug reports of not being able to getattr from an
open file after an unlink.  This patch cleans up transient fids on an unlink
and will search open fids on a client if it detects a dentry that appears to
have been unlinked.  This search is necessary because fstat does not pass fd
information through the VFS API to the filesystem, only the dentry which for
9p has an imperfect match to fids.

Inherent in this patch is also a fix for the qid handling on create/open
which apparently wasn't being set correctly and was necessary for the search
to succeed.

A possible optimization over this fix is to include accounting of open fids
with the inode in the private data (in a similar fashion to the way we track
transient fids with dentries).  This would allow a much quicker search for
a matching open fid.

Signed-off-by: Eric Van Hensbergen 
(changed v9fs_fid_find_global to v9fs_fid_find_inode in comment)
Signed-off-by: Greg Kurz 
Signed-off-by: Jianyong Wu 

Change-Id: Ifd5c8cdca8b40216e3e7d021eb6d0afd750096e7
---
 fs/9p/fid.c   | 30 ++
 fs/9p/vfs_inode.c |  4 
 net/9p/client.c   |  5 -
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 3d681a2c2731..3304984c0fad 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -38,6 +38,33 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
spin_unlock(>d_lock);
 }
 
+/**
+ * v9fs_fid_find_inode - search for a fid off of the client list
+ * @inode: return a fid pointing to a specific inode
+ * @uid: return a fid belonging to the specified user
+ *
+ */
+
+static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
+{
+   struct p9_client *clnt = v9fs_inode2v9ses(inode)->clnt;
+   struct p9_fid *fid, *fidptr, *ret = NULL;
+   unsigned long flags;
+
+   p9_debug(P9_DEBUG_VFS, " inode: %p\n", inode);
+
+   spin_lock_irqsave(>lock, flags);
+   list_for_each_entry_safe(fid, fidptr, >fidlist, flist) {
+   if (uid_eq(fid->uid, uid) &&
+  (inode->i_ino == v9fs_qid2ino(>qid))) {
+   ret = fid;
+   break;
+   }
+   }
+   spin_unlock_irqrestore(>lock, flags);
+   return ret;
+}
+
 /**
  * v9fs_fid_find - retrieve a fid that belongs to the specified uid
  * @dentry: dentry to look for fid in
@@ -65,6 +92,9 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, 
kuid_t uid, int any)
}
}
spin_unlock(>d_lock);
+   } else {
+   if (dentry->d_inode)
+   ret = v9fs_fid_find_inode(dentry->d_inode, uid);
}
 
return ret;
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index ae0c38ad1fcb..31c2fddabb82 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -570,6 +570,10 @@ static int v9fs_remove(struct inode *dir, struct dentry 
*dentry, int flags)
 
v9fs_invalidate_inode_attr(inode);
v9fs_invalidate_inode_attr(dir);
+
+   /* invalidate all fids associated with dentry */
+   /* NOTE: This will not include open fids */
+   dentry->d_op->d_release(dentry);
}
return retval;
 }
diff --git a/net/9p/client.c b/net/9p/client.c
index 09f1ec589b80..1a3f72bf45fc 100644
--- a/net/9p/client.c
+++ b/net/9p/client.c
@@ -1219,7 +1219,7 @@ struct p9_fid *p9_client_walk(struct p9_fid *oldfid, 
uint16_t nwname,
if (nwname)
memmove(>qid, [nwqids - 1], sizeof(struct p9_qid));
else
-   fid->qid = oldfid->qid;
+   memmove(>qid, >qid, sizeof(struct p9_qid));
 
kfree(wqids);
return fid;
@@ -1272,6 +1272,7 @@ int p9_client_open(struct p9_fid *fid, int mode)
p9_is_proto_dotl(clnt) ? "RLOPEN" : "ROPEN",  qid.type,
(unsigned long long)qid.path, qid.version, iounit);
 
+   memmove(>qid, , sizeof(struct p9_qid));
fid->mode = mode;
fid->iounit = iounit;
 
@@ -1317,6 +1318,7 @@ int p9_client_create_dotl(struct p9_fid *ofid, const char 
*name, u32 flags, u32
(unsigned long long)qid->path,
qid->version, iounit);
 
+   memmove(>qid, qid, sizeof(struct p9_qid));
ofid->mode = mode;
ofid->iounit = iounit;
 
@@ -1362,6 +1364,7 @@ int p9_client_fcreate(struct p9_fid *fid, const char 
*name, u32 perm, int mode,
(unsigned long long)qid.path,
qid.version, iounit);
 
+   memmove(>qid, , sizeof(struct p9_qid));
fid->mode = mode;
fid->iounit = iounit;
 
-- 
2.17.1



[PATCH RFC 2/4] fs/9p: track open fids

2020-09-13 Thread Jianyong Wu
From: Greg Kurz 

This patch adds accounting of open fids in a list hanging off the i_private
field of the corresponding inode. This allows faster lookups compared to
searching the full 9p client list.

The lookup code is modified accordingly.

Change-Id: I9f1b99d9c7ab36a15534726cce034a8a1443d680
Signed-off-by: Greg Kurz 
Signed-off-by: Jianyong Wu 
---
 fs/9p/fid.c | 32 +++-
 fs/9p/fid.h |  1 +
 fs/9p/vfs_dir.c |  3 +++
 fs/9p/vfs_file.c|  1 +
 fs/9p/vfs_inode.c   |  6 +-
 fs/9p/vfs_inode_dotl.c  |  1 +
 include/net/9p/client.h |  1 +
 7 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/fs/9p/fid.c b/fs/9p/fid.c
index 3304984c0fad..d11dd430590d 100644
--- a/fs/9p/fid.c
+++ b/fs/9p/fid.c
@@ -39,7 +39,7 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
 }
 
 /**
- * v9fs_fid_find_inode - search for a fid off of the client list
+ * v9fs_fid_find_inode - search for an open fid off of the inode list
  * @inode: return a fid pointing to a specific inode
  * @uid: return a fid belonging to the specified user
  *
@@ -47,24 +47,38 @@ void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid)
 
 static struct p9_fid *v9fs_fid_find_inode(struct inode *inode, kuid_t uid)
 {
-   struct p9_client *clnt = v9fs_inode2v9ses(inode)->clnt;
-   struct p9_fid *fid, *fidptr, *ret = NULL;
-   unsigned long flags;
+   struct hlist_head *h;
+   struct p9_fid *fid, *ret = NULL;
 
p9_debug(P9_DEBUG_VFS, " inode: %p\n", inode);
 
-   spin_lock_irqsave(>lock, flags);
-   list_for_each_entry_safe(fid, fidptr, >fidlist, flist) {
-   if (uid_eq(fid->uid, uid) &&
-  (inode->i_ino == v9fs_qid2ino(>qid))) {
+   spin_lock(>i_lock);
+   h = (struct hlist_head *)>i_private;
+   hlist_for_each_entry(fid, h, ilist) {
+   if (uid_eq(fid->uid, uid)) {
ret = fid;
break;
}
}
-   spin_unlock_irqrestore(>lock, flags);
+   spin_unlock(>i_lock);
return ret;
 }
 
+/**
+ * v9fs_open_fid_add - add an open fid to an inode
+ * @dentry: inode that the fid is being added to
+ * @fid: fid to add
+ *
+ */
+
+void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid)
+{
+   spin_lock(>i_lock);
+   hlist_add_head(>ilist, (struct hlist_head *)>i_private);
+   spin_unlock(>i_lock);
+}
+
+
 /**
  * v9fs_fid_find - retrieve a fid that belongs to the specified uid
  * @dentry: dentry to look for fid in
diff --git a/fs/9p/fid.h b/fs/9p/fid.h
index 928b1093f511..dfa11df02818 100644
--- a/fs/9p/fid.h
+++ b/fs/9p/fid.h
@@ -15,6 +15,7 @@ static inline struct p9_fid *v9fs_parent_fid(struct dentry 
*dentry)
 }
 void v9fs_fid_add(struct dentry *dentry, struct p9_fid *fid);
 struct p9_fid *v9fs_writeback_fid(struct dentry *dentry);
+void v9fs_open_fid_add(struct inode *inode, struct p9_fid *fid);
 static inline struct p9_fid *clone_fid(struct p9_fid *fid)
 {
return IS_ERR(fid) ? fid :  p9_client_walk(fid, 0, NULL, 1);
diff --git a/fs/9p/vfs_dir.c b/fs/9p/vfs_dir.c
index 674d22bf4f6f..d82d8a346f86 100644
--- a/fs/9p/vfs_dir.c
+++ b/fs/9p/vfs_dir.c
@@ -210,6 +210,9 @@ int v9fs_dir_release(struct inode *inode, struct file *filp)
fid = filp->private_data;
p9_debug(P9_DEBUG_VFS, "inode: %p filp: %p fid: %d\n",
 inode, filp, fid ? fid->fid : -1);
+   spin_lock(>i_lock);
+   hlist_del(>ilist);
+   spin_unlock(>i_lock);
if (fid)
p9_client_clunk(fid);
return 0;
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index 92cd1d80218d..b42cc1752cd1 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -96,6 +96,7 @@ int v9fs_file_open(struct inode *inode, struct file *file)
mutex_unlock(>v_mutex);
if (v9ses->cache == CACHE_LOOSE || v9ses->cache == CACHE_FSCACHE)
v9fs_cache_inode_set_cookie(inode, file);
+   v9fs_open_fid_add(inode, fid);
return 0;
 out_error:
p9_client_clunk(file->private_data);
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 31c2fddabb82..6b243ffcbcf0 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -256,6 +256,7 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses,
inode->i_rdev = rdev;
inode->i_atime = inode->i_mtime = inode->i_ctime = current_time(inode);
inode->i_mapping->a_ops = _addr_operations;
+   inode->i_private = NULL;
 
switch (mode & S_IFMT) {
case S_IFIFO:
@@ -796,6 +797,7 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry 
*dentry,
struct v9fs_session_info *v9ses;
struct p9_fid *fid, *inode_fid;
struct dentry *res = NULL;
+   struct inode *inode;
 
if (d_in_lookup(dentry)) {
res = v9fs_vfs_lookup(dir, dentry, 0);
@@ -824,7 +826,8 @@ v9fs_vfs_atomic_open(struct inode *dir, struct dentry 
*dentry,
}
 
 

[PATCH RFC 0/4] 9p: fix open-unlink-f*syscall bug

2020-09-13 Thread Jianyong Wu
open-unlink-f*syscall bug is a well-known bug in 9p, we try to fix the bug
in this patch set.
I take Eric's and Greg's patches which constiute the 1/4 - 3/4 of this patch
set as the main frame of the solution. In patch 4/4, I fix the fid race issue
exists in Greg's patch.

Eric Van Hensbergen (1):
  fs/9p: fix create-unlink-getattr idiom

Greg Kurz (1):
  fs/9p: search open fids first

Jianyong Wu (2):
  fs/9p: track open fids
  9p: fix race issue in fid contention.

 fs/9p/fid.c | 72 +++--
 fs/9p/fid.h | 25 +++---
 fs/9p/vfs_dentry.c  |  2 +-
 fs/9p/vfs_dir.c | 20 ++--
 fs/9p/vfs_file.c|  3 +-
 fs/9p/vfs_inode.c   | 52 +
 fs/9p/vfs_inode_dotl.c  | 44 -
 fs/9p/vfs_super.c   |  7 ++--
 fs/9p/xattr.c   | 18 ---
 include/net/9p/client.h |  8 +
 net/9p/client.c |  7 +++-
 11 files changed, 206 insertions(+), 52 deletions(-)

-- 
2.17.1



[PATCH V2 RESEND 1/4] gpio: mxc: Support module build

2020-09-13 Thread Anson Huang
Change config to tristate, add module device table, module author,
description and license to support module build for i.MX GPIO driver.

As this is a SoC GPIO module, it provides common functions for most
of the peripheral devices, such as GPIO pins control, secondary
interrupt controller for GPIO pins IRQ etc., without GPIO driver, most
of the peripheral devices will NOT work properly, so GPIO module is
similar with clock, pinctrl driver that should be loaded ONCE and
never unloaded.

Since MXC GPIO driver needs to have init function to register syscore
ops once, here still use subsys_initcall(), NOT module_platform_driver().

Signed-off-by: Anson Huang 
---
Changes since V1:
- no code change, just add detail explanation about why this patch
  does NOT support module unloaded.
---
 drivers/gpio/Kconfig| 2 +-
 drivers/gpio/gpio-mxc.c | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 5cfdaf3..c7292a5 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -397,7 +397,7 @@ config GPIO_MVEBU
select REGMAP_MMIO
 
 config GPIO_MXC
-   def_bool y
+   tristate "i.MX GPIO support"
depends on ARCH_MXC || COMPILE_TEST
select GPIO_GENERIC
select GENERIC_IRQ_CHIP
diff --git a/drivers/gpio/gpio-mxc.c b/drivers/gpio/gpio-mxc.c
index 64278a4..643f4c55 100644
--- a/drivers/gpio/gpio-mxc.c
+++ b/drivers/gpio/gpio-mxc.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -158,6 +159,7 @@ static const struct of_device_id mxc_gpio_dt_ids[] = {
{ .compatible = "fsl,imx7d-gpio", .data = 
_gpio_devtype[IMX35_GPIO], },
{ /* sentinel */ }
 };
+MODULE_DEVICE_TABLE(of, mxc_gpio_dt_ids);
 
 /*
  * MX2 has one interrupt *for all* gpio ports. The list is used
@@ -604,3 +606,7 @@ static int __init gpio_mxc_init(void)
return platform_driver_register(_gpio_driver);
 }
 subsys_initcall(gpio_mxc_init);
+
+MODULE_AUTHOR("Shawn Guo ");
+MODULE_DESCRIPTION("i.MX GPIO Driver");
+MODULE_LICENSE("GPL");
-- 
2.7.4



[PATCH V2 RESEND 2/4] arm64: defconfig: Build in CONFIG_GPIO_MXC by default

2020-09-13 Thread Anson Huang
i.MX GPIO is NOT default enabled now, so select CONFIG_GPIO_MXC
as built-in manually.

Signed-off-by: Anson Huang 
---
No change.
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 63003ec..c8fca1a 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -510,6 +510,7 @@ CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCA953X_IRQ=y
 CONFIG_GPIO_BD9571MWV=m
 CONFIG_GPIO_MAX77620=y
+CONFIG_GPIO_MXC=y
 CONFIG_POWER_AVS=y
 CONFIG_QCOM_CPR=y
 CONFIG_ROCKCHIP_IODOMAIN=y
-- 
2.7.4



[PATCH V2 RESEND 3/4] ARM: imx_v6_v7_defconfig: Build in CONFIG_GPIO_MXC by default

2020-09-13 Thread Anson Huang
i.MX GPIO is NOT default enabled now, so select CONFIG_GPIO_MXC
as built-in manually.

Signed-off-by: Anson Huang 
---
No change.
---
 arch/arm/configs/imx_v6_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/imx_v6_v7_defconfig 
b/arch/arm/configs/imx_v6_v7_defconfig
index 0fa79bd..221f5c3 100644
--- a/arch/arm/configs/imx_v6_v7_defconfig
+++ b/arch/arm/configs/imx_v6_v7_defconfig
@@ -217,6 +217,7 @@ CONFIG_GPIO_PCA953X=y
 CONFIG_GPIO_PCF857X=y
 CONFIG_GPIO_STMPE=y
 CONFIG_GPIO_74X164=y
+CONFIG_GPIO_MXC=y
 CONFIG_POWER_RESET=y
 CONFIG_POWER_RESET_SYSCON=y
 CONFIG_POWER_RESET_SYSCON_POWEROFF=y
-- 
2.7.4



[PATCH V2 RESEND 4/4] ARM: multi_v7_defconfig: Build in CONFIG_GPIO_MXC by default

2020-09-13 Thread Anson Huang
i.MX GPIO is NOT default enabled now, so select CONFIG_GPIO_MXC
as built-in manually.

Signed-off-by: Anson Huang 
---
new patch.
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/configs/multi_v7_defconfig 
b/arch/arm/configs/multi_v7_defconfig
index bfaa38c..d2744ff 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -465,6 +465,7 @@ CONFIG_GPIO_PALMAS=y
 CONFIG_GPIO_TPS6586X=y
 CONFIG_GPIO_TPS65910=y
 CONFIG_GPIO_TWL4030=y
+CONFIG_GPIO_MXC=y
 CONFIG_POWER_AVS=y
 CONFIG_ROCKCHIP_IODOMAIN=y
 CONFIG_POWER_RESET_AS3722=y
-- 
2.7.4



Re: [RFC PATCH] fork: Free per-cpu cached vmalloc'ed thread stacks with

2020-09-13 Thread Andrew Morton
On Sat, 5 Sep 2020 00:12:29 + "Isaac J. Manjarres"  
wrote:

> The per-cpu cached vmalloc'ed stacks are currently freed in the
> CPU hotplug teardown path by the free_vm_stack_cache() callback,
> which invokes vfree(), which may result in purging the list of
> lazily freed vmap areas.
> 
> Purging all of the lazily freed vmap areas can take a long time
> when the list of vmap areas is large. This is problematic, as
> free_vm_stack_cache() is invoked prior to the offline CPU's timers
> being migrated. This is not desirable as it can lead to timer
> migration delays in the CPU hotplug teardown path, and timer callbacks
> will be invoked long after the timer has expired.
> 
> For example, on a system that has only one online CPU (CPU 1) that is
> running a heavy workload, and another CPU that is being offlined,
> the online CPU will invoke free_vm_stack_cache() to free the cached
> vmalloc'ed stacks for the CPU being offlined. When there are 2702
> vmap areas that total to 13498 pages, free_vm_stack_cache() takes
> over 2 seconds to execute:
> 
> [001]   399.335808: cpuhp_enter: cpu: 0005 target:   0 step:  67 
> (free_vm_stack_cache)
> 
> /* The first vmap area to be freed */
> [001]   399.337157: __purge_vmap_area_lazy: [0:2702] 0xffc033da8000 - 
> 0xffc033dad000 (5 : 13498)
> 
> /* After two seconds */
> [001]   401.528010: __purge_vmap_area_lazy: [1563:2702] 0xffc02fe1 - 
> 0xffc02fe15000 (5 : 5765)
> 
> Instead of freeing the per-cpu cached vmalloc'ed stacks synchronously
> with respect to the CPU hotplug teardown state machine, free them
> asynchronously to help move along the CPU hotplug teardown state machine
> quickly.
> 
> ...
>
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -202,7 +202,7 @@ static int free_vm_stack_cache(unsigned int cpu)
>   if (!vm_stack)
>   continue;
>  
> - vfree(vm_stack->addr);
> + vfree_atomic(vm_stack->addr);
>   cached_vm_stacks[i] = NULL;
>   }

I guess that makes sense, although perhaps we shouldn't be permitting
purge_list to get so large - such latency issues will still appear in
other situations.

If we go with this fix-just-fork approach, can we please have a comment
in there explaining why vfree_atomic() is being used?



Re: [PATCH] lib/string.c: Clarify kerndoc for stpcpy()

2020-09-13 Thread Andrew Morton
On Sun, 6 Sep 2020 13:26:32 -0700 Kees Cook  wrote:

> On Sun, Sep 06, 2020 at 12:08:09PM -0400, Arvind Sankar wrote:
> > On Sun, Sep 06, 2020 at 03:06:29AM -0700, Kees Cook wrote:
> > > Fix the language around return values to indicate destination instead of
> > > source.
> > > 
> > > Reported-by: Masahiro Yamada 
> > > Link: 
> > > https://lore.kernel.org/lkml/CAK7LNAQvQBhjYgSkvm-dVyNz2Jd2C2qAtfyRk-rngEDfjkc38g
> > > Cc: Nick Desaulniers 
> > > Signed-off-by: Kees Cook 
> > > ---
> > > This is a fix for lib-stringc-implement-stpcpy.patch in -mm.
> > > 
> > > Andrew, please note that it would be nice to get this into -rc6
> > > to unbreak the clang builds.
> > > 
> > > Thanks!
> > > ---
> > >  lib/string.c | 12 ++--
> > >  1 file changed, 6 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/lib/string.c b/lib/string.c
> > > index 6bd0cf0fb009..32a56436c7eb 100644
> > > --- a/lib/string.c
> > > +++ b/lib/string.c
> > > @@ -280,12 +280,12 @@ EXPORT_SYMBOL(strscpy_pad);
> > >   * @src: pointer to the beginning of string being copied from. Must not 
> > > overlap
> > >   *   dest.
> > >   *
> > > - * stpcpy differs from strcpy in a key way: the return value is the new
> > > - * %NUL-terminated character. (for strcpy, the return value is a pointer 
> > > to
> > > - * src. This interface is considered unsafe as it doesn't perform bounds
> > > - * checking of the inputs. As such it's not recommended for usage. 
> > > Instead,
> > > - * its definition is provided in case the compiler lowers other libcalls 
> > > to
> > > - * stpcpy.
> > > + * stpcpy differs from strcpy in a key way: the return value is a pointer
> > > + * to the new %NUL-terminated character in @dest. (For strcpy, the return
> > > + * value is a pointer to the start of @dest. This interface is considered
> >   ^ need closing parenthesis
> > 
> > Thanks.
> 
> *face in hands* Yup. Andrew, do you want to poke that yourself or should
> I send a fix-fix? :)

I haven't got onto the stpcpy() base patch yet, so a full resend of
a v4 would be nice please.


[PATCH] uio: free uio id after uio file node is freed

2020-09-13 Thread Lang Dai
uio_register_device() do two things.
1) get an uio id from a global pool, e.g. the id is 
2) create file nodes like /sys/class/uio/uio

uio_unregister_device() do two things.
1) free the uio id  and return it to the global pool
2) free the file node /sys/class/uio/uio

There is a situation is that one worker is calling uio_unregister_device(),
and another worker is calling uio_register_device().
If the two workers are X and Y, they go as below sequence,
1) X free the uio id 
2) Y get an uio id 
3) Y create file node /sys/class/uio/uio
4) X free the file note /sys/class/uio/uio
Then it will failed at the 3rd step and cause the phenomenon we saw as it
is creating a duplicated file node.

Failure reports as follows:
sysfs: cannot create duplicate filename '/class/uio/uio10'
Call Trace:
   sysfs_do_create_link_sd.isra.2+0x9e/0xb0
   sysfs_create_link+0x25/0x40
   device_add+0x2c4/0x640
   __uio_register_device+0x1c5/0x576 [uio]
   adf_uio_init_bundle_dev+0x231/0x280 [intel_qat]
   adf_uio_register+0x1c0/0x340 [intel_qat]
   adf_dev_start+0x202/0x370 [intel_qat]
   adf_dev_start_async+0x40/0xa0 [intel_qat]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
 ? process_one_work+0x410/0x410
 ? kthread_bind+0x40/0x40
 ret_from_fork+0x1f/0x40
 Code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef
 e8 ec c4 ff ff 4c 89 e2 48 89 de 48 c7 c7 e8 b4 ee b4 e8 6a d4 d7
 ff <0f> 0b 48 89 df e8 20 fa f3 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84
---[ end trace a7531c1ed5269e84 ]---
 c6xxvf b002:00:00.0: Failed to register UIO devices
 c6xxvf b002:00:00.0: Failed to register UIO devices

Signed-off-by: Lang Dai 

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 73efb80..6dca744 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -1048,8 +1048,6 @@ void uio_unregister_device(struct uio_info *info)
 
idev = info->uio_dev;
 
-   uio_free_minor(idev);
-
mutex_lock(>info_lock);
uio_dev_del_attributes(idev);
 
@@ -1064,6 +1062,8 @@ void uio_unregister_device(struct uio_info *info)
 
device_unregister(>dev);
 
+   uio_free_minor(idev);
+
return;
 }
 EXPORT_SYMBOL_GPL(uio_unregister_device);
-- 
2.7.4



Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2

2020-09-13 Thread Muchun Song
On Mon, Sep 14, 2020 at 11:19 AM Zefan Li  wrote:
>
> On 2020/9/14 11:10, Muchun Song wrote:
> > On Mon, Sep 14, 2020 at 1:09 AM Chris Down  wrote:
> >>
> >> Muchun Song writes:
> >>> In the cgroup v1, we have a numa_stat interface. This is useful for
> >>> providing visibility into the numa locality information within an
> >>> memcg since the pages are allowed to be allocated from any physical
> >>> node. One of the use cases is evaluating application performance by
> >>> combining this information with the application's CPU allocation.
> >>> But the cgroup v2 does not. So this patch adds the missing information.
> >>>
> >>> Signed-off-by: Muchun Song 
> >>> Suggested-by: Shakeel Butt 
> >>> Reported-by: kernel test robot 
> >>
> >> This is a feature patch, why does this have LKP's Reported-by?
> >
> > In the v2 version, the kernel test robot reported a compiler error
> > on the powerpc architecture. So I added that. Thanks.
> >
>
> You should remove this reported-by, and then add v2->v3 changelog:

Got it. I see Andrew has done it for me, thank him very much for
his work. He also added this patch to the -mm tree.

>
> ...original changelog...
>
> v3:
> - fixed something reported by test rebot

I already added that in the changelog. Thanks.


--
Yours,
Muchun


Re: [PATCH 1/2] ubifs: xattr: Fix some potential memory leaks while iterating entries

2020-09-13 Thread Zhihao Cheng

在 2020/9/14 3:08, Richard Weinberger 写道:

On Mon, Jun 1, 2020 at 11:11 AM Zhihao Cheng  wrote:





I agree that this needs fixing. Did you also look into getting rid of pxent?
UBIFS uses the pxent pattern over and over and the same error got copy pasted
a lot. :-(

I thought about it. I'm not sure whether it is good to drop 'pxent'. 
Maybe you mean doing changes looks like following(Takes 
ubifs_jnl_write_inode() for example):


diff --git a/fs/ubifs/journal.c b/fs/ubifs/journal.c
index 4a5b06f8d812..fcd5ac047b34 100644
--- a/fs/ubifs/journal.c
+++ b/fs/ubifs/journal.c
@@ -879,13 +879,19 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, 
const struct inode *inode)

union ubifs_key key;
struct fscrypt_name nm = {0};
struct inode *xino;
-   struct ubifs_dent_node *xent, *pxent = NULL;
+   struct ubifs_dent_node *xent;

if (ui->xattr_cnt >= ubifs_xattr_max_cnt(c)) {
ubifs_err(c, "Cannot delete inode, it has too much 
xattrs!");
goto out_release;
}

+   fname_name() = kmalloc(UBIFS_MAX_NLEN, GFP_NOFS);
+   if (!fname_name()) {
+   err = -ENOMEM;
+   goto out_release;
+   }
+
lowest_xent_key(c, , inode->i_ino);
while (1) {
xent = ubifs_tnc_next_ent(c, , );
@@ -894,11 +900,12 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, 
const struct inode *inode)

if (err == -ENOENT)
break;

+   kfree(fname_name());
goto out_release;
}

-   fname_name() = xent->name;
fname_len() = le16_to_cpu(xent->nlen);
+   strncpy(fname_name(), xent->name, fname_len());

xino = ubifs_iget(c->vfs_sb, le64_to_cpu(xent->inum));
if (IS_ERR(xino)) {
@@ -907,6 +914,7 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, 
const struct inode *inode)

  xent->name, err);
ubifs_ro_mode(c, err);
kfree(xent);
+   kfree(fname_name());
goto out_release;
}
ubifs_assert(c, ubifs_inode(xino)->xattr);
@@ -916,11 +924,10 @@ int ubifs_jnl_write_inode(struct ubifs_info *c, 
const struct inode *inode)

ino = (void *)ino + UBIFS_INO_NODE_SZ;
iput(xino);

-   kfree(pxent);
-   pxent = xent;
key_read(c, >key, );
+   kfree(xent);
}
-   kfree(pxent);
+   kfree(fname_name());
}

pack_inode(c, ino, inode, 1);

The kill_xattrs process is more intuitive without the pxent. However, 
the release process for the memory (stores xent->name) is similar to 
'pxent'. If you think it's better than v1, I will send v2.




linux-next: build warning after merge of the tip tree

2020-09-13 Thread Stephen Rothwell
Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
produced this warning:

x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_selftest_dynamic.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_kprobe_selftest.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_clock.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/ftrace.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/ring_buffer.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_output.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_seq.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_stat.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_printk.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/tracing_map.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_sched_switch.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_functions.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_preemptirq.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_irqsoff.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_sched_wakeup.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_hwlat.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_nop.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_stack.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_mmiotrace.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_functions_graph.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/blktrace.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/fgraph.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_export.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_syscalls.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_event_perf.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events_filter.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events_trigger.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events_inject.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events_synth.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_events_hist.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/bpf_trace.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/trace_kprobe.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/power-traces.o' being placed in section `.ctors.65435'
x86_64-linux-gnu-ld: warning: orphan section `.ctors.65435' from 
`kernel/trace/rpm-traces.o' being placed in 

[PATCH] uio: free uio id after uio file node is freed

2020-09-13 Thread Lang Dai
uio_register_device() do two things.
1) get an uio id from a global pool, e.g. the id is 
2) create file nodes like /sys/class/uio/uio

uio_unregister_device() do two things.
1) free the uio id  and return it to the global pool
2) free the file node /sys/class/uio/uio

There is a situation is that one worker is calling uio_unregister_device(),
and another worker is calling uio_register_device().
If the two workers are X and Y, they go as below sequence,
1) X free the uio id 
2) Y get an uio id 
3) Y create file node /sys/class/uio/uio
4) X free the file note /sys/class/uio/uio
Then it will failed at the 3rd step and cause the phenomenon we saw as it
is creating a duplicated file node.

Failure reports as follows:
sysfs: cannot create duplicate filename '/class/uio/uio10'
Call Trace:
   sysfs_do_create_link_sd.isra.2+0x9e/0xb0
   sysfs_create_link+0x25/0x40
   device_add+0x2c4/0x640
   __uio_register_device+0x1c5/0x576 [uio]
   adf_uio_init_bundle_dev+0x231/0x280 [intel_qat]
   adf_uio_register+0x1c0/0x340 [intel_qat]
   adf_dev_start+0x202/0x370 [intel_qat]
   adf_dev_start_async+0x40/0xa0 [intel_qat]
   process_one_work+0x14d/0x410
   worker_thread+0x4b/0x460
   kthread+0x105/0x140
 ? process_one_work+0x410/0x410
 ? kthread_bind+0x40/0x40
 ret_from_fork+0x1f/0x40
 Code: 85 c0 48 89 c3 74 12 b9 00 10 00 00 48 89 c2 31 f6 4c 89 ef
 e8 ec c4 ff ff 4c 89 e2 48 89 de 48 c7 c7 e8 b4 ee b4 e8 6a d4 d7
 ff <0f> 0b 48 89 df e8 20 fa f3 ff 5b 41 5c 41 5d 5d c3 66 0f 1f 84
---[ end trace a7531c1ed5269e84 ]---
 c6xxvf b002:00:00.0: Failed to register UIO devices
 c6xxvf b002:00:00.0: Failed to register UIO devices

Signed-off-by: Lang Dai 

diff --git a/drivers/uio/uio.c b/drivers/uio/uio.c
index 73efb80..6dca744 100644
--- a/drivers/uio/uio.c
+++ b/drivers/uio/uio.c
@@ -1048,8 +1048,6 @@ void uio_unregister_device(struct uio_info *info)
 
idev = info->uio_dev;
 
-   uio_free_minor(idev);
-
mutex_lock(>info_lock);
uio_dev_del_attributes(idev);
 
@@ -1064,6 +1062,8 @@ void uio_unregister_device(struct uio_info *info)
 
device_unregister(>dev);
 
+   uio_free_minor(idev);
+
return;
 }
 EXPORT_SYMBOL_GPL(uio_unregister_device);
-- 
2.7.4



Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2

2020-09-13 Thread Zefan Li
On 2020/9/14 11:10, Muchun Song wrote:
> On Mon, Sep 14, 2020 at 1:09 AM Chris Down  wrote:
>>
>> Muchun Song writes:
>>> In the cgroup v1, we have a numa_stat interface. This is useful for
>>> providing visibility into the numa locality information within an
>>> memcg since the pages are allowed to be allocated from any physical
>>> node. One of the use cases is evaluating application performance by
>>> combining this information with the application's CPU allocation.
>>> But the cgroup v2 does not. So this patch adds the missing information.
>>>
>>> Signed-off-by: Muchun Song 
>>> Suggested-by: Shakeel Butt 
>>> Reported-by: kernel test robot 
>>
>> This is a feature patch, why does this have LKP's Reported-by?
> 
> In the v2 version, the kernel test robot reported a compiler error
> on the powerpc architecture. So I added that. Thanks.
> 

You should remove this reported-by, and then add v2->v3 changelog:

...original changelog...

v3:
- fixed something reported by test rebot


drivers/tee/optee/shm_pool.c:34:28-34: ERROR: application of sizeof to pointer

2020-09-13 Thread kernel test robot
tree:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
master
head:   856deb866d16e29bd65952e0289066f6078af773
commit: 5a769f6ff439cedc547395a6dc78faa26108f741 optee: Fix multi page dynamic 
shm pool alloc
date:   9 months ago
config: arm64-randconfig-c004-20200913 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


coccinelle warnings: (new ones prefixed by >>)

>> drivers/tee/optee/shm_pool.c:34:28-34: ERROR: application of sizeof to 
>> pointer

Please review and possibly fold the followup patch.

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [External] Re: [PATCH v3] mm: memcontrol: Add the missing numa_stat interface for cgroup v2

2020-09-13 Thread Muchun Song
On Mon, Sep 14, 2020 at 1:09 AM Chris Down  wrote:
>
> Muchun Song writes:
> >In the cgroup v1, we have a numa_stat interface. This is useful for
> >providing visibility into the numa locality information within an
> >memcg since the pages are allowed to be allocated from any physical
> >node. One of the use cases is evaluating application performance by
> >combining this information with the application's CPU allocation.
> >But the cgroup v2 does not. So this patch adds the missing information.
> >
> >Signed-off-by: Muchun Song 
> >Suggested-by: Shakeel Butt 
> >Reported-by: kernel test robot 
>
> This is a feature patch, why does this have LKP's Reported-by?

In the v2 version, the kernel test robot reported a compiler error
on the powerpc architecture. So I added that. Thanks.

-- 
Yours,
Muchun


Re: [PATCH v4 2/5] iommu: Add iommu_at(de)tach_subdev_group()

2020-09-13 Thread Lu Baolu

Hi Alex,

On 9/11/20 6:05 AM, Alex Williamson wrote:

On Tue,  1 Sep 2020 11:34:19 +0800
Lu Baolu  wrote:


This adds two new APIs for the use cases like vfio/mdev where subdevices
derived from physical devices are created and put in an iommu_group. The
new IOMMU API interfaces mimic the vfio_mdev_at(de)tach_domain() directly,
testing whether the resulting device supports IOMMU_DEV_FEAT_AUX and using
an aux vs non-aux at(de)tach.

By doing this we could

- Set the iommu_group.domain. The iommu_group.domain is private to iommu
   core (therefore vfio code cannot set it), but we need it set in order
   for iommu_get_domain_for_dev() to work with a group attached to an aux
   domain.

- Prefer to use the _attach_group() interfaces while the _attach_device()
   interfaces are relegated to special cases.

Link: https://lore.kernel.org/linux-iommu/20200730134658.44c57...@x1.home/
Link: https://lore.kernel.org/linux-iommu/20200730151703.5daf8...@x1.home/
Signed-off-by: Lu Baolu 
---
  drivers/iommu/iommu.c | 136 ++
  include/linux/iommu.h |  20 +++
  2 files changed, 156 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 38cdfeb887e1..fb21c2ff4861 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2757,6 +2757,142 @@ int iommu_aux_get_pasid(struct iommu_domain *domain, 
struct device *dev)
  }
  EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
  
+static int __iommu_aux_attach_device(struct iommu_domain *domain,

+struct device *phys_dev,
+struct device *sub_dev)
+{
+   int ret;
+
+   if (unlikely(!domain->ops->aux_attach_dev))
+   return -ENODEV;
+
+   ret = domain->ops->aux_attach_dev(domain, phys_dev, sub_dev);
+   if (!ret)
+   trace_attach_device_to_domain(sub_dev);
+
+   return ret;
+}
+
+static void __iommu_aux_detach_device(struct iommu_domain *domain,
+ struct device *phys_dev,
+ struct device *sub_dev)
+{
+   if (unlikely(!domain->ops->aux_detach_dev))
+   return;
+
+   domain->ops->aux_detach_dev(domain, phys_dev, sub_dev);
+   trace_detach_device_from_domain(sub_dev);
+}
+
+static int __iommu_attach_subdev_group(struct iommu_domain *domain,
+  struct iommu_group *group,
+  iommu_device_lookup_t fn)
+{
+   struct group_device *device;
+   struct device *phys_dev;
+   int ret = -ENODEV;
+
+   list_for_each_entry(device, >devices, list) {
+   phys_dev = fn(device->dev);
+   if (!phys_dev) {
+   ret = -ENODEV;
+   break;
+   }
+
+   if (iommu_dev_feature_enabled(phys_dev, IOMMU_DEV_FEAT_AUX))
+   ret = __iommu_aux_attach_device(domain, phys_dev,
+   device->dev);
+   else
+   ret = __iommu_attach_device(domain, phys_dev);
+
+   if (ret)
+   break;
+   }
+
+   return ret;
+}
+
+static void __iommu_detach_subdev_group(struct iommu_domain *domain,
+   struct iommu_group *group,
+   iommu_device_lookup_t fn)
+{
+   struct group_device *device;
+   struct device *phys_dev;
+
+   list_for_each_entry(device, >devices, list) {
+   phys_dev = fn(device->dev);
+   if (!phys_dev)
+   break;



Seems like this should be a continue rather than a break.  On the
unwind path maybe we're relying on holding the group mutex and
deterministic behavior from the fn() callback to unwind to the same
point, but we still have an entirely separate detach interface and I'm
not sure we couldn't end up with an inconsistent state if we don't
iterate each group device here.  Thanks,


Yes, agreed.

Best regards,
baolu



Alex


+
+   if (iommu_dev_feature_enabled(phys_dev, IOMMU_DEV_FEAT_AUX))
+   __iommu_aux_detach_device(domain, phys_dev, 
device->dev);
+   else
+   __iommu_detach_device(domain, phys_dev);
+   }
+}
+
+/**
+ * iommu_attach_subdev_group - attach domain to an iommu_group which
+ *contains subdevices.
+ *
+ * @domain: domain
+ * @group:  iommu_group which contains subdevices
+ * @fn: callback for each subdevice in the @iommu_group to retrieve the
+ *  physical device where the subdevice was created from.
+ *
+ * Returns 0 on success, or an error value.
+ */
+int iommu_attach_subdev_group(struct iommu_domain *domain,
+ struct iommu_group *group,
+ iommu_device_lookup_t fn)
+{
+   int ret = -ENODEV;
+
+   mutex_lock(>mutex);
+  

[PATCH V5 01/17] dt-bindings: soc: Add dvfsrc driver bindings

2020-09-13 Thread Henry Chen
Document the binding for enabling dvfsrc on MediaTek SoC.

Signed-off-by: Henry Chen 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/soc/mediatek/dvfsrc.txt| 25 ++
 include/dt-bindings/soc/mtk,dvfsrc.h   | 14 
 2 files changed, 39 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/soc/mediatek/dvfsrc.txt
 create mode 100644 include/dt-bindings/soc/mtk,dvfsrc.h

diff --git a/Documentation/devicetree/bindings/soc/mediatek/dvfsrc.txt 
b/Documentation/devicetree/bindings/soc/mediatek/dvfsrc.txt
new file mode 100644
index 000..d5a47d8
--- /dev/null
+++ b/Documentation/devicetree/bindings/soc/mediatek/dvfsrc.txt
@@ -0,0 +1,25 @@
+MediaTek DVFSRC
+
+The Dynamic Voltage and Frequency Scaling Resource Collector (DVFSRC) is a
+HW module which is used to collect all the requests from both software and
+hardware and turn into the decision of minimum operating voltage and minimum
+DRAM frequency to fulfill those requests.
+
+Required Properties:
+- compatible: Should be one of the following
+   - "mediatek,mt6873-dvfsrc": For MT6873 SoC
+   - "mediatek,mt8183-dvfsrc": For MT8183 SoC
+   - "mediatek,mt8192-dvfsrc": For MT8192 SoC
+- reg: Address range of the DVFSRC unit
+- clock-names: Must include the following entries:
+   "dvfsrc": DVFSRC module clock
+- clocks: Must contain an entry for each entry in clock-names.
+
+Example:
+
+   dvfsrc@10012000 {
+   compatible = "mediatek,mt8183-dvfsrc";
+   reg = <0 0x10012000 0 0x1000>;
+   clocks = < CLK_INFRA_DVFSRC>;
+   clock-names = "dvfsrc";
+   };
diff --git a/include/dt-bindings/soc/mtk,dvfsrc.h 
b/include/dt-bindings/soc/mtk,dvfsrc.h
new file mode 100644
index 000..a522488
--- /dev/null
+++ b/include/dt-bindings/soc/mtk,dvfsrc.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (c) 2018 MediaTek Inc.
+ */
+
+#ifndef _DT_BINDINGS_POWER_MTK_DVFSRC_H
+#define _DT_BINDINGS_POWER_MTK_DVFSRC_H
+
+#define MT8183_DVFSRC_LEVEL_1  1
+#define MT8183_DVFSRC_LEVEL_2  2
+#define MT8183_DVFSRC_LEVEL_3  3
+#define MT8183_DVFSRC_LEVEL_4  4
+
+#endif /* _DT_BINDINGS_POWER_MTK_DVFSRC_H */
-- 
1.9.1


[PATCH V5 05/17] soc: mediatek: add header for mediatek SIP interface

2020-09-13 Thread Henry Chen
From: Arvin Wang 

Add a header to collect SIPs and add one SIP call to initialize power
management hardware for the SIP interface defined to access the SPM
handling vcore voltage and ddr rate changes on mt8183 (and most likely
later socs).

Signed-off-by: Arvin Wang 
---
 include/linux/soc/mediatek/mtk_sip_svc.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/soc/mediatek/mtk_sip_svc.h 
b/include/linux/soc/mediatek/mtk_sip_svc.h
index 082398e..079bbcb 100644
--- a/include/linux/soc/mediatek/mtk_sip_svc.h
+++ b/include/linux/soc/mediatek/mtk_sip_svc.h
@@ -22,4 +22,8 @@
ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, MTK_SIP_SMC_CONVENTION, \
   ARM_SMCCC_OWNER_SIP, fn_id)
 
+/* VCOREFS */
+#define MTK_SIP_VCOREFS_CONTROL \
+   MTK_SIP_SMC_CMD(0x506)
+
 #endif
-- 
1.9.1


[PATCH V5 13/17] arm64: dts: mt8192: add dvfsrc related nodes

2020-09-13 Thread Henry Chen
Add DDR EMI provider dictating dram interconnect bus performance found on
MT8183-based platforms

Signed-off-by: Henry Chen 
---
 arch/arm64/boot/dts/mediatek/mt8192.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8192.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
index 1eae441..647c57a 100644
--- a/arch/arm64/boot/dts/mediatek/mt8192.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8192.dtsi
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 
 / {
compatible = "mediatek,mt8192";
@@ -420,6 +421,7 @@
ddr_emi: dvfsrc@10012000 {
compatible = "mediatek,mt8192-dvfsrc";
reg = <0 0x10012000 0 0x1000>;
+   #interconnect-cells = <1>;
};
 
systimer: timer@10017000 {
-- 
1.9.1


[PATCH V5 02/17] dt-bindings: soc: Add opp table on scpsys bindings

2020-09-13 Thread Henry Chen
Add opp table on scpsys dt-bindings for Mediatek SoC.

Signed-off-by: Henry Chen 
Reviewed-by: Rob Herring 
---
 .../devicetree/bindings/soc/mediatek/scpsys.txt| 38 ++
 1 file changed, 38 insertions(+)

diff --git a/Documentation/devicetree/bindings/soc/mediatek/scpsys.txt 
b/Documentation/devicetree/bindings/soc/mediatek/scpsys.txt
index 7f322f9..4b96fdc 100644
--- a/Documentation/devicetree/bindings/soc/mediatek/scpsys.txt
+++ b/Documentation/devicetree/bindings/soc/mediatek/scpsys.txt
@@ -90,6 +90,27 @@ Example:
 < CLK_TOP_VENC_SEL>,
 < CLK_TOP_VENC_LT_SEL>;
clock-names = "mfg", "mm", "venc", "venc_lt";
+   operating-points-v2 = <_opp_table>;
+
+   dvfsrc_opp_table: opp-table {
+   compatible = "operating-points-v2-level";
+
+   dvfsrc_vol_min: opp1 {
+   opp,level = ;
+   };
+
+   dvfsrc_freq_medium: opp2 {
+   opp,level = ;
+   };
+
+   dvfsrc_freq_max: opp3 {
+   opp,level = ;
+   };
+
+   dvfsrc_vol_max: opp4 {
+   opp,level = ;
+   };
+   };
};
 
 Example(power domain sub node within power controller):
@@ -151,4 +172,21 @@ Example consumer:
afe: mt8173-afe-pcm@1122 {
compatible = "mediatek,mt8173-afe-pcm";
power-domains = < MT8173_POWER_DOMAIN_AUDIO>;
+   operating-points-v2 = <_opp_table>;
+   };
+
+   aud_opp_table: aud-opp-table {
+   compatible = "operating-points-v2";
+   opp1 {
+   opp-hz = /bits/ 64 <79300>;
+   required-opps = <_vol_min>;
+   };
+   opp2 {
+   opp-hz = /bits/ 64 <91000>;
+   required-opps = <_vol_max>;
+   };
+   opp3 {
+   opp-hz = /bits/ 64 <101400>;
+   required-opps = <_vol_max>;
+   };
};
-- 
1.9.1


[PATCH V5 11/17] interconnect: mediatek: Add interconnect provider driver

2020-09-13 Thread Henry Chen
Introduce Mediatek MT6873/MT8183/MT8192 specific provider driver
using the interconnect framework.

 ICC provider ICC Nodes
    
 -   |CPU |   |--- |VPU |
-   | |-  | 
   |DRAM |--|DRAM |   | 
   | |--|scheduler|- |GPU |   |--- |DISP|
   | |--|(EMI)|   | 
   | |--| |   -   | 
-   | |- |MMSYS|--|--- |VDEC|
 --   | 
   /|\| 
|change DRAM freq |--- |VENC|
 --   | 
|  DVFSR   |  |
|  |  | 
 --   |--- |IMG |
  | 
  | 
  |--- |CAM |


Signed-off-by: Henry Chen 
---
 drivers/interconnect/Kconfig|   1 +
 drivers/interconnect/Makefile   |   1 +
 drivers/interconnect/mediatek/Kconfig   |  13 ++
 drivers/interconnect/mediatek/Makefile  |   3 +
 drivers/interconnect/mediatek/mtk-emi.c | 330 
 5 files changed, 348 insertions(+)
 create mode 100644 drivers/interconnect/mediatek/Kconfig
 create mode 100644 drivers/interconnect/mediatek/Makefile
 create mode 100644 drivers/interconnect/mediatek/mtk-emi.c

diff --git a/drivers/interconnect/Kconfig b/drivers/interconnect/Kconfig
index 5b7204e..e939f5a 100644
--- a/drivers/interconnect/Kconfig
+++ b/drivers/interconnect/Kconfig
@@ -13,5 +13,6 @@ if INTERCONNECT
 
 source "drivers/interconnect/imx/Kconfig"
 source "drivers/interconnect/qcom/Kconfig"
+source "drivers/interconnect/mediatek/Kconfig"
 
 endif
diff --git a/drivers/interconnect/Makefile b/drivers/interconnect/Makefile
index 4825c28..6f4b88a 100644
--- a/drivers/interconnect/Makefile
+++ b/drivers/interconnect/Makefile
@@ -6,3 +6,4 @@ icc-core-objs   := core.o
 obj-$(CONFIG_INTERCONNECT) += icc-core.o
 obj-$(CONFIG_INTERCONNECT_IMX) += imx/
 obj-$(CONFIG_INTERCONNECT_QCOM)+= qcom/
+obj-$(CONFIG_INTERCONNECT_MTK) += mediatek/
diff --git a/drivers/interconnect/mediatek/Kconfig 
b/drivers/interconnect/mediatek/Kconfig
new file mode 100644
index 000..972d3bb
--- /dev/null
+++ b/drivers/interconnect/mediatek/Kconfig
@@ -0,0 +1,13 @@
+config INTERCONNECT_MTK
+   bool "Mediatek Network-on-Chip interconnect drivers"
+   depends on ARCH_MEDIATEK
+   help
+ Support for Mediatek's Network-on-Chip interconnect hardware.
+
+config INTERCONNECT_MTK_EMI
+   tristate "Mediatek EMI interconnect driver"
+   depends on INTERCONNECT_MTK
+   depends on (MTK_DVFSRC && OF)
+   help
+ This is a driver for the Mediatek Network-on-Chip on DVFSRC-based
+ platforms.
diff --git a/drivers/interconnect/mediatek/Makefile 
b/drivers/interconnect/mediatek/Makefile
new file mode 100644
index 000..353842b
--- /dev/null
+++ b/drivers/interconnect/mediatek/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_INTERCONNECT_MTK_EMI) += mtk-emi.o
\ No newline at end of file
diff --git a/drivers/interconnect/mediatek/mtk-emi.c 
b/drivers/interconnect/mediatek/mtk-emi.c
new file mode 100644
index 000..9670077
--- /dev/null
+++ b/drivers/interconnect/mediatek/mtk-emi.c
@@ -0,0 +1,330 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2020, The Linux Foundation. All rights reserved.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+enum mtk_icc_name {
+   SLAVE_DDR_EMI,
+   MASTER_MCUSYS,
+   MASTER_GPUSYS,
+   MASTER_MMSYS,
+   MASTER_MM_VPU,
+   MASTER_MM_DISP,
+   MASTER_MM_VDEC,
+   MASTER_MM_VENC,
+   MASTER_MM_CAM,
+   MASTER_MM_IMG,
+   MASTER_MM_MDP,
+   MASTER_VPUSYS,
+   MASTER_VPU_PORT_0,
+   MASTER_VPU_PORT_1,
+   MASTER_MDLASYS,
+   MASTER_MDLA_PORT_0,
+   MASTER_UFS,
+   MASTER_PCIE,
+   MASTER_USB,
+   MASTER_WIFI,
+   MASTER_BT,
+   MASTER_NETSYS,
+   MASTER_DBGIF,
+
+   SLAVE_HRT_DDR_EMI,
+   MASTER_HRT_MMSYS,
+   MASTER_HRT_MM_DISP,
+   MASTER_HRT_MM_VDEC,
+   MASTER_HRT_MM_VENC,
+   MASTER_HRT_MM_CAM,
+   MASTER_HRT_MM_IMG,
+   MASTER_HRT_MM_MDP,
+   MASTER_HRT_DBGIF,
+};
+
+#define MT8183_MAX_LINKS   1
+
+/**
+ * struct mtk_icc_node - Mediatek specific interconnect nodes
+ * @name: the node name used in debugfs
+ * @ep : the type of this endpoint
+ * @id: a unique node identifier
+ * @links: an array of nodes where we can go next while traversing
+ * @num_links: the total number of @links
+ * @sum_avg: current sum 

[PATCH V5 04/17] arm64: dts: mt8183: add performance state support of scpsys

2020-09-13 Thread Henry Chen
Add support for performance state of scpsys on mt8183 platform

Signed-off-by: Henry Chen 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index d85bae7..82ca929 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include "mt8183-pinfunc.h"
+#include 
 
 / {
compatible = "mediatek,mt8183";
@@ -340,6 +341,27 @@
#address-cells = <1>;
#size-cells = <0>;
 
+   operating-points-v2 = <_opp_table>;
+   dvfsrc_opp_table: opp-table {
+   compatible = "operating-points-v2-level";
+
+   dvfsrc_vol_min: opp1 {
+   opp,level = ;
+   };
+
+   dvfsrc_freq_medium: opp2 {
+   opp,level = ;
+   };
+
+   dvfsrc_freq_max: opp3 {
+   opp,level = ;
+   };
+
+   dvfsrc_vol_max: opp4 {
+   opp,level = ;
+   };
+   };
+
audio@MT8183_POWER_DOMAIN_AUDIO {
reg = ;
};
-- 
1.9.1


[PATCH V5 12/17] arm64: dts: mt8183: add dvfsrc related nodes

2020-09-13 Thread Henry Chen
Add DDR EMI provider dictating dram interconnect bus performance found on
MT8192-based platforms

Signed-off-by: Henry Chen 
---
 arch/arm64/boot/dts/mediatek/mt8183.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/mediatek/mt8183.dtsi 
b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
index 4046603..63a4decd 100644
--- a/arch/arm64/boot/dts/mediatek/mt8183.dtsi
+++ b/arch/arm64/boot/dts/mediatek/mt8183.dtsi
@@ -13,6 +13,7 @@
 #include 
 #include "mt8183-pinfunc.h"
 #include 
+#include 
 
 / {
compatible = "mediatek,mt8183";
@@ -472,6 +473,7 @@
ddr_emi: dvfsrc@10012000 {
compatible = "mediatek,mt8183-dvfsrc";
reg = <0 0x10012000 0 0x1000>;
+   #interconnect-cells = <1>;
};
 
pwrap: pwrap@1000d000 {
-- 
1.9.1


[PATCH V5 06/17] soc: mediatek: add driver for dvfsrc support

2020-09-13 Thread Henry Chen
Add dvfsrc driver for MT6873/MT8183/MT8192

Signed-off-by: Henry Chen 
---
 drivers/soc/mediatek/Kconfig|  12 +
 drivers/soc/mediatek/Makefile   |   1 +
 drivers/soc/mediatek/mtk-dvfsrc.c   | 618 
 include/linux/soc/mediatek/mtk_dvfsrc.h |  34 ++
 4 files changed, 665 insertions(+)
 create mode 100644 drivers/soc/mediatek/mtk-dvfsrc.c
 create mode 100644 include/linux/soc/mediatek/mtk_dvfsrc.h

diff --git a/drivers/soc/mediatek/Kconfig b/drivers/soc/mediatek/Kconfig
index 3f5e5cb..ac78c47 100644
--- a/drivers/soc/mediatek/Kconfig
+++ b/drivers/soc/mediatek/Kconfig
@@ -16,6 +16,18 @@ config MTK_CMDQ
  time limitation, such as updating display configuration during the
  vblank.
 
+config MTK_DVFSRC
+   tristate "MediaTek DVFSRC Support"
+   depends on ARCH_MEDIATEK
+   depends on MTK_SCPSYS
+   help
+ Say yes here to add support for the MediaTek DVFSRC (dynamic voltage
+ and frequency scaling resource collector) found
+ on different MediaTek SoCs. The DVFSRC is a proprietary
+ hardware which is used to collect all the requests from
+ system and turn into the decision of minimum Vcore voltage
+ and minimum DRAM frequency to fulfill those requests.
+
 config MTK_PMIC_WRAP
tristate "MediaTek PMIC Wrapper Support"
depends on RESET_CONTROLLER
diff --git a/drivers/soc/mediatek/Makefile b/drivers/soc/mediatek/Makefile
index 2afa7b9..65e9597 100644
--- a/drivers/soc/mediatek/Makefile
+++ b/drivers/soc/mediatek/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-$(CONFIG_MTK_CMDQ) += mtk-cmdq-helper.o
+obj-$(CONFIG_MTK_DVFSRC) += mtk-dvfsrc.o
 obj-$(CONFIG_MTK_PMIC_WRAP) += mtk-pmic-wrap.o
 obj-$(CONFIG_MTK_SCPSYS) += mtk-scpsys.o
 obj-$(CONFIG_MTK_MMSYS) += mtk-mmsys.o
diff --git a/drivers/soc/mediatek/mtk-dvfsrc.c 
b/drivers/soc/mediatek/mtk-dvfsrc.c
new file mode 100644
index 000..c539677
--- /dev/null
+++ b/drivers/soc/mediatek/mtk-dvfsrc.c
@@ -0,0 +1,618 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 MediaTek Inc.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "mtk-scpsys.h"
+
+#define DVFSRC_IDLE 0x00
+#define DVFSRC_GET_TARGET_LEVEL(x)  (((x) >> 0) & 0x)
+#define DVFSRC_GET_CURRENT_LEVEL(x) (((x) >> 16) & 0x)
+#define kbps_to_mbps(x) ((x) / 1000)
+
+#define MT8183_DVFSRC_OPP_LP4   0
+#define MT8183_DVFSRC_OPP_LP4X  1
+#define MT8183_DVFSRC_OPP_LP3   2
+
+#define POLL_TIMEOUT1000
+#define STARTUP_TIME1
+
+#define MTK_SIP_DVFSRC_INIT0x00
+
+#define DVFSRC_OPP_DESC(_opp_table)\
+{  \
+   .opps = _opp_table, \
+   .num_opp = ARRAY_SIZE(_opp_table),  \
+}
+
+struct dvfsrc_opp {
+   u32 vcore_opp;
+   u32 dram_opp;
+};
+
+struct dvfsrc_domain {
+   u32 id;
+   u32 state;
+};
+
+struct dvfsrc_opp_desc {
+   const struct dvfsrc_opp *opps;
+   u32 num_opp;
+};
+
+struct mtk_dvfsrc;
+struct dvfsrc_soc_data {
+   const int *regs;
+   u32 num_domains;
+   struct dvfsrc_domain *domains;
+   const struct dvfsrc_opp_desc *opps_desc;
+   int (*get_target_level)(struct mtk_dvfsrc *dvfsrc);
+   int (*get_current_level)(struct mtk_dvfsrc *dvfsrc);
+   u32 (*get_vcore_level)(struct mtk_dvfsrc *dvfsrc);
+   u32 (*get_vcp_level)(struct mtk_dvfsrc *dvfsrc);
+   void (*set_dram_bw)(struct mtk_dvfsrc *dvfsrc, u64 bw);
+   void (*set_dram_peak_bw)(struct mtk_dvfsrc *dvfsrc, u64 bw);
+   void (*set_dram_hrtbw)(struct mtk_dvfsrc *dvfsrc, u64 bw);
+   void (*set_opp_level)(struct mtk_dvfsrc *dvfsrc, u32 level);
+   void (*set_vcore_level)(struct mtk_dvfsrc *dvfsrc, u32 level);
+   void (*set_vscp_level)(struct mtk_dvfsrc *dvfsrc, u32 level);
+   int (*wait_for_opp_level)(struct mtk_dvfsrc *dvfsrc, u32 level);
+   int (*wait_for_vcore_level)(struct mtk_dvfsrc *dvfsrc, u32 level);
+};
+
+struct mtk_dvfsrc {
+   struct device *dev;
+   struct platform_device *icc;
+   struct platform_device *regulator;
+   const struct dvfsrc_soc_data *dvd;
+   int dram_type;
+   const struct dvfsrc_opp_desc *curr_opps;
+   void __iomem *regs;
+   spinlock_t req_lock;
+   struct mutex pstate_lock;
+   struct notifier_block scpsys_notifier;
+};
+
+static u32 dvfsrc_read(struct mtk_dvfsrc *dvfs, u32 offset)
+{
+   return readl(dvfs->regs + dvfs->dvd->regs[offset]);
+}
+
+static void dvfsrc_write(struct mtk_dvfsrc *dvfs, u32 offset, u32 val)
+{
+   writel(val, dvfs->regs + dvfs->dvd->regs[offset]);
+}
+
+#define dvfsrc_rmw(dvfs, offset, val, mask, shift) \
+   dvfsrc_write(dvfs, offset, \
+   (dvfsrc_read(dvfs, offset) & ~(mask << shift)) | (val << shift))
+
+enum dvfsrc_regs {
+   DVFSRC_SW_REQ,
+   DVFSRC_SW_REQ2,
+   DVFSRC_LEVEL,
+   

  1   2   3   4   5   >