[PATCH RFC] s390: convert to GENERIC_VDSO

2020-08-02 Thread Sven Schnelle
these two patches convert the s390 architecture to generic VDSO. The
first patch adds an option to add architecture specific information
to struct vdso_data. We need that information because the old s390
assembly code had a steering capability, which steered the clock slowly.
To emulate that behaviour we need to add the steering offset to struct
vdso_data.

This requirements results in the need for a seqlock kind of lock, which is
implemented open-coded in __arch_get_hw_counter(). open-coded because we
cannot include seqlock.h in userspace code (and using the normal seqlock
interface on kernel side might result in people changing struct seqlock,
but not changing the vdso userspace part), therefore both sides are
open-coded. I think in theory we could also call vdso_write_begin()/
vdso_write_end(). What do you think?

If there are no objections we would carry both patches through the s390 tree.

Thanks
Sven




[PATCH 2/2] s390: convert to GENERIC_VDSO

2020-08-02 Thread Sven Schnelle
This patch converts s390 to the generic vDSO. There are a few special
things on s390:

- vDSO can be called without a stack frame - glibc did this in the past.
  So we need to allocate a stackframe on our own.

- The former assembly code used stcke to get the TOD clock and applied
  time steering to it. We need to do the same in the new code. This is done
  in the architecture specific __arch_get_hw_counter function. The steering
  information is stored in an architecure specific area in the vDSO data.

- CPUCLOCK_VIRT is now handled with a syscall fallback, which might
  be slower/less accurate than the old implementation.

The getcpu() function stays as an assembly function because there is no
generic implementation and the code is just a few lines.

Performance number from my system do 100 mio gettimeofday() calls:

Plain syscall: 8.6s
Generic VDSO:  1.3s
old ASM VDSO:  1s

So it's a bit slower but still much faster than syscalls.

Signed-off-by: Sven Schnelle 
Reviewed-by: Heiko Carstens 
---
 arch/s390/Kconfig   |   3 +
 arch/s390/include/asm/clocksource.h |   7 +
 arch/s390/include/asm/vdso.h|  25 +--
 arch/s390/include/asm/vdso/clocksource.h|   8 +
 arch/s390/include/asm/vdso/data.h   |  14 ++
 arch/s390/include/asm/vdso/gettimeofday.h   |  78 ++
 arch/s390/include/asm/vdso/processor.h  |   5 +
 arch/s390/include/asm/vdso/vdso.h   |   0
 arch/s390/include/asm/vdso/vsyscall.h   |  26 
 arch/s390/kernel/asm-offsets.c  |  20 ---
 arch/s390/kernel/entry.S|   6 -
 arch/s390/kernel/setup.c|   1 -
 arch/s390/kernel/time.c |  70 ++---
 arch/s390/kernel/vdso.c |  29 +---
 arch/s390/kernel/vdso64/Makefile|  19 ++-
 arch/s390/kernel/vdso64/clock_getres.S  |  50 --
 arch/s390/kernel/vdso64/clock_gettime.S | 163 
 arch/s390/kernel/vdso64/gettimeofday.S  |  71 -
 arch/s390/kernel/vdso64/vdso64_generic.c|  18 +++
 arch/s390/kernel/vdso64/vdso_user_wrapper.S |  38 +
 20 files changed, 232 insertions(+), 419 deletions(-)
 create mode 100644 arch/s390/include/asm/clocksource.h
 create mode 100644 arch/s390/include/asm/vdso/clocksource.h
 create mode 100644 arch/s390/include/asm/vdso/data.h
 create mode 100644 arch/s390/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/s390/include/asm/vdso/processor.h
 create mode 100644 arch/s390/include/asm/vdso/vdso.h
 create mode 100644 arch/s390/include/asm/vdso/vsyscall.h
 delete mode 100644 arch/s390/kernel/vdso64/clock_getres.S
 delete mode 100644 arch/s390/kernel/vdso64/clock_gettime.S
 delete mode 100644 arch/s390/kernel/vdso64/gettimeofday.S
 create mode 100644 arch/s390/kernel/vdso64/vdso64_generic.c
 create mode 100644 arch/s390/kernel/vdso64/vdso_user_wrapper.S

diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index c7d7ede6300c..eb50f748e151 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -73,6 +73,7 @@ config S390
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYSCALL_WRAPPER
select ARCH_HAS_UBSAN_SANITIZE_ALL
+   select ARCH_HAS_VDSO_DATA
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_INLINE_READ_LOCK
select ARCH_INLINE_READ_LOCK_BH
@@ -118,6 +119,7 @@ config S390
select GENERIC_CPU_AUTOPROBE
select GENERIC_CPU_VULNERABILITIES
select GENERIC_FIND_FIRST_BIT
+   select GENERIC_GETTIMEOFDAY
select GENERIC_SMP_IDLE_THREAD
select GENERIC_TIME_VSYSCALL
select HAVE_ALIGNED_STRUCT_PAGE if SLUB
@@ -149,6 +151,7 @@ config S390
select HAVE_FUNCTION_TRACER
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_GCC_PLUGINS
+   select HAVE_GENERIC_VDSO
select HAVE_KERNEL_BZIP2
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZ4
diff --git a/arch/s390/include/asm/clocksource.h 
b/arch/s390/include/asm/clocksource.h
new file mode 100644
index ..03434369fce4
--- /dev/null
+++ b/arch/s390/include/asm/clocksource.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* s390-specific clocksource additions */
+
+#ifndef _ASM_S390_CLOCKSOURCE_H
+#define _ASM_S390_CLOCKSOURCE_H
+
+#endif /* _ASM_S390_CLOCKSOURCE_H */
diff --git a/arch/s390/include/asm/vdso.h b/arch/s390/include/asm/vdso.h
index 0cd085cdeb4f..82f86b3c394b 100644
--- a/arch/s390/include/asm/vdso.h
+++ b/arch/s390/include/asm/vdso.h
@@ -2,6 +2,8 @@
 #ifndef __S390_VDSO_H__
 #define __S390_VDSO_H__
 
+#include 
+
 /* Default link addresses for the vDSOs */
 #define VDSO32_LBASE   0
 #define VDSO64_LBASE   0
@@ -18,30 +20,7 @@
  * itself and may change without notice.
  */
 
-struct vdso_data {
-   __u64 tb_update_count;  /* Timebase atomicity ctr   0x00 */
-   __u64 xtime_tod_stamp;  /* TOD clock for xtime  0x08 */
-   __u64 xtime_clock_sec;  /* Kernel time  

[PATCH 1/2] vdso: allow to add architecture-specific vdso data

2020-08-02 Thread Sven Schnelle
Add the possibility to add architecture specific vDSO
data to struct vdso_data. This is useful if the arch specific
user space VDSO code needs additional data during execution.
If CONFIG_ARCH_HAS_VDSO_DATA is defined, the generic code will
include asm/vdso/data.h which should contain 'struct arch_vdso_data'.
This structure will be embedded in the generic vDSO data.

Signed-off-by: Sven Schnelle 
Reviewed-by: Heiko Carstens 
---
 arch/Kconfig| 3 +++
 include/vdso/datapage.h | 7 +++
 2 files changed, 10 insertions(+)

diff --git a/arch/Kconfig b/arch/Kconfig
index 8cc35dc556c7..e1017ce979e2 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -979,6 +979,9 @@ config HAVE_SPARSE_SYSCALL_NR
  entries at 4000, 5000 and 6000 locations. This option turns on syscall
  related optimizations for a given architecture.
 
+config ARCH_HAS_VDSO_DATA
+   bool
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/include/vdso/datapage.h b/include/vdso/datapage.h
index 7955c56d6b3c..74e730238ce6 100644
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -19,6 +19,10 @@
 #include 
 #include 
 
+#ifdef CONFIG_ARCH_HAS_VDSO_DATA
+#include 
+#endif
+
 #define VDSO_BASES (CLOCK_TAI + 1)
 #define VDSO_HRES  (BIT(CLOCK_REALTIME)| \
 BIT(CLOCK_MONOTONIC)   | \
@@ -97,6 +101,9 @@ struct vdso_data {
s32 tz_dsttime;
u32 hrtimer_res;
u32 __unused;
+#ifdef CONFIG_ARCH_HAS_VDSO_DATA
+   struct arch_vdso_data arch;
+#endif
 };
 
 /*
-- 
2.17.1



Re: [md] e1a86dbbbd: mdadm-selftests.enchmarks/mdadm-selftests/tests/07layouts.fail

2020-08-02 Thread Song Liu



> On Jul 29, 2020, at 2:04 AM, kernel test robot  wrote:
> 
> Greeting,
> 
> FYI, we noticed the following commit (built with gcc-9):
> 
> commit: e1a86dbbbd6a77f73c3d099030495fa31f181e2f ("md: fix deadlock causing 
> by sysfs_notify")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
> 
> 
> in testcase: mdadm-selftests
> with following parameters:
> 
>   disk: 1HDD
>   test_prefix: 07layout
>   ucode: 0x21
> 
> 
> 
> on test machine: 4 threads Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 4G 
> memory
> 
> caused below changes (please refer to attached dmesg/kmsg for entire 
> log/backtrace):
> 
> 
> 
> If you fix the issue, kindly add following tag
> Reported-by: kernel test robot 
> 
> 
> 
> 2020-07-29 01:06:34 mkdir -p /var/tmp
> 2020-07-29 01:06:34 mke2fs -t ext3 -b 4096 -J size=4 -q /dev/sda3
> 2020-07-29 01:07:36 mount -t ext3 /dev/sda3 /var/tmp
> sed -e 's/{DEFAULT_METADATA}/1.2/g' \
> -e 's,{MAP_PATH},/run/mdadm/map,g'  mdadm.8.in > mdadm.8
> /usr/bin/install -D -m 644 mdadm.8 /usr/share/man/man8/mdadm.8
> /usr/bin/install -D -m 644 mdmon.8 /usr/share/man/man8/mdmon.8
> /usr/bin/install -D -m 644 md.4 /usr/share/man/man4/md.4
> /usr/bin/install -D -m 644 mdadm.conf.5 /usr/share/man/man5/mdadm.conf.5
> /usr/bin/install -D -m 644 udev-md-raid-creating.rules 
> /lib/udev/rules.d/01-md-raid-creating.rules
> /usr/bin/install -D -m 644 udev-md-raid-arrays.rules 
> /lib/udev/rules.d/63-md-raid-arrays.rules
> /usr/bin/install -D -m 644 udev-md-raid-assembly.rules 
> /lib/udev/rules.d/64-md-raid-assembly.rules
> /usr/bin/install -D -m 644 udev-md-clustered-confirm-device.rules 
> /lib/udev/rules.d/69-md-clustered-confirm-device.rules
> /usr/bin/install -D  -m 755 mdadm /sbin/mdadm
> /usr/bin/install -D  -m 755 mdmon /sbin/mdmon
> Testing on linux-5.8.0-rc4-00129-ge1a86dbbbd6a7 kernel
> /lkp/benchmarks/mdadm-selftests/tests/07layouts... FAILED - see 
> /var/tmp/07layouts.log and /var/tmp/fail07layouts.log for details
> 07layouts TIMEOUT
> 
> 
> 
> To reproduce:
> 
>git clone https://github.com/intel/lkp-tests.git
>cd lkp-tests
>bin/lkp install job.yaml  # job file is attached in this email
>bin/lkp run job.yaml
> 
> 
> 
> Thanks,
> Rong Chen
> 
> <07layouts.log>

Hi Junxiao, 

Could you please look into this issue? 

Thanks,
Song



Re: [PATCH] scsi: esas2r: fix possible buffer overflow caused by bad DMA value in esas2r_process_fs_ioctl()

2020-08-02 Thread James Bottomley
On Mon, 2020-08-03 at 11:07 +0800, Jia-Ju Bai wrote:
> 
> On 2020/8/2 23:47, James Bottomley wrote:
> > On Sun, 2020-08-02 at 23:21 +0800, Jia-Ju Bai wrote:
> > > Because "fs" is mapped to DMA, its data can be modified at
> > > anytime by malicious or malfunctioning hardware. In this case,
> > > the check "if (fsc->command >= cmdcnt)" can be passed, and then
> > > "fsc->command" can be modified by hardware to cause buffer
> > > overflow.
> > 
> > This threat model seems to be completely bogus.  If the device were
> > malicious it would have given the mailbox incorrect values a priori
> > ... it wouldn't give the correct value then update it.  For most
> > systems we do assume correct operation of the device but if there's
> > a worry about incorrect operation, the usual approach is to guard
> > the device with an IOMMU which, again, would make this sort of fix
> > unnecessary because the IOMMU will have removed access to the
> > buffer after the command completed.
> 
> Thanks for the reply :)
> 
> In my opinion, IOMMU is used to prevent the hardware from accessing 
> arbitrary memory addresses, but it cannot prevent the hardware from 
> writing a bad value into a valid memory address.

I think that's what I said above.  It would give us a bad a priori
value which copying can't help with.

> For this reason, I think that the hardware can normally access 
> "fsc->command" and modify it into arbitrary value at any time,
> because IOMMU considers the address of "fsc->command" is valid for
> the hardware.

Not if we suspected the device.  I think esas2r does keep the buffer
mapped, but if we suspected the device we'd only map it for the reply
then unmap it.

The point I'm making is we have hardware tools at our disposal to
corral suspect devices if need be, but they're really only used in
exceptional VM circumstances.  Under ordinary circumstances we simply
trust the device.  So if you had evidence that esas2r were prone to
faults, we'd usually force the manufacturer to fix the firmware and as
a last resort we might consider corralling it with an iommu we wouldn't
just copy some values.

If you want an example of defensive coding, we had to add a load of
checks to TPM devices to cope with the bus interposer situation. 
That's one case where we no longer trust the device to return correct
information.  However, to do the same for any SCSI device we'd need a
convincing rationale for why.

James



[PATCHv2] x86/purgatory: don't generate debug info for purgatory.ro

2020-08-02 Thread Pingfan Liu
Purgatory.ro is a standalone binary that is not linked against the rest of
the kernel.  Its image is copied into an array that is linked to the
kernel, and from there kexec relocates it wherever it desires.

Unlike the debug info for vmlinux, which can be used for analyzing crash
such info is useless in purgatory.ro. And discarding them can save about
200K space.

Original:
  259080  kexec-purgatory.o
Stripped debug info:
   29152  kexec-purgatory.o

Signed-off-by: Pingfan Liu 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Hans de Goede 
Cc: Nick Desaulniers 
Cc: Arvind Sankar 
Cc: Steve Wahl 
Cc: linux-kernel@vger.kernel.org
Cc: ke...@lists.infradead.org
To: x...@kernel.org
---
 arch/x86/purgatory/Makefile | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
index 088bd76..d24b43a 100644
--- a/arch/x86/purgatory/Makefile
+++ b/arch/x86/purgatory/Makefile
@@ -32,7 +32,7 @@ KCOV_INSTRUMENT := n
 # make up the standalone purgatory.ro
 
 PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel
-PURGATORY_CFLAGS := -mcmodel=large -ffreestanding -fno-zero-initialized-in-bss
+PURGATORY_CFLAGS := -mcmodel=large -ffreestanding -fno-zero-initialized-in-bss 
-g0
 PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING
 PURGATORY_CFLAGS += $(call cc-option,-fno-stack-protector)
 
@@ -64,6 +64,9 @@ CFLAGS_sha256.o   += $(PURGATORY_CFLAGS)
 CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
 CFLAGS_string.o+= $(PURGATORY_CFLAGS)
 
+AFLAGS_REMOVE_setup-x86_$(BITS).o  += -Wa,-gdwarf-2
+AFLAGS_REMOVE_entry64.o+= -Wa,-gdwarf-2
+
 $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
$(call if_changed,ld)
 
-- 
2.7.5



Re: [PATCH] xen: hypercall.h: fix duplicated word

2020-08-02 Thread Jürgen Groß

On 26.07.20 02:17, Randy Dunlap wrote:

Change the repeated word "as" to "as a".

Signed-off-by: Randy Dunlap 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: xen-de...@lists.xenproject.org


Pushed to: xen/tip.git for-linus-5.9


Juergen


Re: [PATCH] xen/gntdev: gntdev.h: drop a duplicated word

2020-08-02 Thread Jürgen Groß

On 19.07.20 02:33, Randy Dunlap wrote:

Drop the repeated word "of" in a comment.

Signed-off-by: Randy Dunlap 
Cc: Boris Ostrovsky 
Cc: Juergen Gross 
Cc: xen-de...@lists.xenproject.org


Pushed to xen/tip.git for-linus-5.9


Juergen


Re: [PATCH] kprobes: fix NULL pointer dereference at kprobe_ftrace_handler

2020-08-02 Thread Muchun Song
Ping guys. Any comments or suggestions?

On Tue, Jul 28, 2020 at 2:45 PM Muchun Song  wrote:
>
> We found a case of kernel panic on our server. The stack trace is as
> follows(omit some irrelevant information):
>
>   BUG: kernel NULL pointer dereference, address: 0080
>   RIP: 0010:kprobe_ftrace_handler+0x5e/0xe0
>   RSP: 0018:b512c6550998 EFLAGS: 00010282
>   RAX:  RBX: 8e9d16eea018 RCX: 
>   RDX: be1179c0 RSI: c0535564 RDI: c0534ec0
>   RBP: c0534ec1 R08: 8e9d1bbb0f00 R09: 0004
>   R10:  R11:  R12: 
>   R13: 8e9d1f797060 R14: bacc R15: 8e9ce13eca00
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: 0080 CR3: 0008453d0005 CR4: 003606e0
>   DR0:  DR1:  DR2: 
>   DR3:  DR6: fffe0ff0 DR7: 0400
>   Call Trace:
>
>ftrace_ops_assist_func+0x56/0xe0
>ftrace_call+0x5/0x34
>tcpa_statistic_send+0x5/0x130 [ttcp_engine]
>
> The tcpa_statistic_send is the function being kprobed. After analysis,
> the root cause is that the fourth parameter regs of kprobe_ftrace_handler
> is NULL. Why regs is NULL? We use the crash tool to analyze the kdump.
>
>   crash> dis tcpa_statistic_send -r
>  : callq 0xbd8018c0 
>
> The tcpa_statistic_send calls ftrace_caller instead of ftrace_regs_caller.
> So it is reasonable that the fourth parameter regs of kprobe_ftrace_handler
> is NULL. In theory, we should call the ftrace_regs_caller instead of the
> ftrace_caller. After in-depth analysis, we found a reproducible path.
>
>   Writing a simple kernel module which starts a periodic timer. The
>   timer's handler is named 'kprobe_test_timer_handler'. The module
>   name is kprobe_test.ko.
>
>   1) insmod kprobe_test.ko
>   2) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
>   3) echo 0 > /proc/sys/kernel/ftrace_enabled
>   4) rmmod kprobe_test
>   5) stop step 2) kprobe
>   6) insmod kprobe_test.ko
>   7) bpftrace -e 'kretprobe:kprobe_test_timer_handler {}'
>
> We mark the kprobe as GONE but not disarm the kprobe in the step 4).
> The step 5) also do not disarm the kprobe when unregister kprobe. So
> we do not remove the ip from the filter. In this case, when the module
> loads again in the step 6), we will replace the code to ftrace_caller
> via the ftrace_module_enable(). When we register kprobe again, we will
> not replace ftrace_caller to ftrace_regs_caller because the ftrace is
> disabled in the step 3). So the step 7) will trigger kernel panic. Fix
> this problem by disarming the kprobe when the module is going away.
>
> Signed-off-by: Muchun Song 
> Co-developed-by: Chengming Zhou 
> Signed-off-by: Chengming Zhou 
> ---
>  kernel/kprobes.c | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 146c648eb943..503add629599 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -2148,6 +2148,13 @@ static void kill_kprobe(struct kprobe *p)
>  * the original probed function (which will be freed soon) any more.
>  */
> arch_remove_kprobe(p);
> +
> +   /*
> +* The module is going away. We should disarm the kprobe which
> +* is using ftrace.
> +*/
> +   if (kprobe_ftrace(p))
> +   disarm_kprobe_ftrace(p);
>  }
>
>  /* Disable one kprobe */
> --
> 2.11.0
>


-- 
Yours,
Muchun


[PATCH v3 0/2] i2c: stm32: add host-notify support via i2c slave

2020-08-02 Thread Alain Volmat
This serie replaces the previous 'stm32-f7: Addition of SMBus Alert /
Host-notify features' serie to only focus on the SMBus Host-Notify feature.
It should be applied with "[PATCH] i2c: add binding to mark a bus as SMBus"
from Wolfram which defines the newly introduced "smbus" binding.

Alain Volmat (2):
  i2c: smbus: add core function handling SMBus host-notify
  i2c: stm32f7: Add SMBus Host-Notify protocol support

 drivers/i2c/busses/Kconfig   |   1 +
 drivers/i2c/busses/i2c-stm32f7.c | 110 +--
 drivers/i2c/i2c-smbus.c  | 107 +
 include/linux/i2c-smbus.h|  12 +
 4 files changed, 215 insertions(+), 15 deletions(-)

-- 
v3: move smbus host-notify slave code into i2c-smbus.c file
rework slave callback index handling
add sanity check in slave free function

v2: fix a bad test within the i2c-stm32f7 driver leading to decrease of
available slave slot



linux-next: build failure after merge of the tip tree

2020-08-02 Thread Stephen Rothwell
Hi all,

After merging the tip tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

ERROR: modpost: "sched_setscheduler" [drivers/gpu/drm/drm.ko] undefined!

Caused by commit

  616d91b68cd5 ("sched: Remove sched_setscheduler*() EXPORTs")

interacting with commit

  5e6c2b4f9161 ("drm/vblank: Add vblank works")

from the drm tree.

I have reverted commit 616d91b68cd5 again for now.

-- 
Cheers,
Stephen Rothwell


pgpCY6Df1WSuW.pgp
Description: OpenPGP digital signature


Re: [PATCH v3 0/3] Few bug fixes and Convert to pin_user_pages*()

2020-08-02 Thread Jürgen Groß

On 12.07.20 05:39, Souptick Joarder wrote:

This series contains few clean up, minor bug fixes and
Convert get_user_pages() to pin_user_pages().

I'm compile tested this, but unable to run-time test,
so any testing help is much appriciated.

v2:
 Addressed few review comments and compile issue.
 Patch[1/2] from v1 split into 2 in v2.
v3:
Address review comment. Add review tag.

Cc: John Hubbard 
Cc: Boris Ostrovsky 
Cc: Paul Durrant 

Souptick Joarder (3):
   xen/privcmd: Corrected error handling path
   xen/privcmd: Mark pages as dirty
   xen/privcmd: Convert get_user_pages*() to pin_user_pages*()

  drivers/xen/privcmd.c | 32 ++--
  1 file changed, 14 insertions(+), 18 deletions(-)



Series pushed to xen/tip.git for-linus-5.9


Juergen


[PATCH] include/linux/miscdevice.h - Fix typo/grammar

2020-08-02 Thread Sebastian Fricke
Improve the clarity and grammar of descriptive comment on top of the
minor number assignments.

Fix a typo within 2 comments for macros.
s/This helps in eleminating of boilerplate code.
 /This helps to eliminate boilerplate code./

Signed-off-by: Sebastian Fricke 
---
 include/linux/miscdevice.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/miscdevice.h b/include/linux/miscdevice.h
index c7a93002a3c1..0676f18093f9 100644
--- a/include/linux/miscdevice.h
+++ b/include/linux/miscdevice.h
@@ -7,9 +7,9 @@
 #include 
 
 /*
- * These allocations are managed by dev...@lanana.org. If you use an
- * entry that is not in assigned your entry may well be moved and
- * reassigned, or set dynamic if a fixed value is not justified.
+ * These allocations are managed by dev...@lanana.org. If you need
+ * an entry that is not assigned here, it can be moved and
+ * reassigned or dynamically set if a fixed value is not justified.
  */
 
 #define PSMOUSE_MINOR  1
@@ -93,14 +93,14 @@ extern void misc_deregister(struct miscdevice *misc);
 
 /*
  * Helper macro for drivers that don't do anything special in the initcall.
- * This helps in eleminating of boilerplate code.
+ * This helps to eliminate boilerplate code.
  */
 #define builtin_misc_device(__misc_device) \
builtin_driver(__misc_device, misc_register)
 
 /*
  * Helper macro for drivers that don't do anything special in module init / 
exit
- * call. This helps in eleminating of boilerplate code.
+ * call. This helps to eliminate boilerplate code.
  */
 #define module_misc_device(__misc_device) \
module_driver(__misc_device, misc_register, misc_deregister)
-- 
2.20.1



[PATCH] hwmon: axi-fan-control: remove duplicate macros

2020-08-02 Thread Alexandru Ardelean
These macros are also present in the "include/linux/fpga/adi-axi-common.h"
file which is included in this driver.

This patch removes them from the AXI Fan Control driver. No sense in having
them in 2 places.

Signed-off-by: Alexandru Ardelean 
---
 drivers/hwmon/axi-fan-control.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/drivers/hwmon/axi-fan-control.c b/drivers/hwmon/axi-fan-control.c
index 38d9cdb3db1a..e3f6b03e6764 100644
--- a/drivers/hwmon/axi-fan-control.c
+++ b/drivers/hwmon/axi-fan-control.c
@@ -15,10 +15,6 @@
 #include 
 #include 
 
-#define ADI_AXI_PCORE_VER_MAJOR(version)   (((version) >> 16) & 0xff)
-#define ADI_AXI_PCORE_VER_MINOR(version)   (((version) >> 8) & 0xff)
-#define ADI_AXI_PCORE_VER_PATCH(version)   ((version) & 0xff)
-
 /* register map */
 #define ADI_REG_RSTN   0x0080
 #define ADI_REG_PWM_WIDTH  0x0084
-- 
2.17.1



Re: [PATCH] ASoC: fsl_sai: Clean code for synchronize mode

2020-08-02 Thread Nicolin Chen
On Mon, Aug 03, 2020 at 11:17:54AM +0800, Shengjiu Wang wrote:
> TX synchronous with RX: The RMR is no need to be changed when
> Tx is enabled, the other configuration in hw_params() is enough for

Probably you should explain why RMR can be removed, like what
it really does so as to make it clear that there's no such a
relationship between RMR and clock generating.

Anyway, this is against the warning comments in the driver:
/*
 * For SAI master mode, when Tx(Rx) sync with Rx(Tx) clock, Rx(Tx) will
 * generate bclk and frame clock for Tx(Rx), we should set RCR4(TCR4),
 * RCR5(TCR5) and RMR(TMR) for playback(capture), or there will be sync
 * error.
 */

So would need to update it.

> clock generation. The TCSR.TE is no need to enabled when only RX
> is enabled.

You are correct if there's only RX running without TX joining.
However, that's something we can't guarantee. Then we'd enable
TE after RE is enabled, which is against what RM recommends:

# From 54.3.3.1 Synchronous mode in IMX6SXRM
# If the receiver bit clock and frame sync are to be used by
# both the transmitter and receiver, it is recommended that
# the receiver is the last enabled and the first disabled.

I remember I did this "ugly" design by strictly following what
RM says. If hardware team has updated the RM or removed this
limitation, please quote in the commit logs.

> + if (!sai->synchronous[TX] && sai->synchronous[RX] && !tx) {
> + regmap_update_bits(sai->regmap, FSL_SAI_xCSR((!tx), 
> ofs),
> +FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);
> + } else if (!sai->synchronous[RX] && sai->synchronous[TX] && tx) 
> {
> + regmap_update_bits(sai->regmap, FSL_SAI_xCSR((!tx), 
> ofs),
> +FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);

Two identical regmap_update_bits calls -- both on !tx (RX?)


[PATCH v3 1/2] i2c: smbus: add core function handling SMBus host-notify

2020-08-02 Thread Alain Volmat
SMBus Host-Notify protocol, from the adapter point of view
consist of receiving a message from a client, including the
client address and some other data.

It can be simply handled by creating a new slave device
and registering a callback performing the parsing of the
message received from the client.

This commit introduces two new core functions
  * i2c_new_slave_host_notify_device
  * i2c_free_slave_host_notify_device
that take care of registration of the new slave device and
callback and will call i2c_handle_smbus_host_notify once a
Host-Notify event is received.

Signed-off-by: Alain Volmat 
---
 v3: move smbus host-notify slave code into i2c-smbus.c file
 rework slave callback index handling
 add sanity check in slave free function
 v2: identical to v1
 drivers/i2c/i2c-smbus.c   | 107 ++
 include/linux/i2c-smbus.h |  12 ++
 2 files changed, 119 insertions(+)

diff --git a/drivers/i2c/i2c-smbus.c b/drivers/i2c/i2c-smbus.c
index dc0108287ccf..d3d06e3b4f3b 100644
--- a/drivers/i2c/i2c-smbus.c
+++ b/drivers/i2c/i2c-smbus.c
@@ -197,6 +197,113 @@ EXPORT_SYMBOL_GPL(i2c_handle_smbus_alert);
 
 module_i2c_driver(smbalert_driver);
 
+#if IS_ENABLED(CONFIG_I2C_SLAVE)
+#define SMBUS_HOST_NOTIFY_LEN  3
+struct i2c_slave_host_notify_status {
+   u8 index;
+   u8 addr;
+};
+
+static int i2c_slave_host_notify_cb(struct i2c_client *client,
+   enum i2c_slave_event event, u8 *val)
+{
+   struct i2c_slave_host_notify_status *status = client->dev.platform_data;
+
+   switch (event) {
+   case I2C_SLAVE_WRITE_RECEIVED:
+   /* We only retrieve the first byte received (addr)
+* since there is currently no support to retrieve the data
+* parameter from the client.
+*/
+   if (status->index == 0)
+   status->addr = *val;
+   if (status->index < U8_MAX)
+   status->index++;
+   break;
+   case I2C_SLAVE_STOP:
+   if (status->index == SMBUS_HOST_NOTIFY_LEN)
+   i2c_handle_smbus_host_notify(client->adapter,
+status->addr);
+   fallthrough;
+   case I2C_SLAVE_WRITE_REQUESTED:
+   status->index = 0;
+   break;
+   case I2C_SLAVE_READ_REQUESTED:
+   case I2C_SLAVE_READ_PROCESSED:
+   *val = 0xff;
+   break;
+   }
+
+   return 0;
+}
+
+/**
+ * i2c_new_slave_host_notify_device - get a client for SMBus host-notify 
support
+ * @adapter: the target adapter
+ * Context: can sleep
+ *
+ * Setup handling of the SMBus host-notify protocol on a given I2C bus segment.
+ *
+ * Handling is done by creating a device and its callback and handling data
+ * received via the SMBus host-notify address (0x8)
+ *
+ * This returns the client, which should be ultimately freed using
+ * i2c_free_slave_host_notify_device(); or an ERRPTR to indicate an error.
+ */
+struct i2c_client *i2c_new_slave_host_notify_device(struct i2c_adapter 
*adapter)
+{
+   struct i2c_board_info host_notify_board_info = {
+   I2C_BOARD_INFO("smbus_host_notify", 0x08),
+   .flags  = I2C_CLIENT_SLAVE,
+   };
+   struct i2c_slave_host_notify_status *status;
+   struct i2c_client *client;
+   int ret;
+
+   status = kzalloc(sizeof(struct i2c_slave_host_notify_status),
+GFP_KERNEL);
+   if (!status)
+   return ERR_PTR(-ENOMEM);
+
+   host_notify_board_info.platform_data = status;
+
+   client = i2c_new_client_device(adapter, _notify_board_info);
+   if (IS_ERR(client)) {
+   kfree(status);
+   return client;
+   }
+
+   ret = i2c_slave_register(client, i2c_slave_host_notify_cb);
+   if (ret) {
+   i2c_unregister_device(client);
+   kfree(status);
+   return ERR_PTR(ret);
+   }
+
+   return client;
+}
+EXPORT_SYMBOL_GPL(i2c_new_slave_host_notify_device);
+
+/**
+ * i2c_free_slave_host_notify_device - free the client for SMBus host-notify
+ * support
+ * @client: the client to free
+ * Context: can sleep
+ *
+ * Free the i2c_client allocated via i2c_new_slave_host_notify_device
+ */
+void i2c_free_slave_host_notify_device(struct i2c_client *client)
+{
+   if (IS_ERR_OR_NULL(client))
+   return;
+
+   i2c_slave_unregister(client);
+   kfree(client->dev.platform_data);
+   i2c_unregister_device(client);
+}
+EXPORT_SYMBOL_GPL(i2c_free_slave_host_notify_device);
+#endif
+
 /*
  * SPD is not part of SMBus but we include it here for convenience as the
  * target systems are the same.
diff --git a/include/linux/i2c-smbus.h b/include/linux/i2c-smbus.h
index 1e4e0de4ef8b..1ef421818d3a 100644
--- a/include/linux/i2c-smbus.h
+++ b/include/linux/i2c-smbus.h
@@ -38,6 +38,18 @@ static 

[PATCH v4 16/23] mm/memremap_pages: Convert to 'struct range'

2020-08-02 Thread Dan Williams
The 'struct resource' in 'struct dev_pagemap' is only used for holding
resource span information. The other fields, 'name', 'flags', 'desc',
'parent', 'sibling', and 'child' are all unused wasted space.

This is in preparation for introducing a multi-range extension of
devm_memremap_pages().

The bulk of this change is unwinding all the places internal to
libnvdimm that used 'struct resource' unnecessarily.

P2PDMA had a minor usage of the flags field, but only to report failures
with "%pR". That is replaced with an open coded print of the range.

Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Dan Williams 
Cc: Vishal Verma 
Cc: Dave Jiang 
Cc: Ben Skeggs 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Ira Weiny 
Cc: Jason Gunthorpe 
Signed-off-by: Dan Williams 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c |   13 +++--
 drivers/dax/bus.c  |   10 ++--
 drivers/dax/bus.h  |2 -
 drivers/dax/dax-private.h  |5 --
 drivers/dax/device.c   |3 -
 drivers/dax/hmem/hmem.c|5 ++
 drivers/dax/pmem/core.c|   12 ++---
 drivers/gpu/drm/nouveau/nouveau_dmem.c |   14 +++---
 drivers/nvdimm/badrange.c  |   26 +--
 drivers/nvdimm/claim.c |   13 +++--
 drivers/nvdimm/nd.h|3 +
 drivers/nvdimm/pfn_devs.c  |   12 ++---
 drivers/nvdimm/pmem.c  |   26 ++-
 drivers/nvdimm/region.c|   21 +
 drivers/pci/p2pdma.c   |   11 ++---
 include/linux/memremap.h   |5 +-
 include/linux/range.h  |6 ++
 lib/test_hmm.c |   14 +++---
 mm/memremap.c  |   77 
 tools/testing/nvdimm/test/iomap.c  |2 -
 20 files changed, 147 insertions(+), 133 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 7705d5557239..29ec555055c2 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -687,9 +687,9 @@ static struct page *kvmppc_uvmem_get_page(unsigned long 
gpa, struct kvm *kvm)
struct kvmppc_uvmem_page_pvt *pvt;
unsigned long pfn_last, pfn_first;
 
-   pfn_first = kvmppc_uvmem_pgmap.res.start >> PAGE_SHIFT;
+   pfn_first = kvmppc_uvmem_pgmap.range.start >> PAGE_SHIFT;
pfn_last = pfn_first +
-  (resource_size(_uvmem_pgmap.res) >> PAGE_SHIFT);
+  (range_len(_uvmem_pgmap.range) >> PAGE_SHIFT);
 
spin_lock(_uvmem_bitmap_lock);
bit = find_first_zero_bit(kvmppc_uvmem_bitmap,
@@ -1007,7 +1007,7 @@ static vm_fault_t kvmppc_uvmem_migrate_to_ram(struct 
vm_fault *vmf)
 static void kvmppc_uvmem_page_free(struct page *page)
 {
unsigned long pfn = page_to_pfn(page) -
-   (kvmppc_uvmem_pgmap.res.start >> PAGE_SHIFT);
+   (kvmppc_uvmem_pgmap.range.start >> PAGE_SHIFT);
struct kvmppc_uvmem_page_pvt *pvt;
 
spin_lock(_uvmem_bitmap_lock);
@@ -1170,7 +1170,8 @@ int kvmppc_uvmem_init(void)
}
 
kvmppc_uvmem_pgmap.type = MEMORY_DEVICE_PRIVATE;
-   kvmppc_uvmem_pgmap.res = *res;
+   kvmppc_uvmem_pgmap.range.start = res->start;
+   kvmppc_uvmem_pgmap.range.end = res->end;
kvmppc_uvmem_pgmap.ops = _uvmem_ops;
/* just one global instance: */
kvmppc_uvmem_pgmap.owner = _uvmem_pgmap;
@@ -1205,7 +1206,7 @@ void kvmppc_uvmem_free(void)
return;
 
memunmap_pages(_uvmem_pgmap);
-   release_mem_region(kvmppc_uvmem_pgmap.res.start,
-  resource_size(_uvmem_pgmap.res));
+   release_mem_region(kvmppc_uvmem_pgmap.range.start,
+  range_len(_uvmem_pgmap.range));
kfree(kvmppc_uvmem_bitmap);
 }
diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 53d07f2f1285..00fa73a8dfb4 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -515,7 +515,7 @@ static void dax_region_unregister(void *region)
 }
 
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-   struct resource *res, int target_node, unsigned int align,
+   struct range *range, int target_node, unsigned int align,
unsigned long flags)
 {
struct dax_region *dax_region;
@@ -530,8 +530,8 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,
return NULL;
}
 
-   if (!IS_ALIGNED(res->start, align)
-   || !IS_ALIGNED(resource_size(res), align))
+   if (!IS_ALIGNED(range->start, align)
+   || !IS_ALIGNED(range_len(range), align))
return NULL;
 
dax_region = kzalloc(sizeof(*dax_region), GFP_KERNEL);
@@ -546,8 +546,8 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,

Re: powerpc: build failures in Linus' tree

2020-08-02 Thread Willy Tarreau
On Mon, Aug 03, 2020 at 02:10:17PM +1000, Stephen Rothwell wrote:
> Our mails have crossed.

Ah indeed :-)

> I just sent a more comprehensive patch.  I
> think your patch would require a lot of build testing and even then may
> fail for some CONFIG combination that we didn't test or added in the
> future (or someone just made up).

Your looks far more complete and very likely more future-proof, I
totally agree.

Thanks!
Willy


[PATCH v4 18/23] device-dax: Add dis-contiguous resource support

2020-08-02 Thread Dan Williams
Break the requirement that device-dax instances are physically
contiguous. With this constraint removed it allows fragmented available
capacity to be fully allocated.

This capability is useful to mitigate the "noisy neighbor" problem with
memory-side-cache management for virtual machines, or any other scenario
where a platform address boundary also designates a performance
boundary. For example a direct mapped memory side cache might rotate
cache colors at 1GB boundaries.  With dis-contiguous allocations a
device-dax instance could be configured to contain only 1 cache color.

It also satisfies Joao's use case (see link) for partitioning memory for
exclusive guest access. It allows for a future potential mode where the
host kernel need not allocate 'struct page' capacity up-front.

Link: 
https://lore.kernel.org/lkml/20200110190313.17144-1-joao.m.mart...@oracle.com/
Reported-by: Joao Martins 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c  |  230 +++-
 drivers/dax/dax-private.h  |9 +-
 drivers/dax/device.c   |   55 ++
 drivers/dax/kmem.c |  132 +++
 tools/testing/nvdimm/dax-dev.c |   20 ++-
 5 files changed, 319 insertions(+), 127 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 00fa73a8dfb4..8dd82ea9d53d 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -136,15 +136,27 @@ static bool is_static(struct dax_region *dax_region)
return (dax_region->res.flags & IORESOURCE_DAX_STATIC) != 0;
 }
 
+static u64 dev_dax_size(struct dev_dax *dev_dax)
+{
+   u64 size = 0;
+   int i;
+
+   device_lock_assert(_dax->dev);
+
+   for (i = 0; i < dev_dax->nr_range; i++)
+   size += range_len(_dax->ranges[i].range);
+
+   return size;
+}
+
 static int dax_bus_probe(struct device *dev)
 {
struct dax_device_driver *dax_drv = to_dax_drv(dev->driver);
struct dev_dax *dev_dax = to_dev_dax(dev);
struct dax_region *dax_region = dev_dax->region;
-   struct range *range = _dax->range;
int rc;
 
-   if (range_len(range) == 0 || dev_dax->id < 0)
+   if (dev_dax_size(dev_dax) == 0 || dev_dax->id < 0)
return -ENXIO;
 
rc = dax_drv->probe(dev_dax);
@@ -354,15 +366,19 @@ void kill_dev_dax(struct dev_dax *dev_dax)
 }
 EXPORT_SYMBOL_GPL(kill_dev_dax);
 
-static void free_dev_dax_range(struct dev_dax *dev_dax)
+static void free_dev_dax_ranges(struct dev_dax *dev_dax)
 {
struct dax_region *dax_region = dev_dax->region;
-   struct range *range = _dax->range;
+   int i;
 
device_lock_assert(dax_region->dev);
-   if (range_len(range))
+   for (i = 0; i < dev_dax->nr_range; i++) {
+   struct range *range = _dax->ranges[i].range;
+
__release_region(_region->res, range->start,
range_len(range));
+   }
+   dev_dax->nr_range = 0;
 }
 
 static void unregister_dev_dax(void *dev)
@@ -372,7 +388,7 @@ static void unregister_dev_dax(void *dev)
dev_dbg(dev, "%s\n", __func__);
 
kill_dev_dax(dev_dax);
-   free_dev_dax_range(dev_dax);
+   free_dev_dax_ranges(dev_dax);
device_del(dev);
put_device(dev);
 }
@@ -423,7 +439,7 @@ static ssize_t delete_store(struct device *dev, struct 
device_attribute *attr,
device_lock(dev);
device_lock(victim);
dev_dax = to_dev_dax(victim);
-   if (victim->driver || range_len(_dax->range))
+   if (victim->driver || dev_dax_size(dev_dax))
rc = -EBUSY;
else {
/*
@@ -569,51 +585,83 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
u64 start,
struct dax_region *dax_region = dev_dax->region;
struct resource *res = _region->res;
struct device *dev = _dax->dev;
+   struct dev_dax_range *ranges;
+   unsigned long pgoff = 0;
struct resource *alloc;
+   int i;
 
device_lock_assert(dax_region->dev);
 
/* handle the seed alloc special case */
if (!size) {
-   dev_dax->range = (struct range) {
-   .start = res->start,
-   .end = res->start - 1,
-   };
+   if (dev_WARN_ONCE(dev, dev_dax->nr_range,
+   "0-size allocation must be first\n"))
+   return -EBUSY;
+   /* nr_range == 0 is elsewhere special cased as 0-size device */
return 0;
}
 
+   ranges = krealloc(dev_dax->ranges, sizeof(*ranges)
+   * (dev_dax->nr_range + 1), GFP_KERNEL);
+   if (!ranges)
+   return -ENOMEM;
+
alloc = __request_region(res, start, size, dev_name(dev), 0);
-   if (!alloc)
+   if (!alloc && !dev_dax->nr_range) {
+   /*
+* If we adjusted an existing @ranges leave it alone,
+* 

Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

2020-08-02 Thread Stanley Chu
Hi Can,

On Mon, 2020-08-03 at 13:14 +0800, Can Guo wrote:
> Hi Stanley,
> 
> On 2020-08-03 11:00, Stanley Chu wrote:
> > Hi Can,
> > 
> > On Sat, 2020-08-01 at 07:17 +0800, Can Guo wrote:
> >> Hi Bart,
> >> 
> >> On 2020-08-01 00:51, Bart Van Assche wrote:
> >> > On 2020-07-31 01:00, Can Guo wrote:
> >> >> AFAIK, sychronization of scsi_done is not a problem here, because scsi
> >> >> layer
> >> >> use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to
> >> >> prevent
> >> >> the concurrency of abort and real completion of it.
> >> >>
> >> >> Check func scsi_times_out(), hope it helps.
> >> >>
> >> >> enum blk_eh_timer_return scsi_times_out(struct request *req)
> >> >> {
> >> >> ...
> >> >> if (rtn == BLK_EH_DONE) {
> >> >> /*
> >> >>  * Set the command to complete first in order to
> >> >> prevent
> >> >> a real
> >> >>  * completion from releasing the command while error
> >> >> handling
> >> >>  * is using it. If the command was already completed,
> >> >> then the
> >> >>  * lower level driver beat the timeout handler, and it
> >> >> is safe
> >> >>  * to return without escalating error recovery.
> >> >>  *
> >> >>  * If timeout handling lost the race to a real
> >> >> completion, the
> >> >>  * block layer may ignore that due to a fake timeout
> >> >> injection,
> >> >>  * so return RESET_TIMER to allow error handling
> >> >> another
> >> >> shot
> >> >>  * at this command.
> >> >>  */
> >> >> if (test_and_set_bit(SCMD_STATE_COMPLETE,
> >> >> >state))
> >> >> return BLK_EH_RESET_TIMER;
> >> >> if (scsi_abort_command(scmd) != SUCCESS) {
> >> >> set_host_byte(scmd, DID_TIME_OUT);
> >> >> scsi_eh_scmd_add(scmd);
> >> >> }
> >> >> }
> >> >> }
> >> >
> >> > I am familiar with this mechanism. My concern is that both the regular
> >> > completion path and the abort handler must call scsi_dma_unmap() before
> >> > calling cmd->scsi_done(cmd). I don't see how
> >> > test_and_set_bit(SCMD_STATE_COMPLETE, >state) could prevent that
> >> > the regular completion path and the abort handler call scsi_dma_unmap()
> >> > concurrently since both calls happen before the SCMD_STATE_COMPLETE bit
> >> > is set?
> >> >
> >> > Thanks,
> >> >
> >> > Bart.
> >> 
> >> For scsi_dma_unmap() part, that is true - we should make it serialized
> >> with
> >> any other completion paths. I've found it during my fault injection
> >> test, so
> >> I've made a patch to fix it, but it only comes in my next error 
> >> recovery
> >> enhancement patch series. Please check the attachment.
> >> 
> > 
> > Your patch looks good to me.
> > 
> > I have the same idea before but I found that calling scsi_done() (by
> > __ufshcd_transfer_req_compl()) in ufshcd_abort() in old kernel (e.g.,
> > 4.14) will cause issues but it has been resolved by introduced
> > SCMD_STATE_COMPLETE flag in newer kernel. So your patch makes sense.
> > 
> > Would you mind sending out this draft patch as a formal patch together
> > with my patch to fix issues in ufshcd_abort()? Our patches are aimed to
> > fix cases that host/device reset eventually not being triggered by the
> > result of ufshcd_abort(), for example, command is aborted successfully
> > or command is not pending in device with its doorbell also cleared.
> > 
> > Thanks,
> > Stanley Chu
> > 
> 
> I don't quite actually follow your fix here and I didn't test the 
> similar
> fault injection scenario like you do here, so I am not sure if I should
> just absorb your fix into mine. How about I put my fix in my current 
> error
> recovery patch series (maybe in next version of it) and you can give 
> your
> review. So you can still go with your fix as it is. Mine will be picked 
> up
> later by Martin. What do you think?
> 

Sure, that's good to me.

Thanks,

Stanley Chu

> Thanks,
> 
> Can Guo.
> 
> >> Thanks,
> >> 
> >> Can Guo.
> >> 



[PATCH] i2c: stm32f7: add SMBus-Alert support

2020-08-02 Thread Alain Volmat
Add support for the SMBus-Alert protocol.

Signed-off-by: Alain Volmat 
---
 This patch has to be integrated on top of the patch
 'i2c: stm32f7: Add SMBus Host-Notify protocol support' since SMBus Alert is
 enabled by the DT binding 'smbus' introduced in that patch.

 drivers/i2c/busses/i2c-stm32f7.c | 71 
 1 file changed, 71 insertions(+)

diff --git a/drivers/i2c/busses/i2c-stm32f7.c b/drivers/i2c/busses/i2c-stm32f7.c
index 223c238c3c09..fe7641da54ef 100644
--- a/drivers/i2c/busses/i2c-stm32f7.c
+++ b/drivers/i2c/busses/i2c-stm32f7.c
@@ -51,6 +51,7 @@
 
 /* STM32F7 I2C control 1 */
 #define STM32F7_I2C_CR1_PECEN  BIT(23)
+#define STM32F7_I2C_CR1_ALERTENBIT(22)
 #define STM32F7_I2C_CR1_SMBHEN BIT(20)
 #define STM32F7_I2C_CR1_WUPEN  BIT(18)
 #define STM32F7_I2C_CR1_SBCBIT(16)
@@ -123,6 +124,7 @@
(((n) & STM32F7_I2C_ISR_ADDCODE_MASK) >> 17)
 #define STM32F7_I2C_ISR_DIRBIT(16)
 #define STM32F7_I2C_ISR_BUSY   BIT(15)
+#define STM32F7_I2C_ISR_ALERT  BIT(13)
 #define STM32F7_I2C_ISR_PECERR BIT(11)
 #define STM32F7_I2C_ISR_ARLO   BIT(9)
 #define STM32F7_I2C_ISR_BERR   BIT(8)
@@ -136,6 +138,7 @@
 #define STM32F7_I2C_ISR_TXEBIT(0)
 
 /* STM32F7 I2C Interrupt Clear */
+#define STM32F7_I2C_ICR_ALERTCFBIT(13)
 #define STM32F7_I2C_ICR_PECCF  BIT(11)
 #define STM32F7_I2C_ICR_ARLOCF BIT(9)
 #define STM32F7_I2C_ICR_BERRCF BIT(8)
@@ -277,6 +280,17 @@ struct stm32f7_i2c_msg {
 };
 
 /**
+ * struct stm32f7_i2c_alert - SMBus alert specific data
+ * @setup: platform data for the smbus_alert i2c client
+ * @ara: I2C slave device used to respond to the SMBus Alert with Alert
+ * Response Address
+ */
+struct stm32f7_i2c_alert {
+   struct i2c_smbus_alert_setup setup;
+   struct i2c_client *ara;
+};
+
+/**
  * struct stm32f7_i2c_dev - private data of the controller
  * @adap: I2C adapter for this controller
  * @dev: device for this controller
@@ -305,6 +319,7 @@ struct stm32f7_i2c_msg {
  * @wakeup_src: boolean to know if the device is a wakeup source
  * @smbus_mode: states that the controller is configured in SMBus mode
  * @host_notify_client: SMBus host-notify client
+ * @alert: SMBus alert specific data
  */
 struct stm32f7_i2c_dev {
struct i2c_adapter adap;
@@ -333,6 +348,7 @@ struct stm32f7_i2c_dev {
bool wakeup_src;
bool smbus_mode;
struct i2c_client *host_notify_client;
+   struct stm32f7_i2c_alert *alert;
 };
 
 /*
@@ -1601,6 +1617,13 @@ static irqreturn_t stm32f7_i2c_isr_error(int irq, void 
*data)
f7_msg->result = -EINVAL;
}
 
+   if (status & STM32F7_I2C_ISR_ALERT) {
+   dev_dbg(dev, "<%s>: SMBus alert received\n", __func__);
+   writel_relaxed(STM32F7_I2C_ICR_ALERTCF, base + STM32F7_I2C_ICR);
+   i2c_handle_smbus_alert(i2c_dev->alert->ara);
+   return IRQ_HANDLED;
+   }
+
if (!i2c_dev->slave_running) {
u32 mask;
/* Disable interrupts */
@@ -1967,6 +1990,42 @@ static void stm32f7_i2c_disable_smbus_host(struct 
stm32f7_i2c_dev *i2c_dev)
}
 }
 
+static int stm32f7_i2c_enable_smbus_alert(struct stm32f7_i2c_dev *i2c_dev)
+{
+   struct stm32f7_i2c_alert *alert;
+   struct i2c_adapter *adap = _dev->adap;
+   struct device *dev = i2c_dev->dev;
+   void __iomem *base = i2c_dev->base;
+
+   alert = devm_kzalloc(dev, sizeof(*alert), GFP_KERNEL);
+   if (!alert)
+   return -ENOMEM;
+
+   alert->ara = i2c_new_smbus_alert_device(adap, >setup);
+   if (IS_ERR(alert->ara))
+   return PTR_ERR(alert->ara);
+
+   i2c_dev->alert = alert;
+
+   /* Enable SMBus Alert */
+   stm32f7_i2c_set_bits(base + STM32F7_I2C_CR1, STM32F7_I2C_CR1_ALERTEN);
+
+   return 0;
+}
+
+static void stm32f7_i2c_disable_smbus_alert(struct stm32f7_i2c_dev *i2c_dev)
+{
+   struct stm32f7_i2c_alert *alert = i2c_dev->alert;
+   void __iomem *base = i2c_dev->base;
+
+   if (alert) {
+   /* Disable SMBus Alert */
+   stm32f7_i2c_clr_bits(base + STM32F7_I2C_CR1,
+STM32F7_I2C_CR1_ALERTEN);
+   i2c_unregister_device(alert->ara);
+   }
+}
+
 static u32 stm32f7_i2c_func(struct i2c_adapter *adap)
 {
struct stm32f7_i2c_dev *i2c_dev = i2c_get_adapdata(adap);
@@ -2161,6 +2220,14 @@ static int stm32f7_i2c_probe(struct platform_device 
*pdev)
ret);
goto i2c_adapter_remove;
}
+
+   ret = stm32f7_i2c_enable_smbus_alert(i2c_dev);
+   if (ret) {
+   dev_err(i2c_dev->dev,
+   

Re: PATCH: rtsx_pci driver - don't disable the rts5229 card reader on Intel NUC boxes

2020-08-02 Thread Chris Clayton
Hi, Ricky

On 03/08/2020 04:01, 吳昊澄 Ricky wrote:
> Hi Chris,
> 
> We don’t think this is our bug...
> This register(FPDCTL) write to OC_POWER_DOWN is for our power saving feature, 
> not to disable the reader
> In your case, we cannot reproduce this on our side that we mention before, we 
> don’t have the platform(Intel NUC Tall Arches Canyon NUC6CAYH Celeron J345) 
> to see this issue
> But we think this issue maybe only on this platform, our RTS5229 works well 
> on the new kernel all platform that we have  
> 
> Ricky

Perhaps I should have used the word regression rather than bug. I didn't buy 
the machine until earlier this year, but
other people who have reported this problem have indicated that until 
bede03a579b3 was applied (during the 5.1 merge
window), the driver supported the card reader on this on the Intel NUC boxes. I 
know that is true because I built and
tested a 5.0 kernel and the card reader worked fine. I've also built and tested 
an 5.1-rc1 kernel and the card reader no
longer works. Whether by design or by accident, the card reader worked and now 
it doesn't. That's a regression in my book!

Since you signed off the patch that caused the regression, I believe it is your 
bug.

Thanks.

Chris
> 
>> -Original Message-
>> From: Chris Clayton [mailto:chris2...@googlemail.com]
>> Sent: Monday, August 03, 2020 3:59 AM
>> To: LKML; 吳昊澄 Ricky; gre...@linuxfoundation.org; rdun...@infradead.org;
>> philqua...@gmail.com; Arnd Bergmann
>> Subject: Re: PATCH: rtsx_pci driver - don't disable the rts5229 card reader 
>> on
>> Intel NUC boxes
>>
>> Sorry, I should have said that the patch is against 5.7.12. It applies to 
>> upstream,
>> but with offsets.
>>
>> On 02/08/2020 20:48, Chris Clayton wrote:
>>> bede03a579b3 introduced a bug which leaves the rts5229 PCI Express card
>> reader on my Intel NUC6CAYH box.
>>>
>>> The bug is in drivers/misc/cardreader/rtsx_pcr.c. A call to 
>>> rtsx_pci_init_ocp()
>> was added to rtsx_pci_init_hw().
>>> At the call point, pcr->ops->init_ocp is NULL and pcr->option.ocp_en is 0, 
>>> so in
>> rtsx_pci_init_ocp() the cardreader
>>> gets disabled.
>>>
>>> I've avoided this by making excution code that results in the reader being
>> disabled conditional on the device
>>> not being an RTS5229. Of course, other rtsxxx card readers may also be
>> disabled by this bug. I don't have the
>>> knowledge to address that, so I'll leave to the driver maintainers.
>>>
>>> The patch to avoid the bug is attached.
>>>
>>> Fixes: bede03a579b3 ("misc: rtsx: Enable OCP for rts522a rts524a rts525a
>> rts5260")
>>> Link: https://marc.info/?l=linux-kernel=159105912832257
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=204003
>>> Signed-off-by: Chris Clayton 
>>>
>>> bede03a579b3 introduced a bug which leaves the rts5229 PCI Express card
>> reader on my Intel NUC6CAYH box.
>>>
>>> The bug is in drivers/misc/cardreader/rtsx_pcr.c. A call to 
>>> rtsx_pci_init_ocp()
>> was added to rtsx_pci_init_hw().
>>> At the call point, pcr->ops->init_ocp is NULL and pcr->option.ocp_en is 0, 
>>> so in
>> rtsx_pci_init_ocp() the cardreader
>>> gets disabled.
>>>
>>> I've avoided this by making excution code that results in the reader being
>> disabled conditional on the device
>>> not being an RTS5229. Of course, other rtsxxx card readers may also be
>> disabled by this bug. I don't have the
>>> knowledge to address that, so I'll leave to the driver maintainers.
>>>
>>> The patch to avoid the bug is attached.
>>>
>>> Chris
>>>
>>
>> --Please consider the environment before printing this e-mail.


[v7] dt-bindings: msm: disp: add yaml schemas for DPU and DSI bindings

2020-08-02 Thread Krishna Manikandan
MSM Mobile Display Subsytem (MDSS) encapsulates sub-blocks
like DPU display controller, DSI etc. Add YAML schema
for the device tree bindings for the same.

Signed-off-by: Krishna Manikandan 

Changes in v2:
- Changed dpu to DPU (Sam Ravnborg)
- Fixed indentation issues (Sam Ravnborg)
- Added empty line between different properties (Sam Ravnborg)
- Replaced reference txt files with  their corresponding
  yaml files (Sam Ravnborg)
- Modified the file to use "|" only when it is
  necessary (Sam Ravnborg)

Changes in v3:
- Corrected the license used (Rob Herring)
- Added maxItems for properties (Rob Herring)
- Dropped generic descriptions (Rob Herring)
- Added ranges property (Rob Herring)
- Corrected the indendation (Rob Herring)
- Added additionalProperties (Rob Herring)
- Split dsi file into two, one for dsi controller
  and another one for dsi phy per target (Rob Herring)
- Corrected description for pinctrl-names (Rob Herring)
- Corrected the examples used in yaml file (Rob Herring)
- Delete dsi.txt and dpu.txt (Rob Herring)

Changes in v4:
- Move schema up by one level (Rob Herring)
- Add patternProperties for mdp node (Rob Herring)
- Corrected description of some properties (Rob Herring)

Changes in v5:
- Correct the indentation (Rob Herring)
- Remove unnecessary description from properties (Rob Herring)
- Correct the number of interconnect entries (Rob Herring)
- Add interconnect names for sc7180 (Rob Herring)
- Add description for ports (Rob Herring)
- Remove common properties (Rob Herring)
- Add unevalutatedProperties (Rob Herring)
- Reference existing dsi controller yaml in the common
  dsi controller file (Rob Herring)
- Correct the description of clock names to include only the
  clocks that are required (Rob Herring)
- Remove properties which are already covered under the common
  binding (Rob Herring)
- Add dsi phy supply nodes which are required for sc7180 and
  sdm845 targets (Rob Herring)
- Add type ref for syscon-sfpb (Rob Herring)

Changes in v6:
- Fixed errors during dt_binding_check (Rob Herring)
- Add maxItems for phys and phys-names (Rob Herring)
- Use unevaluatedProperties wherever required (Rob Herring)
- Removed interrupt controller from required properties for
  dsi controller (Rob Herring)
- Add constraints for dsi-phy reg-names based on the compatible
  phy version (Rob Herring)
- Add constraints for dsi-phy supply nodes based on the
  compatible phy version (Rob Herring)

Changes in v7:
- Add default value for qcom,mdss-mdp-transfer-time-us (Rob Herring)
- Modify the schema for data-lanes (Rob Herring)
- Split the phy schema into separate schemas based on
  the phy version (Rob Herring)
---
 .../bindings/display/msm/dpu-sc7180.yaml   | 236 +++
 .../bindings/display/msm/dpu-sdm845.yaml   | 216 ++
 .../devicetree/bindings/display/msm/dpu.txt| 141 
 .../display/msm/dsi-common-controller.yaml | 249 +
 .../display/msm/dsi-controller-sc7180.yaml | 120 ++
 .../display/msm/dsi-controller-sdm845.yaml | 120 ++
 .../bindings/display/msm/dsi-phy-10nm.yaml |  62 +
 .../bindings/display/msm/dsi-phy-14nm.yaml |  61 +
 .../bindings/display/msm/dsi-phy-20nm.yaml |  66 ++
 .../bindings/display/msm/dsi-phy-28nm.yaml |  62 +
 .../bindings/display/msm/dsi-phy-sc7180.yaml   |  80 +++
 .../bindings/display/msm/dsi-phy-sdm845.yaml   |  82 +++
 .../devicetree/bindings/display/msm/dsi.txt| 246 
 13 files changed, 1354 insertions(+), 387 deletions(-)
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dpu-sc7180.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dpu-sdm845.yaml
 delete mode 100644 Documentation/devicetree/bindings/display/msm/dpu.txt
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-common-controller.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-controller-sc7180.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-controller-sdm845.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-10nm.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-14nm.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-20nm.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-28nm.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-sc7180.yaml
 create mode 100644 
Documentation/devicetree/bindings/display/msm/dsi-phy-sdm845.yaml
 delete mode 100644 Documentation/devicetree/bindings/display/msm/dsi.txt

diff --git 

[PATCH v4 17/23] mm/memremap_pages: Support multiple ranges per invocation

2020-08-02 Thread Dan Williams
In support of device-dax growing the ability to front physically
dis-contiguous ranges of memory, update devm_memremap_pages() to track
multiple ranges with a single reference counter and devm instance.

Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Dan Williams 
Cc: Vishal Verma 
Cc: Dave Jiang 
Cc: Ben Skeggs 
Cc: David Airlie 
Cc: Daniel Vetter 
Cc: Ira Weiny 
Cc: Jason Gunthorpe 
Signed-off-by: Dan Williams 
---
 arch/powerpc/kvm/book3s_hv_uvmem.c |1 
 drivers/dax/device.c   |1 
 drivers/gpu/drm/nouveau/nouveau_dmem.c |1 
 drivers/nvdimm/pfn_devs.c  |1 
 drivers/nvdimm/pmem.c  |1 
 drivers/pci/p2pdma.c   |1 
 include/linux/memremap.h   |   10 +
 lib/test_hmm.c |1 
 mm/memremap.c  |  258 +++-
 9 files changed, 165 insertions(+), 110 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 29ec555055c2..84e5a2dc8be5 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -1172,6 +1172,7 @@ int kvmppc_uvmem_init(void)
kvmppc_uvmem_pgmap.type = MEMORY_DEVICE_PRIVATE;
kvmppc_uvmem_pgmap.range.start = res->start;
kvmppc_uvmem_pgmap.range.end = res->end;
+   kvmppc_uvmem_pgmap.nr_range = 1;
kvmppc_uvmem_pgmap.ops = _uvmem_ops;
/* just one global instance: */
kvmppc_uvmem_pgmap.owner = _uvmem_pgmap;
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index fffc54ce0911..f3755df4ae29 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -417,6 +417,7 @@ int dev_dax_probe(struct dev_dax *dev_dax)
if (!pgmap)
return -ENOMEM;
pgmap->range = *range;
+   pgmap->nr_range = 1;
}
pgmap->type = MEMORY_DEVICE_DEVDAX;
addr = devm_memremap_pages(dev, pgmap);
diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c 
b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 25811ed7e274..a13c6215bba8 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -251,6 +251,7 @@ nouveau_dmem_chunk_alloc(struct nouveau_drm *drm, struct 
page **ppage)
chunk->pagemap.type = MEMORY_DEVICE_PRIVATE;
chunk->pagemap.range.start = res->start;
chunk->pagemap.range.end = res->end;
+   chunk->pagemap.nr_range = 1;
chunk->pagemap.ops = _dmem_pagemap_ops;
chunk->pagemap.owner = drm->dev;
 
diff --git a/drivers/nvdimm/pfn_devs.c b/drivers/nvdimm/pfn_devs.c
index 3c4787b92a6a..b499df630d4d 100644
--- a/drivers/nvdimm/pfn_devs.c
+++ b/drivers/nvdimm/pfn_devs.c
@@ -693,6 +693,7 @@ static int __nvdimm_setup_pfn(struct nd_pfn *nd_pfn, struct 
dev_pagemap *pgmap)
.start = nsio->res.start + start_pad,
.end = nsio->res.end - end_trunc,
};
+   pgmap->nr_range = 1;
if (nd_pfn->mode == PFN_MODE_RAM) {
if (offset < reserve)
return -EINVAL;
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 69cc0e783709..1f45af363a94 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -442,6 +442,7 @@ static int pmem_attach_disk(struct device *dev,
} else if (pmem_should_map_pages(dev)) {
pmem->pgmap.range.start = res->start;
pmem->pgmap.range.end = res->end;
+   pmem->pgmap.nr_range = 1;
pmem->pgmap.type = MEMORY_DEVICE_FS_DAX;
pmem->pgmap.ops = _pagemap_ops;
addr = devm_memremap_pages(dev, >pgmap);
diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index dd6b0d51a50c..403304785561 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -187,6 +187,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, 
size_t size,
pgmap = _pgmap->pgmap;
pgmap->range.start = pci_resource_start(pdev, bar) + offset;
pgmap->range.end = pgmap->range.start + size - 1;
+   pgmap->nr_range = 1;
pgmap->type = MEMORY_DEVICE_PCI_P2PDMA;
 
p2p_pgmap->provider = pdev;
diff --git a/include/linux/memremap.h b/include/linux/memremap.h
index 6c21951bdb16..4e9c738f4b31 100644
--- a/include/linux/memremap.h
+++ b/include/linux/memremap.h
@@ -95,7 +95,6 @@ struct dev_pagemap_ops {
 /**
  * struct dev_pagemap - metadata for ZONE_DEVICE mappings
  * @altmap: pre-allocated/reserved memory for vmemmap allocations
- * @range: physical address range covered by @ref
  * @ref: reference count that pins the devm_memremap_pages() mapping
  * @internal_ref: internal reference if @ref is not provided by the caller
  * @done: completion for @internal_ref
@@ -105,10 +104,12 @@ struct dev_pagemap_ops {
  * @owner: an opaque pointer identifying the entity that manages this
  * instance.  Used by various helpers to make sure 

[PATCH v4 14/23] drivers/base: Make device_find_child_by_name() compatible with sysfs inputs

2020-08-02 Thread Dan Williams
Use sysfs_streq() in device_find_child_by_name() to allow it to use a
sysfs input string that might contain a trailing newline.

The other "device by name" interfaces,
{bus,driver,class}_find_device_by_name(), already account for sysfs
strings.

Cc: "Rafael J. Wysocki" 
Reviewed-by: Greg Kroah-Hartman 
Signed-off-by: Dan Williams 
---
 drivers/base/core.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 2169c5132558..231189dd6599 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -3328,7 +3328,7 @@ struct device *device_find_child_by_name(struct device 
*parent,
 
klist_iter_init(>p->klist_children, );
while ((child = next_device()))
-   if (!strcmp(dev_name(child), name) && get_device(child))
+   if (sysfs_streq(dev_name(child), name) && get_device(child))
break;
klist_iter_exit();
return child;



[PATCH v4 19/23] device-dax: Introduce 'mapping' devices

2020-08-02 Thread Dan Williams
In support of interrogating the physical address layout of a device with
dis-contiguous ranges, introduce a sysfs directory with 'start', 'end',
and 'page_offset' attributes. The alternative is trying to parse
/proc/iomem, and that file will not reflect the extent layout until the
device is enabled.

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |  191 +
 drivers/dax/dax-private.h |   14 +++
 2 files changed, 203 insertions(+), 2 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 8dd82ea9d53d..2779c65dc7c0 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -579,6 +579,167 @@ struct dax_region *alloc_dax_region(struct device 
*parent, int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
+static void dax_mapping_release(struct device *dev)
+{
+   struct dax_mapping *mapping = to_dax_mapping(dev);
+   struct dev_dax *dev_dax = to_dev_dax(dev->parent);
+
+   ida_free(_dax->ida, mapping->id);
+   kfree(mapping);
+}
+
+static void unregister_dax_mapping(void *data)
+{
+   struct device *dev = data;
+   struct dax_mapping *mapping = to_dax_mapping(dev);
+   struct dev_dax *dev_dax = to_dev_dax(dev->parent);
+   struct dax_region *dax_region = dev_dax->region;
+
+   dev_dbg(dev, "%s\n", __func__);
+
+   device_lock_assert(dax_region->dev);
+
+   dev_dax->ranges[mapping->range_id].mapping = NULL;
+   mapping->range_id = -1;
+
+   device_del(dev);
+   put_device(dev);
+}
+
+static struct dev_dax_range *get_dax_range(struct device *dev)
+{
+   struct dax_mapping *mapping = to_dax_mapping(dev);
+   struct dev_dax *dev_dax = to_dev_dax(dev->parent);
+   struct dax_region *dax_region = dev_dax->region;
+
+   device_lock(dax_region->dev);
+   if (mapping->range_id < 0) {
+   device_unlock(dax_region->dev);
+   return NULL;
+   }
+
+   return _dax->ranges[mapping->range_id];
+}
+
+static void put_dax_range(struct dev_dax_range *dax_range)
+{
+   struct dax_mapping *mapping = dax_range->mapping;
+   struct dev_dax *dev_dax = to_dev_dax(mapping->dev.parent);
+   struct dax_region *dax_region = dev_dax->region;
+
+   device_unlock(dax_region->dev);
+}
+
+static ssize_t start_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dev_dax_range *dax_range;
+   ssize_t rc;
+
+   dax_range = get_dax_range(dev);
+   if (!dax_range)
+   return -ENXIO;
+   rc = sprintf(buf, "%#llx\n", dax_range->range.start);
+   put_dax_range(dax_range);
+
+   return rc;
+}
+static DEVICE_ATTR(start, 0400, start_show, NULL);
+
+static ssize_t end_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dev_dax_range *dax_range;
+   ssize_t rc;
+
+   dax_range = get_dax_range(dev);
+   if (!dax_range)
+   return -ENXIO;
+   rc = sprintf(buf, "%#llx\n", dax_range->range.end);
+   put_dax_range(dax_range);
+
+   return rc;
+}
+static DEVICE_ATTR(end, 0400, end_show, NULL);
+
+static ssize_t pgoff_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dev_dax_range *dax_range;
+   ssize_t rc;
+
+   dax_range = get_dax_range(dev);
+   if (!dax_range)
+   return -ENXIO;
+   rc = sprintf(buf, "%#lx\n", dax_range->pgoff);
+   put_dax_range(dax_range);
+
+   return rc;
+}
+static DEVICE_ATTR(page_offset, 0400, pgoff_show, NULL);
+
+static struct attribute *dax_mapping_attributes[] = {
+   _attr_start.attr,
+   _attr_end.attr,
+   _attr_page_offset.attr,
+   NULL,
+};
+
+static const struct attribute_group dax_mapping_attribute_group = {
+   .attrs = dax_mapping_attributes,
+};
+
+static const struct attribute_group *dax_mapping_attribute_groups[] = {
+   _mapping_attribute_group,
+   NULL,
+};
+
+static struct device_type dax_mapping_type = {
+   .release = dax_mapping_release,
+   .groups = dax_mapping_attribute_groups,
+};
+
+static int devm_register_dax_mapping(struct dev_dax *dev_dax, int range_id)
+{
+   struct dax_region *dax_region = dev_dax->region;
+   struct dax_mapping *mapping;
+   struct device *dev;
+   int rc;
+
+   device_lock_assert(dax_region->dev);
+
+   if (dev_WARN_ONCE(_dax->dev, !dax_region->dev->driver,
+   "region disabled\n"))
+   return -ENXIO;
+
+   mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
+   if (!mapping)
+   return -ENOMEM;
+   mapping->range_id = range_id;
+   mapping->id = ida_alloc(_dax->ida, GFP_KERNEL);
+   if (mapping->id < 0) {
+   kfree(mapping);
+   return -ENOMEM;
+   }
+   dev_dax->ranges[range_id].mapping = mapping;
+   dev = >dev;
+   device_initialize(dev);
+   

[PATCH v4 22/23] dax/hmem: Introduce dax_hmem.region_idle parameter

2020-08-02 Thread Dan Williams
From: Joao Martins 

Introduce a new module parameter for dax_hmem which
initializes all region devices as free, rather than allocating
a pagemap for the region by default.

All hmem devices created with dax_hmem.region_idle=1 will have full
available size for creating dynamic dax devices.

Signed-off-by: Joao Martins 
Link: https://lore.kernel.org/r/20200716172913.19658-4-joao.m.mart...@oracle.com
Signed-off-by: Dan Williams 
---
 drivers/dax/hmem/hmem.c |5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index 1a3347bb6143..1bf040dbc834 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -5,6 +5,9 @@
 #include 
 #include "../bus.h"
 
+static bool region_idle;
+module_param_named(region_idle, region_idle, bool, 0644);
+
 static int dax_hmem_probe(struct platform_device *pdev)
 {
struct device *dev = >dev;
@@ -30,7 +33,7 @@ static int dax_hmem_probe(struct platform_device *pdev)
data = (struct dev_dax_data) {
.dax_region = dax_region,
.id = -1,
-   .size = resource_size(res),
+   .size = region_idle ? 0 : resource_size(res),
};
dev_dax = devm_create_dev_dax();
if (IS_ERR(dev_dax))



[PATCH v4 23/23] device-dax: Add a range mapping allocation attribute

2020-08-02 Thread Dan Williams
From: Joao Martins 

Add a sysfs attribute which denotes a range from the dax region
to be allocated. It's an write only @mapping sysfs attribute in
the format of '-' to allocate a range. @start and
@end use hexadecimal values and the @pgoff is implicitly ordered
wrt to previous writes to @mapping sysfs e.g. a write of a range
of length 1G the pgoff is 0..1G(-4K), a second write will use
@pgoff for 1G+4K...

This range mapping interface is useful for:

 1) Application which want to implement its own allocation logic,
 and thus pick the desired ranges from dax_region.

 2) For use cases like VMM fast restart[0] where after kexec we
 want to the same gpa<->phys mappings (as originally created
 before kexec).

[0] 
https://static.sched.com/hosted_files/kvmforum2019/66/VMM-fast-restart_kvmforum2019.pdf

Signed-off-by: Joao Martins 
Link: https://lore.kernel.org/r/20200716172913.19658-5-joao.m.mart...@oracle.com
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |   64 +
 1 file changed, 64 insertions(+)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index b984213c315f..092112bba6ed 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1040,6 +1040,67 @@ static ssize_t size_store(struct device *dev, struct 
device_attribute *attr,
 }
 static DEVICE_ATTR_RW(size);
 
+static ssize_t range_parse(const char *opt, size_t len, struct range *range)
+{
+   unsigned long long addr = 0;
+   char *start, *end, *str;
+   ssize_t rc = EINVAL;
+
+   str = kstrdup(opt, GFP_KERNEL);
+   if (!str)
+   return rc;
+
+   end = str;
+   start = strsep(, "-");
+   if (!start || !end)
+   goto err;
+
+   rc = kstrtoull(start, 16, );
+   if (rc)
+   goto err;
+   range->start = addr;
+
+   rc = kstrtoull(end, 16, );
+   if (rc)
+   goto err;
+   range->end = addr;
+
+err:
+   kfree(str);
+   return rc;
+}
+
+static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t len)
+{
+   struct dev_dax *dev_dax = to_dev_dax(dev);
+   struct dax_region *dax_region = dev_dax->region;
+   size_t to_alloc;
+   struct range r;
+   ssize_t rc;
+
+   rc = range_parse(buf, len, );
+   if (rc)
+   return rc;
+
+   rc = -ENXIO;
+   device_lock(dax_region->dev);
+   if (!dax_region->dev->driver) {
+   device_unlock(dax_region->dev);
+   return rc;
+   }
+   device_lock(dev);
+
+   to_alloc = range_len();
+   if (alloc_is_aligned(dev_dax, to_alloc))
+   rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
+   device_unlock(dev);
+   device_unlock(dax_region->dev);
+
+   return rc == 0 ? len : rc;
+}
+static DEVICE_ATTR_WO(mapping);
+
 static ssize_t align_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
@@ -1172,6 +1233,8 @@ static umode_t dev_dax_visible(struct kobject *kobj, 
struct attribute *a, int n)
return 0;
if (a == _attr_numa_node.attr && !IS_ENABLED(CONFIG_NUMA))
return 0;
+   if (a == _attr_mapping.attr && is_static(dax_region))
+   return 0;
if ((a == _attr_align.attr ||
 a == _attr_size.attr) && is_static(dax_region))
return 0444;
@@ -1181,6 +1244,7 @@ static umode_t dev_dax_visible(struct kobject *kobj, 
struct attribute *a, int n)
 static struct attribute *dev_dax_attributes[] = {
_attr_modalias.attr,
_attr_size.attr,
+   _attr_mapping.attr,
_attr_target_node.attr,
_attr_align.attr,
_attr_resource.attr,



[PATCH v4 21/23] device-dax: Add an 'align' attribute

2020-08-02 Thread Dan Williams
From: Joao Martins 

Introduce a device align attribute. While doing so,
rename the region align attribute to be more explicitly
named as so, but keep it named as @align to retain the API
for tools like daxctl.

Changes on align may not always be valid, when say certain
mappings were created with 2M and then we switch to 1G. So, we
validate all ranges against the new value being attempted,
post resizing.

Signed-off-by: Joao Martins 
Link: https://lore.kernel.org/r/20200716172913.19658-3-joao.m.mart...@oracle.com
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |   93 -
 drivers/dax/dax-private.h |   18 +
 2 files changed, 101 insertions(+), 10 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 9edfdf83408e..b984213c315f 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -230,14 +230,15 @@ static ssize_t region_size_show(struct device *dev,
 static struct device_attribute dev_attr_region_size = __ATTR(size, 0444,
region_size_show, NULL);
 
-static ssize_t align_show(struct device *dev,
+static ssize_t region_align_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct dax_region *dax_region = dev_get_drvdata(dev);
 
return sprintf(buf, "%u\n", dax_region->align);
 }
-static DEVICE_ATTR_RO(align);
+static struct device_attribute dev_attr_region_align =
+   __ATTR(align, 0400, region_align_show, NULL);
 
 #define for_each_dax_region_resource(dax_region, res) \
for (res = (dax_region)->res.child; res; res = res->sibling)
@@ -488,7 +489,7 @@ static umode_t dax_region_visible(struct kobject *kobj, 
struct attribute *a,
 static struct attribute *dax_region_attributes[] = {
_attr_available_size.attr,
_attr_region_size.attr,
-   _attr_align.attr,
+   _attr_region_align.attr,
_attr_create.attr,
_attr_seed.attr,
_attr_delete.attr,
@@ -855,15 +856,13 @@ static ssize_t size_show(struct device *dev,
return sprintf(buf, "%llu\n", size);
 }
 
-static bool alloc_is_aligned(struct dax_region *dax_region,
-   resource_size_t size)
+static bool alloc_is_aligned(struct dev_dax *dev_dax, resource_size_t size)
 {
/*
 * The minimum mapping granularity for a device instance is a
 * single subsection, unless the arch says otherwise.
 */
-   return IS_ALIGNED(size, max_t(unsigned long, dax_region->align,
-   memremap_compat_align()));
+   return IS_ALIGNED(size, max_t(unsigned long, dev_dax->align, 
memremap_compat_align()));
 }
 
 static int dev_dax_shrink(struct dev_dax *dev_dax, resource_size_t size)
@@ -958,7 +957,7 @@ static ssize_t dev_dax_resize(struct dax_region *dax_region,
return dev_dax_shrink(dev_dax, size);
 
to_alloc = size - dev_size;
-   if (dev_WARN_ONCE(dev, !alloc_is_aligned(dax_region, to_alloc),
+   if (dev_WARN_ONCE(dev, !alloc_is_aligned(dev_dax, to_alloc),
"resize of %pa misaligned\n", _alloc))
return -ENXIO;
 
@@ -1022,7 +1021,7 @@ static ssize_t size_store(struct device *dev, struct 
device_attribute *attr,
if (rc)
return rc;
 
-   if (!alloc_is_aligned(dax_region, val)) {
+   if (!alloc_is_aligned(dev_dax, val)) {
dev_dbg(dev, "%s: size: %lld misaligned\n", __func__, val);
return -EINVAL;
}
@@ -1041,6 +1040,78 @@ static ssize_t size_store(struct device *dev, struct 
device_attribute *attr,
 }
 static DEVICE_ATTR_RW(size);
 
+static ssize_t align_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dev_dax *dev_dax = to_dev_dax(dev);
+
+   return sprintf(buf, "%d\n", dev_dax->align);
+}
+
+static ssize_t dev_dax_validate_align(struct dev_dax *dev_dax)
+{
+   resource_size_t dev_size = dev_dax_size(dev_dax);
+   struct device *dev = _dax->dev;
+   int i;
+
+   if (dev_size > 0 && !alloc_is_aligned(dev_dax, dev_size)) {
+   dev_dbg(dev, "%s: align %u invalid for size %pa\n",
+   __func__, dev_dax->align, _size);
+   return -EINVAL;
+   }
+
+   for (i = 0; i < dev_dax->nr_range; i++) {
+   size_t len = range_len(_dax->ranges[i].range);
+
+   if (!alloc_is_aligned(dev_dax, len)) {
+   dev_dbg(dev, "%s: align %u invalid for range %d\n",
+   __func__, dev_dax->align, i);
+   return -EINVAL;
+   }
+   }
+
+   return 0;
+}
+
+static ssize_t align_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t len)
+{
+   struct dev_dax *dev_dax = to_dev_dax(dev);
+   struct dax_region *dax_region = dev_dax->region;
+   unsigned long val, align_save;
+   ssize_t rc;
+
+   rc = 

[PATCH v4 15/23] device-dax: Add resize support

2020-08-02 Thread Dan Williams
Make the device-dax 'size' attribute writable to allow capacity to be
split between multiple instances in a region. The intended consumers of
this capability are users that want to split a scarce memory resource
between device-dax and System-RAM access, or users that want to have
multiple security domains for a large region.

By default the hmem instance provider allocates an entire region to the
first instance. The process of creating a new instance (assuming a
region-id of 0) is find the region and trigger the 'create' attribute
which yields an empty instance to configure. For example:

cd /sys/bus/dax/devices
echo dax0.0 > dax0.0/driver/unbind
echo $new_size > dax0.0/size
echo 1 > $(readlink -f dax0.0)../dax_region/create
seed=$(cat $(readlink -f dax0.0)../dax_region/seed)
echo $new_size > $seed/size
echo dax0.0 > ../drivers/{device_dax,kmem}/bind
echo dax0.1 > ../drivers/{device_dax,kmem}/bind

Instances can be destroyed by:

echo $device > $(readlink -f $device)../dax_region/delete

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |  161 ++---
 1 file changed, 152 insertions(+), 9 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index dce9413a4394..53d07f2f1285 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -6,6 +6,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "dax-private.h"
 #include "bus.h"
 
@@ -562,7 +563,8 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
-static int alloc_dev_dax_range(struct dev_dax *dev_dax, resource_size_t size)
+static int alloc_dev_dax_range(struct dev_dax *dev_dax, u64 start,
+   resource_size_t size)
 {
struct dax_region *dax_region = dev_dax->region;
struct resource *res = _region->res;
@@ -580,12 +582,7 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
resource_size_t size)
return 0;
}
 
-   /* TODO: handle multiple allocations per region */
-   if (res->child)
-   return -ENOMEM;
-
-   alloc = __request_region(res, res->start, size, dev_name(dev), 0);
-
+   alloc = __request_region(res, start, size, dev_name(dev), 0);
if (!alloc)
return -ENOMEM;
 
@@ -597,6 +594,29 @@ static int alloc_dev_dax_range(struct dev_dax *dev_dax, 
resource_size_t size)
return 0;
 }
 
+static int adjust_dev_dax_range(struct dev_dax *dev_dax, struct resource *res, 
resource_size_t size)
+{
+   struct dax_region *dax_region = dev_dax->region;
+   struct range *range = _dax->range;
+   int rc = 0;
+
+   device_lock_assert(dax_region->dev);
+
+   if (size)
+   rc = adjust_resource(res, range->start, size);
+   else
+   __release_region(_region->res, range->start, 
range_len(range));
+   if (rc)
+   return rc;
+
+   dev_dax->range = (struct range) {
+   .start = range->start,
+   .end = range->start + size - 1,
+   };
+
+   return 0;
+}
+
 static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
@@ -605,7 +625,127 @@ static ssize_t size_show(struct device *dev,
 
return sprintf(buf, "%llu\n", size);
 }
-static DEVICE_ATTR_RO(size);
+
+static bool alloc_is_aligned(struct dax_region *dax_region,
+   resource_size_t size)
+{
+   /*
+* The minimum mapping granularity for a device instance is a
+* single subsection, unless the arch says otherwise.
+*/
+   return IS_ALIGNED(size, max_t(unsigned long, dax_region->align,
+   memremap_compat_align()));
+}
+
+static int dev_dax_shrink(struct dev_dax *dev_dax, resource_size_t size)
+{
+   struct dax_region *dax_region = dev_dax->region;
+   struct range *range = _dax->range;
+   struct resource *res, *adjust = NULL;
+   struct device *dev = _dax->dev;
+
+   for_each_dax_region_resource(dax_region, res)
+   if (strcmp(res->name, dev_name(dev)) == 0
+   && res->start == range->start) {
+   adjust = res;
+   break;
+   }
+
+   if (dev_WARN_ONCE(dev, !adjust, "failed to find matching resource\n"))
+   return -ENXIO;
+   return adjust_dev_dax_range(dev_dax, adjust, size);
+}
+
+static ssize_t dev_dax_resize(struct dax_region *dax_region,
+   struct dev_dax *dev_dax, resource_size_t size)
+{
+   resource_size_t avail = dax_region_avail_size(dax_region), to_alloc;
+   resource_size_t dev_size = range_len(_dax->range);
+   struct resource *region_res = _region->res;
+   struct device *dev = _dax->dev;
+   const char *name = dev_name(dev);
+   struct resource *res, *first;
+
+   if (dev->driver)
+   return -EBUSY;
+   if 

[PATCH v4 20/23] device-dax: Make align a per-device property

2020-08-02 Thread Dan Williams
From: Joao Martins 

Introduce @align to struct dev_dax.

When creating a new device, we still initialize to the default
dax_region @align. Child devices belonging to a region may wish
to keep a different alignment property instead of a global
region-defined one.

Signed-off-by: Joao Martins 
Link: https://lore.kernel.org/r/20200716172913.19658-2-joao.m.mart...@oracle.com
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |1 +
 drivers/dax/dax-private.h |3 +++
 drivers/dax/device.c  |   37 +++--
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 2779c65dc7c0..9edfdf83408e 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1215,6 +1215,7 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data 
*data)
 
dev_dax->dax_dev = dax_dev;
dev_dax->target_node = dax_region->target_node;
+   dev_dax->align = dax_region->align;
ida_init(_dax->ida);
kref_get(_region->kref);
 
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 13780f62b95e..5fd3a26cfcea 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -62,6 +62,7 @@ struct dax_mapping {
 struct dev_dax {
struct dax_region *region;
struct dax_device *dax_dev;
+   unsigned int align;
int target_node;
int id;
struct ida ida;
@@ -84,4 +85,6 @@ static inline struct dax_mapping *to_dax_mapping(struct 
device *dev)
 {
return container_of(dev, struct dax_mapping, dev);
 }
+
+phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff, unsigned 
long size);
 #endif
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 2bfc5c83e3b0..d2b1892cb1b2 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -17,7 +17,6 @@
 static int check_vma(struct dev_dax *dev_dax, struct vm_area_struct *vma,
const char *func)
 {
-   struct dax_region *dax_region = dev_dax->region;
struct device *dev = _dax->dev;
unsigned long mask;
 
@@ -32,7 +31,7 @@ static int check_vma(struct dev_dax *dev_dax, struct 
vm_area_struct *vma,
return -EINVAL;
}
 
-   mask = dax_region->align - 1;
+   mask = dev_dax->align - 1;
if (vma->vm_start & mask || vma->vm_end & mask) {
dev_info_ratelimited(dev,
"%s: %s: fail, unaligned vma (%#lx - %#lx, 
%#lx)\n",
@@ -78,21 +77,19 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax 
*dev_dax,
struct vm_fault *vmf, pfn_t *pfn)
 {
struct device *dev = _dax->dev;
-   struct dax_region *dax_region;
phys_addr_t phys;
unsigned int fault_size = PAGE_SIZE;
 
if (check_vma(dev_dax, vmf->vma, __func__))
return VM_FAULT_SIGBUS;
 
-   dax_region = dev_dax->region;
-   if (dax_region->align > PAGE_SIZE) {
+   if (dev_dax->align > PAGE_SIZE) {
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
-   dax_region->align, fault_size);
+   dev_dax->align, fault_size);
return VM_FAULT_SIGBUS;
}
 
-   if (fault_size != dax_region->align)
+   if (fault_size != dev_dax->align)
return VM_FAULT_SIGBUS;
 
phys = dax_pgoff_to_phys(dev_dax, vmf->pgoff, PAGE_SIZE);
@@ -120,15 +117,15 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
 
dax_region = dev_dax->region;
-   if (dax_region->align > PMD_SIZE) {
+   if (dev_dax->align > PMD_SIZE) {
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
-   dax_region->align, fault_size);
+   dev_dax->align, fault_size);
return VM_FAULT_SIGBUS;
}
 
-   if (fault_size < dax_region->align)
+   if (fault_size < dev_dax->align)
return VM_FAULT_SIGBUS;
-   else if (fault_size > dax_region->align)
+   else if (fault_size > dev_dax->align)
return VM_FAULT_FALLBACK;
 
/* if we are outside of the VMA */
@@ -164,15 +161,15 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
 
dax_region = dev_dax->region;
-   if (dax_region->align > PUD_SIZE) {
+   if (dev_dax->align > PUD_SIZE) {
dev_dbg(dev, "alignment (%#x) > fault size (%#x)\n",
-   dax_region->align, fault_size);
+   dev_dax->align, fault_size);
return VM_FAULT_SIGBUS;
}
 
-   if (fault_size < dax_region->align)
+   if (fault_size < dev_dax->align)
return VM_FAULT_SIGBUS;
-   else if (fault_size > dax_region->align)
+   else if (fault_size > dev_dax->align)
return VM_FAULT_FALLBACK;
 
/* if we are outside 

[PATCH v4 11/23] device-dax: Kill dax_kmem_res

2020-08-02 Thread Dan Williams
Several related issues around this unneeded attribute:

- The dax_kmem_res property allows the kmem driver to stash the adjusted
  resource range that was used for the hotplug operation, but that can be
  recalculated from the original base range.

- kmem is using an open coded release_resource() + kfree() when an
  idiomatic release_mem_region() is sufficient.

- The driver managed resource need only manage the busy flag. Other flags
  are of no concern to the kmem driver. In fact if kmem inherits some
  memory range that add_memory_driver_managed() rejects that is a
  memory-hotplug-core policy that the driver is in no position to
  override.

- The implementation trusts that failed remove_memory() results in the
  entire resource range remaining pinned busy. The driver need not make
  that layering violation assumption and just maintain the busy state in
  its local resource.

- The "Hot-remove not yet implemented." comment is stale since hotremove
  support is now included.

Cc: David Hildenbrand 
Cc: Vishal Verma 
Cc: Dave Hansen 
Cc: Pavel Tatashin 
Signed-off-by: Dan Williams 
---
 drivers/dax/dax-private.h |3 -
 drivers/dax/kmem.c|  123 +
 2 files changed, 58 insertions(+), 68 deletions(-)

diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 6779f683671d..12a2dbc43b40 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -42,8 +42,6 @@ struct dax_region {
  * @dev - device core
  * @pgmap - pgmap for memmap setup / lifetime (driver owned)
  * @range: resource range for the instance
- * @dax_mem_res: physical address range of hotadded DAX memory
- * @dax_mem_name: name for hotadded DAX memory via add_memory_driver_managed()
  */
 struct dev_dax {
struct dax_region *region;
@@ -52,7 +50,6 @@ struct dev_dax {
struct device dev;
struct dev_pagemap *pgmap;
struct range range;
-   struct resource *dax_kmem_res;
 };
 
 static inline u64 range_len(struct range *range)
diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index 5bb133df147d..77e25361fbeb 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -19,16 +19,24 @@ static const char *kmem_name;
 /* Set if any memory will remain added when the driver will be unloaded. */
 static bool any_hotremove_failed;
 
+static struct range dax_kmem_range(struct dev_dax *dev_dax)
+{
+   struct range range;
+
+   /* memory-block align the hotplug range */
+   range.start = ALIGN(dev_dax->range.start, memory_block_size_bytes());
+   range.end = ALIGN_DOWN(dev_dax->range.end + 1,
+   memory_block_size_bytes()) - 1;
+   return range;
+}
+
 int dev_dax_kmem_probe(struct device *dev)
 {
struct dev_dax *dev_dax = to_dev_dax(dev);
-   struct range *range = _dax->range;
-   resource_size_t kmem_start;
-   resource_size_t kmem_size;
-   resource_size_t kmem_end;
-   struct resource *new_res;
-   const char *new_res_name;
-   int numa_node;
+   struct range range = dax_kmem_range(dev_dax);
+   int numa_node = dev_dax->target_node;
+   struct resource *res;
+   char *res_name;
int rc;
 
/*
@@ -37,109 +45,94 @@ int dev_dax_kmem_probe(struct device *dev)
 * could be mixed in a node with faster memory, causing
 * unavoidable performance issues.
 */
-   numa_node = dev_dax->target_node;
if (numa_node < 0) {
dev_warn(dev, "rejecting DAX region with invalid node: %d\n",
numa_node);
return -EINVAL;
}
 
-   /* Hotplug starting at the beginning of the next block: */
-   kmem_start = ALIGN(range->start, memory_block_size_bytes());
-
-   kmem_size = range_len(range);
-   /* Adjust the size down to compensate for moving up kmem_start: */
-   kmem_size -= kmem_start - range->start;
-   /* Align the size down to cover only complete blocks: */
-   kmem_size &= ~(memory_block_size_bytes() - 1);
-   kmem_end = kmem_start + kmem_size;
-
-   new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
-   if (!new_res_name)
+   res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+   if (!res_name)
return -ENOMEM;
 
-   /* Region is permanently reserved if hotremove fails. */
-   new_res = request_mem_region(kmem_start, kmem_size, new_res_name);
-   if (!new_res) {
-   dev_warn(dev, "could not reserve region [%pa-%pa]\n",
-_start, _end);
-   kfree(new_res_name);
+   res = request_mem_region(range.start, range_len(), res_name);
+   if (!res) {
+   dev_warn(dev, "could not reserve region [%#llx-%#llx]\n",
+   range.start, range.end);
+   kfree(res_name);
return -EBUSY;
}
 
/*
-* Set flags appropriate for System RAM.  Leave ..._BUSY clear
-

[PATCH v4 03/23] efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance

2020-08-02 Thread Dan Williams
In preparation for attaching a platform device per iomem resource teach
the efi_fake_mem code to create an e820 entry per instance. Similar to
E820_TYPE_PRAM, bypass merging resource when the e820 map is sanitized.

Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: x...@kernel.org
Acked-by: Ard Biesheuvel 
Signed-off-by: Dan Williams 
---
 arch/x86/kernel/e820.c  |   16 +++-
 drivers/firmware/efi/x86_fake_mem.c |   12 +---
 2 files changed, 24 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index 983cd53ed4c9..22aad412f965 100644
--- a/arch/x86/kernel/e820.c
+++ b/arch/x86/kernel/e820.c
@@ -305,6 +305,20 @@ static int __init cpcompare(const void *a, const void *b)
return (ap->addr != ap->entry->addr) - (bp->addr != bp->entry->addr);
 }
 
+static bool e820_nomerge(enum e820_type type)
+{
+   /*
+* These types may indicate distinct platform ranges aligned to
+* numa node, protection domain, performance domain, or other
+* boundaries. Do not merge them.
+*/
+   if (type == E820_TYPE_PRAM)
+   return true;
+   if (type == E820_TYPE_SOFT_RESERVED)
+   return true;
+   return false;
+}
+
 int __init e820__update_table(struct e820_table *table)
 {
struct e820_entry *entries = table->entries;
@@ -380,7 +394,7 @@ int __init e820__update_table(struct e820_table *table)
}
 
/* Continue building up new map based on this information: */
-   if (current_type != last_type || current_type == 
E820_TYPE_PRAM) {
+   if (current_type != last_type || e820_nomerge(current_type)) {
if (last_type != 0)  {
new_entries[new_nr_entries].size = 
change_point[chg_idx]->addr - last_addr;
/* Move forward only if the new size was 
non-zero: */
diff --git a/drivers/firmware/efi/x86_fake_mem.c 
b/drivers/firmware/efi/x86_fake_mem.c
index e5d6d5a1b240..0bafcc1bb0f6 100644
--- a/drivers/firmware/efi/x86_fake_mem.c
+++ b/drivers/firmware/efi/x86_fake_mem.c
@@ -38,7 +38,7 @@ void __init efi_fake_memmap_early(void)
m_start = mem->range.start;
m_end = mem->range.end;
for_each_efi_memory_desc(md) {
-   u64 start, end;
+   u64 start, end, size;
 
if (md->type != EFI_CONVENTIONAL_MEMORY)
continue;
@@ -58,11 +58,17 @@ void __init efi_fake_memmap_early(void)
 */
start = max(start, m_start);
end = min(end, m_end);
+   size = end - start + 1;
 
if (end <= start)
continue;
-   e820__range_update(start, end - start + 1, 
E820_TYPE_RAM,
-   E820_TYPE_SOFT_RESERVED);
+
+   /*
+* Ensure each efi_fake_mem instance results in
+* a unique e820 resource
+*/
+   e820__range_remove(start, size, E820_TYPE_RAM, 1);
+   e820__range_add(start, size, E820_TYPE_SOFT_RESERVED);
e820__update_table(e820_table);
}
}



[PATCH v4 13/23] device-dax: Introduce 'seed' devices

2020-08-02 Thread Dan Williams
Add a seed device concept for dynamic dax regions to be able to split
the region amongst multiple sub-instances. The seed device, similar to
libnvdimm seed devices, is a device that starts with zero capacity
allocated and unbound to a driver. In contrast to libnvdimm seed devices
explicit 'create' and 'delete' interfaces are added to the region to
trigger seeds to be created and unused devices to be reclaimed. The
explicit create and delete replaces implicit create as a side effect of
probe and implicit delete when writing 0 to the size that libnvdimm
implements.

Delete can be performed on any 0-sized and idle device.  This avoids the
gymnastics of needing to move device_unregister() to its own async
context.  Specifically, it avoids the deadlock of deleting a device via
one of its own attributes. It is also less surprising to userspace which
never sees an extra device it did not request.

For now just add the device creation, teardown, and ->probe()
prevention. A later patch will arrange for the 'dax/size' attribute to
be writable to allocate capacity from the region.

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |  317 -
 drivers/dax/bus.h |4 -
 drivers/dax/dax-private.h |9 +
 drivers/dax/device.c  |   12 +-
 drivers/dax/hmem/hmem.c   |2 
 drivers/dax/kmem.c|   14 +-
 drivers/dax/pmem/compat.c |2 
 7 files changed, 304 insertions(+), 56 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 0a48ce378686..dce9413a4394 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -135,10 +135,46 @@ static bool is_static(struct dax_region *dax_region)
return (dax_region->res.flags & IORESOURCE_DAX_STATIC) != 0;
 }
 
+static int dax_bus_probe(struct device *dev)
+{
+   struct dax_device_driver *dax_drv = to_dax_drv(dev->driver);
+   struct dev_dax *dev_dax = to_dev_dax(dev);
+   struct dax_region *dax_region = dev_dax->region;
+   struct range *range = _dax->range;
+   int rc;
+
+   if (range_len(range) == 0 || dev_dax->id < 0)
+   return -ENXIO;
+
+   rc = dax_drv->probe(dev_dax);
+
+   if (rc || is_static(dax_region))
+   return rc;
+
+   /*
+* Track new seed creation only after successful probe of the
+* previous seed.
+*/
+   if (dax_region->seed == dev)
+   dax_region->seed = NULL;
+
+   return 0;
+}
+
+static int dax_bus_remove(struct device *dev)
+{
+   struct dax_device_driver *dax_drv = to_dax_drv(dev->driver);
+   struct dev_dax *dev_dax = to_dev_dax(dev);
+
+   return dax_drv->remove(dev_dax);
+}
+
 static struct bus_type dax_bus_type = {
.name = "dax",
.uevent = dax_bus_uevent,
.match = dax_bus_match,
+   .probe = dax_bus_probe,
+   .remove = dax_bus_remove,
.drv_groups = dax_drv_groups,
 };
 
@@ -219,14 +255,216 @@ static ssize_t available_size_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(available_size);
 
+static ssize_t seed_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dax_region *dax_region = dev_get_drvdata(dev);
+   struct device *seed;
+   ssize_t rc;
+
+   if (is_static(dax_region))
+   return -EINVAL;
+
+   device_lock(dev);
+   seed = dax_region->seed;
+   rc = sprintf(buf, "%s\n", seed ? dev_name(seed) : "");
+   device_unlock(dev);
+
+   return rc;
+}
+static DEVICE_ATTR_RO(seed);
+
+static ssize_t create_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dax_region *dax_region = dev_get_drvdata(dev);
+   struct device *youngest;
+   ssize_t rc;
+
+   if (is_static(dax_region))
+   return -EINVAL;
+
+   device_lock(dev);
+   youngest = dax_region->youngest;
+   rc = sprintf(buf, "%s\n", youngest ? dev_name(youngest) : "");
+   device_unlock(dev);
+
+   return rc;
+}
+
+static ssize_t create_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t len)
+{
+   struct dax_region *dax_region = dev_get_drvdata(dev);
+   unsigned long long avail;
+   ssize_t rc;
+   int val;
+
+   if (is_static(dax_region))
+   return -EINVAL;
+
+   rc = kstrtoint(buf, 0, );
+   if (rc)
+   return rc;
+   if (val != 1)
+   return -EINVAL;
+
+   device_lock(dev);
+   avail = dax_region_avail_size(dax_region);
+   if (avail == 0)
+   rc = -ENOSPC;
+   else {
+   struct dev_dax_data data = {
+   .dax_region = dax_region,
+   .size = 0,
+   .id = -1,
+   };
+   struct dev_dax *dev_dax = devm_create_dev_dax();
+
+   if (IS_ERR(dev_dax))
+   rc = PTR_ERR(dev_dax);
+   

[PATCH v4 10/23] device-dax: Make pgmap optional for instance creation

2020-08-02 Thread Dan Williams
The passed in dev_pagemap is only required in the pmem case as the
libnvdimm core may have reserved a vmem_altmap for dev_memremap_pages()
to place the memmap in pmem directly. In the hmem case there is no
agent reserving an altmap so it can all be handled by a core internal
default.

Pass the resource range via a new @range property of 'struct
dev_dax_data'.

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c  |   29 +++--
 drivers/dax/bus.h  |2 ++
 drivers/dax/dax-private.h  |9 -
 drivers/dax/device.c   |   28 +++-
 drivers/dax/hmem/hmem.c|8 
 drivers/dax/kmem.c |   12 ++--
 drivers/dax/pmem/core.c|4 
 tools/testing/nvdimm/dax-dev.c |8 
 8 files changed, 62 insertions(+), 38 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index dffa4655e128..96bd64ba95a5 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -271,7 +271,7 @@ static ssize_t size_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct dev_dax *dev_dax = to_dev_dax(dev);
-   unsigned long long size = resource_size(_dax->region->res);
+   unsigned long long size = range_len(_dax->range);
 
return sprintf(buf, "%llu\n", size);
 }
@@ -293,19 +293,12 @@ static ssize_t target_node_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(target_node);
 
-static unsigned long long dev_dax_resource(struct dev_dax *dev_dax)
-{
-   struct dax_region *dax_region = dev_dax->region;
-
-   return dax_region->res.start;
-}
-
 static ssize_t resource_show(struct device *dev,
struct device_attribute *attr, char *buf)
 {
struct dev_dax *dev_dax = to_dev_dax(dev);
 
-   return sprintf(buf, "%#llx\n", dev_dax_resource(dev_dax));
+   return sprintf(buf, "%#llx\n", dev_dax->range.start);
 }
 static DEVICE_ATTR(resource, 0400, resource_show, NULL);
 
@@ -376,6 +369,7 @@ static void dev_dax_release(struct device *dev)
 
dax_region_put(dax_region);
put_dax(dax_dev);
+   kfree(dev_dax->pgmap);
kfree(dev_dax);
 }
 
@@ -412,7 +406,12 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data 
*data)
if (!dev_dax)
return ERR_PTR(-ENOMEM);
 
-   memcpy(_dax->pgmap, data->pgmap, sizeof(struct dev_pagemap));
+   if (data->pgmap) {
+   dev_dax->pgmap = kmemdup(data->pgmap,
+   sizeof(struct dev_pagemap), GFP_KERNEL);
+   if (!dev_dax->pgmap)
+   goto err_pgmap;
+   }
 
/*
 * No 'host' or dax_operations since there is no access to this
@@ -421,18 +420,19 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data 
*data)
dax_dev = alloc_dax(dev_dax, NULL, NULL, DAXDEV_F_SYNC);
if (IS_ERR(dax_dev)) {
rc = PTR_ERR(dax_dev);
-   goto err;
+   goto err_alloc_dax;
}
 
/* a device_dax instance is dead while the driver is not attached */
kill_dax(dax_dev);
 
-   /* from here on we're committed to teardown via dax_dev_release() */
+   /* from here on we're committed to teardown via dev_dax_release() */
dev = _dax->dev;
device_initialize(dev);
 
dev_dax->dax_dev = dax_dev;
dev_dax->region = dax_region;
+   dev_dax->range = data->range;
dev_dax->target_node = dax_region->target_node;
kref_get(_region->kref);
 
@@ -458,8 +458,9 @@ struct dev_dax *devm_create_dev_dax(struct dev_dax_data 
*data)
return ERR_PTR(rc);
 
return dev_dax;
-
- err:
+err_alloc_dax:
+   kfree(dev_dax->pgmap);
+err_pgmap:
kfree(dev_dax);
 
return ERR_PTR(rc);
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 299c2e7fac09..4aeb36da83a4 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -3,6 +3,7 @@
 #ifndef __DAX_BUS_H__
 #define __DAX_BUS_H__
 #include 
+#include 
 
 struct dev_dax;
 struct resource;
@@ -21,6 +22,7 @@ struct dev_dax_data {
struct dax_region *dax_region;
struct dev_pagemap *pgmap;
enum dev_dax_subsys subsys;
+   struct range range;
int id;
 };
 
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 8a4c40ccd2ef..6779f683671d 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -41,6 +41,7 @@ struct dax_region {
  * @target_node: effective numa node if dev_dax memory range is onlined
  * @dev - device core
  * @pgmap - pgmap for memmap setup / lifetime (driver owned)
+ * @range: resource range for the instance
  * @dax_mem_res: physical address range of hotadded DAX memory
  * @dax_mem_name: name for hotadded DAX memory via add_memory_driver_managed()
  */
@@ -49,10 +50,16 @@ struct dev_dax {
struct dax_device *dax_dev;
int target_node;
struct device dev;
- 

[PATCH v3 2/2] i2c: stm32f7: Add SMBus Host-Notify protocol support

2020-08-02 Thread Alain Volmat
Rely on the core functions to implement the host-notify
protocol via the a I2C slave device.

Signed-off-by: Alain Volmat 
---
 v3: identical to v2
 v2: fix slot #0 usage condition within stm32f7_i2c_get_free_slave_id

 drivers/i2c/busses/Kconfig   |   1 +
 drivers/i2c/busses/i2c-stm32f7.c | 110 +--
 2 files changed, 96 insertions(+), 15 deletions(-)

diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index 735bf31a3fdf..ae8671727a4c 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -1036,6 +1036,7 @@ config I2C_STM32F7
tristate "STMicroelectronics STM32F7 I2C support"
depends on ARCH_STM32 || COMPILE_TEST
select I2C_SLAVE
+   select I2C_SMBUS
help
  Enable this option to add support for STM32 I2C controller embedded
  in STM32F7 SoCs.
diff --git a/drivers/i2c/busses/i2c-stm32f7.c b/drivers/i2c/busses/i2c-stm32f7.c
index bff3479fe122..223c238c3c09 100644
--- a/drivers/i2c/busses/i2c-stm32f7.c
+++ b/drivers/i2c/busses/i2c-stm32f7.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -50,6 +51,7 @@
 
 /* STM32F7 I2C control 1 */
 #define STM32F7_I2C_CR1_PECEN  BIT(23)
+#define STM32F7_I2C_CR1_SMBHEN BIT(20)
 #define STM32F7_I2C_CR1_WUPEN  BIT(18)
 #define STM32F7_I2C_CR1_SBCBIT(16)
 #define STM32F7_I2C_CR1_RXDMAENBIT(15)
@@ -150,7 +152,7 @@
 
 #define STM32F7_I2C_MAX_LEN0xff
 #define STM32F7_I2C_DMA_LEN_MIN0x16
-#define STM32F7_I2C_MAX_SLAVE  0x2
+#define STM32F7_I2C_MAX_SLAVE  0x3
 
 #define STM32F7_I2C_DNF_DEFAULT0
 #define STM32F7_I2C_DNF_MAX16
@@ -301,6 +303,8 @@ struct stm32f7_i2c_msg {
  * @fmp_creg: register address for clearing Fast Mode Plus bits
  * @fmp_mask: mask for Fast Mode Plus bits in set register
  * @wakeup_src: boolean to know if the device is a wakeup source
+ * @smbus_mode: states that the controller is configured in SMBus mode
+ * @host_notify_client: SMBus host-notify client
  */
 struct stm32f7_i2c_dev {
struct i2c_adapter adap;
@@ -327,6 +331,8 @@ struct stm32f7_i2c_dev {
u32 fmp_creg;
u32 fmp_mask;
bool wakeup_src;
+   bool smbus_mode;
+   struct i2c_client *host_notify_client;
 };
 
 /*
@@ -1321,10 +1327,18 @@ static int stm32f7_i2c_get_free_slave_id(struct 
stm32f7_i2c_dev *i2c_dev,
int i;
 
/*
-* slave[0] supports 7-bit and 10-bit slave address
-* slave[1] supports 7-bit slave address only
+* slave[0] support only SMBus Host address (0x8)
+* slave[1] supports 7-bit and 10-bit slave address
+* slave[2] supports 7-bit slave address only
 */
-   for (i = STM32F7_I2C_MAX_SLAVE - 1; i >= 0; i--) {
+   if (i2c_dev->smbus_mode && (slave->addr == 0x08)) {
+   if (i2c_dev->slave[0])
+   goto fail;
+   *id = 0;
+   return 0;
+   }
+
+   for (i = STM32F7_I2C_MAX_SLAVE - 1; i > 0; i--) {
if (i == 1 && (slave->flags & I2C_CLIENT_TEN))
continue;
if (!i2c_dev->slave[i]) {
@@ -1333,6 +1347,7 @@ static int stm32f7_i2c_get_free_slave_id(struct 
stm32f7_i2c_dev *i2c_dev,
}
}
 
+fail:
dev_err(dev, "Slave 0x%x could not be registered\n", slave->addr);
 
return -EINVAL;
@@ -1776,7 +1791,13 @@ static int stm32f7_i2c_reg_slave(struct i2c_client 
*slave)
if (!stm32f7_i2c_is_slave_registered(i2c_dev))
stm32f7_i2c_enable_wakeup(i2c_dev, true);
 
-   if (id == 0) {
+   switch (id) {
+   case 0:
+   /* Slave SMBus Host */
+   i2c_dev->slave[id] = slave;
+   break;
+
+   case 1:
/* Configure Own Address 1 */
oar1 = readl_relaxed(i2c_dev->base + STM32F7_I2C_OAR1);
oar1 &= ~STM32F7_I2C_OAR1_MASK;
@@ -1789,7 +1810,9 @@ static int stm32f7_i2c_reg_slave(struct i2c_client *slave)
oar1 |= STM32F7_I2C_OAR1_OA1EN;
i2c_dev->slave[id] = slave;
writel_relaxed(oar1, i2c_dev->base + STM32F7_I2C_OAR1);
-   } else if (id == 1) {
+   break;
+
+   case 2:
/* Configure Own Address 2 */
oar2 = readl_relaxed(i2c_dev->base + STM32F7_I2C_OAR2);
oar2 &= ~STM32F7_I2C_OAR2_MASK;
@@ -1802,7 +1825,10 @@ static int stm32f7_i2c_reg_slave(struct i2c_client 
*slave)
oar2 |= STM32F7_I2C_OAR2_OA2EN;
i2c_dev->slave[id] = slave;
writel_relaxed(oar2, i2c_dev->base + STM32F7_I2C_OAR2);
-   } else {
+   break;
+
+   default:
+   dev_err(dev, "I2C slave id not supported\n");
 

[PATCH v4 08/23] device-dax: Drop the dax_region.pfn_flags attribute

2020-08-02 Thread Dan Williams
All callers specify the same flags to alloc_dax_region(), so there is no
need to allow for anything other than PFN_DEV|PFN_MAP, or carry a
->pfn_flags around on the region. Device-dax instances are always page
backed.

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |4 +---
 drivers/dax/bus.h |3 +--
 drivers/dax/dax-private.h |2 --
 drivers/dax/device.c  |   26 +++---
 drivers/dax/hmem/hmem.c   |2 +-
 drivers/dax/pmem/core.c   |3 +--
 6 files changed, 7 insertions(+), 33 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index df238c8b6ef2..f06ffa66cd78 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -226,8 +226,7 @@ static void dax_region_unregister(void *region)
 }
 
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-   struct resource *res, int target_node, unsigned int align,
-   unsigned long long pfn_flags)
+   struct resource *res, int target_node, unsigned int align)
 {
struct dax_region *dax_region;
 
@@ -251,7 +250,6 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,
 
dev_set_drvdata(parent, dax_region);
memcpy(_region->res, res, sizeof(*res));
-   dax_region->pfn_flags = pfn_flags;
kref_init(_region->kref);
dax_region->id = region_id;
dax_region->align = align;
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 9e4eba67e8b9..55577e9791da 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -10,8 +10,7 @@ struct dax_device;
 struct dax_region;
 void dax_region_put(struct dax_region *dax_region);
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-   struct resource *res, int target_node, unsigned int align,
-   unsigned long long flags);
+   struct resource *res, int target_node, unsigned int align);
 
 enum dev_dax_subsys {
DEV_DAX_BUS,
diff --git a/drivers/dax/dax-private.h b/drivers/dax/dax-private.h
index 16850d5388ab..8a4c40ccd2ef 100644
--- a/drivers/dax/dax-private.h
+++ b/drivers/dax/dax-private.h
@@ -23,7 +23,6 @@ void dax_bus_exit(void);
  * @dev: parent device backing this region
  * @align: allocation and mapping alignment for child dax devices
  * @res: physical address range of the region
- * @pfn_flags: identify whether the pfns are paged back or not
  */
 struct dax_region {
int id;
@@ -32,7 +31,6 @@ struct dax_region {
struct device *dev;
unsigned int align;
struct resource res;
-   unsigned long long pfn_flags;
 };
 
 /**
diff --git a/drivers/dax/device.c b/drivers/dax/device.c
index 4c0af2eb7e19..bffef1b21144 100644
--- a/drivers/dax/device.c
+++ b/drivers/dax/device.c
@@ -41,14 +41,6 @@ static int check_vma(struct dev_dax *dev_dax, struct 
vm_area_struct *vma,
return -EINVAL;
}
 
-   if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) == PFN_DEV
-   && (vma->vm_flags & VM_DONTCOPY) == 0) {
-   dev_info_ratelimited(dev,
-   "%s: %s: fail, dax range requires 
MADV_DONTFORK\n",
-   current->comm, func);
-   return -EINVAL;
-   }
-
if (!vma_is_dax(vma)) {
dev_info_ratelimited(dev,
"%s: %s: fail, vma is not DAX capable\n",
@@ -102,7 +94,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
}
 
-   *pfn = phys_to_pfn_t(phys, dax_region->pfn_flags);
+   *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP);
 
return vmf_insert_mixed(vmf->vma, vmf->address, *pfn);
 }
@@ -127,12 +119,6 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
}
 
-   /* dax pmd mappings require pfn_t_devmap() */
-   if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) {
-   dev_dbg(dev, "region lacks devmap flags\n");
-   return VM_FAULT_SIGBUS;
-   }
-
if (fault_size < dax_region->align)
return VM_FAULT_SIGBUS;
else if (fault_size > dax_region->align)
@@ -150,7 +136,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
}
 
-   *pfn = phys_to_pfn_t(phys, dax_region->pfn_flags);
+   *pfn = phys_to_pfn_t(phys, PFN_DEV|PFN_MAP);
 
return vmf_insert_pfn_pmd(vmf, *pfn, vmf->flags & FAULT_FLAG_WRITE);
 }
@@ -177,12 +163,6 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax 
*dev_dax,
return VM_FAULT_SIGBUS;
}
 
-   /* dax pud mappings require pfn_t_devmap() */
-   if ((dax_region->pfn_flags & (PFN_DEV|PFN_MAP)) != (PFN_DEV|PFN_MAP)) {
-   dev_dbg(dev, "region lacks devmap flags\n");
-   return VM_FAULT_SIGBUS;
-   }
-
   

[PATCH v4 09/23] device-dax: Move instance creation parameters to 'struct dev_dax_data'

2020-08-02 Thread Dan Williams
In preparation for adding more parameters to instance creation, move
existing parameters to a new struct.

Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c   |   14 +++---
 drivers/dax/bus.h   |   16 
 drivers/dax/hmem/hmem.c |8 +++-
 drivers/dax/pmem/core.c |9 -
 4 files changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index f06ffa66cd78..dffa4655e128 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -395,9 +395,9 @@ static void unregister_dev_dax(void *dev)
put_device(dev);
 }
 
-struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id,
-   struct dev_pagemap *pgmap, enum dev_dax_subsys subsys)
+struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data)
 {
+   struct dax_region *dax_region = data->dax_region;
struct device *parent = dax_region->dev;
struct dax_device *dax_dev;
struct dev_dax *dev_dax;
@@ -405,14 +405,14 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region 
*dax_region, int id,
struct device *dev;
int rc = -ENOMEM;
 
-   if (id < 0)
+   if (data->id < 0)
return ERR_PTR(-EINVAL);
 
dev_dax = kzalloc(sizeof(*dev_dax), GFP_KERNEL);
if (!dev_dax)
return ERR_PTR(-ENOMEM);
 
-   memcpy(_dax->pgmap, pgmap, sizeof(*pgmap));
+   memcpy(_dax->pgmap, data->pgmap, sizeof(struct dev_pagemap));
 
/*
 * No 'host' or dax_operations since there is no access to this
@@ -438,13 +438,13 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region 
*dax_region, int id,
 
inode = dax_inode(dax_dev);
dev->devt = inode->i_rdev;
-   if (subsys == DEV_DAX_BUS)
+   if (data->subsys == DEV_DAX_BUS)
dev->bus = _bus_type;
else
dev->class = dax_class;
dev->parent = parent;
dev->type = _dax_type;
-   dev_set_name(dev, "dax%d.%d", dax_region->id, id);
+   dev_set_name(dev, "dax%d.%d", dax_region->id, data->id);
 
rc = device_add(dev);
if (rc) {
@@ -464,7 +464,7 @@ struct dev_dax *__devm_create_dev_dax(struct dax_region 
*dax_region, int id,
 
return ERR_PTR(rc);
 }
-EXPORT_SYMBOL_GPL(__devm_create_dev_dax);
+EXPORT_SYMBOL_GPL(devm_create_dev_dax);
 
 static int match_always_count;
 
diff --git a/drivers/dax/bus.h b/drivers/dax/bus.h
index 55577e9791da..299c2e7fac09 100644
--- a/drivers/dax/bus.h
+++ b/drivers/dax/bus.h
@@ -13,18 +13,18 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,
struct resource *res, int target_node, unsigned int align);
 
 enum dev_dax_subsys {
-   DEV_DAX_BUS,
+   DEV_DAX_BUS = 0, /* zeroed dev_dax_data picks this by default */
DEV_DAX_CLASS,
 };
 
-struct dev_dax *__devm_create_dev_dax(struct dax_region *dax_region, int id,
-   struct dev_pagemap *pgmap, enum dev_dax_subsys subsys);
+struct dev_dax_data {
+   struct dax_region *dax_region;
+   struct dev_pagemap *pgmap;
+   enum dev_dax_subsys subsys;
+   int id;
+};
 
-static inline struct dev_dax *devm_create_dev_dax(struct dax_region 
*dax_region,
-   int id, struct dev_pagemap *pgmap)
-{
-   return __devm_create_dev_dax(dax_region, id, pgmap, DEV_DAX_BUS);
-}
+struct dev_dax *devm_create_dev_dax(struct dev_dax_data *data);
 
 /* to be deleted when DEV_DAX_CLASS is removed */
 struct dev_dax *__dax_pmem_probe(struct device *dev, enum dev_dax_subsys 
subsys);
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index 506893861253..b84fe17178d8 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -11,6 +11,7 @@ static int dax_hmem_probe(struct platform_device *pdev)
struct dev_pagemap pgmap = { };
struct dax_region *dax_region;
struct memregion_info *mri;
+   struct dev_dax_data data;
struct dev_dax *dev_dax;
struct resource *res;
 
@@ -26,7 +27,12 @@ static int dax_hmem_probe(struct platform_device *pdev)
if (!dax_region)
return -ENOMEM;
 
-   dev_dax = devm_create_dev_dax(dax_region, 0, );
+   data = (struct dev_dax_data) {
+   .dax_region = dax_region,
+   .id = 0,
+   .pgmap = ,
+   };
+   dev_dax = devm_create_dev_dax();
if (IS_ERR(dev_dax))
return PTR_ERR(dev_dax);
 
diff --git a/drivers/dax/pmem/core.c b/drivers/dax/pmem/core.c
index ea52bb77a294..08ee5947a49c 100644
--- a/drivers/dax/pmem/core.c
+++ b/drivers/dax/pmem/core.c
@@ -14,6 +14,7 @@ struct dev_dax *__dax_pmem_probe(struct device *dev, enum 
dev_dax_subsys subsys)
resource_size_t offset;
struct nd_pfn_sb *pfn_sb;
struct dev_dax *dev_dax;
+   struct dev_dax_data data;
struct nd_namespace_io *nsio;
struct dax_region *dax_region;
struct dev_pagemap pgmap = 

[PATCH v4 05/23] resource: Report parent to walk_iomem_res_desc() callback

2020-08-02 Thread Dan Williams
In support of detecting whether a resource might have been been claimed,
report the parent to the walk_iomem_res_desc() callback. For example,
the ACPI HMAT parser publishes "hmem" platform devices per target range.
However, if the HMAT is disabled / missing a fallback driver can attach
devices to the raw memory ranges as a fallback if it sees unclaimed /
orphan "Soft Reserved" resources in the resource tree.

Otherwise, find_next_iomem_res() returns a resource with garbage data
from the stack allocation in __walk_iomem_res_desc() for the res->parent
field.

There are currently no users that expect ->child and ->sibling to be
valid, and the resource_lock would be needed to traverse them. Use a
compound literal to implicitly zero initialize the fields that are not
being returned in addition to setting ->parent.

Cc: Jason Gunthorpe 
Cc: Dave Hansen 
Cc: Wei Yang 
Cc: Tom Lendacky 
Signed-off-by: Dan Williams 
---
 kernel/resource.c |   11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 841737bbda9e..f1175ce93a1d 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -382,10 +382,13 @@ static int find_next_iomem_res(resource_size_t start, 
resource_size_t end,
 
if (p) {
/* copy data */
-   res->start = max(start, p->start);
-   res->end = min(end, p->end);
-   res->flags = p->flags;
-   res->desc = p->desc;
+   *res = (struct resource) {
+   .start = max(start, p->start),
+   .end = min(end, p->end),
+   .flags = p->flags,
+   .desc = p->desc,
+   .parent = p->parent,
+   };
}
 
read_unlock(_lock);



[PATCH v4 06/23] mm/memory_hotplug: Introduce default phys_to_target_node() implementation

2020-08-02 Thread Dan Williams
In preparation to set a fallback value for dev_dax->target_node,
introduce generic fallback helpers for phys_to_target_node()

A generic implementation based on node-data or memblock was proposed,
but as noted by Mike:

"Here again, I would prefer to add a weak default for
 phys_to_target_node() because the "generic" implementation is not really
 generic.

 The fallback to reserved ranges is x86 specfic because on x86 most of
 the reserved areas is not in memblock.memory. AFAIK, no other
 architecture does this."

The info message in the generic memory_add_physaddr_to_nid()
implementation is fixed up to properly reflect that
memory_add_physaddr_to_nid() communicates "online" node info and
phys_to_target_node() indicates "target / to-be-onlined" node info.

Cc: David Hildenbrand 
Cc: Mike Rapoport 
Cc: Jia He 
Signed-off-by: Dan Williams 
---
 arch/x86/mm/numa.c |1 -
 include/linux/memory_hotplug.h |5 +
 include/linux/numa.h   |   11 ---
 mm/memory_hotplug.c|   10 +-
 4 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index f3805bbaa784..c62e274d52d0 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -917,7 +917,6 @@ int phys_to_target_node(phys_addr_t start)
 
return meminfo_to_nid(_reserved_meminfo, start);
 }
-EXPORT_SYMBOL_GPL(phys_to_target_node);
 
 int memory_add_physaddr_to_nid(u64 start)
 {
diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 375515803cd8..dcdc7d6206d5 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -151,11 +151,16 @@ int add_pages(int nid, unsigned long start_pfn, unsigned 
long nr_pages,
 
 #ifdef CONFIG_NUMA
 extern int memory_add_physaddr_to_nid(u64 start);
+extern int phys_to_target_node(u64 start);
 #else
 static inline int memory_add_physaddr_to_nid(u64 start)
 {
return 0;
 }
+static inline int phys_to_target_node(u64 start)
+{
+   return 0;
+}
 #endif
 
 #ifdef CONFIG_HAVE_ARCH_NODEDATA_EXTENSION
diff --git a/include/linux/numa.h b/include/linux/numa.h
index a42df804679e..8cb33ccfb671 100644
--- a/include/linux/numa.h
+++ b/include/linux/numa.h
@@ -23,22 +23,11 @@
 #ifdef CONFIG_NUMA
 /* Generic implementation available */
 int numa_map_to_online_node(int node);
-
-/*
- * Optional architecture specific implementation, users need a "depends
- * on $ARCH"
- */
-int phys_to_target_node(phys_addr_t addr);
 #else
 static inline int numa_map_to_online_node(int node)
 {
return NUMA_NO_NODE;
 }
-
-static inline int phys_to_target_node(phys_addr_t addr)
-{
-   return NUMA_NO_NODE;
-}
 #endif
 
 #endif /* _LINUX_NUMA_H */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index dcdf3271f87e..426b79adf529 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -353,11 +353,19 @@ int __ref __add_pages(int nid, unsigned long pfn, 
unsigned long nr_pages,
 #ifdef CONFIG_NUMA
 int __weak memory_add_physaddr_to_nid(u64 start)
 {
-   pr_info_once("Unknown target node for memory at 0x%llx, assuming node 
0\n",
+   pr_info_once("Unknown online node for memory at 0x%llx, assuming node 
0\n",
start);
return 0;
 }
 EXPORT_SYMBOL_GPL(memory_add_physaddr_to_nid);
+
+int __weak phys_to_target_node(u64 start)
+{
+   pr_info_once("Unknown target node for memory at 0x%llx, assuming node 
0\n",
+   start);
+   return 0;
+}
+EXPORT_SYMBOL_GPL(phys_to_target_node);
 #endif
 
 /* find the smallest valid pfn in the range [start_pfn, end_pfn) */



[PATCH v4 04/23] ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device

2020-08-02 Thread Dan Williams
In preparation for exposing "Soft Reserved" memory ranges without an
HMAT, move the hmem device registration to its own compilation unit and
make the implementation generic.

The generic implementation drops usage acpi_map_pxm_to_online_node()
that was translating ACPI proximity domain values and instead relies on
numa_map_to_online_node() to determine the numa node for the device.

Cc: "Rafael J. Wysocki" 
Link: 
https://lore.kernel.org/r/158318761484.2216124.2049322072599482736.st...@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams 
---
 drivers/acpi/numa/hmat.c  |   68 -
 drivers/dax/Kconfig   |4 +++
 drivers/dax/Makefile  |3 +-
 drivers/dax/hmem.c|   56 -
 drivers/dax/hmem/Makefile |5 +++
 drivers/dax/hmem/device.c |   65 +++
 drivers/dax/hmem/hmem.c   |   56 +
 include/linux/dax.h   |8 +
 8 files changed, 145 insertions(+), 120 deletions(-)
 delete mode 100644 drivers/dax/hmem.c
 create mode 100644 drivers/dax/hmem/Makefile
 create mode 100644 drivers/dax/hmem/device.c
 create mode 100644 drivers/dax/hmem/hmem.c

diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index a12e36a12618..134bcb40b2af 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static u8 hmat_revision;
 static int hmat_disable __initdata;
@@ -640,66 +641,6 @@ static void hmat_register_target_perf(struct memory_target 
*target)
node_set_perf_attrs(mem_nid, >hmem_attrs, 0);
 }
 
-static void hmat_register_target_device(struct memory_target *target,
-   struct resource *r)
-{
-   /* define a clean / non-busy resource for the platform device */
-   struct resource res = {
-   .start = r->start,
-   .end = r->end,
-   .flags = IORESOURCE_MEM,
-   };
-   struct platform_device *pdev;
-   struct memregion_info info;
-   int rc, id;
-
-   rc = region_intersects(res.start, resource_size(), IORESOURCE_MEM,
-   IORES_DESC_SOFT_RESERVED);
-   if (rc != REGION_INTERSECTS)
-   return;
-
-   id = memregion_alloc(GFP_KERNEL);
-   if (id < 0) {
-   pr_err("memregion allocation failure for %pr\n", );
-   return;
-   }
-
-   pdev = platform_device_alloc("hmem", id);
-   if (!pdev) {
-   pr_err("hmem device allocation failure for %pr\n", );
-   goto out_pdev;
-   }
-
-   pdev->dev.numa_node = acpi_map_pxm_to_online_node(target->memory_pxm);
-   info = (struct memregion_info) {
-   .target_node = acpi_map_pxm_to_node(target->memory_pxm),
-   };
-   rc = platform_device_add_data(pdev, , sizeof(info));
-   if (rc < 0) {
-   pr_err("hmem memregion_info allocation failure for %pr\n", 
);
-   goto out_pdev;
-   }
-
-   rc = platform_device_add_resources(pdev, , 1);
-   if (rc < 0) {
-   pr_err("hmem resource allocation failure for %pr\n", );
-   goto out_resource;
-   }
-
-   rc = platform_device_add(pdev);
-   if (rc < 0) {
-   dev_err(>dev, "device add failed for %pr\n", );
-   goto out_resource;
-   }
-
-   return;
-
-out_resource:
-   put_device(>dev);
-out_pdev:
-   memregion_free(id);
-}
-
 static void hmat_register_target_devices(struct memory_target *target)
 {
struct resource *res;
@@ -711,8 +652,11 @@ static void hmat_register_target_devices(struct 
memory_target *target)
if (!IS_ENABLED(CONFIG_DEV_DAX_HMEM))
return;
 
-   for (res = target->memregions.child; res; res = res->sibling)
-   hmat_register_target_device(target, res);
+   for (res = target->memregions.child; res; res = res->sibling) {
+   int target_nid = acpi_map_pxm_to_node(target->memory_pxm);
+
+   hmem_register_device(target_nid, res);
+   }
 }
 
 static void hmat_register_target(struct memory_target *target)
diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index 3b6c06f07326..a229f45d34aa 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -48,6 +48,10 @@ config DEV_DAX_HMEM
 
  Say M if unsure.
 
+config DEV_DAX_HMEM_DEVICES
+   depends on DEV_DAX_HMEM
+   def_bool y
+
 config DEV_DAX_KMEM
tristate "KMEM DAX: volatile-use of persistent memory"
default DEV_DAX
diff --git a/drivers/dax/Makefile b/drivers/dax/Makefile
index 80065b38b3c4..9d4ba672d305 100644
--- a/drivers/dax/Makefile
+++ b/drivers/dax/Makefile
@@ -2,11 +2,10 @@
 obj-$(CONFIG_DAX) += dax.o
 obj-$(CONFIG_DEV_DAX) += device_dax.o
 obj-$(CONFIG_DEV_DAX_KMEM) += kmem.o
-obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
 
 dax-y := super.o
 dax-y += bus.o
 device_dax-y := 

[PATCH v4 07/23] ACPI: HMAT: Attach a device for each soft-reserved range

2020-08-02 Thread Dan Williams
The hmem enabling in commit 'cf8741ac57ed ("ACPI: NUMA: HMAT: Register
"soft reserved" memory as an "hmem" device")' only registered ranges to
the hmem driver for each soft-reservation that also appeared in the
HMAT. While this is meant to encourage platform firmware to "do the
right thing" and publish an HMAT, the corollary is that platforms that
fail to publish an accurate HMAT will strand memory from Linux usage.
Additionally, the "efi_fake_mem" kernel command line option enabling
will strand memory by default without an HMAT.

Arrange for "soft reserved" memory that goes unclaimed by HMAT entries
to be published as raw resource ranges for the hmem driver to consume.

Include a module parameter to disable either this fallback behavior, or
the hmat enabling from creating hmem devices. The module parameter
requires the hmem device enabling to have unique name in the module
namespace: "device_hmem".

The driver depends on the architecture providing phys_to_target_node()
which is only x86 via numa_meminfo() and arm64 via a generic memblock
implementation.

Cc: Jonathan Cameron 
Cc: Brice Goglin 
Cc: Ard Biesheuvel 
Cc: "Rafael J. Wysocki" 
Cc: Jeff Moyer 
Cc: Catalin Marinas 
Cc: Will Deacon 
Reviewed-by: Joao Martins 
Signed-off-by: Dan Williams 
---
 drivers/dax/hmem/Makefile |3 ++-
 drivers/dax/hmem/device.c |   35 +++
 2 files changed, 37 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/hmem/Makefile b/drivers/dax/hmem/Makefile
index a9d353d0c9ed..57377b4c3d47 100644
--- a/drivers/dax/hmem/Makefile
+++ b/drivers/dax/hmem/Makefile
@@ -1,5 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_DEV_DAX_HMEM) += dax_hmem.o
-obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device.o
+obj-$(CONFIG_DEV_DAX_HMEM_DEVICES) += device_hmem.o
 
+device_hmem-y := device.o
 dax_hmem-y := hmem.o
diff --git a/drivers/dax/hmem/device.c b/drivers/dax/hmem/device.c
index b9dd6b27745c..cb6401c9e9a4 100644
--- a/drivers/dax/hmem/device.c
+++ b/drivers/dax/hmem/device.c
@@ -5,6 +5,9 @@
 #include 
 #include 
 
+static bool nohmem;
+module_param_named(disable, nohmem, bool, 0444);
+
 void hmem_register_device(int target_nid, struct resource *r)
 {
/* define a clean / non-busy resource for the platform device */
@@ -17,6 +20,9 @@ void hmem_register_device(int target_nid, struct resource *r)
struct memregion_info info;
int rc, id;
 
+   if (nohmem)
+   return;
+
rc = region_intersects(res.start, resource_size(), IORESOURCE_MEM,
IORES_DESC_SOFT_RESERVED);
if (rc != REGION_INTERSECTS)
@@ -63,3 +69,32 @@ void hmem_register_device(int target_nid, struct resource *r)
 out_pdev:
memregion_free(id);
 }
+
+static __init int hmem_register_one(struct resource *res, void *data)
+{
+   /*
+* If the resource is not a top-level resource it was already
+* assigned to a device by the HMAT parsing.
+*/
+   if (res->parent != _resource) {
+   pr_info("HMEM: skip %pr, already claimed\n", res);
+   return 0;
+   }
+
+   hmem_register_device(phys_to_target_node(res->start), res);
+
+   return 0;
+}
+
+static __init int hmem_init(void)
+{
+   walk_iomem_res_desc(IORES_DESC_SOFT_RESERVED,
+   IORESOURCE_MEM, 0, -1, NULL, hmem_register_one);
+   return 0;
+}
+
+/*
+ * As this is a fallback for address ranges unclaimed by the ACPI HMAT
+ * parsing it must be at an initcall level greater than hmat_init().
+ */
+late_initcall(hmem_init);



[PATCH v4 12/23] device-dax: Add an allocation interface for device-dax instances

2020-08-02 Thread Dan Williams
In preparation for a facility that enables dax regions to be
sub-divided, introduce infrastructure to track and allocate region
capacity.

The new dax_region/available_size attribute is only enabled for volatile
hmem devices, not pmem devices that are defined by nvdimm namespace
boundaries. This is per Jeff's feedback the last time dynamic device-dax
capacity allocation support was discussed.

Link: 
https://lore.kernel.org/linux-nvdimm/x49shpp3zn8@segfault.boston.devel.redhat.com
Cc: Vishal Verma 
Signed-off-by: Dan Williams 
---
 drivers/dax/bus.c |  120 +
 drivers/dax/bus.h |7 ++-
 drivers/dax/dax-private.h |2 -
 drivers/dax/hmem/hmem.c   |7 +--
 drivers/dax/pmem/core.c   |8 +--
 5 files changed, 121 insertions(+), 23 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 96bd64ba95a5..0a48ce378686 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -130,6 +130,11 @@ ATTRIBUTE_GROUPS(dax_drv);
 
 static int dax_bus_match(struct device *dev, struct device_driver *drv);
 
+static bool is_static(struct dax_region *dax_region)
+{
+   return (dax_region->res.flags & IORESOURCE_DAX_STATIC) != 0;
+}
+
 static struct bus_type dax_bus_type = {
.name = "dax",
.uevent = dax_bus_uevent,
@@ -185,7 +190,48 @@ static ssize_t align_show(struct device *dev,
 }
 static DEVICE_ATTR_RO(align);
 
+#define for_each_dax_region_resource(dax_region, res) \
+   for (res = (dax_region)->res.child; res; res = res->sibling)
+
+static unsigned long long dax_region_avail_size(struct dax_region *dax_region)
+{
+   resource_size_t size = resource_size(_region->res);
+   struct resource *res;
+
+   device_lock_assert(dax_region->dev);
+
+   for_each_dax_region_resource(dax_region, res)
+   size -= resource_size(res);
+   return size;
+}
+
+static ssize_t available_size_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   struct dax_region *dax_region = dev_get_drvdata(dev);
+   unsigned long long size;
+
+   device_lock(dev);
+   size = dax_region_avail_size(dax_region);
+   device_unlock(dev);
+
+   return sprintf(buf, "%llu\n", size);
+}
+static DEVICE_ATTR_RO(available_size);
+
+static umode_t dax_region_visible(struct kobject *kobj, struct attribute *a,
+   int n)
+{
+   struct device *dev = container_of(kobj, struct device, kobj);
+   struct dax_region *dax_region = dev_get_drvdata(dev);
+
+   if (is_static(dax_region) && a == _attr_available_size.attr)
+   return 0;
+   return a->mode;
+}
+
 static struct attribute *dax_region_attributes[] = {
+   _attr_available_size.attr,
_attr_region_size.attr,
_attr_align.attr,
_attr_id.attr,
@@ -195,6 +241,7 @@ static struct attribute *dax_region_attributes[] = {
 static const struct attribute_group dax_region_attribute_group = {
.name = "dax_region",
.attrs = dax_region_attributes,
+   .is_visible = dax_region_visible,
 };
 
 static const struct attribute_group *dax_region_attribute_groups[] = {
@@ -226,7 +273,8 @@ static void dax_region_unregister(void *region)
 }
 
 struct dax_region *alloc_dax_region(struct device *parent, int region_id,
-   struct resource *res, int target_node, unsigned int align)
+   struct resource *res, int target_node, unsigned int align,
+   unsigned long flags)
 {
struct dax_region *dax_region;
 
@@ -249,12 +297,17 @@ struct dax_region *alloc_dax_region(struct device 
*parent, int region_id,
return NULL;
 
dev_set_drvdata(parent, dax_region);
-   memcpy(_region->res, res, sizeof(*res));
kref_init(_region->kref);
dax_region->id = region_id;
dax_region->align = align;
dax_region->dev = parent;
dax_region->target_node = target_node;
+   dax_region->res = (struct resource) {
+   .start = res->start,
+   .end = res->end,
+   .flags = IORESOURCE_MEM | flags,
+   };
+
if (sysfs_create_groups(>kobj, dax_region_attribute_groups)) {
kfree(dax_region);
return NULL;
@@ -267,6 +320,32 @@ struct dax_region *alloc_dax_region(struct device *parent, 
int region_id,
 }
 EXPORT_SYMBOL_GPL(alloc_dax_region);
 
+static int alloc_dev_dax_range(struct dev_dax *dev_dax, resource_size_t size)
+{
+   struct dax_region *dax_region = dev_dax->region;
+   struct resource *res = _region->res;
+   struct device *dev = _dax->dev;
+   struct resource *alloc;
+
+   device_lock_assert(dax_region->dev);
+
+   /* TODO: handle multiple allocations per region */
+   if (res->child)
+   return -ENOMEM;
+
+   alloc = __request_region(res, res->start, size, dev_name(dev), 0);
+
+   if (!alloc)
+   return -ENOMEM;
+
+   dev_dax->range = (struct 

[PATCH v4 00/23] device-dax: Support sub-dividing soft-reserved ranges

2020-08-02 Thread Dan Williams
Changes since v3 [1]:
- Update x86 boot options documentation for 'nohmat' (Randy)

- Fixup a handful of kbuild robot reports, the most significant being
  moving usage of PUD_SIZE and PMD_SIZE under
  #ifdef CONFIG_TRANSPARENT_HUGEPAGE protection.

[1]: 
http://lore.kernel.org/r/159625229779.3040297.11363509688097221416.st...@dwillia2-desk3.amr.corp.intel.com

---
Merge notes:

Well, no v5.8-rc8 to line this up for v5.9, so next best is early
integration into -mm before other collisions develop.

Chatted with Justin offline and it currently appears that the missing
numa information is the fault of the platform firmware to populate all
the necessary NUMA data in the NFIT.

---
Cover:

The device-dax facility allows an address range to be directly mapped
through a chardev, or optionally hotplugged to the core kernel page
allocator as System-RAM. It is the mechanism for converting persistent
memory (pmem) to be used as another volatile memory pool i.e. the
current Memory Tiering hot topic on linux-mm.

In the case of pmem the nvdimm-namespace-label mechanism can sub-divide
it, but that labeling mechanism is not available / applicable to
soft-reserved ("EFI specific purpose") memory [3]. This series provides
a sysfs-mechanism for the daxctl utility to enable provisioning of
volatile-soft-reserved memory ranges.

The motivations for this facility are:

1/ Allow performance differentiated memory ranges to be split between
   kernel-managed and directly-accessed use cases.

2/ Allow physical memory to be provisioned along performance relevant
   address boundaries. For example, divide a memory-side cache [4] along
   cache-color boundaries.

3/ Parcel out soft-reserved memory to VMs using device-dax as a security
   / permissions boundary [5]. Specifically I have seen people (ab)using
   memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the
   device-dax interface on custom address ranges. A follow-on for the VM
   use case is to teach device-dax to dynamically allocate 'struct page' at
   runtime to reduce the duplication of 'struct page' space in both the
   guest and the host kernel for the same physical pages.

[2]: http://lore.kernel.org/r/20200713160837.13774-11-joao.m.mart...@oracle.com
[3]: 
http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.st...@dwillia2-desk3.amr.corp.intel.com
[4]: 
http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.st...@dwillia2-desk3.amr.corp.intel.com
[5]: http://lore.kernel.org/r/20200110190313.17144-1-joao.m.mart...@oracle.com

---

Dan Williams (19):
  x86/numa: Cleanup configuration dependent command-line options
  x86/numa: Add 'nohmat' option
  efi/fake_mem: Arrange for a resource entry per efi_fake_mem instance
  ACPI: HMAT: Refactor hmat_register_target_device to hmem_register_device
  resource: Report parent to walk_iomem_res_desc() callback
  mm/memory_hotplug: Introduce default phys_to_target_node() implementation
  ACPI: HMAT: Attach a device for each soft-reserved range
  device-dax: Drop the dax_region.pfn_flags attribute
  device-dax: Move instance creation parameters to 'struct dev_dax_data'
  device-dax: Make pgmap optional for instance creation
  device-dax: Kill dax_kmem_res
  device-dax: Add an allocation interface for device-dax instances
  device-dax: Introduce 'seed' devices
  drivers/base: Make device_find_child_by_name() compatible with sysfs 
inputs
  device-dax: Add resize support
  mm/memremap_pages: Convert to 'struct range'
  mm/memremap_pages: Support multiple ranges per invocation
  device-dax: Add dis-contiguous resource support
  device-dax: Introduce 'mapping' devices

Joao Martins (4):
  device-dax: Make align a per-device property
  device-dax: Add an 'align' attribute
  dax/hmem: Introduce dax_hmem.region_idle parameter
  device-dax: Add a range mapping allocation attribute


 Documentation/x86/x86_64/boot-options.rst |4 
 arch/powerpc/kvm/book3s_hv_uvmem.c|   14 
 arch/x86/include/asm/numa.h   |8 
 arch/x86/kernel/e820.c|   16 
 arch/x86/mm/numa.c|   11 
 arch/x86/mm/numa_emulation.c  |3 
 arch/x86/xen/enlighten_pv.c   |2 
 drivers/acpi/numa/hmat.c  |   76 --
 drivers/acpi/numa/srat.c  |9 
 drivers/base/core.c   |2 
 drivers/dax/Kconfig   |4 
 drivers/dax/Makefile  |3 
 drivers/dax/bus.c | 1046 +++--
 drivers/dax/bus.h |   28 -
 drivers/dax/dax-private.h |   60 +-
 drivers/dax/device.c  |  134 ++--
 drivers/dax/hmem.c|   56 --
 drivers/dax/hmem/Makefile |6 
 drivers/dax/hmem/device.c |  100 +++
 drivers/dax/hmem/hmem.c   

[PATCH v4 02/23] x86/numa: Add 'nohmat' option

2020-08-02 Thread Dan Williams
Disable parsing of the HMAT for debug, to workaround broken platform
instances, or cases where it is otherwise not wanted.

Cc: x...@kernel.org
Cc: "Rafael J. Wysocki" 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Signed-off-by: Dan Williams 
---
 Documentation/x86/x86_64/boot-options.rst |4 
 arch/x86/mm/numa.c|2 ++
 drivers/acpi/numa/hmat.c  |8 +++-
 include/acpi/acpi_numa.h  |8 
 4 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/Documentation/x86/x86_64/boot-options.rst 
b/Documentation/x86/x86_64/boot-options.rst
index 2b98efb5ba7f..324cefff92e7 100644
--- a/Documentation/x86/x86_64/boot-options.rst
+++ b/Documentation/x86/x86_64/boot-options.rst
@@ -173,6 +173,10 @@ NUMA
   numa=noacpi
 Don't parse the SRAT table for NUMA setup
 
+  numa=nohmat
+Don't parse the HMAT table for NUMA setup, or soft-reserved memory
+partitioning.
+
   numa=fake=[MG]
 If given as a memory unit, fills all system RAM with nodes of
 size interleaved over physical nodes.
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 87c52822cc44..f3805bbaa784 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -41,6 +41,8 @@ static __init int numa_setup(char *opt)
return numa_emu_cmdline(opt + 5);
if (!strncmp(opt, "noacpi", 6))
disable_srat();
+   if (!strncmp(opt, "nohmat", 6))
+   disable_hmat();
return 0;
 }
 early_param("numa", numa_setup);
diff --git a/drivers/acpi/numa/hmat.c b/drivers/acpi/numa/hmat.c
index 2c32cfb72370..a12e36a12618 100644
--- a/drivers/acpi/numa/hmat.c
+++ b/drivers/acpi/numa/hmat.c
@@ -26,6 +26,12 @@
 #include 
 
 static u8 hmat_revision;
+static int hmat_disable __initdata;
+
+void __init disable_hmat(void)
+{
+   hmat_disable = 1;
+}
 
 static LIST_HEAD(targets);
 static LIST_HEAD(initiators);
@@ -814,7 +820,7 @@ static __init int hmat_init(void)
enum acpi_hmat_type i;
acpi_status status;
 
-   if (srat_disabled())
+   if (srat_disabled() || hmat_disable)
return 0;
 
status = acpi_get_table(ACPI_SIG_SRAT, 0, );
diff --git a/include/acpi/acpi_numa.h b/include/acpi/acpi_numa.h
index 8784183b2204..0e9302285f14 100644
--- a/include/acpi/acpi_numa.h
+++ b/include/acpi/acpi_numa.h
@@ -27,4 +27,12 @@ static inline void disable_srat(void)
 {
 }
 #endif /* CONFIG_ACPI_NUMA */
+
+#ifdef CONFIG_ACPI_HMAT
+extern void disable_hmat(void);
+#else  /* CONFIG_ACPI_HMAT */
+static inline void disable_hmat(void)
+{
+}
+#endif /* CONFIG_ACPI_HMAT */
 #endif /* __ACP_NUMA_H */



[PATCH v4 01/23] x86/numa: Cleanup configuration dependent command-line options

2020-08-02 Thread Dan Williams
In preparation for adding a new numa= option clean up the existing ones
to avoid ifdefs in numa_setup(), and provide feedback when the option is
numa=fake= option is invalid due to kernel config. The same does not
need to be done for numa=noacpi, since the capability is already hard
disabled at compile-time.

Suggested-by: Rafael J. Wysocki 
Signed-off-by: Dan Williams 
---
 arch/x86/include/asm/numa.h  |8 +++-
 arch/x86/mm/numa.c   |8 ++--
 arch/x86/mm/numa_emulation.c |3 ++-
 arch/x86/xen/enlighten_pv.c  |2 +-
 drivers/acpi/numa/srat.c |9 +++--
 include/acpi/acpi_numa.h |6 +-
 6 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index bbfde3d2662f..0aecc0b629e0 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -3,6 +3,7 @@
 #define _ASM_X86_NUMA_H
 
 #include 
+#include 
 
 #include 
 #include 
@@ -77,7 +78,12 @@ void debug_cpumask_set_cpu(int cpu, int node, bool enable);
 #ifdef CONFIG_NUMA_EMU
 #define FAKE_NODE_MIN_SIZE ((u64)32 << 20)
 #define FAKE_NODE_MIN_HASH_MASK(~(FAKE_NODE_MIN_SIZE - 1UL))
-void numa_emu_cmdline(char *);
+int numa_emu_cmdline(char *str);
+#else /* CONFIG_NUMA_EMU */
+static inline int numa_emu_cmdline(char *str)
+{
+   return -EINVAL;
+}
 #endif /* CONFIG_NUMA_EMU */
 
 #endif /* _ASM_X86_NUMA_H */
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index aa76ec2d359b..87c52822cc44 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -37,14 +37,10 @@ static __init int numa_setup(char *opt)
return -EINVAL;
if (!strncmp(opt, "off", 3))
numa_off = 1;
-#ifdef CONFIG_NUMA_EMU
if (!strncmp(opt, "fake=", 5))
-   numa_emu_cmdline(opt + 5);
-#endif
-#ifdef CONFIG_ACPI_NUMA
+   return numa_emu_cmdline(opt + 5);
if (!strncmp(opt, "noacpi", 6))
-   acpi_numa = -1;
-#endif
+   disable_srat();
return 0;
 }
 early_param("numa", numa_setup);
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index c5174b4e318b..847c23196e57 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -13,9 +13,10 @@
 static int emu_nid_to_phys[MAX_NUMNODES];
 static char *emu_cmdline __initdata;
 
-void __init numa_emu_cmdline(char *str)
+int __init numa_emu_cmdline(char *str)
 {
emu_cmdline = str;
+   return 0;
 }
 
 static int __init emu_find_memblk_by_nid(int nid, const struct numa_meminfo 
*mi)
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 2aab43a13a8c..64b81ba5a4d6 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -1350,7 +1350,7 @@ asmlinkage __visible void __init xen_start_kernel(void)
 * any NUMA information the kernel tries to get from ACPI will
 * be meaningless.  Prevent it from trying.
 */
-   acpi_numa = -1;
+   disable_srat();
 #endif
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_pv, xen_cpu_dead_pv));
 
diff --git a/drivers/acpi/numa/srat.c b/drivers/acpi/numa/srat.c
index 15bbaab8500b..1b0ae0a1959b 100644
--- a/drivers/acpi/numa/srat.c
+++ b/drivers/acpi/numa/srat.c
@@ -27,7 +27,12 @@ static int node_to_pxm_map[MAX_NUMNODES]
= { [0 ... MAX_NUMNODES - 1] = PXM_INVAL };
 
 unsigned char acpi_srat_revision __initdata;
-int acpi_numa __initdata;
+static int acpi_numa __initdata;
+
+void __init disable_srat(void)
+{
+   acpi_numa = -1;
+}
 
 int pxm_to_node(int pxm)
 {
@@ -163,7 +168,7 @@ static int __init slit_valid(struct acpi_table_slit *slit)
 void __init bad_srat(void)
 {
pr_err("SRAT: SRAT not used.\n");
-   acpi_numa = -1;
+   disable_srat();
 }
 
 int __init srat_disabled(void)
diff --git a/include/acpi/acpi_numa.h b/include/acpi/acpi_numa.h
index fdebcfc6c8df..8784183b2204 100644
--- a/include/acpi/acpi_numa.h
+++ b/include/acpi/acpi_numa.h
@@ -17,10 +17,14 @@ extern int pxm_to_node(int);
 extern int node_to_pxm(int);
 extern int acpi_map_pxm_to_node(int);
 extern unsigned char acpi_srat_revision;
-extern int acpi_numa __initdata;
+extern void disable_srat(void);
 
 extern void bad_srat(void);
 extern int srat_disabled(void);
 
+#else  /* CONFIG_ACPI_NUMA */
+static inline void disable_srat(void)
+{
+}
 #endif /* CONFIG_ACPI_NUMA */
 #endif /* __ACP_NUMA_H */



[PATCH] scsi/aic7xxx/aicasm: Add missing fclose() call

2020-08-02 Thread Youling Tang
Add missing fclose() call to close "regdiagfile" in the function stop().

Signed-off-by: Youling Tang 
---
 drivers/scsi/aic7xxx/aicasm/aicasm.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/scsi/aic7xxx/aicasm/aicasm.c 
b/drivers/scsi/aic7xxx/aicasm/aicasm.c
index 5f474e4..a504058 100644
--- a/drivers/scsi/aic7xxx/aicasm/aicasm.c
+++ b/drivers/scsi/aic7xxx/aicasm/aicasm.c
@@ -722,6 +722,15 @@ stop(const char *string, int err_code)
}
}
 
+   if (regdiagfile != NULL) {
+   fclose(regdiagfile);
+   if (err_code != 0) {
+   fprintf(stderr, "%s: Removing %s due to error\n",
+   appname, regdiagfilename);
+   unlink(regdiagfilename);
+   }
+   }
+
symlist_free(_functions);
symtable_close();
 
-- 
2.1.0



Re: [PATCH] iio: trigger: sysfs: Disable irqs before calling iio_trigger_poll()

2020-08-02 Thread Christian Eggers
On Saturday, 1 August 2020, 18:02:34 CEST, Jonathan Cameron wrote:
> On Mon, 27 Jul 2020 16:57:13 +0200
> 
> Christian Eggers  wrote:
> > iio_trigger_poll() calls generic_handle_irq(). This function expects to
> > be run with local IRQs disabled.
> 
> Was there an error or warning that lead to this patch?
[   17.448466] 000: [ cut here ]
[   17.448481] 000: WARNING: CPU: 0 PID: 9 at kernel/irq/handle.c:152 
__handle_irq_event_percpu+0x55/0xae
[   17.448511] 000: irq 236 handler irq_default_primary_handler+0x1/0x4 enabled 
interrupts
[   17.448526] 000: Modules linked in: bridge stp llc usb_f_ncm u_ether 
libcomposite sd_mod configfs cdc_acm usb_storage scsi_mod ci_hdrc_imx ci_hdrc 
st_magn_spi ulpi st_sensors_spi ehci_hcd regmap_spi tcpm roles st_magn_i2c 
typec st_sensors_i2c udc_core st_magn as73211 st_sensors imx_thermal usb49xx 
usbcore industrialio_triggered_buffer rtc_rv3028 kfifo_buf at24 usb_common 
nls_base i2c_dev usbmisc_imx phy_mxs_usb anatop_regulator imx2_wdt imx_fan 
spidev leds_pwm leds_gpio led_class iio_trig_sysfs imx6sx_adc industrialio 
fixed at25 spi_imx spi_bitbang imx_napi dev imx_sdma virt_dma nfsv3 nfs lockd 
grace sunrpc ksz9477_i2c ksz9477 tag_ksz ksz_common dsa_core phylink regmap_i2c 
i2c_imx i2c_core fec ptp pps_core micrel
[   17.448712] 000: CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 5.4.47-rt28+ 
#446
[   17.448723] 000: Hardware name: Freescale i.MX6 Ultralite (Device Tree)
[   17.448738] 000: [] (unwind_backtrace) from [] 
(show_stack+0xb/0xc)
[   17.448754] 000: [] (show_stack) from [] 
(__warn+0x7b/0x8c)
[   17.448772] 000: [] (__warn) from [] 
(warn_slowpath_fmt+0x31/0x50)
[   17.448787] 000: [] (warn_slowpath_fmt) from [] 
(__handle_irq_event_percpu+0x55/0xae)
[   17.448807] 000: [] (__handle_irq_event_percpu) from [] 
(handle_irq_event_percpu+0x19/0x40)
[   17.448823] 000: [] (handle_irq_event_percpu) from [] 
(handle_irq_event+0x3f/0x5c)
[   17.448839] 000: [] (handle_irq_event) from [] 
(handle_simple_irq+0x67/0x6a)
[   17.448854] 000: [] (handle_simple_irq) from [] 
(generic_handle_irq+0xd/0x16)
[   17.448870] 000: [] (generic_handle_irq) from [] 
(iio_trigger_poll+0x33/0x44 [industrialio])
[   17.448962] 000: [] (iio_trigger_poll [industrialio]) from 
[] (irq_work_run_list+0x43/0x66)
[   17.449010] 000: [] (irq_work_run_list) from [] 
(run_timer_softirq+0x7/0x3c)
[   17.449029] 000: [] (run_timer_softirq) from [] 
(__do_softirq+0x10f/0x160)
[   17.449045] 000: [] (__do_softirq) from [] 
(run_ksoftirqd+0x19/0x2c)
[   17.449061] 000: [] (run_ksoftirqd) from [] 
(smpboot_thread_fn+0x13b/0x140)
[   17.449078] 000: [] (smpboot_thread_fn) from [] 
(kthread+0xa3/0xac)
[   17.449095] 000: [] (kthread) from [] 
(ret_from_fork+0x11/0x20)
[   17.449110] 000: Exception stack(0xc2063fb0 to 0xc2063ff8)
[   17.449119] 000: 3fa0:   
 
[   17.449130] 000: 3fc0:       
 
[   17.449139] 000: 3fe0:     0013 
[   17.449146] 000: ---[ end trace 0002 ]---



> Or can you point to what call in generic_handle_irq is making the
> assumption that we are breaking?
> 
> Given this is using the irq_work framework I'm wondering if this is
> a more general problem?

If I understand correctly, the kernel temporarily disables hardware interrupts
while hardware irq handlers are run. In case of the iio-trig-hrtim and 
iio-trig-sysfs
interrupts, __handle_irq_event_percpu() is not called from a hardware irq
(where interrupts would be disabled), but from software.

Similar examples found here:
0a29ac5bd3 ("net: usb: lan78xx: Disable interrupts before calling 
generic_handle_irq()")

and
drivers/i2c/busses/i2c-cht-wc.c:103


> 
> Basically more info please!
> 
> Thanks,
> 
> Jonathan
> 
Regards
Christian





Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

2020-08-02 Thread Can Guo

Hi Stanley,

On 2020-08-03 11:00, Stanley Chu wrote:

Hi Can,

On Sat, 2020-08-01 at 07:17 +0800, Can Guo wrote:

Hi Bart,

On 2020-08-01 00:51, Bart Van Assche wrote:
> On 2020-07-31 01:00, Can Guo wrote:
>> AFAIK, sychronization of scsi_done is not a problem here, because scsi
>> layer
>> use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to
>> prevent
>> the concurrency of abort and real completion of it.
>>
>> Check func scsi_times_out(), hope it helps.
>>
>> enum blk_eh_timer_return scsi_times_out(struct request *req)
>> {
>> ...
>> if (rtn == BLK_EH_DONE) {
>> /*
>>  * Set the command to complete first in order to
>> prevent
>> a real
>>  * completion from releasing the command while error
>> handling
>>  * is using it. If the command was already completed,
>> then the
>>  * lower level driver beat the timeout handler, and it
>> is safe
>>  * to return without escalating error recovery.
>>  *
>>  * If timeout handling lost the race to a real
>> completion, the
>>  * block layer may ignore that due to a fake timeout
>> injection,
>>  * so return RESET_TIMER to allow error handling
>> another
>> shot
>>  * at this command.
>>  */
>> if (test_and_set_bit(SCMD_STATE_COMPLETE,
>> >state))
>> return BLK_EH_RESET_TIMER;
>> if (scsi_abort_command(scmd) != SUCCESS) {
>> set_host_byte(scmd, DID_TIME_OUT);
>> scsi_eh_scmd_add(scmd);
>> }
>> }
>> }
>
> I am familiar with this mechanism. My concern is that both the regular
> completion path and the abort handler must call scsi_dma_unmap() before
> calling cmd->scsi_done(cmd). I don't see how
> test_and_set_bit(SCMD_STATE_COMPLETE, >state) could prevent that
> the regular completion path and the abort handler call scsi_dma_unmap()
> concurrently since both calls happen before the SCMD_STATE_COMPLETE bit
> is set?
>
> Thanks,
>
> Bart.

For scsi_dma_unmap() part, that is true - we should make it serialized
with
any other completion paths. I've found it during my fault injection
test, so
I've made a patch to fix it, but it only comes in my next error 
recovery

enhancement patch series. Please check the attachment.



Your patch looks good to me.

I have the same idea before but I found that calling scsi_done() (by
__ufshcd_transfer_req_compl()) in ufshcd_abort() in old kernel (e.g.,
4.14) will cause issues but it has been resolved by introduced
SCMD_STATE_COMPLETE flag in newer kernel. So your patch makes sense.

Would you mind sending out this draft patch as a formal patch together
with my patch to fix issues in ufshcd_abort()? Our patches are aimed to
fix cases that host/device reset eventually not being triggered by the
result of ufshcd_abort(), for example, command is aborted successfully
or command is not pending in device with its doorbell also cleared.

Thanks,
Stanley Chu



I don't quite actually follow your fix here and I didn't test the 
similar

fault injection scenario like you do here, so I am not sure if I should
just absorb your fix into mine. How about I put my fix in my current 
error
recovery patch series (maybe in next version of it) and you can give 
your
review. So you can still go with your fix as it is. Mine will be picked 
up

later by Martin. What do you think?

Thanks,

Can Guo.


Thanks,

Can Guo.



Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

2020-08-02 Thread Can Guo

Hi Bart,

On 2020-08-03 11:12, Bart Van Assche wrote:

On 2020-07-31 16:17, Can Guo wrote:
For scsi_dma_unmap() part, that is true - we should make it serialized 
with
any other completion paths. I've found it during my fault injection 
test, so
I've made a patch to fix it, but it only comes in my next error 
recovery

enhancement patch series. Please check the attachment.


Hi Can,

It is not clear to me how that patch serializes scsi_dma_unmap() 
against

other completion paths? Doesn't the regular completion path call
__ufshcd_transfer_req_compl() without holding the host lock?

Thanks,

Bart.


FYI, ufshcd_intr() holds the host spin lock the whole time. So, to your
question, the regular completion path from IRQ handler has the host lock 
held.


Thanks,

Can Guo.


Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog

2020-08-02 Thread Andrii Nakryiko
On Sun, Aug 2, 2020 at 9:47 PM Song Liu  wrote:
>
>
> > On Aug 2, 2020, at 6:51 PM, Andrii Nakryiko  
> > wrote:
> >
> > On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
> >>
> >> Add a benchmark to compare performance of
> >>  1) uprobe;
> >>  2) user program w/o args;
> >>  3) user program w/ args;
> >>  4) user program w/ args on random cpu.
> >>
> >
> > Can you please add it to the existing benchmark runner instead, e.g.,
> > along the other bench_trigger benchmarks? No need to re-implement
> > benchmark setup. And also that would also allow to compare existing
> > ways of cheaply triggering a program vs this new _USER program?
>
> Will try.
>
> >
> > If the performance is not significantly better than other ways, do you
> > think it still makes sense to add a new BPF program type? I think
> > triggering KPROBE/TRACEPOINT from bpf_prog_test_run() would be very
> > nice, maybe it's possible to add that instead of a new program type?
> > Either way, let's see comparison with other program triggering
> > mechanisms first.
>
> Triggering KPROBE and TRACEPOINT from bpf_prog_test_run() will be useful.
> But I don't think they can be used instead of user program, for a couple
> reasons. First, KPROBE/TRACEPOINT may be triggered by other programs
> running in the system, so user will have to filter those noise out in
> each program. Second, it is not easy to specify CPU for KPROBE/TRACEPOINT,
> while this feature could be useful in many cases, e.g. get stack trace
> on a given CPU.
>

Right, it's not as convenient with KPROBE/TRACEPOINT as with the USER
program you've added specifically with that feature in mind. But if
you pin user-space thread on the needed CPU and trigger kprobe/tp,
then you'll get what you want. As for the "noise", see how
bench_trigger() deals with that: it records thread ID and filters
everything not matching. You can do the same with CPU ID. It's not as
automatic as with a special BPF program type, but still pretty simple,
which is why I'm still deciding (for myself) whether USER program type
is necessary :)


> Thanks,
> Song


Re: [PATCH bpf-next 3/5] selftests/bpf: add selftest for BPF_PROG_TYPE_USER

2020-08-02 Thread Andrii Nakryiko
On Sun, Aug 2, 2020 at 9:33 PM Song Liu  wrote:
>
>
>
> > On Aug 2, 2020, at 6:43 PM, Andrii Nakryiko  
> > wrote:
> >
> > On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
> >>
> >> This test checks the correctness of BPF_PROG_TYPE_USER program, including:
> >> running on the right cpu, passing in correct args, returning retval, and
> >> being able to call bpf_get_stack|stackid.
> >>
> >> Signed-off-by: Song Liu 
> >> ---
> >> .../selftests/bpf/prog_tests/user_prog.c  | 52 +
> >> tools/testing/selftests/bpf/progs/user_prog.c | 56 +++
> >> 2 files changed, 108 insertions(+)
> >> create mode 100644 tools/testing/selftests/bpf/prog_tests/user_prog.c
> >> create mode 100644 tools/testing/selftests/bpf/progs/user_prog.c
> >>
> >> diff --git a/tools/testing/selftests/bpf/prog_tests/user_prog.c 
> >> b/tools/testing/selftests/bpf/prog_tests/user_prog.c
> >> new file mode 100644
> >> index 0..416707b3bff01
> >> --- /dev/null
> >> +++ b/tools/testing/selftests/bpf/prog_tests/user_prog.c
> >> @@ -0,0 +1,52 @@
> >> +// SPDX-License-Identifier: GPL-2.0
> >> +/* Copyright (c) 2020 Facebook */
> >> +#include 
> >> +#include "user_prog.skel.h"
> >> +
> >> +static int duration;
> >> +
> >> +void test_user_prog(void)
> >> +{
> >> +   struct bpf_user_prog_args args = {{0, 1, 2, 3, 4}};
> >> +   struct bpf_prog_test_run_attr attr = {};
> >> +   struct user_prog *skel;
> >> +   int i, numcpu, ret;
> >> +
> >> +   skel = user_prog__open_and_load();
> >> +
> >> +   if (CHECK(!skel, "user_prog__open_and_load",
> >> + "skeleton open_and_laod failed\n"))
> >> +   return;
> >> +
> >> +   numcpu = libbpf_num_possible_cpus();
> >
> > nit: possible doesn't mean online right now, so it will fail on
> > offline or non-present CPUs
>
> Just found parse_cpu_mask_file(), will use it to fix this.
>
> [...]
>
> >> +
> >> +volatile int cpu_match = 1;
> >> +volatile __u64 sum = 1;
> >> +volatile int get_stack_success = 0;
> >> +volatile int get_stackid_success = 0;
> >> +volatile __u64 stacktrace[PERF_MAX_STACK_DEPTH];
> >
> > nit: no need for volatile for non-static variables
> >
> >> +
> >> +SEC("user")
> >> +int user_func(struct bpf_user_prog_ctx *ctx)
> >
> > If you put args in bpf_user_prog_ctx as a first field, you should be
> > able to re-use the BPF_PROG macro to access those arguments in a more
> > user-friendly way.
>
> I am not sure I am following here. Do you mean something like:
>
> struct bpf_user_prog_ctx {
> __u64 args[BPF_USER_PROG_MAX_ARGS];
> struct pt_regs *regs;
> };
>
> (swap args and regs)?
>

Yes, BPF_PROG assumes that context is a plain u64[5] array, so if you
put args at the beginning, it will work nicely with BPF_PROG.

> Thanks,
> Song
>
>


Re: BUG: unable to handle kernel NULL pointer dereference in bpf_prog_ADDR

2020-08-02 Thread John Fastabend
Eric Dumazet wrote:
> 
> 
> On 8/2/20 3:45 PM, syzbot wrote:
> > Hello,
> > 
> > syzbot found the following issue on:
> > 
> > HEAD commit:ac3a0c84 Merge git://git.kernel.org/pub/scm/linux/kernel/g..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1323497090
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=c0cfcf935bcc94d2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=192a7fbbece55f740074
> > compiler:   gcc (GCC) 10.1.0-syz 20200507
> > syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=141541ea90
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+192a7fbbece55f740...@syzkaller.appspotmail.com
> > 
> > BUG: kernel NULL pointer dereference, address: 
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x) - not-present page
> > PGD 9176a067 P4D 9176a067 PUD 9176b067 PMD 0 
> > Oops:  [#1] PREEMPT SMP KASAN
> > CPU: 1 PID: 8142 Comm: syz-executor.2 Not tainted 5.8.0-rc7-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > RIP: 0010:bpf_prog_e48ebe87b99394c4+0x1f/0x590
> > Code: cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 e5 48 81 ec 
> > 00 00 00 00 53 41 55 41 56 41 57 6a 00 31 c0 48 8b 47 28 <48> 8b 40 00 8b 
> > 80 00 01 00 00 5b 41 5f 41 5e 41 5d 5b c9 c3 cc cc
> > RSP: 0018:c900038a7b00 EFLAGS: 00010246
> > RAX:  RBX: dc00 RCX: dc00
> > RDX: 88808cfb0200 RSI: c9e7e038 RDI: c900038a7ca8
> > RBP: c900038a7b28 R08:  R09: 
> > R10:  R11:  R12: c9e7e000
> > R13: c9e7e000 R14: 0001 R15: 
> > FS:  7fda07fef700() GS:8880ae70() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2:  CR3: 91769000 CR4: 001406e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: fffe0ff0 DR7: 0400
> > Call Trace:
> >  bpf_prog_run_xdp include/linux/filter.h:734 [inline]
> >  bpf_test_run+0x221/0xc70 net/bpf/test_run.c:47
> >  bpf_prog_test_run_xdp+0x2ca/0x510 net/bpf/test_run.c:524
> >  bpf_prog_test_run kernel/bpf/syscall.c:2983 [inline]
> >  __do_sys_bpf+0x2117/0x4b10 kernel/bpf/syscall.c:4135
> >  do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x45cc79
> > Code: 2d b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 
> > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff 
> > ff 0f 83 fb b5 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:7fda07feec78 EFLAGS: 0246 ORIG_RAX: 0141
> > RAX: ffda RBX: 1740 RCX: 0045cc79
> > RDX: 0028 RSI: 2080 RDI: 000a
> > RBP: 0078bfe0 R08:  R09: 
> > R10:  R11: 0246 R12: 0078bfac
> > R13: 7ffc3ef769bf R14: 7fda07fef9c0 R15: 0078bfac
> > Modules linked in:
> > CR2: 
> > ---[ end trace b2d24107e7fdae7d ]---
> > RIP: 0010:bpf_prog_e48ebe87b99394c4+0x1f/0x590
> > Code: cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 e5 48 81 ec 
> > 00 00 00 00 53 41 55 41 56 41 57 6a 00 31 c0 48 8b 47 28 <48> 8b 40 00 8b 
> > 80 00 01 00 00 5b 41 5f 41 5e 41 5d 5b c9 c3 cc cc
> > RSP: 0018:c900038a7b00 EFLAGS: 00010246
> > RAX:  RBX: dc00 RCX: dc00
> > RDX: 88808cfb0200 RSI: c9e7e038 RDI: c900038a7ca8
> > RBP: c900038a7b28 R08:  R09: 
> > R10:  R11:  R12: c9e7e000
> > R13: c9e7e000 R14: 0001 R15: 
> > FS:  7fda07fef700() GS:8880ae70() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2:  CR3: 91769000 CR4: 001406e0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: fffe0ff0 DR7: 0400
> > 
> > 
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkal...@googlegroups.com.
> > 
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > syzbot can test patches for this issue, for details see:
> > https://goo.gl/tpsmEJ#testing-patches
> > 
> 
> 
> # 
> https://syzkaller.appspot.com/bug?id=d60883a0b19a778d2bcab55f3f6459467f4a3ea7
> # See https://goo.gl/kgGztJ for information about syzkaller reproducers.
> 

Re: [PATCH bpf-next 2/5] libbpf: support BPF_PROG_TYPE_USER programs

2020-08-02 Thread Andrii Nakryiko
On Sun, Aug 2, 2020 at 9:21 PM Song Liu  wrote:
>
>
>
> > On Aug 2, 2020, at 6:40 PM, Andrii Nakryiko  
> > wrote:
> >
> > On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
> >>
> >> Add cpu_plus to bpf_prog_test_run_attr. Add BPF_PROG_SEC "user" for
> >> BPF_PROG_TYPE_USER programs.
> >>
> >> Signed-off-by: Song Liu 
> >> ---
> >> tools/lib/bpf/bpf.c   | 1 +
> >> tools/lib/bpf/bpf.h   | 3 +++
> >> tools/lib/bpf/libbpf.c| 1 +
> >> tools/lib/bpf/libbpf_probes.c | 1 +
> >> 4 files changed, 6 insertions(+)
> >>
> >> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> >> index e1bdf214f75fe..b28c3daa9c270 100644
> >> --- a/tools/lib/bpf/bpf.c
> >> +++ b/tools/lib/bpf/bpf.c
> >> @@ -693,6 +693,7 @@ int bpf_prog_test_run_xattr(struct 
> >> bpf_prog_test_run_attr *test_attr)
> >>attr.test.ctx_size_in = test_attr->ctx_size_in;
> >>attr.test.ctx_size_out = test_attr->ctx_size_out;
> >>attr.test.repeat = test_attr->repeat;
> >> +   attr.test.cpu_plus = test_attr->cpu_plus;
> >>
> >>ret = sys_bpf(BPF_PROG_TEST_RUN, , sizeof(attr));
> >>test_attr->data_size_out = attr.test.data_size_out;
> >> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> >> index 6d367e01d05e9..0c799740df566 100644
> >> --- a/tools/lib/bpf/bpf.h
> >> +++ b/tools/lib/bpf/bpf.h
> >> @@ -205,6 +205,9 @@ struct bpf_prog_test_run_attr {
> >>void *ctx_out;  /* optional */
> >>__u32 ctx_size_out; /* in: max length of ctx_out
> >> * out: length of cxt_out */
> >> +   __u32 cpu_plus; /* specify which cpu to run the test with
> >> +* cpu_plus = cpu_id + 1.
> >> +* If cpu_plus = 0, run on current cpu */
> >
> > We can't do this due to ABI guarantees. We'll have to add a new API
> > using OPTS arguments.
>
> To make sure I understand this correctly, the concern is when we compile
> the binary with one version of libbpf and run it with libbpf.so of a
> different version, right?
>

yep, exactly

> Thanks,
> Song
>
> >
> >> };
> >>
> >> LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr 
> >> *test_attr);
> >> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> >> index b9f11f854985b..9ce175a486214 100644
> >> --- a/tools/lib/bpf/libbpf.c
> >> +++ b/tools/lib/bpf/libbpf.c
> >> @@ -6922,6 +6922,7 @@ static const struct bpf_sec_def section_defs[] = {
> >>BPF_PROG_SEC("lwt_out", BPF_PROG_TYPE_LWT_OUT),
> >>BPF_PROG_SEC("lwt_xmit",BPF_PROG_TYPE_LWT_XMIT),
> >>BPF_PROG_SEC("lwt_seg6local",   
> >> BPF_PROG_TYPE_LWT_SEG6LOCAL),
> >> +   BPF_PROG_SEC("user",BPF_PROG_TYPE_USER),
> >
> > let's do "user/" for consistency with most other prog types (and nice
> > separation between prog type and custom user name)
>
> Will update.
>

thanks!


KASAN: use-after-free Write in sco_chan_del

2020-08-02 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:ac3a0c84 Merge git://git.kernel.org/pub/scm/linux/kernel/g..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11638a4290
kernel config:  https://syzkaller.appspot.com/x/.config?x=c0cfcf935bcc94d2
dashboard link: https://syzkaller.appspot.com/bug?extid=8f6017ee5c7fb9515782
compiler:   gcc (GCC) 10.1.0-syz 20200507
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=17fd776c90
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=15ac701490

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+8f6017ee5c7fb9515...@syzkaller.appspotmail.com

==
BUG: KASAN: use-after-free in instrument_atomic_write 
include/linux/instrumented.h:71 [inline]
BUG: KASAN: use-after-free in atomic_dec_and_test 
include/asm-generic/atomic-instrumented.h:748 [inline]
BUG: KASAN: use-after-free in hci_conn_drop 
include/net/bluetooth/hci_core.h:1049 [inline]
BUG: KASAN: use-after-free in sco_chan_del+0xe6/0x430 net/bluetooth/sco.c:148
Write of size 4 at addr 8880a03d8010 by task syz-executor104/6978

CPU: 1 PID: 6978 Comm: syz-executor104 Not tainted 5.8.0-rc7-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xae/0x436 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 check_memory_region_inline mm/kasan/generic.c:186 [inline]
 check_memory_region+0x13d/0x180 mm/kasan/generic.c:192
 instrument_atomic_write include/linux/instrumented.h:71 [inline]
 atomic_dec_and_test include/asm-generic/atomic-instrumented.h:748 [inline]
 hci_conn_drop include/net/bluetooth/hci_core.h:1049 [inline]
 sco_chan_del+0xe6/0x430 net/bluetooth/sco.c:148
 __sco_sock_close+0x16e/0x5b0 net/bluetooth/sco.c:433
 sco_sock_close net/bluetooth/sco.c:447 [inline]
 sco_sock_release+0x69/0x290 net/bluetooth/sco.c:1021
 __sock_release+0xcd/0x280 net/socket.c:605
 sock_close+0x18/0x20 net/socket.c:1278
 __fput+0x33c/0x880 fs/file_table.c:281
 task_work_run+0xdd/0x190 kernel/task_work.c:135
 exit_task_work include/linux/task_work.h:25 [inline]
 do_exit+0xb72/0x2a40 kernel/exit.c:805
 do_group_exit+0x125/0x310 kernel/exit.c:903
 get_signal+0x40b/0x1ee0 kernel/signal.c:2743
 do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop arch/x86/entry/common.c:235 [inline]
 __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:269
 do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:393
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x446e69
Code: Bad RIP value.
RSP: 002b:7fff15a45008 EFLAGS: 0246 ORIG_RAX: 002a
RAX: fffc RBX:  RCX: 00446e69
RDX: 0008 RSI: 2000 RDI: 0004
RBP: 0004 R08: 0002 R09: 0003000100ff
R10: 0004 R11: 0246 R12: 
R13: 00407ac0 R14:  R15: 

Allocated by task 6978:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 __kasan_kmalloc.constprop.0+0xc2/0xd0 mm/kasan/common.c:494
 kmem_cache_alloc_trace+0x14f/0x2d0 mm/slab.c:3551
 kmalloc include/linux/slab.h:555 [inline]
 kzalloc include/linux/slab.h:669 [inline]
 hci_conn_add+0x53/0x1340 net/bluetooth/hci_conn.c:525
 hci_connect_sco+0x350/0x860 net/bluetooth/hci_conn.c:1279
 sco_connect net/bluetooth/sco.c:240 [inline]
 sco_sock_connect+0x308/0x980 net/bluetooth/sco.c:576
 __sys_connect_file+0x155/0x1a0 net/socket.c:1854
 __sys_connect+0x160/0x190 net/socket.c:1871
 __do_sys_connect net/socket.c:1882 [inline]
 __se_sys_connect net/socket.c:1879 [inline]
 __x64_sys_connect+0x6f/0xb0 net/socket.c:1879
 do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Freed by task 6972:
 save_stack+0x1b/0x40 mm/kasan/common.c:48
 set_track mm/kasan/common.c:56 [inline]
 kasan_set_free_info mm/kasan/common.c:316 [inline]
 __kasan_slab_free+0xf5/0x140 mm/kasan/common.c:455
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x103/0x2c0 mm/slab.c:3757
 device_release+0x71/0x200 drivers/base/core.c:1579
 kobject_cleanup lib/kobject.c:693 [inline]
 kobject_release lib/kobject.c:722 [inline]
 kref_put include/linux/kref.h:65 [inline]
 kobject_put+0x1c0/0x270 lib/kobject.c:739
 put_device+0x1b/0x30 drivers/base/core.c:2799
 hci_conn_del+0x27e/0x6a0 net/bluetooth/hci_conn.c:645
 hci_phy_link_complete_evt.isra.0+0x508/0x790 net/bluetooth/hci_event.c:4921
 hci_event_packet+0x481a/0x86f5 net/bluetooth/hci_event.c:6180
 hci_rx_work+0x22e/0xb10 net/bluetooth/hci_core.c:4705
 process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
 worker_thread+0x64c/0x1120 

Re: [PATCH] tools/bpf/bpftool: Fix wrong return value in do_dump()

2020-08-02 Thread John Fastabend
Andrii Nakryiko wrote:
> On Sun, Aug 2, 2020 at 4:16 AM Tianjia Zhang
>  wrote:
> >
> > In case of btf_id does not exist, a negative error code -ENOENT
> > should be returned.
> >
> > Fixes: c93cc69004df3 ("bpftool: add ability to dump BTF types")
> > Cc: Andrii Nakryiko 
> > Signed-off-by: Tianjia Zhang 
> > ---
> 
> 
> Acked-by: Andrii Nakryiko 
> 

Acked-by: John Fastabend 


Re: [PATCH v6] scsi: ufs: Quiesce all scsi devices before shutdown

2020-08-02 Thread Can Guo

Hi Stanley,

On 2020-08-03 12:25, Stanley Chu wrote:

Currently I/O request could be still submitted to UFS device while
UFS is working on shutdown flow. This may lead to racing as below
scenarios and finally system may crash due to unclocked register
accesses.

To fix this kind of issues, in ufshcd_shutdown(),

1. Use pm_runtime_get_sync() instead of resuming UFS device by
   ufshcd_runtime_resume() "internally" to let runtime PM framework
   manage and prevent concurrent runtime operations by incoming I/O
   requests.

2. Specifically quiesce all SCSI devices to block all I/O requests
   after device is resumed.

Example of racing scenario: While UFS device is runtime-suspended

Thread #1: Executing UFS shutdown flow, e.g.,
   ufshcd_suspend(UFS_SHUTDOWN_PM)

Thread #2: Executing runtime resume flow triggered by I/O request,
   e.g., ufshcd_resume(UFS_RUNTIME_PM)

This breaks the assumption that UFS PM flows can not be running
concurrently and some unexpected racing behavior may happen.

Signed-off-by: Stanley Chu 
---
Changes:
  - Since v4: Use pm_runtime_get_sync() instead of resuming UFS device
by ufshcd_runtime_resume() "internally".
---
 drivers/scsi/ufs/ufshcd.c | 39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 307622284239..fc01171d13b1 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -159,6 +159,12 @@ struct ufs_pm_lvl_states ufs_pm_lvl_states[] = {
{UFS_POWERDOWN_PWR_MODE, UIC_LINK_OFF_STATE},
 };

+#define ufshcd_scsi_for_each_sdev(fn) \
+   list_for_each_entry(starget, >host->__targets, siblings) { \
+   __starget_for_each_device(starget, NULL, \
+ fn); \
+   }
+
 static inline enum ufs_dev_pwr_mode
 ufs_get_pm_lvl_to_dev_pwr_mode(enum ufs_pm_level lvl)
 {
@@ -8629,6 +8635,13 @@ int ufshcd_runtime_idle(struct ufs_hba *hba)
 }
 EXPORT_SYMBOL(ufshcd_runtime_idle);

+static void ufshcd_quiesce_sdev(struct scsi_device *sdev, void *data)
+{
+   /* Suspended devices are already quiesced so can be skipped */


Why can runtime suspended sdevs be skipped? Block layer can still resume
them at any time, no?


+   if (!pm_runtime_suspended(>sdev_gendev))
+   scsi_device_quiesce(sdev);
+}
+
 /**
  * ufshcd_shutdown - shutdown routine
  * @hba: per adapter instance
@@ -8640,6 +8653,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
 int ufshcd_shutdown(struct ufs_hba *hba)
 {
int ret = 0;
+   struct scsi_target *starget;

if (!hba->is_powered)
goto out;
@@ -8647,11 +8661,26 @@ int ufshcd_shutdown(struct ufs_hba *hba)
if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
goto out;

-   if (pm_runtime_suspended(hba->dev)) {
-   ret = ufshcd_runtime_resume(hba);
-   if (ret)
-   goto out;
-   }
+   /*
+* Let runtime PM framework manage and prevent concurrent runtime
+* operations with shutdown flow.
+*/
+   pm_runtime_get_sync(hba->dev);
+
+   /*
+* Quiesce all SCSI devices to prevent any non-PM requests sending
+* from block layer during and after shutdown.
+*
+* Here we can not use blk_cleanup_queue() since PM requests
+* (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
+* through block layer. Therefore SCSI command queued after the
+* scsi_target_quiesce() call returned will block until
+* blk_cleanup_queue() is called.
+*
+* Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
+* be ignored since shutdown is one-way flow.
+*/
+   ufshcd_scsi_for_each_sdev(ufshcd_quiesce_sdev);


Any reasons why don't use scsi_target_quiesce() here?

Thanks,

Can Guo.



ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
 out:


Re: [PATCH] powerpc: fix up PPC_FSL_BOOK3E build

2020-08-02 Thread Stephen Rothwell
Hi all,

On Mon, 3 Aug 2020 13:54:47 +1000 Stephen Rothwell  
wrote:
>
> Commit
> 
>   1c9df907da83 ("random: fix circular include dependency on arm64 after 
> addition of percpu.h")
> 
> exposed a curcular include dependency:
> 
> asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
> includes asm/mmu.h
> 
> So fix it by extracting the small part of asm/mmu.h that needs
> asm/percu.h into a new file and including that where necessary.
> 
> Cc: Willy Tarreau 
> Cc: 
> Signed-off-by: Stephen Rothwell 

I should have put:

Fixes: 1c9df907da83 ("random: fix circular include dependency on arm64 after 
addition of percpu.h")

-- 
Cheers,
Stephen Rothwell


pgpX9XumL_JNj.pgp
Description: OpenPGP digital signature


Re: [Linux-kernel-mentees] [PATCH net] rds: Prevent kernel-infoleak in rds_notify_queue_get()

2020-08-02 Thread Leon Romanovsky
On Sun, Aug 02, 2020 at 03:45:40PM -0700, Joe Perches wrote:
> On Sun, 2020-08-02 at 19:28 -0300, Jason Gunthorpe wrote:
> > On Sun, Aug 02, 2020 at 03:23:58PM -0700, Joe Perches wrote:
> > > On Sun, 2020-08-02 at 19:10 -0300, Jason Gunthorpe wrote:
> > > > On Sat, Aug 01, 2020 at 08:38:33AM +0300, Leon Romanovsky wrote:
> > > >
> > > > > I'm using {} instead of {0} because of this GCC bug.
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
> > > >
> > > > This is why the {} extension exists..
> > >
> > > There is no guarantee that the gcc struct initialization {}
> > > extension also zeros padding.
> >
> > We just went over this. Yes there is, C11 requires it.
>
> c11 is not c90.  The kernel uses c90.

It is not accurate, kernel uses gnu89 dialect, which is C90 with some
C99 features [1]. In our case, we rely on GCC extension {} that doesn't
contradict standart [2] and fills holes with zeros too.

[1] Makefile:500
   496 KBUILD_CFLAGS   := -Wall -Wundef -Werror=strict-prototypes 
-Wno-trigraphs \
   497-fno-strict-aliasing -fno-common -fshort-wchar 
-fno-PIE \
   498-Werror=implicit-function-declaration 
-Werror=implicit-int \
   499-Wno-format-security \
   500-std=gnu89

[2] From GCC:
https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html
"When a base standard is specified, the compiler accepts all programs
following that standard plus those using GNU extensions that do not
contradict it."

Thanks

>
>
>


Re: kernel panic: panic_on_warn set

2020-08-02 Thread Dmitry Vyukov
On Mon, Aug 3, 2020 at 6:55 AM Dmitry Vyukov  wrote:
>
> On Mon, Aug 3, 2020 at 5:24 AM butt3rflyh4ck  
> wrote:
> >
> > Hi, syzkaller always get this crashes, I think this crash is not a
> > bug, maybe some wrong configs
> > cause, can you give me some help. thanks.
> >
> > log is below:
> > 888063151a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >  888063151a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> > ==
> > Kernel panic - not syncing: panic_on_warn set ...
> > CPU: 0 PID: 18555 Comm: syz-executor.2 Tainted: GB 
> > 5.8.0-rc7+ #3
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> > 1.10.2-1ubuntu1 04/01/2014
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> >  panic+0x2e3/0x75c kernel/panic.c:231
> >  end_report+0x4d/0x53 mm/kasan/report.c:104
> >  __kasan_report mm/kasan/report.c:520 [inline]
> >  kasan_report.cold+0xd/0x37 mm/kasan/report.c:530
> >  __fb_pad_aligned_buffer include/linux/fb.h:655 [inline]
> >  bit_putcs_aligned drivers/video/fbdev/core/bitblit.c:96 [inline]
> >  bit_putcs+0xbb6/0xd20 drivers/video/fbdev/core/bitblit.c:185
> >  fbcon_putcs+0x33c/0x3f0 drivers/video/fbdev/core/fbcon.c:1362
> >  do_update_region+0x399/0x630 drivers/tty/vt/vt.c:683
> >  redraw_screen+0x64c/0x770 drivers/tty/vt/vt.c:1029
> >  vc_do_resize+0x/0x13f0 drivers/tty/vt/vt.c:1320
> >  vt_ioctl+0x2037/0x2670 drivers/tty/vt/vt_ioctl.c:901
> >  tty_ioctl+0x1019/0x15f0 drivers/tty/tty_io.c:2656
> >  vfs_ioctl fs/ioctl.c:48 [inline]
> >  ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
> >  __do_sys_ioctl fs/ioctl.c:762 [inline]
> >  __se_sys_ioctl fs/ioctl.c:760 [inline]
> >  __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
> >  do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x467129
> > Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
> > 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> > 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:7fb6f7854c58 EFLAGS: 0246 ORIG_RAX: 0010
> > RAX: ffda RBX: 0071e7c0 RCX: 00467129
> > RDX: 2000 RSI: 560a RDI: 0003
> > RBP: 004c0bd5 R08:  R09: 
> > R10:  R11: 0246 R12: 0076bf00
> > R13:  R14: 0076bf00 R15: 7ffd478a3fc0
> > Dumping ftrace buffer:
> >(ftrace buffer empty)
> > Kernel Offset: disabled
> > Rebooting in 1 seconds..
>
> +syzkaller mailing list, LKML
> -syzkaller-bugs to BCC
>
> Hi butt3rflyh4ck,
>
> This is a very real kernel bug, see:
> https://groups.google.com/forum/#!searchin/syzkaller-bugs/%22bit_putcs_aligned%22%7Csort:date
> There are some reproducers available if you need them.

Or if you are asking about "Kernel panic - not syncing: panic_on_warn
set" specifically, it happens because you set panic_on_warn=1 cmdline
argument. But it only happens if there is a bug happened in the kernel
before, it just turns some non-fatal bugs into fatal. So removing
panic_on_warn=1 won't help. It's the right setting for syzkaller.


Re: kernel panic: panic_on_warn set

2020-08-02 Thread Dmitry Vyukov
On Mon, Aug 3, 2020 at 5:24 AM butt3rflyh4ck  wrote:
>
> Hi, syzkaller always get this crashes, I think this crash is not a
> bug, maybe some wrong configs
> cause, can you give me some help. thanks.
>
> log is below:
> 888063151a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>  888063151a80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ==
> Kernel panic - not syncing: panic_on_warn set ...
> CPU: 0 PID: 18555 Comm: syz-executor.2 Tainted: GB 5.8.0-rc7+ 
> #3
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.10.2-1ubuntu1 04/01/2014
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x18f/0x20d lib/dump_stack.c:118
>  panic+0x2e3/0x75c kernel/panic.c:231
>  end_report+0x4d/0x53 mm/kasan/report.c:104
>  __kasan_report mm/kasan/report.c:520 [inline]
>  kasan_report.cold+0xd/0x37 mm/kasan/report.c:530
>  __fb_pad_aligned_buffer include/linux/fb.h:655 [inline]
>  bit_putcs_aligned drivers/video/fbdev/core/bitblit.c:96 [inline]
>  bit_putcs+0xbb6/0xd20 drivers/video/fbdev/core/bitblit.c:185
>  fbcon_putcs+0x33c/0x3f0 drivers/video/fbdev/core/fbcon.c:1362
>  do_update_region+0x399/0x630 drivers/tty/vt/vt.c:683
>  redraw_screen+0x64c/0x770 drivers/tty/vt/vt.c:1029
>  vc_do_resize+0x/0x13f0 drivers/tty/vt/vt.c:1320
>  vt_ioctl+0x2037/0x2670 drivers/tty/vt/vt_ioctl.c:901
>  tty_ioctl+0x1019/0x15f0 drivers/tty/tty_io.c:2656
>  vfs_ioctl fs/ioctl.c:48 [inline]
>  ksys_ioctl+0x11a/0x180 fs/ioctl.c:753
>  __do_sys_ioctl fs/ioctl.c:762 [inline]
>  __se_sys_ioctl fs/ioctl.c:760 [inline]
>  __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:760
>  do_syscall_64+0x60/0xe0 arch/x86/entry/common.c:384
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x467129
> Code: f7 d8 64 89 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 48 89 f8 48
> 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> RSP: 002b:7fb6f7854c58 EFLAGS: 0246 ORIG_RAX: 0010
> RAX: ffda RBX: 0071e7c0 RCX: 00467129
> RDX: 2000 RSI: 560a RDI: 0003
> RBP: 004c0bd5 R08:  R09: 
> R10:  R11: 0246 R12: 0076bf00
> R13:  R14: 0076bf00 R15: 7ffd478a3fc0
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 1 seconds..

+syzkaller mailing list, LKML
-syzkaller-bugs to BCC

Hi butt3rflyh4ck,

This is a very real kernel bug, see:
https://groups.google.com/forum/#!searchin/syzkaller-bugs/%22bit_putcs_aligned%22%7Csort:date
There are some reproducers available if you need them.


Re: [PATCH bpf-next 5/5] selftests/bpf: add benchmark for uprobe vs. user_prog

2020-08-02 Thread Song Liu


> On Aug 2, 2020, at 6:51 PM, Andrii Nakryiko  wrote:
> 
> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
>> 
>> Add a benchmark to compare performance of
>>  1) uprobe;
>>  2) user program w/o args;
>>  3) user program w/ args;
>>  4) user program w/ args on random cpu.
>> 
> 
> Can you please add it to the existing benchmark runner instead, e.g.,
> along the other bench_trigger benchmarks? No need to re-implement
> benchmark setup. And also that would also allow to compare existing
> ways of cheaply triggering a program vs this new _USER program?

Will try. 

> 
> If the performance is not significantly better than other ways, do you
> think it still makes sense to add a new BPF program type? I think
> triggering KPROBE/TRACEPOINT from bpf_prog_test_run() would be very
> nice, maybe it's possible to add that instead of a new program type?
> Either way, let's see comparison with other program triggering
> mechanisms first.

Triggering KPROBE and TRACEPOINT from bpf_prog_test_run() will be useful. 
But I don't think they can be used instead of user program, for a couple
reasons. First, KPROBE/TRACEPOINT may be triggered by other programs 
running in the system, so user will have to filter those noise out in
each program. Second, it is not easy to specify CPU for KPROBE/TRACEPOINT,
while this feature could be useful in many cases, e.g. get stack trace 
on a given CPU. 

Thanks,
Song

[GIT PULL] Crypto Update for 5.9

2020-08-02 Thread Herbert Xu
Hi Linus:

API:

- Add support for allocating transforms on a specific NUMA Node.
- Introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY for storage users.

Algorithms:

- Drop PMULL based ghash on arm64.
- Fixes for building with clang on x86.
- Add sha256 helper that does the digest in one go.
- Add SP800-56A rev 3 validation checks to dh.

Drivers:

- Permit users to specify NUMA node in hisilicon/zip.
- Add support for i.MX6 in imx-rngc.
- Add sa2ul crypto driver.
- Add BA431 hwrng driver.
- Add Ingenic JZ4780 and X1000 hwrng driver.
- Spread IRQ affinity in inside-secure and marvell/cesa.

There may be a conflict with the tip tree because of the removal
of arch/x86/include/asm/inst.h.  This file was previously only used
by the Crypto API and just as we stopped using it the tip tree
started using it.  So taking the version from the tip tree should
do the trick.

There is also a conflit witht the jc_docs tree due to unrelated
changes to the same file.  The resolution should be straightforward.

The following changes since commit e04ec0de61c1eb9693179093e83ab8ca68a30d08:

  padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial 
(2020-06-18 17:09:54 +1000)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6.git linus 

for you to fetch changes up to 3cbfe80737c18ac6e635421ab676716a393d3074:

  crypto: sa2ul - Fix inconsistent IS_ERR and PTR_ERR (2020-07-31 18:25:29 
+1000)


Alexander A. Klimov (2):
  hwrng: ks-sa - Replace HTTP links with HTTPS ones
  crypto: Replace HTTP links with HTTPS ones

Andrei Botila (1):
  crypto: caam/qi2 - add support for dpseci_reset()

Andrey Smirnov (1):
  crypto: caam - add clock info for VFxxx SoCs

Ard Biesheuvel (20):
  crypto: arm64/ghash - drop PMULL based shash
  crypto: arm64/gcm - disentangle ghash and gcm setkey() routines
  crypto: arm64/gcm - use variably sized key struct
  crypto: arm64/gcm - use inline helper to suppress indirect calls
  crypto: arm/ghash - use variably sized key struct
  crypto: amlogic-gxl - default to build as module
  crypto: amlogic-gxl - permit async skcipher as fallback
  crypto: omap-aes - permit asynchronous skcipher as fallback
  crypto: sun4i - permit asynchronous skcipher as fallback
  crypto: sun8i-ce - permit asynchronous skcipher as fallback
  crypto: sun8i-ss - permit asynchronous skcipher as fallback
  crypto: ccp - permit asynchronous skcipher as fallback
  crypto: chelsio - permit asynchronous skcipher as fallback
  crypto: mxs-dcp - permit asynchronous skcipher as fallback
  crypto: picoxcell - permit asynchronous skcipher as fallback
  crypto: qce - permit asynchronous skcipher as fallback
  crypto: sahara - permit asynchronous skcipher as fallback
  crypto: mediatek - use AES library for GCM key derivation
  crypto: x86/chacha-sse3 - use unaligned loads for state array
  crypto: xts - Replace memcpy() invocation with simple assignment

Arnd Bergmann (1):
  crypto: x86/crc32c - fix building with clang ias

Barry Song (2):
  crypto: api - permit users to specify numa node of acomp hardware
  crypto: hisilicon/zip - permit users to specify NUMA node

Christophe JAILLET (2):
  crypto: chelsio - Avoid some code duplication
  crypto: chelsio - Fix some pr_xxx messages

Colin Ian King (4):
  crypto: caam/qi2 - remove redundant assignment to ret
  crypto: ccp - remove redundant assignment to variable ret
  crypto: img-hash - remove redundant initialization of variable err
  hwrng: core - remove redundant initialization of variable ret

Dan Carpenter (1):
  crypto: hisilicon - allow smaller reads in debugfs

Dan Douglass (1):
  crypto: caam/jr - remove incorrect reference to caam_jr_register()

Daniel Jordan (6):
  padata: remove start function
  padata: remove stop function
  padata: inline single call of pd_setup_cpumasks()
  padata: remove effective cpumasks from the instance
  padata: fold padata_alloc_possible() into padata_alloc()
  padata: remove padata_parallel_queue

Dinghao Liu (1):
  crypto: sun8i-ce - Fix runtime PM imbalance in sun8i_ce_cipher_init

Eric Biggers (14):
  crc-t10dif: use fallback in initial state
  crc-t10dif: clean up some more things
  crypto: sparc - rename sha256 to sha256_alg
  crypto: lib/sha256 - add sha256() function
  efi: use sha256() instead of open coding
  mptcp: use sha256() instead of open coding
  ASoC: cros_ec_codec: use sha256() instead of open coding
  crypto: geniv - remove unneeded arguments from aead_geniv_alloc()
  crypto: seqiv - remove seqiv_create()
  crypto: algapi - use common mechanism for inheriting flags
  crypto: algapi - add NEED_FALLBACK to INHERITED_FLAGS
  crypto: algapi - introduce the flag CRYPTO_ALG_ALLOCATES_MEMORY
  

答复: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

2020-08-02 Thread FelixCui-oc
Hi baolu:
Some ACPI devices need to issue dma requests to access the 
reserved memory area.
So bios uses the device scope type ACPI_NAMESPACE_DEVICE in 
RMRR to report these ACPI devices.
At present, there is no analysis in the kernel that the device 
scope type in RMRR is ACPI_NAMESPACE_DEVICE.
This patch is mainly to add the analysis of the device scope 
type ACPI_NAMESPACE_DEVICE in RMRR structure and establish identity mapping for 
these ACPI devices. In addition, some naming changes have been made in patch in 
order to distinguish acpi device from pci device.
You can refer to the description of type in 8.3.1 device scope 
in vt-d spec.

Best regards
FelixCui-oc



-邮件原件-
发件人: Lu Baolu  
发送时间: 2020年8月3日 10:32
收件人: FelixCui-oc ; Joerg Roedel ; 
io...@lists.linux-foundation.org; linux-kernel@vger.kernel.org; David Woodhouse 

抄送: baolu...@linux.intel.com; Cobe Chen(BJ-RD) ; Raymond 
Pang(BJ-RD) 
主题: Re: [PATCH] iommu/vt-d:Add support for ACPI device in RMRR

Hi,

On 8/2/20 6:07 PM, FelixCuioc wrote:
> Some ACPI devices require access to the specified reserved memory 
> region.BIOS report the specified reserved memory region through RMRR 
> structures.Add analysis of ACPI device in RMRR and establish identity 
> mapping for ACPI device.

Can you please add more words about the problem you want to solve? Do you mean 
some RMRRs are not enumerated correctly? Or, enumerated, but not identity 
mapped?

Nit: add version and change log once you refreshed your patch.

> 
> Reported-by: kernel test robot 

No need to add this. The problem you want to solve through this patch is not 
reported by lkp.

Best regards,
baolu

> Signed-off-by: FelixCuioc 
> ---
>   drivers/iommu/intel/dmar.c  | 74 -
>   drivers/iommu/intel/iommu.c | 46 ++-
>   drivers/iommu/iommu.c   |  6 +++
>   include/linux/dmar.h| 12 +-
>   include/linux/iommu.h   |  3 ++
>   5 files changed, 105 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c 
> index 93e6345f3414..024ca38dba12 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct 
> dmar_pci_notify_info *info, int bus,
>   }
>   
>   /* Return: > 0 if match found, 0 if no match found, < 0 if error 
> happens */ -int dmar_insert_dev_scope(struct dmar_pci_notify_info 
> *info,
> +int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info,
> void *start, void*end, u16 segment,
> struct dmar_dev_scope *devices,
> int devices_cnt)
> @@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct 
> dmar_pci_notify_info *info)
>   
>   drhd = container_of(dmaru->hdr,
>   struct acpi_dmar_hardware_unit, header);
> - ret = dmar_insert_dev_scope(info, (void *)(drhd + 1),
> + ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1),
>   ((void *)drhd) + drhd->header.length,
>   dmaru->segment,
>   dmaru->devices, dmaru->devices_cnt); @@ -696,48 
> +696,56 @@ 
> dmar_find_matched_drhd_unit(struct pci_dev *dev)
>   
>   return dmaru;
>   }
> -
> -static void __init dmar_acpi_insert_dev_scope(u8 device_number,
> -   struct acpi_device *adev)
> +int dmar_acpi_insert_dev_scope(u8 device_number,
> + struct acpi_device *adev,
> + void *start, void *end,
> + struct dmar_dev_scope *devices,
> + int devices_cnt)
>   {
> - struct dmar_drhd_unit *dmaru;
> - struct acpi_dmar_hardware_unit *drhd;
>   struct acpi_dmar_device_scope *scope;
>   struct device *tmp;
>   int i;
>   struct acpi_dmar_pci_path *path;
>   
> + for (; start < end; start += scope->length) {
> + scope = start;
> + if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE)
> + continue;
> + if (scope->enumeration_id != device_number)
> + continue;
> + path = (void *)(scope + 1);
> + for_each_dev_scope(devices, devices_cnt, i, tmp)
> + if (tmp == NULL) {
> + devices[i].bus = scope->bus;
> + devices[i].devfn = PCI_DEVFN(path->device, 
> path->function);
> + rcu_assign_pointer(devices[i].dev,
> +get_device(>dev));
> + return 1;
> + }
> + WARN_ON(i >= devices_cnt);
> + }
> + return 0;
> +}
> +static int dmar_acpi_bus_add_dev(u8 device_number, 

Re: [PATCH bpf-next 4/5] selftests/bpf: move two functions to test_progs.c

2020-08-02 Thread Song Liu



> On Aug 2, 2020, at 6:46 PM, Andrii Nakryiko  wrote:
> 
> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
>> 
>> Move time_get_ns() and get_base_addr() to test_progs.c, so they can be
>> used in other tests.
>> 
>> Signed-off-by: Song Liu 
>> ---
>> .../selftests/bpf/prog_tests/attach_probe.c   | 21 -
>> .../selftests/bpf/prog_tests/test_overhead.c  |  8 -
>> tools/testing/selftests/bpf/test_progs.c  | 30 +++
>> tools/testing/selftests/bpf/test_progs.h  |  2 ++
>> 4 files changed, 32 insertions(+), 29 deletions(-)
>> 
> 
> [...]
> 
>> static int test_task_rename(const char *prog)
>> {
>>int i, fd, duration = 0, err;
>> diff --git a/tools/testing/selftests/bpf/test_progs.c 
>> b/tools/testing/selftests/bpf/test_progs.c
>> index b1e4dadacd9b4..c9e6a5ad5b9a4 100644
>> --- a/tools/testing/selftests/bpf/test_progs.c
>> +++ b/tools/testing/selftests/bpf/test_progs.c
>> @@ -622,6 +622,36 @@ int cd_flavor_subdir(const char *exec_name)
>>return chdir(flavor);
>> }
>> 
>> +__u64 time_get_ns(void)
>> +{
> 
> I'd try to avoid adding stuff to test_progs.c. There is generic
> testing_helpers.c, maybe let's put this there?
> 
>> +   struct timespec ts;
>> +
>> +   clock_gettime(CLOCK_MONOTONIC, );
>> +   return ts.tv_sec * 10ull + ts.tv_nsec;
>> +}
>> +
>> +ssize_t get_base_addr(void)
>> +{
> 
> This would definitely be better in trace_helpers.c, though.

Will update. 

Thanks,
Song

Re: [PATCH bpf-next 3/5] selftests/bpf: add selftest for BPF_PROG_TYPE_USER

2020-08-02 Thread Song Liu



> On Aug 2, 2020, at 6:43 PM, Andrii Nakryiko  wrote:
> 
> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
>> 
>> This test checks the correctness of BPF_PROG_TYPE_USER program, including:
>> running on the right cpu, passing in correct args, returning retval, and
>> being able to call bpf_get_stack|stackid.
>> 
>> Signed-off-by: Song Liu 
>> ---
>> .../selftests/bpf/prog_tests/user_prog.c  | 52 +
>> tools/testing/selftests/bpf/progs/user_prog.c | 56 +++
>> 2 files changed, 108 insertions(+)
>> create mode 100644 tools/testing/selftests/bpf/prog_tests/user_prog.c
>> create mode 100644 tools/testing/selftests/bpf/progs/user_prog.c
>> 
>> diff --git a/tools/testing/selftests/bpf/prog_tests/user_prog.c 
>> b/tools/testing/selftests/bpf/prog_tests/user_prog.c
>> new file mode 100644
>> index 0..416707b3bff01
>> --- /dev/null
>> +++ b/tools/testing/selftests/bpf/prog_tests/user_prog.c
>> @@ -0,0 +1,52 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/* Copyright (c) 2020 Facebook */
>> +#include 
>> +#include "user_prog.skel.h"
>> +
>> +static int duration;
>> +
>> +void test_user_prog(void)
>> +{
>> +   struct bpf_user_prog_args args = {{0, 1, 2, 3, 4}};
>> +   struct bpf_prog_test_run_attr attr = {};
>> +   struct user_prog *skel;
>> +   int i, numcpu, ret;
>> +
>> +   skel = user_prog__open_and_load();
>> +
>> +   if (CHECK(!skel, "user_prog__open_and_load",
>> + "skeleton open_and_laod failed\n"))
>> +   return;
>> +
>> +   numcpu = libbpf_num_possible_cpus();
> 
> nit: possible doesn't mean online right now, so it will fail on
> offline or non-present CPUs

Just found parse_cpu_mask_file(), will use it to fix this. 

[...]

>> +
>> +volatile int cpu_match = 1;
>> +volatile __u64 sum = 1;
>> +volatile int get_stack_success = 0;
>> +volatile int get_stackid_success = 0;
>> +volatile __u64 stacktrace[PERF_MAX_STACK_DEPTH];
> 
> nit: no need for volatile for non-static variables
> 
>> +
>> +SEC("user")
>> +int user_func(struct bpf_user_prog_ctx *ctx)
> 
> If you put args in bpf_user_prog_ctx as a first field, you should be
> able to re-use the BPF_PROG macro to access those arguments in a more
> user-friendly way.

I am not sure I am following here. Do you mean something like:

struct bpf_user_prog_ctx {
__u64 args[BPF_USER_PROG_MAX_ARGS];
struct pt_regs *regs;
};

(swap args and regs)? 

Thanks,
Song




Re: [PATCH] vop: Add missing __iomem annotation in vop_dc_to_vdev()

2020-08-02 Thread Dixit, Ashutosh
On Sun, 02 Aug 2020 21:24:01 -0700, Greg Kroah-Hartman wrote:
>
> On Sun, Aug 02, 2020 at 04:28:12PM -0700, Ashutosh Dixit wrote:
> > Fix the following sparse warnings in drivers/misc/mic/vop//vop_main.c:
> >
> > 551:58: warning: incorrect type in argument 1 (different address spaces)
> > 551:58:expected void const volatile [noderef] __iomem *addr
> > 551:58:got restricted __le64 *
> > 560:49: warning: incorrect type in argument 1 (different address spaces)
> > 560:49:expected struct mic_device_ctrl *dc
> > 560:49:got struct mic_device_ctrl [noderef] __iomem *dc
> > 579:49: warning: incorrect type in argument 1 (different address spaces)
> > 579:49:expected struct mic_device_ctrl *dc
> > 579:49:got struct mic_device_ctrl [noderef] __iomem *dc
> >
> > Cc: Michael S. Tsirkin 
> > Cc: Sudeep Dutt 
> > Cc: Arnd Bergmann 
> > Cc: Vincent Whitchurch 
> > Cc: stable 
>
> Why is this a stable thing?  It doesn't fix a real bug, and sparse
> warnings are not needed for stable trees, unless this is the last sparse
> warning there.

It is the last sparse warning. Sorry I wasn't sure about stable so I
thought might as well. Please ignore if it's not required. Thanks.


RE: [PATCH v2 7/9] usb: cdns3: core: removed 'goto not_otg'

2020-08-02 Thread Pawel Laszczak
>
>On 20-07-13 12:05:52, Pawel Laszczak wrote:
>> Patch removes 'goto not_otg' instruction from
>> cdns3_hw_role_state_machine function.
>>
>> Signed-off-by: Pawel Laszczak 
>> ---
>>  drivers/usb/cdns3/core.c | 20 +---
>>  1 file changed, 9 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/usb/cdns3/core.c b/drivers/usb/cdns3/core.c
>> index c498b585eb13..8e3996f211a8 100644
>> --- a/drivers/usb/cdns3/core.c
>> +++ b/drivers/usb/cdns3/core.c
>> @@ -191,11 +191,17 @@ static int cdns3_core_init_role(struct cdns3 *cdns)
>>   */
>>  static enum usb_role cdns3_hw_role_state_machine(struct cdns3 *cdns)
>>  {
>> -enum usb_role role;
>> +enum usb_role role = USB_ROLE_NONE;
>>  int id, vbus;
>>
>> -if (cdns->dr_mode != USB_DR_MODE_OTG)
>> -goto not_otg;
>> +if (cdns->dr_mode != USB_DR_MODE_OTG) {
>> +if (cdns3_is_host(cdns))
>> +role = USB_ROLE_HOST;
>> +if (cdns3_is_device(cdns))
>> +role = USB_ROLE_DEVICE;
>> +
>> +return role;
>> +}
>
>Would you please improve it a bit like below:
>
>   if (cdns->dr_mode != USB_DR_MODE_OTG) {
>   if (cdns3_is_host(cdns))
>   role = USB_ROLE_HOST;
>   else if (cdns3_is_device(cdns))
>   role = USB_ROLE_DEVICE;
>   else
>   role = USB_ROLE_NONE;
>
>   return role;
>   }
>

Sorry for delay, I had holiday. 
Currently this patch was added by Greg to his usb-next branch, so 
I don't want to change anything.  Next time I will add such changes. 

>Peter
>>
>>  id = cdns3_get_id(cdns);
>>  vbus = cdns3_get_vbus(cdns);
>> @@ -232,14 +238,6 @@ static enum usb_role cdns3_hw_role_state_machine(struct 
>> cdns3 *cdns)
>>  dev_dbg(cdns->dev, "role %d -> %d\n", cdns->role, role);
>>
>>  return role;
>> -
>> -not_otg:
>> -if (cdns3_is_host(cdns))
>> -role = USB_ROLE_HOST;
>> -if (cdns3_is_device(cdns))
>> -role = USB_ROLE_DEVICE;
>> -
>> -return role;
>>  }
>>
>>  static int cdns3_idle_role_start(struct cdns3 *cdns)
>> --
>> 2.17.1
>>
>
>--
>
>Thanks,
>Peter Chen

Thanks,
Pawel


[PATCH v6] scsi: ufs: Quiesce all scsi devices before shutdown

2020-08-02 Thread Stanley Chu
Currently I/O request could be still submitted to UFS device while
UFS is working on shutdown flow. This may lead to racing as below
scenarios and finally system may crash due to unclocked register
accesses.

To fix this kind of issues, in ufshcd_shutdown(),

1. Use pm_runtime_get_sync() instead of resuming UFS device by
   ufshcd_runtime_resume() "internally" to let runtime PM framework
   manage and prevent concurrent runtime operations by incoming I/O
   requests.

2. Specifically quiesce all SCSI devices to block all I/O requests
   after device is resumed.

Example of racing scenario: While UFS device is runtime-suspended

Thread #1: Executing UFS shutdown flow, e.g.,
   ufshcd_suspend(UFS_SHUTDOWN_PM)

Thread #2: Executing runtime resume flow triggered by I/O request,
   e.g., ufshcd_resume(UFS_RUNTIME_PM)

This breaks the assumption that UFS PM flows can not be running
concurrently and some unexpected racing behavior may happen.

Signed-off-by: Stanley Chu 
---
Changes:
  - Since v4: Use pm_runtime_get_sync() instead of resuming UFS device by 
ufshcd_runtime_resume() "internally".
---
 drivers/scsi/ufs/ufshcd.c | 39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 307622284239..fc01171d13b1 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -159,6 +159,12 @@ struct ufs_pm_lvl_states ufs_pm_lvl_states[] = {
{UFS_POWERDOWN_PWR_MODE, UIC_LINK_OFF_STATE},
 };
 
+#define ufshcd_scsi_for_each_sdev(fn) \
+   list_for_each_entry(starget, >host->__targets, siblings) { \
+   __starget_for_each_device(starget, NULL, \
+ fn); \
+   }
+
 static inline enum ufs_dev_pwr_mode
 ufs_get_pm_lvl_to_dev_pwr_mode(enum ufs_pm_level lvl)
 {
@@ -8629,6 +8635,13 @@ int ufshcd_runtime_idle(struct ufs_hba *hba)
 }
 EXPORT_SYMBOL(ufshcd_runtime_idle);
 
+static void ufshcd_quiesce_sdev(struct scsi_device *sdev, void *data)
+{
+   /* Suspended devices are already quiesced so can be skipped */
+   if (!pm_runtime_suspended(>sdev_gendev))
+   scsi_device_quiesce(sdev);
+}
+
 /**
  * ufshcd_shutdown - shutdown routine
  * @hba: per adapter instance
@@ -8640,6 +8653,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
 int ufshcd_shutdown(struct ufs_hba *hba)
 {
int ret = 0;
+   struct scsi_target *starget;
 
if (!hba->is_powered)
goto out;
@@ -8647,11 +8661,26 @@ int ufshcd_shutdown(struct ufs_hba *hba)
if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
goto out;
 
-   if (pm_runtime_suspended(hba->dev)) {
-   ret = ufshcd_runtime_resume(hba);
-   if (ret)
-   goto out;
-   }
+   /*
+* Let runtime PM framework manage and prevent concurrent runtime
+* operations with shutdown flow.
+*/
+   pm_runtime_get_sync(hba->dev);
+
+   /*
+* Quiesce all SCSI devices to prevent any non-PM requests sending
+* from block layer during and after shutdown.
+*
+* Here we can not use blk_cleanup_queue() since PM requests
+* (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
+* through block layer. Therefore SCSI command queued after the
+* scsi_target_quiesce() call returned will block until
+* blk_cleanup_queue() is called.
+*
+* Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
+* be ignored since shutdown is one-way flow.
+*/
+   ufshcd_scsi_for_each_sdev(ufshcd_quiesce_sdev);
 
ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
 out:
-- 
2.18.0


Re: [PATCH] vop: Add missing __iomem annotation in vop_dc_to_vdev()

2020-08-02 Thread Greg Kroah-Hartman
On Sun, Aug 02, 2020 at 04:28:12PM -0700, Ashutosh Dixit wrote:
> Fix the following sparse warnings in drivers/misc/mic/vop//vop_main.c:
> 
> 551:58: warning: incorrect type in argument 1 (different address spaces)
> 551:58:expected void const volatile [noderef] __iomem *addr
> 551:58:got restricted __le64 *
> 560:49: warning: incorrect type in argument 1 (different address spaces)
> 560:49:expected struct mic_device_ctrl *dc
> 560:49:got struct mic_device_ctrl [noderef] __iomem *dc
> 579:49: warning: incorrect type in argument 1 (different address spaces)
> 579:49:expected struct mic_device_ctrl *dc
> 579:49:got struct mic_device_ctrl [noderef] __iomem *dc
> 
> Cc: Michael S. Tsirkin 
> Cc: Sudeep Dutt 
> Cc: Arnd Bergmann 
> Cc: Vincent Whitchurch 
> Cc: stable 

Why is this a stable thing?  It doesn't fix a real bug, and sparse
warnings are not needed for stable trees, unless this is the last sparse
warning there.

thanks,

greg k-h


Re: [PATCH bpf-next 2/5] libbpf: support BPF_PROG_TYPE_USER programs

2020-08-02 Thread Song Liu



> On Aug 2, 2020, at 6:40 PM, Andrii Nakryiko  wrote:
> 
> On Sat, Aug 1, 2020 at 1:50 AM Song Liu  wrote:
>> 
>> Add cpu_plus to bpf_prog_test_run_attr. Add BPF_PROG_SEC "user" for
>> BPF_PROG_TYPE_USER programs.
>> 
>> Signed-off-by: Song Liu 
>> ---
>> tools/lib/bpf/bpf.c   | 1 +
>> tools/lib/bpf/bpf.h   | 3 +++
>> tools/lib/bpf/libbpf.c| 1 +
>> tools/lib/bpf/libbpf_probes.c | 1 +
>> 4 files changed, 6 insertions(+)
>> 
>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>> index e1bdf214f75fe..b28c3daa9c270 100644
>> --- a/tools/lib/bpf/bpf.c
>> +++ b/tools/lib/bpf/bpf.c
>> @@ -693,6 +693,7 @@ int bpf_prog_test_run_xattr(struct 
>> bpf_prog_test_run_attr *test_attr)
>>attr.test.ctx_size_in = test_attr->ctx_size_in;
>>attr.test.ctx_size_out = test_attr->ctx_size_out;
>>attr.test.repeat = test_attr->repeat;
>> +   attr.test.cpu_plus = test_attr->cpu_plus;
>> 
>>ret = sys_bpf(BPF_PROG_TEST_RUN, , sizeof(attr));
>>test_attr->data_size_out = attr.test.data_size_out;
>> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
>> index 6d367e01d05e9..0c799740df566 100644
>> --- a/tools/lib/bpf/bpf.h
>> +++ b/tools/lib/bpf/bpf.h
>> @@ -205,6 +205,9 @@ struct bpf_prog_test_run_attr {
>>void *ctx_out;  /* optional */
>>__u32 ctx_size_out; /* in: max length of ctx_out
>> * out: length of cxt_out */
>> +   __u32 cpu_plus; /* specify which cpu to run the test with
>> +* cpu_plus = cpu_id + 1.
>> +* If cpu_plus = 0, run on current cpu */
> 
> We can't do this due to ABI guarantees. We'll have to add a new API
> using OPTS arguments.

To make sure I understand this correctly, the concern is when we compile
the binary with one version of libbpf and run it with libbpf.so of a 
different version, right? 

Thanks,
Song

> 
>> };
>> 
>> LIBBPF_API int bpf_prog_test_run_xattr(struct bpf_prog_test_run_attr 
>> *test_attr);
>> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
>> index b9f11f854985b..9ce175a486214 100644
>> --- a/tools/lib/bpf/libbpf.c
>> +++ b/tools/lib/bpf/libbpf.c
>> @@ -6922,6 +6922,7 @@ static const struct bpf_sec_def section_defs[] = {
>>BPF_PROG_SEC("lwt_out", BPF_PROG_TYPE_LWT_OUT),
>>BPF_PROG_SEC("lwt_xmit",BPF_PROG_TYPE_LWT_XMIT),
>>BPF_PROG_SEC("lwt_seg6local",   BPF_PROG_TYPE_LWT_SEG6LOCAL),
>> +   BPF_PROG_SEC("user",BPF_PROG_TYPE_USER),
> 
> let's do "user/" for consistency with most other prog types (and nice
> separation between prog type and custom user name)

Will update. 



[PATCH v5] scsi: ufs: Quiesce all scsi devices before shutdown

2020-08-02 Thread Stanley Chu
Currently I/O request could be still submitted to UFS device while
UFS is working on shutdown flow. This may lead to racing as below
scenarios and finally system may crash due to unclocked register
accesses.

To fix this kind of issues, in ufshcd_shutdown(),

1. Use pm_runtime_get_sync() instead of resuming UFS device by
   ufshcd_runtime_resume() "internally" to let runtime PM framework
   manage and prevent concurrent runtime operations by incoming I/O
   requests.

2. Specifically quiesce all SCSI devices to block all I/O requests
   after device is resumed.

Example of racing scenario: While UFS device is runtime-suspended

Thread #1: Executing UFS shutdown flow, e.g.,
   ufshcd_suspend(UFS_SHUTDOWN_PM)

Thread #2: Executing runtime resume flow triggered by I/O request,
   e.g., ufshcd_resume(UFS_RUNTIME_PM)

This breaks the assumption that UFS PM flows can not be running
concurrently and some unexpected racing behavior may happen.

Signed-off-by: Stanley Chu 
---
 drivers/scsi/ufs/ufshcd.c | 40 ++-
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index 307622284239..e5b99f1b826a 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -159,6 +159,12 @@ struct ufs_pm_lvl_states ufs_pm_lvl_states[] = {
{UFS_POWERDOWN_PWR_MODE, UIC_LINK_OFF_STATE},
 };
 
+#define ufshcd_scsi_for_each_sdev(fn) \
+   list_for_each_entry(starget, >host->__targets, siblings) { \
+   __starget_for_each_device(starget, NULL, \
+ fn); \
+   }
+
 static inline enum ufs_dev_pwr_mode
 ufs_get_pm_lvl_to_dev_pwr_mode(enum ufs_pm_level lvl)
 {
@@ -8629,6 +8635,13 @@ int ufshcd_runtime_idle(struct ufs_hba *hba)
 }
 EXPORT_SYMBOL(ufshcd_runtime_idle);
 
+static void ufshcd_quiesce_sdev(struct scsi_device *sdev, void *data)
+{
+   /* Suspended devices are already quiesced so can be skipped */
+   if (!pm_runtime_suspended(>sdev_gendev))
+   scsi_device_quiesce(sdev);
+}
+
 /**
  * ufshcd_shutdown - shutdown routine
  * @hba: per adapter instance
@@ -8640,6 +8653,7 @@ EXPORT_SYMBOL(ufshcd_runtime_idle);
 int ufshcd_shutdown(struct ufs_hba *hba)
 {
int ret = 0;
+   struct scsi_target *starget;
 
if (!hba->is_powered)
goto out;
@@ -8647,11 +8661,27 @@ int ufshcd_shutdown(struct ufs_hba *hba)
if (ufshcd_is_ufs_dev_poweroff(hba) && ufshcd_is_link_off(hba))
goto out;
 
-   if (pm_runtime_suspended(hba->dev)) {
-   ret = ufshcd_runtime_resume(hba);
-   if (ret)
-   goto out;
-   }
+   /*
+* Let runtime PM framework manage and prevent concurrent runtime
+* operations with shutdown flow.
+*/
+   if (pm_runtime_get_sync(hba->dev))
+   pm_runtime_put_noidle(hba->dev);
+
+   /*
+* Quiesce all SCSI devices to prevent any non-PM requests sending
+* from block layer during and after shutdown.
+*
+* Here we can not use blk_cleanup_queue() since PM requests
+* (with BLK_MQ_REQ_PREEMPT flag) are still required to be sent
+* through block layer. Therefore SCSI command queued after the
+* scsi_target_quiesce() call returned will block until
+* blk_cleanup_queue() is called.
+*
+* Besides, scsi_target_"un"quiesce (e.g., scsi_target_resume) can
+* be ignored since shutdown is one-way flow.
+*/
+   ufshcd_scsi_for_each_sdev(ufshcd_quiesce_sdev);
 
ret = ufshcd_suspend(hba, UFS_SHUTDOWN_PM);
 out:
-- 
2.18.0


Re: [PATCH] x86/purgatory: strip debug info

2020-08-02 Thread Pingfan Liu
On Sat, Aug 1, 2020 at 2:18 AM Nick Desaulniers  wrote:
>
> On Fri, Jul 31, 2020 at 2:36 AM Pingfan Liu  wrote:
> >
> > On Fri, Jul 31, 2020 at 7:11 AM Nick Desaulniers
> >  wrote:
> > >
> > > On Thu, Jul 30, 2020 at 1:27 AM Pingfan Liu  wrote:
> > > >
> > > > It is useless to keep debug info in purgatory. And discarding them saves
> > > > about 200K space.
> > > >
> > > > Original:
> > > >   259080  kexec-purgatory.o
> > > > Stripped:
> > > >29152  kexec-purgatory.o
> > > >
> > > > Signed-off-by: Pingfan Liu 
> > > > Cc: Thomas Gleixner 
> > > > Cc: Ingo Molnar 
> > > > Cc: Borislav Petkov 
> > > > Cc: "H. Peter Anvin" 
> > > > Cc: Hans de Goede 
> > > > Cc: Nick Desaulniers 
> > > > Cc: Arvind Sankar 
> > > > Cc: Steve Wahl 
> > > > Cc: linux-kernel@vger.kernel.org
> > > > To: x...@kernel.org
> > >
> > > I don't see any code in
> > > arch/x86/purgatory/
> > > arch/x86/include/asm/purgatory.h
> > > include/linux/purgatory.h
> > > include/uapi/linux/kexec.h
> > > kernel/kexec*
> > > include/linux/kexec.h
> > > include/linux/crash_dump.h
> > > kernel/crash_dump.c
> > > arch/x86/kernel/crash*
> > > https://github.com/horms/kexec-tools/tree/master/kexec/arch/x86_64
> > > that mentions any kind of debug info section.  I'm not sure what you'd
> > > do with the debug info anyway for this binary.  So I suspect this
> > > information should ok to discard.
> > >
> > > This works, but it might be faster to build to not generate the
> > > compile info in the first place via compile flag `-g0`, which could be
> > > added `ifdef CONFIG_DEBUG_INFO` or even just unconditionally.  That
> > > way we're not doing additional work to generate debug info, then
> > > additional work to throw it away.
> > What about:
> > diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile
> > index 088bd76..7e1ad9e 100644
> > --- a/arch/x86/purgatory/Makefile
> > +++ b/arch/x86/purgatory/Makefile
> > @@ -32,7 +32,7 @@ KCOV_INSTRUMENT := n
> >  # make up the standalone purgatory.ro
> >
> >  PURGATORY_CFLAGS_REMOVE := -mcmodel=kernel
> > -PURGATORY_CFLAGS := -mcmodel=large -ffreestanding 
> > -fno-zero-initialized-in-bss
> > +PURGATORY_CFLAGS := -mcmodel=large -ffreestanding
> > -fno-zero-initialized-in-bss -g0
> >  PURGATORY_CFLAGS += $(DISABLE_STACKLEAK_PLUGIN) -DDISABLE_BRANCH_PROFILING
> >  PURGATORY_CFLAGS += $(call cc-option,-fno-stack-protector)
>
> I tested your patch but still see .debug_* sections in the .ro from a few .o.
>
> At least on
> * setup-x86_64.o
> * entry64.o
>
> If you add the following hunk to your diff:
> ```
> @@ -64,6 +64,9 @@ CFLAGS_sha256.o   += $(PURGATORY_CFLAGS)
>  CFLAGS_REMOVE_string.o += $(PURGATORY_CFLAGS_REMOVE)
>  CFLAGS_string.o+= $(PURGATORY_CFLAGS)
>
> +AFLAGS_REMOVE_setup-x86_$(BITS).o  += -Wa,-gdwarf-2
> +AFLAGS_REMOVE_entry64.o+= -Wa,-gdwarf-2
> +
Go through man as and gcc, and can not find a simpler method than your
suggestion.
>  $(obj)/purgatory.ro: $(PURGATORY_OBJS) FORCE
> $(call if_changed,ld)
> ```
> then that should do it.  Then you can verify the .ro file via:
> $ llvm-readelf -S arch/x86/purgatory/purgatory.ro | not grep debug_
> (no output, should return zero)
Thank you for your good suggestion and I will update V2

Regards,
Pingfan


Re: powerpc: build failures in Linus' tree

2020-08-02 Thread Stephen Rothwell
Hi Willy,

On Mon, 3 Aug 2020 05:45:47 +0200 Willy Tarreau  wrote:
>
> On Sun, Aug 02, 2020 at 07:20:19PM +0200, Willy Tarreau wrote:
> > On Sun, Aug 02, 2020 at 08:48:42PM +1000, Stephen Rothwell wrote:
> > > 
> > > We are getting build failures in some PowerPC configs for Linus' tree.
> > > See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/
> > > 
> > > In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18,
> > >  from /kisskb/src/arch/powerpc/include/asm/percpu.h:13,
> > >  from /kisskb/src/include/linux/random.h:14,
> > >  from /kisskb/src/include/linux/net.h:18,
> > >  from /kisskb/src/net/ipv6/ip6_fib.c:20:
> > > /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type 
> > > name 'next_tlbcam_idx'
> > >   139 | DECLARE_PER_CPU(int, next_tlbcam_idx);
> > > 
> > > I assume this is caused by commit
> > > 
> > >   1c9df907da83 ("random: fix circular include dependency on arm64 after 
> > > addition of percpu.h")
> > > 
> > > But I can't see how, sorry.
> > 
> > So there, asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
> > includes asm/mmu.h.
> > 
> > I suspect that we can remove asm/paca.h from asm/percpu.h as it *seems*
> > to be only used by the #define __my_cpu_offset but I don't know if anything
> > will break further, especially if this __my_cpu_offset is used anywhere
> > without this paca definition.
> 
> I tried this and it fixed 5.8 for me with your config above. I'm appending
> a patch that does just this. I didn't test other configs as I don't know
> which ones to test though. If it fixes the problem for you, maybe it can
> be picked by the PPC maintainers.

Our mails have crossed.  I just sent a more comprehensive patch.  I
think your patch would require a lot of build testing and even then may
fail for some CONFIG combination that we didn't test or added in the
future (or someone just made up).

-- 
Cheers,
Stephen Rothwell


pgpLpiFIouJka.pgp
Description: OpenPGP digital signature


Re: [PATCH v4 2/2] soc: mediatek: add mtk-devapc driver

2020-08-02 Thread Neal Liu
Hi Chun-Kuang,

On Sun, 2020-08-02 at 07:50 +0800, Chun-Kuang Hu wrote:
> Hi, Neal:
> 
> Neal Liu  於 2020年7月29日 週三 下午4:29寫道:
> >
> > MediaTek bus fabric provides TrustZone security support and data
> > protection to prevent slaves from being accessed by unexpected
> > masters.
> > The security violation is logged and sent to the processor for
> > further analysis or countermeasures.
> >
> > Any occurrence of security violation would raise an interrupt, and
> > it will be handled by mtk-devapc driver. The violation
> > information is printed in order to find the murderer.
> >
> > Signed-off-by: Neal Liu 
> > ---
> 
> [snip]
> 
> > +
> > +struct mtk_devapc_context {
> > +   struct device *dev;
> > +   u32 vio_idx_num;
> > +   void __iomem *devapc_pd_base;
> 
> This is devapc context, so prefix 'devapc' is redundant.
> And, what does 'pd' mean?

'pd' means power down. Of course we would also remove it as well.
I suggest to change it as 'infra_base'.

> 
> Regards,
> Chun-Kuang.
> 
> > +   struct mtk_devapc_vio_info *vio_info;
> > +   const struct mtk_devapc_pd_offset *offset;
> > +   const struct mtk_devapc_vio_dbgs *vio_dbgs;
> > +};
> > +



[PATCH v4 0/2] Add documentation and machine driver for SC7180 sound card

2020-08-02 Thread Cheng-Yi Chiang
Note:
- The machine driver patch depends on LPASS patch series so it is not ready to 
be merged now.
  ASoC: qcom: Add support for SC7180 lpass variant 
https://patchwork.kernel.org/cover/11678133/
- The machine driver patch is made by the collaboration of
  Cheng-Yi Chiang 
  Rohit kumar 
  Ajit Pandey 
  But Ajit has left codeaurora.

Changes from v1 to v2:
- Ducumentation: Addressed all suggestions from Doug.
- Machine driver:
  - Fix comment style for license.
  - Sort includes.
  - Remove sc7180_snd_hw_params.
  - Remove sc7180_dai_init and use aux device instead for headset jack 
registration.
  - Statically define format for Primary MI2S.
  - Atomic is not a concern because there is mutex in card to make sure
startup and shutdown happen sequentially.
  - Fix missing return -EINVAL in startup.
  - Use static sound card.
  - Use devm_kzalloc to avoid kfree.

Changes from v2 to v3:
- Ducumentation: Addressed suggestions from Srini.
- Machine driver:
  - Reuse qcom_snd_parse_of to parse properties.
  - Remove playback-only and capture-only.
  - Misc fixes to address comments.

Changes from v3 to v4:
- Ducumentation: Addressed suggestions from Rob.
 - Remove definition of dai.
 - Use 'sound-dai: true' for sound-dai schema.
 - Add reg property to pass 'make dt_binding_check' check although reg is not 
used in the driver.
- Machine driver:
 - Add Reviewed-by: Tzung-Bi Shih 

Ajit Pandey (1):
  ASoC: qcom: sc7180: Add machine driver for sound card registration

Cheng-Yi Chiang (1):
  ASoC: qcom: dt-bindings: Add sc7180 machine bindings

 .../bindings/sound/qcom,sc7180.yaml   | 113 
 sound/soc/qcom/Kconfig|  12 +
 sound/soc/qcom/Makefile   |   2 +
 sound/soc/qcom/sc7180.c   | 244 ++
 4 files changed, 371 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/qcom,sc7180.yaml
 create mode 100644 sound/soc/qcom/sc7180.c

-- 
2.28.0.163.g6104cc2f0b6-goog



[PATCH v4 2/2] ASoC: qcom: sc7180: Add machine driver for sound card registration

2020-08-02 Thread Cheng-Yi Chiang
From: Ajit Pandey 

Add new driver to register sound card on sc7180 trogdor board and
do the required configuration for lpass cpu dai and external codecs
connected over MI2S interfaces.

Signed-off-by: Ajit Pandey 
Signed-off-by: Cheng-Yi Chiang 
Reviewed-by: Tzung-Bi Shih 
---
 sound/soc/qcom/Kconfig  |  12 ++
 sound/soc/qcom/Makefile |   2 +
 sound/soc/qcom/sc7180.c | 244 
 3 files changed, 258 insertions(+)
 create mode 100644 sound/soc/qcom/sc7180.c

diff --git a/sound/soc/qcom/Kconfig b/sound/soc/qcom/Kconfig
index 5d6b2466a2f2..54aa2ede229c 100644
--- a/sound/soc/qcom/Kconfig
+++ b/sound/soc/qcom/Kconfig
@@ -110,3 +110,15 @@ config SND_SOC_SDM845
  To add support for audio on Qualcomm Technologies Inc.
  SDM845 SoC-based systems.
  Say Y if you want to use audio device on this SoCs.
+
+config SND_SOC_SC7180
+   tristate "SoC Machine driver for SC7180 boards"
+   depends on SND_SOC_QCOM
+   select SND_SOC_QCOM_COMMON
+   select SND_SOC_LPASS_SC7180
+   select SND_SOC_MAX98357A
+   select SND_SOC_RT5682
+   help
+To add support for audio on Qualcomm Technologies Inc.
+SC7180 SoC-based systems.
+Say Y if you want to use audio device on this SoCs.
diff --git a/sound/soc/qcom/Makefile b/sound/soc/qcom/Makefile
index 41b2c7a23a4d..3f6275d90526 100644
--- a/sound/soc/qcom/Makefile
+++ b/sound/soc/qcom/Makefile
@@ -15,12 +15,14 @@ snd-soc-storm-objs := storm.o
 snd-soc-apq8016-sbc-objs := apq8016_sbc.o
 snd-soc-apq8096-objs := apq8096.o
 snd-soc-sdm845-objs := sdm845.o
+snd-soc-sc7180-objs := sc7180.o
 snd-soc-qcom-common-objs := common.o
 
 obj-$(CONFIG_SND_SOC_STORM) += snd-soc-storm.o
 obj-$(CONFIG_SND_SOC_APQ8016_SBC) += snd-soc-apq8016-sbc.o
 obj-$(CONFIG_SND_SOC_MSM8996) += snd-soc-apq8096.o
 obj-$(CONFIG_SND_SOC_SDM845) += snd-soc-sdm845.o
+obj-$(CONFIG_SND_SOC_SC7180) += snd-soc-sc7180.o
 obj-$(CONFIG_SND_SOC_QCOM_COMMON) += snd-soc-qcom-common.o
 
 #DSP lib
diff --git a/sound/soc/qcom/sc7180.c b/sound/soc/qcom/sc7180.c
new file mode 100644
index ..7849376f63ba
--- /dev/null
+++ b/sound/soc/qcom/sc7180.c
@@ -0,0 +1,244 @@
+// SPDX-License-Identifier: GPL-2.0-only
+//
+// Copyright (c) 2020, The Linux Foundation. All rights reserved.
+//
+// sc7180.c -- ALSA SoC Machine driver for SC7180
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../codecs/rt5682.h"
+#include "common.h"
+#include "lpass.h"
+
+#define DEFAULT_SAMPLE_RATE_48K48000
+#define DEFAULT_MCLK_RATE  1920
+#define RT5682_PLL1_FREQ (48000 * 512)
+
+struct sc7180_snd_data {
+   struct snd_soc_jack jack;
+   u32 pri_mi2s_clk_count;
+};
+
+static void sc7180_jack_free(struct snd_jack *jack)
+{
+   struct snd_soc_component *component = jack->private_data;
+
+   snd_soc_component_set_jack(component, NULL, NULL);
+}
+
+static int sc7180_headset_init(struct snd_soc_component *component)
+{
+   struct snd_soc_card *card = component->card;
+   struct sc7180_snd_data *pdata = snd_soc_card_get_drvdata(card);
+   struct snd_jack *jack;
+   int rval;
+
+   rval = snd_soc_card_jack_new(
+   card, "Headset Jack",
+   SND_JACK_HEADSET |
+   SND_JACK_HEADPHONE |
+   SND_JACK_BTN_0 | SND_JACK_BTN_1 |
+   SND_JACK_BTN_2 | SND_JACK_BTN_3,
+   >jack, NULL, 0);
+
+   if (rval < 0) {
+   dev_err(card->dev, "Unable to add Headset Jack\n");
+   return rval;
+   }
+
+   jack = pdata->jack.jack;
+
+   snd_jack_set_key(jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
+   snd_jack_set_key(jack, SND_JACK_BTN_1, KEY_VOICECOMMAND);
+   snd_jack_set_key(jack, SND_JACK_BTN_2, KEY_VOLUMEUP);
+   snd_jack_set_key(jack, SND_JACK_BTN_3, KEY_VOLUMEDOWN);
+
+   jack->private_data = component;
+   jack->private_free = sc7180_jack_free;
+
+   rval = snd_soc_component_set_jack(component,
+ >jack, NULL);
+   if (rval != 0 && rval != -EOPNOTSUPP) {
+   dev_warn(card->dev, "Failed to set jack: %d\n", rval);
+   return rval;
+   }
+
+   return 0;
+}
+
+static struct snd_soc_aux_dev sc7180_headset_dev = {
+   .dlc = COMP_EMPTY(),
+   .init = sc7180_headset_init,
+};
+
+static int sc7180_snd_startup(struct snd_pcm_substream *substream)
+{
+   struct snd_soc_pcm_runtime *rtd = substream->private_data;
+   struct snd_soc_card *card = rtd->card;
+   struct sc7180_snd_data *data = snd_soc_card_get_drvdata(card);
+   struct snd_soc_dai *cpu_dai = asoc_rtd_to_cpu(rtd, 0);
+   struct snd_soc_dai *codec_dai = asoc_rtd_to_codec(rtd, 0);
+   int ret;
+
+   switch (cpu_dai->id) {
+   case MI2S_PRIMARY:
+   if (++data->pri_mi2s_clk_count 

Re: [PATCH v4 2/2] soc: mediatek: add mtk-devapc driver

2020-08-02 Thread Neal Liu
Hi Chun-Kuang,

On Sat, 2020-08-01 at 08:12 +0800, Chun-Kuang Hu wrote:
> Hi, Neal:
> 
> This patch is for "mediatek,mt6779-devapc", so I think commit title
> should show the SoC ID.

Okay, I'll change title to 'soc:mediatek: add mt6779 devapc driver'.

> 
> Neal Liu  於 2020年7月29日 週三 下午4:29寫道:
> >
> > MediaTek bus fabric provides TrustZone security support and data
> > protection to prevent slaves from being accessed by unexpected
> > masters.
> > The security violation is logged and sent to the processor for
> > further analysis or countermeasures.
> >
> > Any occurrence of security violation would raise an interrupt, and
> > it will be handled by mtk-devapc driver. The violation
> > information is printed in order to find the murderer.
> >
> > Signed-off-by: Neal Liu 
> > ---
> 
> [snip]
> 
> > +
> > +struct mtk_devapc_context {
> > +   struct device *dev;
> > +   u32 vio_idx_num;
> > +   void __iomem *devapc_pd_base;
> > +   struct mtk_devapc_vio_info *vio_info;
> > +   const struct mtk_devapc_pd_offset *offset;
> > +   const struct mtk_devapc_vio_dbgs *vio_dbgs;
> > +};
> 
> I think this structure should separate the constant part. The constant part 
> is:
> 
> struct mtk_devapc_data {
> const u32 vio_idx_num;
> const struct mtk_devapc_pd_offset *offset; /* I would like to
> remove struct mtk_devapc_pd_offset and directly put its member into
> this structure */
> const struct mtk_devapc_vio_dbgs *vio_dbgs; /* This may disappear */
> };
> 
> And the context is:
> 
> struct mtk_devapc_context {
> struct device *dev;
> void __iomem *devapc_pd_base;
> const struct mtk_devapc_data *data;
> };
> 
> So when you define this, you would not waste memory to store non-constant 
> data.
> 
> static const struct mtk_devapc_data devapc_mt6779 = {
>  .vio_idx_num = 510,
>  .offset = _pd_offset,
>  .vio_dbgs = _vio_dbgs,
> };
> 

Sorry, I still don't understand why this refactoring will not waste
memory to store non-constant data. Could you explain more details?
To my understanding, we still also have to allocate memory to store dev
& devapc_pd_base.

> Regards,
> Chun-Kuang.
> 
> > +
> > +#endif /* __MTK_DEVAPC_H__ */
> > --
> > 1.7.9.5
> > ___
> > Linux-mediatek mailing list
> > linux-media...@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-mediatek



[PATCH v4 1/2] ASoC: qcom: dt-bindings: Add sc7180 machine bindings

2020-08-02 Thread Cheng-Yi Chiang
Add devicetree bindings documentation file for sc7180 sound card.

Signed-off-by: Cheng-Yi Chiang 
---
 .../bindings/sound/qcom,sc7180.yaml   | 113 ++
 1 file changed, 113 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/sound/qcom,sc7180.yaml

diff --git a/Documentation/devicetree/bindings/sound/qcom,sc7180.yaml 
b/Documentation/devicetree/bindings/sound/qcom,sc7180.yaml
new file mode 100644
index ..c74f0fe9fb3b
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/qcom,sc7180.yaml
@@ -0,0 +1,113 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/sound/qcom,sc7180.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Qualcomm Technologies Inc. SC7180 ASoC sound card driver
+
+maintainers:
+  - Rohit kumar 
+  - Cheng-Yi Chiang 
+
+description:
+  This binding describes the SC7180 sound card which uses LPASS for audio.
+
+properties:
+  compatible:
+contains:
+  const: qcom,sc7180-sndcard
+
+  audio-routing:
+$ref: /schemas/types.yaml#/definitions/non-unique-string-array
+description:
+  A list of the connections between audio components. Each entry is a
+  pair of strings, the first being the connection's sink, the second
+  being the connection's source.
+
+  model:
+$ref: /schemas/types.yaml#/definitions/string
+description: User specified audio sound card name
+
+  aux-dev:
+$ref: /schemas/types.yaml#/definitions/phandle
+description: phandle of the codec for headset detection
+
+patternProperties:
+  "^dai-link(@[0-9]+)?$":
+description:
+  Each subnode represents a dai link. Subnodes of each dai links would be
+  cpu/codec dais.
+
+type: object
+
+properties:
+  link-name:
+description: Indicates dai-link name and PCM stream name.
+$ref: /schemas/types.yaml#/definitions/string
+maxItems: 1
+
+  reg:
+description: dai link address.
+$ref: /schemas/types.yaml#/definitions/uint32
+maxItems: 1
+
+  cpu:
+description: Holds subnode which indicates cpu dai.
+type: object
+properties:
+  sound-dai: true
+
+  codec:
+description: Holds subnode which indicates codec dai.
+type: object
+properties:
+  sound-dai: true
+
+required:
+  - link-name
+  - cpu
+  - codec
+
+additionalProperties: false
+
+examples:
+
+  - |
+sound {
+compatible = "qcom,sc7180-sndcard";
+model = "sc7180-snd-card";
+
+audio-routing =
+"Headphone Jack", "HPOL",
+"Headphone Jack", "HPOR";
+
+aux-dev = <>;
+
+#address-cells = <1>;
+#size-cells = <0>;
+
+dai-link@0 {
+link-name = "MultiMedia0";
+reg = <0>;
+cpu {
+sound-dai = <_cpu 0>;
+};
+
+codec {
+sound-dai = < 0>;
+};
+};
+
+dai-link@1 {
+link-name = "MultiMedia1";
+reg = <1>;
+cpu {
+sound-dai = <_cpu 1>;
+};
+
+codec {
+sound-dai = <>;
+};
+};
+};
-- 
2.28.0.163.g6104cc2f0b6-goog



Re: [PATCH 4/6] perf tools: Add support to store time of day in CTF data conversion

2020-08-02 Thread Namhyung Kim
On Thu, Jul 30, 2020 at 11:39:48PM +0200, Jiri Olsa wrote:
> Adding support to convert and store time of day in CTF
> data conversion for 'perf data convert' subcommand.
> 
> The perf.data used for conversion needs to have clock data
> information - must be recorded with -k/--clockid option).
> 
> New --tod option is added to 'perf data convert' subcommand
> to convert data with timestamps converted to wall clock time.
> 
> Record data with clockid set:
>   # perf record -k CLOCK_MONOTONIC kill
>   kill: not enough arguments
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.033 MB perf.data (8 samples) ]
> 
> Convert data with TOD timestamps:
>   # perf data convert --tod --to-ctf ./ctf
>   [ perf data convert: Converted 'perf.data' into CTF data './ctf' ]
>   [ perf data convert: Converted and wrote 0.000 MB (8 samples) ]
> 
> Display data in perf script:
>   # perf script -F+tod --ns
> perf 262150 2020-07-13 18:38:50.097678523 153633.958246159:   
>1 cycles: ...
> perf 262150 2020-07-13 18:38:50.097682941 153633.958250577:   
>1 cycles: ...
> perf 262150 2020-07-13 18:38:50.097684997 153633.958252633:   
>7 cycles: ...
>   ...

I believe this belongs to a later patch.

Thanks
Namhyung

> 
> Display data in babeltrace:
>   # babeltrace --clock-date  ./ctf
>   [2020-07-13 18:38:50.097678523] (+?.?) cycles: { cpu_id = 0 }, { 
> perf_ip = 0xFFF ...
>   [2020-07-13 18:38:50.097682941] (+0.04418) cycles: { cpu_id = 0 }, { 
> perf_ip = 0xFFF ...
>   [2020-07-13 18:38:50.097684997] (+0.02056) cycles: { cpu_id = 0 }, { 
> perf_ip = 0xFFF ...
>   ...
> 
> It's available only for recording with clockid specified,
> because it's the only case where we can get reference time
> to wallclock time. It's can't do that with perf clock yet.
> 
> Error is display if you want to use --tod on data without
> clockid specified:
> 
>   # perf data convert --tod --to-ctf ./ctf
>   Can't provide --tod time, missing clock data. Please record with 
> -k/--clockid option.
>   Failed to setup CTF writer.
>   Error during conversion setup.
> 
> Signed-off-by: Jiri Olsa 
> ---
>  tools/perf/Documentation/perf-data.txt |  3 ++
>  tools/perf/builtin-data.c  |  1 +
>  tools/perf/util/data-convert-bt.c  | 56 +-
>  tools/perf/util/data-convert.h |  1 +
>  4 files changed, 41 insertions(+), 20 deletions(-)


Re: [PATCH 2/6] perf tools: Store clock references for -k/--clockid option

2020-08-02 Thread Namhyung Kim
On Thu, Jul 30, 2020 at 11:39:46PM +0200, Jiri Olsa wrote:
> Adding new CLOCK_DATA feature that stores reference times
> when -k/--clockid option is specified.
> 
> It contains clock id and its reference time together with
> wall clock time taken at the 'same time', both values are
> in nanoseconds.
> 
> The format of data is as below:
> 
>   struct {
>u32 version;  /* version = 1 */
>u32 clockid;
>u64 clockid_time_ns;
>u64 wall_clock_ns;
>   };
> 
> This clock reference times will be used in following changes
> to display wall clock for perf events.
> 
> It's available only for recording with clockid specified,
> because it's the only case where we can get reference time
> to wallclock time. It's can't do that with perf clock yet.
> 
> Original-patch-by: David Ahern 
> Signed-off-by: Jiri Olsa 
> ---
>  .../Documentation/perf.data-file-format.txt   |  13 ++
>  tools/perf/builtin-record.c   |  41 +++
>  tools/perf/util/env.h |  12 ++
>  tools/perf/util/header.c  | 112 ++
>  tools/perf/util/header.h  |   1 +
>  5 files changed, 179 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf.data-file-format.txt 
> b/tools/perf/Documentation/perf.data-file-format.txt
> index b6472e463284..c484e81987c7 100644
> --- a/tools/perf/Documentation/perf.data-file-format.txt
> +++ b/tools/perf/Documentation/perf.data-file-format.txt
> @@ -389,6 +389,19 @@ struct {
>  Example:
>   cpu pmu capabilities: branches=32, max_precise=3, pmu_name=icelake
>  
> + HEADER_CLOCK_DATA = 29,
> +
> + Contains clock id and its reference time together with wall clock
> + time taken at the 'same time', both values are in nanoseconds.
> + The format of data is as below.
> +
> +struct {
> + u32 version;  /* version = 1 */
> + u32 clockid;
> + u64 clockid_time_ns;
> + u64 wall_clock_ns;
> +};
> +

It seems that it's slightly different than what it actually write to a file.
Specifically the last two fields should be reversed IMHO.


>   other bits are reserved and should ignored for now
>   HEADER_FEAT_BITS= 256,
>
[SNIP]

> diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
> index 1ab2682d5d2b..4098a63d5e64 100644
> --- a/tools/perf/util/env.h
> +++ b/tools/perf/util/env.h
> @@ -100,6 +100,18 @@ struct perf_env {
>   /* For fast cpu to numa node lookup via perf_env__numa_node */
>   int *numa_map;
>   int  nr_numa_map;
> +
> + /* For real clock time refference. */

typo: reference

> + struct {
> + u64 tod_ns;
> + u64 clockid_ns;
> + int clockid;
> + /*
> +  * enabled is valid for report mode, and is true if above
> +  * values are set, it's set in process_clock_data
> +  */
> + boolenabled;
> + } clock;
>  };
>  
>  enum perf_compress_type {

[SNIP]
> +static void print_clock_data(struct feat_fd *ff, FILE *fp)
> +{
> + struct timespec clockid_ns;
> + char tstr[64], date[64];
> + struct timeval tod_ns;
> + clockid_t clockid;
> + struct tm ltime;
> + u64 ref;
> +
> + if (!ff->ph->env.clock.enabled) {
> + fprintf(fp, "# reference time disabled\n");
> + return;
> + }
> +
> + /* Compute TOD time. */
> + ref = ff->ph->env.clock.tod_ns;
> + tod_ns.tv_sec = ref / NSEC_PER_SEC;
> + ref -= tod_ns.tv_sec * NSEC_PER_SEC;
> + tod_ns.tv_usec = ref / NSEC_PER_USEC;
> +
> + /* Compute clockid time. */
> + ref = ff->ph->env.clock.clockid_ns;
> + clockid_ns.tv_sec = ref / NSEC_PER_SEC;
> + ref -= clockid_ns.tv_sec * NSEC_PER_SEC;
> + clockid_ns.tv_nsec = ref;
> +
> + clockid = ff->ph->env.clock.clockid;
> +
> + if (localtime_r(_ns.tv_sec, ) == NULL)
> + snprintf(tstr, sizeof(tstr), "");
> + else {
> + strftime(date, sizeof(date), "%F %T", );
> + scnprintf(tstr, sizeof(tstr), "%s.%06d",
> +   date, (int) tod_ns.tv_usec);
> + }
> +
> + fprintf(fp, "# clockid: %s (%u)\n", clockid_name(clockid), clockid);
> + fprintf(fp, "# reference time: %s = %ld.%06d (TOD) = %ld.%ld (%s)\n",

Shouldn't the last one be %ld.%09ld?

Thanks
Namhyung


> + tstr, tod_ns.tv_sec, (int) tod_ns.tv_usec,
> + clockid_ns.tv_sec, clockid_ns.tv_nsec,
> + clockid_name(clockid));
> +}
> +
>  static void print_dir_format(struct feat_fd *ff, FILE *fp)
>  {
>   struct perf_session *session;
> @@ -2738,6 +2815,40 @@ static int process_clockid(struct feat_fd *ff,
>   return 0;
>  }
>  
> +static int process_clock_data(struct feat_fd *ff,
> +   void *_data __maybe_unused)
> +{
> + u32 data32;
> + u64 data64;
> +
> + /* version */
> + if (do_read_u32(ff, ))
> + 

[PATCH] powerpc: fix up PPC_FSL_BOOK3E build

2020-08-02 Thread Stephen Rothwell
Commit

  1c9df907da83 ("random: fix circular include dependency on arm64 after 
addition of percpu.h")

exposed a curcular include dependency:

asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
includes asm/mmu.h

So fix it by extracting the small part of asm/mmu.h that needs
asm/percu.h into a new file and including that where necessary.

Cc: Willy Tarreau 
Cc: 
Signed-off-by: Stephen Rothwell 
---

I have done powerpc test builds of allmodconfig, ppc64e_defconfig and
corenet64_smp_defconfig.

 arch/powerpc/include/asm/mmu.h  |  5 -
 arch/powerpc/include/asm/mmu_fsl_e.h| 10 ++
 arch/powerpc/kernel/smp.c   |  1 +
 arch/powerpc/mm/mem.c   |  1 +
 arch/powerpc/mm/nohash/book3e_hugetlbpage.c |  1 +
 arch/powerpc/mm/nohash/tlb.c|  1 +
 6 files changed, 14 insertions(+), 5 deletions(-)
 create mode 100644 arch/powerpc/include/asm/mmu_fsl_e.h

diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index f4ac25d4df05..fa602a4cf303 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -134,11 +134,6 @@
 
 typedef pte_t *pgtable_t;
 
-#ifdef CONFIG_PPC_FSL_BOOK3E
-#include 
-DECLARE_PER_CPU(int, next_tlbcam_idx);
-#endif
-
 enum {
MMU_FTRS_POSSIBLE =
 #ifdef CONFIG_PPC_BOOK3S
diff --git a/arch/powerpc/include/asm/mmu_fsl_e.h 
b/arch/powerpc/include/asm/mmu_fsl_e.h
new file mode 100644
index ..c74a81556ce5
--- /dev/null
+++ b/arch/powerpc/include/asm/mmu_fsl_e.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_MMU_FSL_E_H_
+#define _ASM_POWERPC_MMU_FSL_E_H_
+
+#ifdef CONFIG_PPC_FSL_BOOK3E
+#include 
+DECLARE_PER_CPU(int, next_tlbcam_idx);
+#endif
+
+#endif
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 73199470c265..142b3e7882bf 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index c2c11eb8dcfc..7371061b2126 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c 
b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
index 8b88be91b622..cacda4ee5da5 100644
--- a/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
+++ b/arch/powerpc/mm/nohash/book3e_hugetlbpage.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 
+#include 
 #include 
 
 #ifdef CONFIG_PPC64
diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c
index 696f568253a0..8b3a68ce7fde 100644
--- a/arch/powerpc/mm/nohash/tlb.c
+++ b/arch/powerpc/mm/nohash/tlb.c
@@ -171,6 +171,7 @@ int extlb_level_exc;
 #endif /* CONFIG_PPC64 */
 
 #ifdef CONFIG_PPC_FSL_BOOK3E
+#include 
 /* next_tlbcam_idx is used to round-robin tlbcam entry assignment */
 DEFINE_PER_CPU(int, next_tlbcam_idx);
 EXPORT_PER_CPU_SYMBOL(next_tlbcam_idx);
-- 
2.28.0

-- 
Cheers,
Stephen Rothwell


pgpMC2exqB0K8.pgp
Description: OpenPGP digital signature


[PATCH v3 2/2] PCI: Reduce warnings on possible RW1C corruption

2020-08-02 Thread Mark Tomlinson
For hardware that only supports 32-bit writes to PCI there is the
possibility of clearing RW1C (write-one-to-clear) bits. A rate-limited
messages was introduced by fb2659230120, but rate-limiting is not the
best choice here. Some devices may not show the warnings they should if
another device has just produced a bunch of warnings. Also, the number
of messages can be a nuisance on devices which are otherwise working
fine.

This patch changes the ratelimit to a single warning per bus. This
ensures no bus is 'starved' of emitting a warning and also that there
isn't a continuous stream of warnings. It would be preferable to have a
warning per device, but the pci_dev structure is not available here, and
a lookup from devfn would be far too slow.

Suggested-by: Bjorn Helgaas 
Fixes: fb2659230120 ("PCI: Warn on possible RW1C corruption for sub-32 bit 
config writes")
Signed-off-by: Mark Tomlinson 
---
 drivers/pci/access.c | 9 ++---
 include/linux/pci.h  | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/access.c b/drivers/pci/access.c
index 79c4a2ef269a..ab85cb7df9b6 100644
--- a/drivers/pci/access.c
+++ b/drivers/pci/access.c
@@ -160,9 +160,12 @@ int pci_generic_config_write32(struct pci_bus *bus, 
unsigned int devfn,
 * write happen to have any RW1C (write-one-to-clear) bits set, we
 * just inadvertently cleared something we shouldn't have.
 */
-   dev_warn_ratelimited(>dev, "%d-byte config write to 
%04x:%02x:%02x.%d offset %#x may corrupt adjacent RW1C bits\n",
-size, pci_domain_nr(bus), bus->number,
-PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+   if (!bus->unsafe_warn) {
+   dev_warn(>dev, "%d-byte config write to %04x:%02x:%02x.%d 
offset %#x may corrupt adjacent RW1C bits\n",
+size, pci_domain_nr(bus), bus->number,
+PCI_SLOT(devfn), PCI_FUNC(devfn), where);
+   bus->unsafe_warn = true;
+   }
 
mask = ~(((1 << (size * 8)) - 1) << ((where & 0x3) * 8));
tmp = readl(addr) & mask;
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 34c1c4f45288..5b6ab593ae09 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -613,6 +613,7 @@ struct pci_bus {
unsigned char   primary;/* Number of primary bridge */
unsigned char   max_bus_speed;  /* enum pci_bus_speed */
unsigned char   cur_bus_speed;  /* enum pci_bus_speed */
+   boolunsafe_warn;/* warned about RW1C config write */
 #ifdef CONFIG_PCI_DOMAINS_GENERIC
int domain_nr;
 #endif
-- 
2.28.0



[PATCH v3 1/2] PCI: iproc: Set affinity mask on MSI interrupts

2020-08-02 Thread Mark Tomlinson
The core interrupt code expects the irq_set_affinity call to update the
effective affinity for the interrupt. This was not being done, so update
iproc_msi_irq_set_affinity() to do so.

Fixes: 3bc2b2348835 ("PCI: iproc: Add iProc PCIe MSI support")
Signed-off-by: Mark Tomlinson 
---
changes in v2:
 - Patch 1/2 Added Fixes tag
 - Patch 2/2 Replace original change with change suggested by Bjorn
   Helgaas.

changes in v3:
 - Use bitfield rather than bool to save memory.

 drivers/pci/controller/pcie-iproc-msi.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/controller/pcie-iproc-msi.c 
b/drivers/pci/controller/pcie-iproc-msi.c
index 3176ad3ab0e5..908475d27e0e 100644
--- a/drivers/pci/controller/pcie-iproc-msi.c
+++ b/drivers/pci/controller/pcie-iproc-msi.c
@@ -209,15 +209,20 @@ static int iproc_msi_irq_set_affinity(struct irq_data 
*data,
struct iproc_msi *msi = irq_data_get_irq_chip_data(data);
int target_cpu = cpumask_first(mask);
int curr_cpu;
+   int ret;
 
curr_cpu = hwirq_to_cpu(msi, data->hwirq);
if (curr_cpu == target_cpu)
-   return IRQ_SET_MASK_OK_DONE;
+   ret = IRQ_SET_MASK_OK_DONE;
+   else {
+   /* steer MSI to the target CPU */
+   data->hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq) + 
target_cpu;
+   ret = IRQ_SET_MASK_OK;
+   }
 
-   /* steer MSI to the target CPU */
-   data->hwirq = hwirq_to_canonical_hwirq(msi, data->hwirq) + target_cpu;
+   irq_data_update_effective_affinity(data, cpumask_of(target_cpu));
 
-   return IRQ_SET_MASK_OK;
+   return ret;
 }
 
 static void iproc_msi_irq_compose_msi_msg(struct irq_data *data,
-- 
2.28.0



[PATCH] drm/nouveau/acr: fix a coding style in nvkm_acr_lsfw_load_bl_inst_data_sig()

2020-08-02 Thread Jing Xiangfeng
This patch performs the following changes:
1. remove a redundant parentheses around the  nvkm_acr_lsfw_add() calls
2. do assignment before this if condition, it is more readable

Signed-off-by: Jing Xiangfeng 
---
 drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c 
b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
index 07d1830126ab..5f6006418472 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/acr/lsfw.c
@@ -191,7 +191,8 @@ nvkm_acr_lsfw_load_bl_inst_data_sig(struct nvkm_subdev 
*subdev,
u32 *bldata;
int ret;
 
-   if (IS_ERR((lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id
+   lsfw = nvkm_acr_lsfw_add(func, acr, falcon, id);
+   if (IS_ERR(lsfw))
return PTR_ERR(lsfw);
 
ret = nvkm_firmware_load_name(subdev, path, "bl", ver, );
-- 
2.17.1



Re: powerpc: build failures in Linus' tree

2020-08-02 Thread Willy Tarreau
Hi again Stephen,

On Sun, Aug 02, 2020 at 07:20:19PM +0200, Willy Tarreau wrote:
> On Sun, Aug 02, 2020 at 08:48:42PM +1000, Stephen Rothwell wrote:
> > Hi all,
> > 
> > We are getting build failures in some PowerPC configs for Linus' tree.
> > See e.g. http://kisskb.ellerman.id.au/kisskb/buildresult/14306515/
> > 
> > In file included from /kisskb/src/arch/powerpc/include/asm/paca.h:18,
> >  from /kisskb/src/arch/powerpc/include/asm/percpu.h:13,
> >  from /kisskb/src/include/linux/random.h:14,
> >  from /kisskb/src/include/linux/net.h:18,
> >  from /kisskb/src/net/ipv6/ip6_fib.c:20:
> > /kisskb/src/arch/powerpc/include/asm/mmu.h:139:22: error: unknown type name 
> > 'next_tlbcam_idx'
> >   139 | DECLARE_PER_CPU(int, next_tlbcam_idx);
> > 
> > I assume this is caused by commit
> > 
> >   1c9df907da83 ("random: fix circular include dependency on arm64 after 
> > addition of percpu.h")
> > 
> > But I can't see how, sorry.
> 
> So there, asm/mmu.h includes asm/percpu.h, which includes asm/paca.h, which
> includes asm/mmu.h.
> 
> I suspect that we can remove asm/paca.h from asm/percpu.h as it *seems*
> to be only used by the #define __my_cpu_offset but I don't know if anything
> will break further, especially if this __my_cpu_offset is used anywhere
> without this paca definition.

I tried this and it fixed 5.8 for me with your config above. I'm appending
a patch that does just this. I didn't test other configs as I don't know
which ones to test though. If it fixes the problem for you, maybe it can
be picked by the PPC maintainers.

Willy
>From bcd64a7d0f3445c9a75d3b4dc4837d2ce61660c9 Mon Sep 17 00:00:00 2001
From: Willy Tarreau 
Date: Mon, 3 Aug 2020 05:27:57 +0200
Subject: powerpc: fix circular dependency in percpu.h

After random.h started to include percpu.h (commit f227e3e), several
archs broke in circular dependencies around percpu.h.

In https://lore.kernel.org/lkml/20200802204842.36bca...@canb.auug.org.au/
Stephen Rothwell reported breakage for powerpc with CONFIG_PPC_FSL_BOOK3E.

It turns out that asm/percpu.h includes asm/paca.h, which itself
includes mmu.h, which includes percpu.h when CONFIG_PPC_FSL_BOOK3E=y.

Percpu seems to include asm/paca.h only for local_paca which is used in
the __my_cpu_offset macro. Removing this include solves the issue for
this config.

Reported-by: Stephen Rothwell 
Fixes: f227e3e ("random32: update the net random state on interrupt and 
activity")
Link: https://lore.kernel.org/lkml/20200802204842.36bca...@canb.auug.org.au/
Cc: Linus Torvalds 
Cc: Michael Ellerman 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Signed-off-by: Willy Tarreau 
---
 arch/powerpc/include/asm/percpu.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/powerpc/include/asm/percpu.h 
b/arch/powerpc/include/asm/percpu.h
index dce863a..cd3f6e5 100644
--- a/arch/powerpc/include/asm/percpu.h
+++ b/arch/powerpc/include/asm/percpu.h
@@ -10,8 +10,6 @@
 
 #ifdef CONFIG_SMP
 
-#include 
-
 #define __my_cpu_offset local_paca->data_offset
 
 #endif /* CONFIG_SMP */
-- 
2.9.0



Re: [PATCH v2 2/2] PCI: Reduce warnings on possible RW1C corruption

2020-08-02 Thread Mark Tomlinson
On Fri, 2020-07-31 at 09:32 -0600, Rob Herring wrote:
> 
> If we don't want to just warn when a 8 or 16 bit access occurs (I'm
> not sure if 32-bit only accesses is possible or common. Seems like
> PCI_COMMAND would always get written?), then a simple way to do this
> is just move this out of line and do something like this where the bus
> or device is created/registered:
> 
> if (bus->ops->write == pci_generic_config_write32)
> warn()
> 
This doesn't work for many of the PCI drivers, since they wrap the call
to pci_generic_config_write32() in their own function.

> > 
> > diff --git a/include/linux/pci.h b/include/linux/pci.h
> > index 34c1c4f45288..5b6ab593ae09 100644
> > --- a/include/linux/pci.h
> > +++ b/include/linux/pci.h
> > @@ -613,6 +613,7 @@ struct pci_bus {
> > unsigned char   primary;/* Number of primary bridge */
> > unsigned char   max_bus_speed;  /* enum pci_bus_speed */
> > unsigned char   cur_bus_speed;  /* enum pci_bus_speed */
> > +   boolunsafe_warn;/* warned about RW1C config write */
> 
> Make this a bitfield next to 'is_added'.

Will do, thanks.




Re: [PATCH v2] scsi: 3w-9xxx: Fix endianness issues found by sparse

2020-08-02 Thread Samuel Holland
On 7/31/20 2:29 AM, Arnd Bergmann wrote:
> On Fri, Jul 31, 2020 at 12:07 AM Samuel Holland  wrote:
>>
>> The main issue observed was at the call to scsi_set_resid, where the
>> byteswapped parameter would eventually trigger the alignment check at
>> drivers/scsi/sd.c:2009. At that point, the kernel would continuously
>> complain about an "Unaligned partial completion", and no further I/O
>> could occur.
>>
>> This gets the controller working on big endian powerpc64.
>>
>> Signed-off-by: Samuel Holland 
>> ---
>>
>> Changes since v1:
>>  - Include changes to use __le?? types in command structures
>>  - Use an object literal for the intermediate "schedulertime" value
>>  - Use local "error" variable to avoid repeated byte swapping
>>  - Create a local "length" variable to avoid very long lines
>>  - Move byte swapping to TW_REQ_LUN_IN/TW_LUN_OUT to avoid long lines
>>
> 
> Looks much better, thanks for the update. I see one more issue here
>>  /* Command Packet */
>>  typedef struct TW_Command {
>> -   unsigned char opcode__sgloffset;
>> -   unsigned char size;
>> -   unsigned char request_id;
>> -   unsigned char unit__hostid;
>> +   u8  opcode__sgloffset;
>> +   u8  size;
>> +   u8  request_id;
>> +   u8  unit__hostid;
>> /* Second DWORD */
>> -   unsigned char status;
>> -   unsigned char flags;
>> +   u8  status;
>> +   u8  flags;
>> union {
>> -   unsigned short block_count;
>> -   unsigned short parameter_count;
>> +   __le16  block_count;
>> +   __le16  parameter_count;
>> } byte6_offset;
>> union {
>> struct {
>> -   u32 lba;
>> -   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> -   dma_addr_t padding;
>> +   __le32  lba;
>> +   TW_SG_Entry sgl[TW_ESCALADE_MAX_SGL_LENGTH];
>> +   dma_addr_t  padding;
> 
> 
> The use of dma_addr_t here seems odd, since this is neither endian-safe nor
> fixed-length. I see you replaced the dma_addr_t in TW_SG_Entry with
> a variable-length fixed-endian word. I guess there is a chance that this is
> correct, but it is really confusing. On top of that, it seems that there is
> implied padding in the structure when built with a 64-bit dma_addr_t
> on most architectures but not on x86-32 (which uses 32-bit alignment for
> 64-bit integers). I don't know what the hardware definition is for TW_Command,
> but ideally this would be expressed using only fixed-endian fixed-length
> members and explicit padding.

All of the command structures are packed, due to the "#pragma pack(1)" earlier
in the file. So alignment is not an issue. This dma_addr_t member _is_ the
explicit padding to make sizeof(TW_Command) -
sizeof(TW_Command.byte8_offset.{io,param}.sgl) equal TW_COMMAND_SIZE * 4. And
indeed the structure is expected to be a different size depending on
sizeof(dma_addr_t).

I left the padding member alone to avoid the #ifdef; since it's never accessed,
the endianness doesn't matter. In fact, since in both cases it's at the end of
the structure, it could probably be removed entirely. I don't see
sizeof(TW_Command) being used anywhere, but I'm not 100% certain. The downside
of removing it would be TW_COMMAND_SIZE becoming a slightly more magic number.

Regards,
Samuel


Re: [PATCH v4 2/2] soc: mediatek: add mtk-devapc driver

2020-08-02 Thread Neal Liu
Hi Chun-Kuang,

On Fri, 2020-07-31 at 23:55 +0800, Chun-Kuang Hu wrote:
> Hi, Neal:
> 
> Neal Liu  於 2020年7月31日 週五 上午10:52寫道:
> >
> > Hi Chun-Kuang,
> >
> > On Fri, 2020-07-31 at 00:14 +0800, Chun-Kuang Hu wrote:
> > > Hi, Neal:
> > >
> > > Neal Liu  於 2020年7月29日 週三 下午4:29寫道:
> > > >
> > > > MediaTek bus fabric provides TrustZone security support and data
> > > > protection to prevent slaves from being accessed by unexpected
> > > > masters.
> > > > The security violation is logged and sent to the processor for
> > > > further analysis or countermeasures.
> > > >
> > > > Any occurrence of security violation would raise an interrupt, and
> > > > it will be handled by mtk-devapc driver. The violation
> > > > information is printed in order to find the murderer.
> > > >
> > > > Signed-off-by: Neal Liu 
> > > > ---
> > >
> > > [snip]
> > >
> > > > +
> > > > +/*
> > > > + * devapc_extract_vio_dbg - extract full violation information after 
> > > > doing
> > > > + *  shift mechanism.
> > > > + */
> > > > +static void devapc_extract_vio_dbg(struct mtk_devapc_context *ctx)
> > > > +{
> > > > +   const struct mtk_devapc_vio_dbgs *vio_dbgs;
> > > > +   struct mtk_devapc_vio_info *vio_info;
> > > > +   void __iomem *vio_dbg0_reg;
> > > > +   void __iomem *vio_dbg1_reg;
> > > > +   u32 dbg0;
> > > > +
> > > > +   vio_dbg0_reg = ctx->devapc_pd_base + ctx->offset->vio_dbg0;
> > > > +   vio_dbg1_reg = ctx->devapc_pd_base + ctx->offset->vio_dbg1;
> > > > +
> > > > +   vio_dbgs = ctx->vio_dbgs;
> > > > +   vio_info = ctx->vio_info;
> > > > +
> > > > +   /* Starts to extract violation information */
> > > > +   dbg0 = readl(vio_dbg0_reg);
> > > > +   vio_info->vio_addr = readl(vio_dbg1_reg);
> > > > +
> > > > +   vio_info->master_id = (dbg0 & vio_dbgs->mstid.mask) >>
> > > > + vio_dbgs->mstid.start;
> > > > +   vio_info->domain_id = (dbg0 & vio_dbgs->dmnid.mask) >>
> > > > + vio_dbgs->dmnid.start;
> > > > +   vio_info->write = ((dbg0 & vio_dbgs->vio_w.mask) >>
> > > > +   vio_dbgs->vio_w.start) == 1;
> > > > +   vio_info->read = ((dbg0 & vio_dbgs->vio_r.mask) >>
> > > > + vio_dbgs->vio_r.start) == 1;
> > > > +   vio_info->vio_addr_high = (dbg0 & vio_dbgs->addr_h.mask) >>
> > > > + vio_dbgs->addr_h.start;
> > >
> > >
> > > I would like to define the type of ctx->vio_info to be
> > >
> > > struct mtk_devapc_vio_dbgs {
> > > u32 mstid:16;
> > > u32 dmnid:6;
> > > u32 vio_w:1;
> > > u32 vio_r:1;
> > > u32 addr_h:4;
> > > u32 resv:4;
> > > };
> > >
> > > so the code would like the simple way
> > >
> > > ctx->vio_info = (struct mtk_devapc_vio_dbgs)readl(vio_dbg1_reg);
> > >
> >
> > This idea looks great! Is there any possible to pass the bit layout by
> > DT data, and still make this operation simple?
> > Why am I asking this question is because this bit layout is platform
> > dependent.
> 
> I doubt these info would be in a single 32-bits register for all
> future SoC. If they are not in single 32-bits register, you may create
> a vio_dbgs_type in DT data, and the code may be
> 
> if (ctx->vio_dbgs_type == VIO_DBGS_TYPE_MT) {
> ctx->vio_info = (struct mtk_devapc_vio_dbgs)readl(vio_dbg1_reg);
> } else if (ctx->vio_dbgs_type == VIO_DBGS_TYPE_MT) {
> ctx->vio_info->mstid = readl(vio_mstid_reg);
> ctx->vio_info->dmnid = readl(vio_dmnid_reg);
> ctx->vio_info->vio_w = readl(vio_vio_w_reg);
> ctx->vio_info->vio_r = readl(vio_vio_r_reg);
> }
> 
> I think we need not to consider how the future would be. Once the
> second SoC driver is upstreaming, we could find out the best solution
> for it.
> 

Okay, I'll apply this on next patch.
Thanks !

> Regards,
> Chun-Kuang.
> 
> >
> > > Regards,
> > > Chun-Kuang.
> > >
> > > > +
> > > > +   devapc_vio_info_print(ctx);
> > > > +}
> > > > +
> >



[PATCH] driver core: Use the ktime_us_delta() helper

2020-08-02 Thread Zenghui Yu
Use the ktime_us_delta() helper to measure the driver probe time. Given the
helpers already returns an s64 value, let's drop the unnecessary casting to
s64 as well. There is no functional change.

Signed-off-by: Zenghui Yu 
---
 drivers/base/dd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 3cefe89e94ca..0ef5b2f32932 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -618,15 +618,14 @@ static int really_probe(struct device *dev, struct 
device_driver *drv)
  */
 static int really_probe_debug(struct device *dev, struct device_driver *drv)
 {
-   ktime_t calltime, delta, rettime;
+   ktime_t calltime, rettime;
int ret;
 
calltime = ktime_get();
ret = really_probe(dev, drv);
rettime = ktime_get();
-   delta = ktime_sub(rettime, calltime);
pr_debug("probe of %s returned %d after %lld usecs\n",
-dev_name(dev), ret, (s64) ktime_to_us(delta));
+dev_name(dev), ret, ktime_us_delta(rettime, calltime));
return ret;
 }
 
-- 
2.19.1



Re: [PATCH v4 2/2] soc: mediatek: add mtk-devapc driver

2020-08-02 Thread Neal Liu
Hi Chun-Kuang,

On Fri, 2020-07-31 at 23:03 +0800, Chun-Kuang Hu wrote:
> Hi, Neal:
> 
> Neal Liu  於 2020年7月31日 週五 上午10:44寫道:
> >
> > Hi Chun-Kuang,
> >
> >
> > On Thu, 2020-07-30 at 00:38 +0800, Chun-Kuang Hu wrote:
> > > Hi, Neal:
> > >
> > > Neal Liu  於 2020年7月29日 週三 下午4:29寫道:
> > > >
> > > > MediaTek bus fabric provides TrustZone security support and data
> > > > protection to prevent slaves from being accessed by unexpected
> > > > masters.
> > > > The security violation is logged and sent to the processor for
> > > > further analysis or countermeasures.
> > > >
> > > > Any occurrence of security violation would raise an interrupt, and
> > > > it will be handled by mtk-devapc driver. The violation
> > > > information is printed in order to find the murderer.
> > > >
> > > > Signed-off-by: Neal Liu 
> > > > ---
> > >
> > > [snip]
> > >
> > > > +
> > > > +static int get_shift_group(struct mtk_devapc_context *ctx)
> > > > +{
> > > > +   u32 vio_shift_sta;
> > > > +   void __iomem *reg;
> > > > +
> > > > +   reg = ctx->devapc_pd_base + ctx->offset->vio_shift_sta;
> > > > +   vio_shift_sta = readl(reg);
> > > > +
> > > > +   if (vio_shift_sta)
> > > > +   return __ffs(vio_shift_sta);
> > > > +
> > > > +   return -EIO;
> > > > +}
> > >
> > > get_shift_group() is a small function, I would like to merge this
> > > function into sync_vio_dbg() to make code more simple.
> >
> > This function have a specific functionality. And it would make this
> > driver more readability. I would like to keep it as a function. Is that
> > okay for you?
> 
> After merge, the function would be:
> 
> static bool sync_min_shift_group_vio_dbg(struct mtk_devapc_context *ctx)
> {
>  int min_shift_group;
>  int ret;
>  u32 val;
> 
>  /* find the minimum shift group which has violation */
>  val = readl(ctx->devapc_pd_base + ctx->offset->vio_shift_sta);
>  if (!val)
> return false;
> 
>  min_shift_group = __ffs(val);
> 
>  /* Assign the group to sync */
>  writel(0x1 << min_shift_group, ctx->devapc_pd_base +
> ctx->offset->vio_shift_sel);
> 
>  /* Start syncing */
>  writel(0x1, ctx->devapc_pd_base + ctx->offset->vio_shift_con);
> 
>  ret = readl_poll_timeout(pd_vio_shift_con_reg, val, val == 0x3, 0,
>  PHY_DEVAPC_TIMEOUT);
>  if (ret) {
>   dev_err(ctx->dev, "%s: Shift violation info failed\n", __func__);
>   return false;
>  }
> 
>  /* Stop syncing */
>  writel(0x0, ctx->devapc_pd_base + ctx->offset->vio_shift_con);
> 
>  /* ? */
>  writel(0x1 << min_shift_group, ctx->devapc_pd_base +
> ctx->offset->vio_shift_sta);
> 
>  return true;
> }
> 
> The whole function is to sync min_shift_group violation info, I don't
> know why separate any part to an independent function? Any function
> call would cause penalty on CPU performance, so I does not like to
> break this function. After good comment, I think every body could
> understand the function of each register.
> After the merge, the code would be so simple as:
> 
> while(sync_min_shift_group_vio_dbg(ctx))
> devapc_extract_vio_dbg(ctx);
> 

Okay, this looks good to me. I'll apply this on next patch.
Thanks !

> >
> > >
> > > > +
> > >
> > > [snip]
> > >
> > > > +
> > > > +#define PHY_DEVAPC_TIMEOUT 0x1
> > > > +
> > > > +/*
> > > > + * sync_vio_dbg - do "shift" mechansim" to get full violation 
> > > > information.
> > > > + *shift mechanism is depends on devapc hardware design.
> > > > + *Mediatek devapc set multiple slaves as a group. When 
> > > > violation
> > > > + *is triggered, violation info is kept inside devapc 
> > > > hardware.
> > > > + *Driver should do shift mechansim to "shift" full 
> > > > violation
> > > > + *info to VIO_DBGs registers.
> > > > + *
> > > > + */
> > > > +static int sync_vio_dbg(struct mtk_devapc_context *ctx, u32 shift_bit)
> > > > +{
> > > > +   void __iomem *pd_vio_shift_sta_reg;
> > > > +   void __iomem *pd_vio_shift_sel_reg;
> > > > +   void __iomem *pd_vio_shift_con_reg;
> > > > +   int ret;
> > > > +   u32 val;
> > > > +
> > > > +   pd_vio_shift_sta_reg = ctx->devapc_pd_base + 
> > > > ctx->offset->vio_shift_sta;
> > > > +   pd_vio_shift_sel_reg = ctx->devapc_pd_base + 
> > > > ctx->offset->vio_shift_sel;
> > > > +   pd_vio_shift_con_reg = ctx->devapc_pd_base + 
> > > > ctx->offset->vio_shift_con;
> > > > +
> > > > +   /* Enable shift mechansim */
> > > > +   writel(0x1 << shift_bit, pd_vio_shift_sel_reg);
> > > > +   writel(0x1, pd_vio_shift_con_reg);
> > > > +
> > > > +   ret = readl_poll_timeout(pd_vio_shift_con_reg, val, val == 0x3, 
> > > > 0,
> > > > +PHY_DEVAPC_TIMEOUT);
> > > > +   if (ret)
> > > > +   dev_err(ctx->dev, "%s: Shift violation info failed\n", 
> > > > __func__);
> > > > +
> > > > +   /* Disable shift mechanism */
> > > > +   writel(0x0, pd_vio_shift_con_reg);
> > > > +   writel(0x0, 

[PATCH] ASoC: fsl_sai: Clean code for synchronize mode

2020-08-02 Thread Shengjiu Wang
TX synchronous with RX: The RMR is no need to be changed when
Tx is enabled, the other configuration in hw_params() is enough for
clock generation. The TCSR.TE is no need to enabled when only RX
is enabled.

RX synchronous with TX: The TMR is no need to be changed when
Rx is enabled, the other configuration in hw_params() is enough for
clock generation. The RCSR.RE is no need to enabled when only TX
is enabled.

Signed-off-by: Shengjiu Wang 
---
 sound/soc/fsl/fsl_sai.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/sound/soc/fsl/fsl_sai.c b/sound/soc/fsl/fsl_sai.c
index cdff739924e2..a210c9836a9a 100644
--- a/sound/soc/fsl/fsl_sai.c
+++ b/sound/soc/fsl/fsl_sai.c
@@ -482,8 +482,6 @@ static int fsl_sai_hw_params(struct snd_pcm_substream 
*substream,
regmap_update_bits(sai->regmap, FSL_SAI_TCR5(ofs),
FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK |
FSL_SAI_CR5_FBT_MASK, val_cr5);
-   regmap_write(sai->regmap, FSL_SAI_TMR,
-   ~0UL - ((1 << channels) - 1));
} else if (!sai->synchronous[RX] && sai->synchronous[TX] && tx) 
{
regmap_update_bits(sai->regmap, FSL_SAI_RCR4(ofs),
FSL_SAI_CR4_SYWD_MASK | FSL_SAI_CR4_FRSZ_MASK,
@@ -491,8 +489,6 @@ static int fsl_sai_hw_params(struct snd_pcm_substream 
*substream,
regmap_update_bits(sai->regmap, FSL_SAI_RCR5(ofs),
FSL_SAI_CR5_WNW_MASK | FSL_SAI_CR5_W0W_MASK |
FSL_SAI_CR5_FBT_MASK, val_cr5);
-   regmap_write(sai->regmap, FSL_SAI_RMR,
-   ~0UL - ((1 << channels) - 1));
}
}
 
@@ -553,11 +549,18 @@ static int fsl_sai_trigger(struct snd_pcm_substream 
*substream, int cmd,
regmap_update_bits(sai->regmap, FSL_SAI_xCSR(tx, ofs),
   FSL_SAI_CSR_FRDE, FSL_SAI_CSR_FRDE);
 
-   regmap_update_bits(sai->regmap, FSL_SAI_RCSR(ofs),
-  FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);
-   regmap_update_bits(sai->regmap, FSL_SAI_TCSR(ofs),
+   regmap_update_bits(sai->regmap, FSL_SAI_xCSR(tx, ofs),
   FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);
 
+   /* Enable opposite direction when necessarily */
+   if (!sai->synchronous[TX] && sai->synchronous[RX] && !tx) {
+   regmap_update_bits(sai->regmap, FSL_SAI_xCSR((!tx), 
ofs),
+  FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);
+   } else if (!sai->synchronous[RX] && sai->synchronous[TX] && tx) 
{
+   regmap_update_bits(sai->regmap, FSL_SAI_xCSR((!tx), 
ofs),
+  FSL_SAI_CSR_TERE, FSL_SAI_CSR_TERE);
+   }
+
regmap_update_bits(sai->regmap, FSL_SAI_xCSR(tx, ofs),
   FSL_SAI_CSR_xIE_MASK, FSL_SAI_FLAGS);
break;
-- 
2.27.0



Re: [PATCH v3] ASoC: fsl-asoc-card: Remove fsl_asoc_card_set_bias_level function

2020-08-02 Thread Nicolin Chen
On Mon, Aug 03, 2020 at 10:13:31AM +0800, Shengjiu Wang wrote:
> With this case:
> aplay -Dhw:x 16khz.wav 24khz.wav
> There is sound distortion for 24khz.wav. The reason is that setting
> PLL of WM8962 with set_bias_level function, the bias level is not
> changed when 24khz.wav is played, then the PLL won't be reset, the
> clock is not correct, so distortion happens.
> 
> The resolution of this issue is to remove fsl_asoc_card_set_bias_level.
> Move PLL configuration to hw_params and hw_free.
> 
> After removing fsl_asoc_card_set_bias_level, also test WM8960 case,
> it can work.
> 
> Fixes: 708b4351f08c ("ASoC: fsl: Add Freescale Generic ASoC Sound Card with 
> ASRC support")
> Signed-off-by: Shengjiu Wang 

Acked-by: Nicolin Chen 


Re: [PATCH v6 2/3] binder: add trace at free transaction.

2020-08-02 Thread Frankie Chang
On Fri, 2020-07-31 at 11:50 -0700, Todd Kjos wrote:
> On Mon, Jul 27, 2020 at 8:28 PM Frankie Chang
>  wrote:
> >
> > From: "Frankie.Chang" 
> >
> > Since the original trace_binder_transaction_received cannot
> > precisely present the real finished time of transaction, adding a
> > trace_binder_txn_latency_free at the point of free transaction
> > may be more close to it.
> >
> > Signed-off-by: Frankie.Chang 
> > ---
> >  drivers/android/binder.c   |6 ++
> >  drivers/android/binder_trace.h |   27 +++
> >  2 files changed, 33 insertions(+)
> >
> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > index 2df146f..1e6fc40 100644
> > --- a/drivers/android/binder.c
> > +++ b/drivers/android/binder.c
> > @@ -1522,6 +1522,9 @@ static void binder_free_transaction(struct 
> > binder_transaction *t)
> >  * If the transaction has no target_proc, then
> >  * t->buffer->transaction has already been cleared.
> >  */
> > +   spin_lock(>lock);
> > +   trace_binder_txn_latency_free(t);
> > +   spin_unlock(>lock);
> 
> Hmm. I don't prefer taking the lock just to call a trace. It doesn't
> make clear why the lock has to be taken. I'd prefer something like:
> 
> if (trace_binder_txn_latency_free_enabled()) {
c
> }
> 
> And then the trace would use the passed-in values instead of accessing
> via t->to_proc/to_thread.
> 
Then we still add lock protection in the hook function, when trace is
disable ?

Or we also pass these to hook function, no matter the trace is enable or
not.I think this way is more clear that the lock protects @from,
@to_proc and @to_thread.Then, there is no need to add the lock in hook
function.

int from_proc, from_thread, to_proc, to_thread;
 
spin_lock(>lock);
from_proc = t->from ? t->from->proc->pid : 0;
from_thread = t->from ? t->from->pid :0;
to_proc = t->to_proc ? t->to_proc->pid : 0;
to_thread = t->to_thread ? t->to_thread->pid : 0;
spin_unlock(>lock);
trace_binder_txn_latency_free(t, from_proc, from_thread, to_proc,
to_pid);

> > binder_free_txn_fixups(t);
> > kfree(t);
> > binder_stats_deleted(BINDER_STAT_TRANSACTION);
> > @@ -3093,6 +3096,9 @@ static void binder_transaction(struct binder_proc 
> > *proc,
> > kfree(tcomplete);
> > binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE);
> >  err_alloc_tcomplete_failed:
> > +   spin_lock(>lock);
> > +   trace_binder_txn_latency_free(t);
> > +   spin_unlock(>lock);
> > kfree(t);
> > binder_stats_deleted(BINDER_STAT_TRANSACTION);
> >  err_alloc_t_failed:
> > diff --git a/drivers/android/binder_trace.h b/drivers/android/binder_trace.h
> > index 6731c3c..8ac87d1 100644
> > --- a/drivers/android/binder_trace.h
> > +++ b/drivers/android/binder_trace.h
> > @@ -95,6 +95,33 @@
> >   __entry->thread_todo)
> >  );
> >
> > +TRACE_EVENT(binder_txn_latency_free,
> > +   TP_PROTO(struct binder_transaction *t),
> > +   TP_ARGS(t),
> > +   TP_STRUCT__entry(
> > +   __field(int, debug_id)
> > +   __field(int, from_proc)
> > +   __field(int, from_thread)
> > +   __field(int, to_proc)
> > +   __field(int, to_thread)
> > +   __field(unsigned int, code)
> > +   __field(unsigned int, flags)
> > +   ),
> > +   TP_fast_assign(
> > +   __entry->debug_id = t->debug_id;
> > +   __entry->from_proc = t->from ? t->from->proc->pid : 0;
> > +   __entry->from_thread = t->from ? t->from->pid : 0;
> > +   __entry->to_proc = t->to_proc ? t->to_proc->pid : 0;
> > +   __entry->to_thread = t->to_thread ? t->to_thread->pid : 0;
> > +   __entry->code = t->code;
> > +   __entry->flags = t->flags;
> > +   ),
> > +   TP_printk("transaction=%d from %d:%d to %d:%d flags=0x%x code=0x%x",
> > + __entry->debug_id, __entry->from_proc, 
> > __entry->from_thread,
> > + __entry->to_proc, __entry->to_thread, __entry->code,
> > + __entry->flags)
> > +);
> > +
> >  TRACE_EVENT(binder_transaction,
> > TP_PROTO(bool reply, struct binder_transaction *t,
> >  struct binder_node *target_node),
> > --
> > 1.7.9.5



Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

2020-08-02 Thread Bart Van Assche
On 2020-07-31 16:17, Can Guo wrote:
> For scsi_dma_unmap() part, that is true - we should make it serialized with
> any other completion paths. I've found it during my fault injection test, so
> I've made a patch to fix it, but it only comes in my next error recovery
> enhancement patch series. Please check the attachment.

Hi Can,

It is not clear to me how that patch serializes scsi_dma_unmap() against
other completion paths? Doesn't the regular completion path call
__ufshcd_transfer_req_compl() without holding the host lock?

Thanks,

Bart.



Re: [PATCH] scsi: esas2r: fix possible buffer overflow caused by bad DMA value in esas2r_process_fs_ioctl()

2020-08-02 Thread Jia-Ju Bai




On 2020/8/2 23:47, James Bottomley wrote:

On Sun, 2020-08-02 at 23:21 +0800, Jia-Ju Bai wrote:

Because "fs" is mapped to DMA, its data can be modified at anytime by
malicious or malfunctioning hardware. In this case, the check
"if (fsc->command >= cmdcnt)" can be passed, and then "fsc->command"
can be modified by hardware to cause buffer overflow.

This threat model seems to be completely bogus.  If the device were
malicious it would have given the mailbox incorrect values a priori ...
it wouldn't give the correct value then update it.  For most systems we
do assume correct operation of the device but if there's a worry about
incorrect operation, the usual approach is to guard the device with an
IOMMU which, again, would make this sort of fix unnecessary because the
IOMMU will have removed access to the buffer after the command
completed.


Thanks for the reply :)

In my opinion, IOMMU is used to prevent the hardware from accessing 
arbitrary memory addresses, but it cannot prevent the hardware from 
writing a bad value into a valid memory address.
For this reason, I think that the hardware can normally access 
"fsc->command" and modify it into arbitrary value at any time, because 
IOMMU considers the address of "fsc->command" is valid for the hardware.



Best wishes,
Jia-Ju Bai



linux-next: manual merge of the bpf-next tree with the net-next tree

2020-08-02 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the bpf-next tree got a conflict in:

  net/core/dev.c

between commit:

  829eb208e80d ("rtnetlink: add support for protodown reason")

from the net-next tree and commits:

  7f0a838254bd ("bpf, xdp: Maintain info on attached XDP BPF programs in 
net_device")
  aa8d3a716b59 ("bpf, xdp: Add bpf_link-based XDP attachment API")

from the bpf-next tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc net/core/dev.c
index f7ef0f5c5569,c8b911b10187..
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@@ -8715,54 -8712,75 +8711,100 @@@ int dev_change_proto_down_generic(struc
  }
  EXPORT_SYMBOL(dev_change_proto_down_generic);
  
 +/**
 + *dev_change_proto_down_reason - proto down reason
 + *
 + *@dev: device
 + *@mask: proto down mask
 + *@value: proto down value
 + */
 +void dev_change_proto_down_reason(struct net_device *dev, unsigned long mask,
 +u32 value)
 +{
 +  int b;
 +
 +  if (!mask) {
 +  dev->proto_down_reason = value;
 +  } else {
 +  for_each_set_bit(b, , 32) {
 +  if (value & (1 << b))
 +  dev->proto_down_reason |= BIT(b);
 +  else
 +  dev->proto_down_reason &= ~BIT(b);
 +  }
 +  }
 +}
 +EXPORT_SYMBOL(dev_change_proto_down_reason);
 +
- u32 __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op,
-   enum bpf_netdev_command cmd)
+ struct bpf_xdp_link {
+   struct bpf_link link;
+   struct net_device *dev; /* protected by rtnl_lock, no refcnt held */
+   int flags;
+ };
+ 
+ static enum bpf_xdp_mode dev_xdp_mode(u32 flags)
  {
-   struct netdev_bpf xdp;
+   if (flags & XDP_FLAGS_HW_MODE)
+   return XDP_MODE_HW;
+   if (flags & XDP_FLAGS_DRV_MODE)
+   return XDP_MODE_DRV;
+   return XDP_MODE_SKB;
+ }
  
-   if (!bpf_op)
-   return 0;
+ static bpf_op_t dev_xdp_bpf_op(struct net_device *dev, enum bpf_xdp_mode mode)
+ {
+   switch (mode) {
+   case XDP_MODE_SKB:
+   return generic_xdp_install;
+   case XDP_MODE_DRV:
+   case XDP_MODE_HW:
+   return dev->netdev_ops->ndo_bpf;
+   default:
+   return NULL;
+   };
+ }
  
-   memset(, 0, sizeof(xdp));
-   xdp.command = cmd;
+ static struct bpf_xdp_link *dev_xdp_link(struct net_device *dev,
+enum bpf_xdp_mode mode)
+ {
+   return dev->xdp_state[mode].link;
+ }
+ 
+ static struct bpf_prog *dev_xdp_prog(struct net_device *dev,
+enum bpf_xdp_mode mode)
+ {
+   struct bpf_xdp_link *link = dev_xdp_link(dev, mode);
+ 
+   if (link)
+   return link->link.prog;
+   return dev->xdp_state[mode].prog;
+ }
+ 
+ u32 dev_xdp_prog_id(struct net_device *dev, enum bpf_xdp_mode mode)
+ {
+   struct bpf_prog *prog = dev_xdp_prog(dev, mode);
  
-   /* Query must always succeed. */
-   WARN_ON(bpf_op(dev, ) < 0 && cmd == XDP_QUERY_PROG);
+   return prog ? prog->aux->id : 0;
+ }
  
-   return xdp.prog_id;
+ static void dev_xdp_set_link(struct net_device *dev, enum bpf_xdp_mode mode,
+struct bpf_xdp_link *link)
+ {
+   dev->xdp_state[mode].link = link;
+   dev->xdp_state[mode].prog = NULL;
  }
  
- static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op,
-  struct netlink_ext_ack *extack, u32 flags,
-  struct bpf_prog *prog)
+ static void dev_xdp_set_prog(struct net_device *dev, enum bpf_xdp_mode mode,
+struct bpf_prog *prog)
+ {
+   dev->xdp_state[mode].link = NULL;
+   dev->xdp_state[mode].prog = prog;
+ }
+ 
+ static int dev_xdp_install(struct net_device *dev, enum bpf_xdp_mode mode,
+  bpf_op_t bpf_op, struct netlink_ext_ack *extack,
+  u32 flags, struct bpf_prog *prog)
  {
-   bool non_hw = !(flags & XDP_FLAGS_HW_MODE);
-   struct bpf_prog *prev_prog = NULL;
struct netdev_bpf xdp;
int err;
  


pgpExdR9TQg9v.pgp
Description: OpenPGP digital signature


RE: PATCH: rtsx_pci driver - don't disable the rts5229 card reader on Intel NUC boxes

2020-08-02 Thread 吳昊澄 Ricky
Hi Chris,

We don’t think this is our bug...
This register(FPDCTL) write to OC_POWER_DOWN is for our power saving feature, 
not to disable the reader
In your case, we cannot reproduce this on our side that we mention before, we 
don’t have the platform(Intel NUC Tall Arches Canyon NUC6CAYH Celeron J345) to 
see this issue
But we think this issue maybe only on this platform, our RTS5229 works well on 
the new kernel all platform that we have  

Ricky

> -Original Message-
> From: Chris Clayton [mailto:chris2...@googlemail.com]
> Sent: Monday, August 03, 2020 3:59 AM
> To: LKML; 吳昊澄 Ricky; gre...@linuxfoundation.org; rdun...@infradead.org;
> philqua...@gmail.com; Arnd Bergmann
> Subject: Re: PATCH: rtsx_pci driver - don't disable the rts5229 card reader on
> Intel NUC boxes
> 
> Sorry, I should have said that the patch is against 5.7.12. It applies to 
> upstream,
> but with offsets.
> 
> On 02/08/2020 20:48, Chris Clayton wrote:
> > bede03a579b3 introduced a bug which leaves the rts5229 PCI Express card
> reader on my Intel NUC6CAYH box.
> >
> > The bug is in drivers/misc/cardreader/rtsx_pcr.c. A call to 
> > rtsx_pci_init_ocp()
> was added to rtsx_pci_init_hw().
> > At the call point, pcr->ops->init_ocp is NULL and pcr->option.ocp_en is 0, 
> > so in
> rtsx_pci_init_ocp() the cardreader
> > gets disabled.
> >
> > I've avoided this by making excution code that results in the reader being
> disabled conditional on the device
> > not being an RTS5229. Of course, other rtsxxx card readers may also be
> disabled by this bug. I don't have the
> > knowledge to address that, so I'll leave to the driver maintainers.
> >
> > The patch to avoid the bug is attached.
> >
> > Fixes: bede03a579b3 ("misc: rtsx: Enable OCP for rts522a rts524a rts525a
> rts5260")
> > Link: https://marc.info/?l=linux-kernel=159105912832257
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=204003
> > Signed-off-by: Chris Clayton 
> >
> > bede03a579b3 introduced a bug which leaves the rts5229 PCI Express card
> reader on my Intel NUC6CAYH box.
> >
> > The bug is in drivers/misc/cardreader/rtsx_pcr.c. A call to 
> > rtsx_pci_init_ocp()
> was added to rtsx_pci_init_hw().
> > At the call point, pcr->ops->init_ocp is NULL and pcr->option.ocp_en is 0, 
> > so in
> rtsx_pci_init_ocp() the cardreader
> > gets disabled.
> >
> > I've avoided this by making excution code that results in the reader being
> disabled conditional on the device
> > not being an RTS5229. Of course, other rtsxxx card readers may also be
> disabled by this bug. I don't have the
> > knowledge to address that, so I'll leave to the driver maintainers.
> >
> > The patch to avoid the bug is attached.
> >
> > Chris
> >
> 
> --Please consider the environment before printing this e-mail.


Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

2020-08-02 Thread Stanley Chu
Hi Can,

On Sat, 2020-08-01 at 07:17 +0800, Can Guo wrote:
> Hi Bart,
> 
> On 2020-08-01 00:51, Bart Van Assche wrote:
> > On 2020-07-31 01:00, Can Guo wrote:
> >> AFAIK, sychronization of scsi_done is not a problem here, because scsi
> >> layer
> >> use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to 
> >> prevent
> >> the concurrency of abort and real completion of it.
> >> 
> >> Check func scsi_times_out(), hope it helps.
> >> 
> >> enum blk_eh_timer_return scsi_times_out(struct request *req)
> >> {
> >> ...
> >> if (rtn == BLK_EH_DONE) {
> >> /*
> >>  * Set the command to complete first in order to 
> >> prevent
> >> a real
> >>  * completion from releasing the command while error
> >> handling
> >>  * is using it. If the command was already completed,
> >> then the
> >>  * lower level driver beat the timeout handler, and it
> >> is safe
> >>  * to return without escalating error recovery.
> >>  *
> >>  * If timeout handling lost the race to a real
> >> completion, the
> >>  * block layer may ignore that due to a fake timeout
> >> injection,
> >>  * so return RESET_TIMER to allow error handling 
> >> another
> >> shot
> >>  * at this command.
> >>  */
> >> if (test_and_set_bit(SCMD_STATE_COMPLETE, 
> >> >state))
> >> return BLK_EH_RESET_TIMER;
> >> if (scsi_abort_command(scmd) != SUCCESS) {
> >> set_host_byte(scmd, DID_TIME_OUT);
> >> scsi_eh_scmd_add(scmd);
> >> }
> >> }
> >> }
> > 
> > I am familiar with this mechanism. My concern is that both the regular
> > completion path and the abort handler must call scsi_dma_unmap() before
> > calling cmd->scsi_done(cmd). I don't see how
> > test_and_set_bit(SCMD_STATE_COMPLETE, >state) could prevent that
> > the regular completion path and the abort handler call scsi_dma_unmap()
> > concurrently since both calls happen before the SCMD_STATE_COMPLETE bit
> > is set?
> > 
> > Thanks,
> > 
> > Bart.
> 
> For scsi_dma_unmap() part, that is true - we should make it serialized 
> with
> any other completion paths. I've found it during my fault injection 
> test, so
> I've made a patch to fix it, but it only comes in my next error recovery
> enhancement patch series. Please check the attachment.
> 

Your patch looks good to me.

I have the same idea before but I found that calling scsi_done() (by
__ufshcd_transfer_req_compl()) in ufshcd_abort() in old kernel (e.g.,
4.14) will cause issues but it has been resolved by introduced
SCMD_STATE_COMPLETE flag in newer kernel. So your patch makes sense.

Would you mind sending out this draft patch as a formal patch together
with my patch to fix issues in ufshcd_abort()? Our patches are aimed to
fix cases that host/device reset eventually not being triggered by the
result of ufshcd_abort(), for example, command is aborted successfully
or command is not pending in device with its doorbell also cleared.

Thanks,
Stanley Chu

> Thanks,
> 
> Can Guo.
> 



Re: [RFC PATCH 0/8] KVM: x86/mmu: Introduce pinned SPTEs framework

2020-08-02 Thread Eric van Tassell

Sean,
What commit did you base your series  on?
Thanks.

-evt(Eric van Tassell)

On 7/31/20 4:23 PM, Sean Christopherson wrote:

SEV currently needs to pin guest memory as it doesn't support migrating
encrypted pages.  Introduce a framework in KVM's MMU to support pinning
pages on demand without requiring additional memory allocations, and with
(somewhat hazy) line of sight toward supporting more advanced features for
encrypted guest memory, e.g. host page migration.

The idea is to use a software available bit in the SPTE to track that a
page has been pinned.  The decision to pin a page and the actual pinning
managment is handled by vendor code via kvm_x86_ops hooks.  There are
intentionally two hooks (zap and unzap) introduced that are not needed for
SEV.  I included them to again show how the flag (probably renamed?) could
be used for more than just pin/unpin.

Bugs in the core implementation are pretty much guaranteed.  The basic
concept has been tested, but in a fairly different incarnation.  Most
notably, tagging PRESENT SPTEs as PINNED has not been tested, although
using the PINNED flag to track zapped (and known to be pinned) SPTEs has
been tested.  I cobbled this variation together fairly quickly to get the
code out there for discussion.

The last patch to pin SEV pages during sev_launch_update_data() is
incomplete; it's there to show how we might leverage MMU-based pinning to
support pinning pages before the guest is live.

Sean Christopherson (8):
   KVM: x86/mmu: Return old SPTE from mmu_spte_clear_track_bits()
   KVM: x86/mmu: Use bits 2:0 to check for present SPTEs
   KVM: x86/mmu: Refactor handling of not-present SPTEs in mmu_set_spte()
   KVM: x86/mmu: Add infrastructure for pinning PFNs on demand
   KVM: SVM: Use the KVM MMU SPTE pinning hooks to pin pages on demand
   KVM: x86/mmu: Move 'pfn' variable to caller of direct_page_fault()
   KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by SEV
   KVM: SVM: Pin SEV pages in MMU during sev_launch_update_data()

  arch/x86/include/asm/kvm_host.h |   7 ++
  arch/x86/kvm/mmu.h  |   3 +
  arch/x86/kvm/mmu/mmu.c  | 186 +---
  arch/x86/kvm/mmu/paging_tmpl.h  |   3 +-
  arch/x86/kvm/svm/sev.c  | 141 +++-
  arch/x86/kvm/svm/svm.c  |   3 +
  arch/x86/kvm/svm/svm.h  |   3 +
  7 files changed, 302 insertions(+), 44 deletions(-)



  1   2   3   4   5   >