Re: [PATCH v3] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Nicolin Chen
On Tue, Jun 16, 2020 at 10:53:48AM +0800, Shengjiu Wang wrote:
> For mono channel, SSI will switch to Normal mode.
> 
> In Normal mode and Network mode, the Word Length Control bits
> control the word length divider in clock generator, which is
> different with I2S Master mode (the word length is fixed to
> 32bit), it should be the value of params_width(hw_params).
> 
> The condition "slots == 2" is not good for I2S Master mode,
> because for Network mode and Normal mode, the slots can also
> be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
> to check if it is I2S Master mode.
> 
> So we refine the formula for mono channel, otherwise there
> will be sound issue for S24_LE.
> 
> Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot 
> number and width")
> Signed-off-by: Shengjiu Wang 

Reviewed-by: Nicolin Chen 


[PATCH v3] All arch: remove system call sys_sysctl

2020-06-15 Thread Xiaoming Ni
Since the commit 61a47c1ad3a4dc ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.

We have been warning about people using the sysctl system call for years
and believe there are no more users.  Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.

So completely remove sys_sysctl on all architectures.

Signed-off-by: Xiaoming Ni 
Acked-by: Will Deacon 
Acked-by: "Eric W. Biederman" 

changes in v3:
  restore include/uapi/linux/sysctl.h
  rebase on commit bc7d17d55762 ("Add linux-next specific files for 20200615")
  Conflicts:
arch/sh/include/uapi/asm/unistd_64.h
arch/sh/kernel/syscalls_64.S

v2:
  According to Kees Cook's suggestion, completely remove sys_sysctl on all arch
  According to Eric W. Biederman's suggestion, update the commit log

V1: 
https://lore.kernel.org/lkml/1591683605-8585-1-git-send-email-nixiaom...@huawei.com/
  Delete the code of sys_sysctl and return -ENOSYS directly at the function 
entry
---
 arch/alpha/kernel/syscalls/syscall.tbl|   2 +-
 arch/arm/configs/am200epdkit_defconfig|   1 -
 arch/arm/tools/syscall.tbl|   2 +-
 arch/arm64/include/asm/unistd32.h |   4 +-
 arch/ia64/kernel/syscalls/syscall.tbl |   2 +-
 arch/m68k/kernel/syscalls/syscall.tbl |   2 +-
 arch/microblaze/kernel/syscalls/syscall.tbl   |   2 +-
 arch/mips/configs/cu1000-neo_defconfig|   1 -
 arch/mips/kernel/syscalls/syscall_n32.tbl |   2 +-
 arch/mips/kernel/syscalls/syscall_n64.tbl |   2 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |   2 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |   2 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |   2 +-
 arch/s390/kernel/syscalls/syscall.tbl |   2 +-
 arch/sh/configs/dreamcast_defconfig   |   1 -
 arch/sh/configs/espt_defconfig|   1 -
 arch/sh/configs/hp6xx_defconfig   |   1 -
 arch/sh/configs/landisk_defconfig |   1 -
 arch/sh/configs/lboxre2_defconfig |   1 -
 arch/sh/configs/microdev_defconfig|   1 -
 arch/sh/configs/migor_defconfig   |   1 -
 arch/sh/configs/r7780mp_defconfig |   1 -
 arch/sh/configs/r7785rp_defconfig |   1 -
 arch/sh/configs/rts7751r2d1_defconfig |   1 -
 arch/sh/configs/rts7751r2dplus_defconfig  |   1 -
 arch/sh/configs/se7206_defconfig  |   1 -
 arch/sh/configs/se7343_defconfig  |   1 -
 arch/sh/configs/se7619_defconfig  |   1 -
 arch/sh/configs/se7705_defconfig  |   1 -
 arch/sh/configs/se7750_defconfig  |   1 -
 arch/sh/configs/se7751_defconfig  |   1 -
 arch/sh/configs/secureedge5410_defconfig  |   1 -
 arch/sh/configs/sh03_defconfig|   1 -
 arch/sh/configs/sh7710voipgw_defconfig|   1 -
 arch/sh/configs/sh7757lcr_defconfig   |   1 -
 arch/sh/configs/sh7763rdp_defconfig   |   1 -
 arch/sh/configs/shmin_defconfig   |   1 -
 arch/sh/configs/titan_defconfig   |   1 -
 arch/sh/kernel/syscalls/syscall.tbl   |   2 +-
 arch/sparc/kernel/syscalls/syscall.tbl|   2 +-
 arch/x86/entry/syscalls/syscall_32.tbl|   2 +-
 arch/x86/entry/syscalls/syscall_64.tbl|   2 +-
 arch/xtensa/kernel/syscalls/syscall.tbl   |   2 +-
 include/linux/compat.h|   1 -
 include/linux/syscalls.h  |   2 -
 include/linux/sysctl.h|   6 +-
 kernel/Makefile   |   2 +-
 kernel/sys_ni.c   |   1 -
 kernel/sysctl_binary.c| 171 --
 .../arch/powerpc/entry/syscalls/syscall.tbl   |   2 +-
 .../perf/arch/s390/entry/syscalls/syscall.tbl |   2 +-
 .../arch/x86/entry/syscalls/syscall_64.tbl|   2 +-
 52 files changed, 24 insertions(+), 227 deletions(-)
 delete mode 100644 kernel/sysctl_binary.c

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl 
b/arch/alpha/kernel/syscalls/syscall.tbl
index b249824c8267..0da7f1c61529 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -249,7 +249,7 @@
 316common  mlockallsys_mlockall
 317common  munlockall  sys_munlockall
 318common  sysinfo sys_sysinfo
-319common  _sysctl sys_sysctl
+319common  _sysctl sys_ni_syscall
 # 320 was sys_idle
 321common  oldumount   sys_oldumount
 322common  swapon  sys_swapon
diff --git a/arch/arm/configs/am200epdkit_defconfig 
b/arch/arm/configs/am200epdkit_defconfig
index f56ac394caf1..4e49d6cb2f62 100644
--- a/arch/arm

Re: [PATCH 2/2] cpufreq: Specify default governor on command line

2020-06-15 Thread Viresh Kumar
On 15-06-20, 17:55, Quentin Perret wrote:
> +static void cpufreq_get_default_governor(void)
> +{
> + default_governor = cpufreq_parse_governor(cpufreq_param_governor);
> + if (!default_governor) {
> + if (*cpufreq_param_governor)
> + pr_warn("Failed to find %s\n", cpufreq_param_governor);
> + default_governor = cpufreq_default_governor();

A module_get() never happened for this case and so maybe a
module_put() should never get called.

> + }
> +}
> +
> +static void cpufreq_put_default_governor(void)
> +{
> + if (!default_governor)
> + return;
> + module_put(default_governor->owner);
> + default_governor = NULL;
> +}
> +
>  static int cpufreq_init_governor(struct cpufreq_policy *policy)
>  {
>   int ret;
> @@ -2701,6 +2721,8 @@ int cpufreq_register_driver(struct cpufreq_driver 
> *driver_data)
>  
>   if (driver_data->setpolicy)
>   driver_data->flags |= CPUFREQ_CONST_LOOPS;
> + else
> + cpufreq_get_default_governor();
>  
>   if (cpufreq_boost_supported()) {
>   ret = create_boost_sysfs_file();
> @@ -2769,6 +2791,7 @@ int cpufreq_unregister_driver(struct cpufreq_driver 
> *driver)
>   subsys_interface_unregister(_interface);
>   remove_boost_sysfs_file();
>   cpuhp_remove_state_nocalls_cpuslocked(hp_online);
> + cpufreq_put_default_governor();
>  
>   write_lock_irqsave(_driver_lock, flags);
>  
> @@ -2792,4 +2815,5 @@ static int __init cpufreq_core_init(void)
>   return 0;
>  }

And since this is a per boot thing, there is perhaps no need of doing
these at driver register/unregister, I would rather do it at:
cpufreq_core_init() time itself and so we will never need to run
cpufreq_put_default_governor() and so can be removed.

And another thing I am not able to understand (despite you commenting
about that in the commit log) is what happens if the default governor
chosen is built as a module ?

-- 
viresh


Re: [PATCH 1/2] cpufreq: Register governors at core_initcall

2020-06-15 Thread Viresh Kumar
On 15-06-20, 17:55, Quentin Perret wrote:
> Currently, most CPUFreq governors are registered at core_initcall time
> when used as default, and module_init otherwise. In preparation for
> letting users specify the default governor on the kernel command line,
> change all of them to use core_initcall unconditionally, as is already
> the case for schedutil and performance. This will enable us to assume
> builtin governors have been registered before the builtin CPUFreq
> drivers probe.
> 
> And since all governors now have similar init/exit patterns, introduce
> two new macros cpufreq_governor_{init,exit}() to factorize the code.
> 
> Signed-off-by: Quentin Perret 
> ---
> Note: I couldn't boot-test the change to spudemand, by lack of hardware.
> But I can confirm cell_defconfig compiles just fine.
> ---
>  .../platforms/cell/cpufreq_spudemand.c| 26 ++-
>  drivers/cpufreq/cpufreq_conservative.c| 22 
>  drivers/cpufreq/cpufreq_ondemand.c| 24 +
>  drivers/cpufreq/cpufreq_performance.c | 14 ++
>  drivers/cpufreq/cpufreq_powersave.c   | 18 +++--
>  drivers/cpufreq/cpufreq_userspace.c   | 18 +++--
>  include/linux/cpufreq.h   | 14 ++
>  kernel/sched/cpufreq_schedutil.c  |  6 +
>  8 files changed, 36 insertions(+), 106 deletions(-)

Acked-by: Viresh Kumar 

-- 
viresh


Re: [PATCH 00/17] spelling.txt: /decriptors/descriptors/

2020-06-15 Thread Martin K. Petersen
On Tue, 9 Jun 2020 13:45:53 +0100, Kieran Bingham wrote:

> I wouldn't normally go through spelling fixes, but I caught sight of
> this typo twice, and then foolishly grepped the tree for it, and saw how
> pervasive it was.
> 
> so here I am ... fixing a typo globally... but with an addition in
> scripts/spelling.txt so it shouldn't re-appear ;-)
> 
> [...]

Applied to 5.9/scsi-queue, thanks!

[06/17] scsi: Fix trivial spelling
https://git.kernel.org/mkp/scsi/c/0a19a725c0ed

-- 
Martin K. Petersen  Oracle Linux Engineering


Re: [PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-15 Thread Eric Biggers
On Mon, Jun 15, 2020 at 09:57:16PM -0400, Waiman Long wrote:
> The kzfree() function is normally used to clear some sensitive
> information, like encryption keys, in the buffer before freeing it back
> to the pool. Memset() is currently used for the buffer clearing. However,
> it is entirely possible that the compiler may choose to optimize away the
> memory clearing especially if LTO is being used. To make sure that this
> optimization will not happen, memzero_explicit(), which is introduced
> in v3.18, is now used in kzfree() to do the clearing.
> 
> Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Waiman Long 
> ---
>  mm/slab_common.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 9e72ba224175..37d48a56431d 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1726,7 +1726,7 @@ void kzfree(const void *p)
>   if (unlikely(ZERO_OR_NULL_PTR(mem)))
>   return;
>   ks = ksize(mem);
> - memset(mem, 0, ks);
> + memzero_explicit(mem, ks);
>   kfree(mem);
>  }
>  EXPORT_SYMBOL(kzfree);

This is a good change, but the commit message isn't really accurate.  AFAIK, no
one has found any case where this memset() gets optimized out.  And even with
LTO, it would be virtually impossible due to all the synchronization and global
data structures that kfree() uses.  (Remember that this isn't the C standard
function "free()", so the compiler can't assign it any special meaning.)
Not to mention that LTO support isn't actually upstream yet.

I still agree with the change, but it might be helpful if the commit message
were honest that this is really a hardening measure and about properly conveying
the intent.  As-is this sounds like a critical fix, which might confuse people.

- Eric


Re: [PATCH] kbuild: reuse vmlinux.o in vmlinux_link

2020-06-15 Thread Masahiro Yamada
On Tue, Jun 16, 2020 at 6:47 AM Sami Tolvanen  wrote:
>
> On Sat, May 23, 2020 at 8:13 AM Masahiro Yamada  wrote:
> >
> > Hi Nicholas,
> > (+CC: Sam Ravnborg)
> >
> >
> > On Sat, May 23, 2020 at 7:06 PM Nicholas Piggin  wrote:
> > >
> > > Excerpts from Masahiro Yamada's message of May 23, 2020 3:44 am:
> > > > + Michael, and PPC ML.
> > > >
> > > > They may know something about the reason of failure.
> > >
> > > Because the linker can't put branch stubs within object code sections,
> > > so when you incrementally link them too large, the linker can't resolve
> > > branches into other object files.
> >
> >
> > Ah, you are right.
> >
> > So, this is a problem not only for PPC
> > but also for ARM (both 32 and 64 bit), etc.
> >
> > ARM needs to insert a veneer to jump far.
> >
> > Prior to thin archive, we could not compile
> > ARCH=arm allyesconfig because
> > drivers/built-in.o was too large.
> >
> > This patch gets us back to the too large
> > incremental object situation.
> >
> > With my quick compile-testing,
> > ARCH=arm allyesconfig
> > and ARCH=arm64 allyesconfig are broken.
>
> Thanks for looking into this! Clang doesn't appear to have this issue
> with LTO because it always enables both -ffunction-sections and
> -fdata-sections. I confirmed that -ffunction-sections also fixes arm64
> allyesconfig with this patch. While I'm fine with reusing vmlinux.o
> only with LTO, how would you feel about enabling -ffunction-sections
> in the kernel by default?


I am OK if it works.

Please do compile tests for some architectures.
(especially, ARCH=powerpc defconfig, and ARCH=arm(64) allyesconfig)


Thank you.




-- 
Best Regards
Masahiro Yamada


[PATCH v3] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Shengjiu Wang
For mono channel, SSI will switch to Normal mode.

In Normal mode and Network mode, the Word Length Control bits
control the word length divider in clock generator, which is
different with I2S Master mode (the word length is fixed to
32bit), it should be the value of params_width(hw_params).

The condition "slots == 2" is not good for I2S Master mode,
because for Network mode and Normal mode, the slots can also
be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
to check if it is I2S Master mode.

So we refine the formula for mono channel, otherwise there
will be sound issue for S24_LE.

Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot number 
and width")
Signed-off-by: Shengjiu Wang 
---
changes in v3
- update according to Nicolin's comments

changes in v2
- refine patch for Network mode and Normal mode.

 sound/soc/fsl/fsl_ssi.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index bad89b0d129e..1a2fa7f18142 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
*substream,
struct regmap *regs = ssi->regs;
u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
unsigned long clkrate, baudrate, tmprate;
-   unsigned int slots = params_channels(hw_params);
-   unsigned int slot_width = 32;
+   unsigned int channels = params_channels(hw_params);
+   unsigned int slot_width = params_width(hw_params);
+   unsigned int slots = 2;
u64 sub, savesub = 10;
unsigned int freq;
bool baudclk_is_used;
@@ -688,10 +689,14 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
*substream,
/* Override slots and slot_width if being specifically set... */
if (ssi->slots)
slots = ssi->slots;
-   /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
-   if (ssi->slot_width && slots != 2)
+   if (ssi->slot_width)
slot_width = ssi->slot_width;
 
+   /* ...but force 32 bits for stereo audio using I2S Master Mode */
+   if (channels == 2 &&
+   (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) == SSI_SCR_I2S_MODE_MASTER)
+   slot_width = 32;
+
/* Generate bit clock based on the slot number and slot width */
freq = slots * slot_width * params_rate(hw_params);
 
-- 
2.21.0



[PATCH 2/2] cpufreq: Specify default governor on command line

2020-06-15 Thread Quentin Perret
Currently, the only way to specify the default CPUfreq governor is via
Kconfig options, which suits users who can build the kernel themselves
perfectly.

However, for those who use a distro-like kernel (such as Android, with
the Generic Kernel Image project), the only way to use a different
default is to boot to userspace, and to then switch using the sysfs
interface. Being able to specify the default governor on the command
line, like is the case for cpuidle, would enable those users to specify
their governor of choice earlier on, and to simplify slighlty the
userspace boot procedure.

To support this use-case, add a kernel command line parameter enabling
to specify a default governor for CPUfreq, which takes precedence over
the builtin default.

This implementation has one notable limitation: the default governor
must be registered before the driver. This is solved for builtin
governors and drivers using appropriate *_initcall() functions. And in
the modular case, this must be reflected as a constraint on the module
loading order.

Signed-off-by: Quentin Perret 
---
 .../admin-guide/kernel-parameters.txt |  5 +++
 Documentation/admin-guide/pm/cpufreq.rst  |  6 ++--
 drivers/cpufreq/cpufreq.c | 34 ---
 3 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index fb95fad81c79..5fd3c9f187eb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -703,6 +703,11 @@
cpufreq.off=1   [CPU_FREQ]
disable the cpufreq sub-system
 
+   cpufreq.default_governor=
+   [CPU_FREQ] Name of the default cpufreq governor to use.
+   This governor must be registered in the kernel before
+   the cpufreq driver probes.
+
cpu_init_udelay=N
[X86] Delay for N microsec between assert and de-assert
of APIC INIT to start processors.  This delay occurs
diff --git a/Documentation/admin-guide/pm/cpufreq.rst 
b/Documentation/admin-guide/pm/cpufreq.rst
index 0c74a7784964..368e612145d2 100644
--- a/Documentation/admin-guide/pm/cpufreq.rst
+++ b/Documentation/admin-guide/pm/cpufreq.rst
@@ -147,9 +147,9 @@ CPUs in it.
 
 The next major initialization step for a new policy object is to attach a
 scaling governor to it (to begin with, that is the default scaling governor
-determined by the kernel configuration, but it may be changed later
-via ``sysfs``).  First, a pointer to the new policy object is passed to the
-governor's ``->init()`` callback which is expected to initialize all of the
+determined by the kernel command line or configuration, but it may be changed
+later via ``sysfs``).  First, a pointer to the new policy object is passed to
+the governor's ``->init()`` callback which is expected to initialize all of the
 data structures necessary to handle the given policy and, possibly, to add
 a governor ``sysfs`` interface to it.  Next, the governor is started by
 invoking its ``->start()`` callback.
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 0128de3603df..0f05caedc320 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -50,6 +50,9 @@ static LIST_HEAD(cpufreq_governor_list);
 #define for_each_governor(__governor)  \
list_for_each_entry(__governor, _governor_list, governor_list)
 
+static char cpufreq_param_governor[CPUFREQ_NAME_LEN];
+static struct cpufreq_governor *default_governor;
+
 /**
  * The "cpufreq driver" - the arch- or hardware-dependent low
  * level driver of CPUFreq support, and its spinlock. This lock
@@ -1055,7 +1058,6 @@ __weak struct cpufreq_governor 
*cpufreq_default_governor(void)
 
 static int cpufreq_init_policy(struct cpufreq_policy *policy)
 {
-   struct cpufreq_governor *def_gov = cpufreq_default_governor();
struct cpufreq_governor *gov = NULL;
unsigned int pol = CPUFREQ_POLICY_UNKNOWN;
 
@@ -1065,8 +1067,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
*policy)
if (gov) {
pr_debug("Restoring governor %s for cpu %d\n",
 policy->governor->name, policy->cpu);
-   } else if (def_gov) {
-   gov = def_gov;
+   } else if (default_governor) {
+   gov = default_governor;
} else {
return -ENODATA;
}
@@ -1074,8 +1076,8 @@ static int cpufreq_init_policy(struct cpufreq_policy 
*policy)
/* Use the default policy if there is no last_policy. */
if (policy->last_policy) {
pol = policy->last_policy;
-   } else if (def_gov) {
-   pol = cpufreq_parse_policy(def_gov->name);
+

Re: [PATCH v13 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-15 Thread Ira Weiny
On Mon, Jun 15, 2020 at 06:14:04PM +0530, Vaibhav Jain wrote:
> Implement support for fetching nvdimm health information via
> H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
> of 64-bit bitmap, bitwise-and of which is then stored in
> 'struct papr_scm_priv' and subsequently partially exposed to
> user-space via newly introduced dimm specific attribute
> 'papr/flags'. Since the hcall is costly, the health information is
> cached and only re-queried, 60s after the previous successful hcall.
> 
> The patch also adds a  documentation text describing flags reported by
> the the new sysfs attribute 'papr/flags' is also introduced at
> Documentation/ABI/testing/sysfs-bus-papr-pmem.
> 
> [1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
> PAPR hcalls")
> 
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 

Reviewed-by: Ira Weiny 

> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
> 
> v12..v13:
> * None
> 
> v11..v12:
> * None
> 
> v10..v11:
> * None
> 
> v9..v10:
> * Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].
> 
> Resend:
> * Added ack from Aneesh.
> 
> v8..v9:
> * Rename some variables and defines to reduce usage of term SCM
>   replacing it with PMEM [Dan Williams, Aneesh]
> * s/PAPR_SCM_DIMM/PAPR_PMEM/g
> * s/papr_scm_nd_attributes/papr_nd_attributes/g
> * s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
> * s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
> * Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem
> 
> v7..v8:
> * Update type of variable 'rc' in __drc_pmem_query_health() and
>   drc_pmem_query_health() to long and int respectively. [ Ira ]
> * Updated the patch description to s/64 bit Big Endian Number/64-bit
>   bitmap/ [ Ira, Aneesh ].
> 
> Resend:
> * None
> 
> v6..v7 :
> * Used the exported buf_seq_printf() function to generate content for
>   'papr/flags'
> * Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
>   and removed the papr_scm.h file [Mpe]
> * Some minor consistency issued in sysfs-bus-papr-scm
>   documentation. [Mpe]
> * s/dimm_mutex/health_mutex/g [Mpe]
> * Split drc_pmem_query_health() into two function one of which takes
>   care of caching and locking. [Mpe]
> * Fixed a local copy creation of dimm health information using
>   READ_ONCE(). [Mpe]
> 
> v5..v6 :
> * Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
>   [Dan Williams]
> * Include documentation for 'papr/flags' attr [Dan Williams]
> * Change flag 'save_fail' to 'flush_fail' [Dan Williams]
> * Caching of health bitmap to reduce expensive hcalls [Dan Williams]
> * Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
> * Replaced two __be64 integers from papr_scm_priv to a single u64
>   integer [Mpe]
> * Updated patch description to reflect the changes made in this
>   version.
> * Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
>   flags_show() [Dan Williams]
> 
> v4..v5 :
> * None
> 
> v3..v4 :
> * None
> 
> v2..v3 :
> * Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
>NVDIMM unarmed [Aneesh]
> 
> v1..v2 :
> * New patch in the series.
> ---
>  Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
>  arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
>  2 files changed, 193 insertions(+), 2 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
> b/Documentation/ABI/testing/sysfs-bus-papr-pmem
> new file mode 100644
> index ..5b10d036a8d4
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
> @@ -0,0 +1,27 @@
> +What:/sys/bus/nd/devices/nmemX/papr/flags
> +Date:Apr, 2020
> +KernelVersion:   v5.8
> +Contact: linuxppc-dev , 
> linux-nvd...@lists.01.org,
> +Description:
> + (RO) Report flags indicating various states of a
> + papr-pmem NVDIMM device. Each flag maps to a one or
> + more bits set in the dimm-health-bitmap retrieved in
> + response to H_SCM_HEALTH hcall. The details of the bit
> + flags returned in response to this hcall is available
> + at 'Documentation/powerpc/papr_hcalls.rst' . Below are
> + the flags reported in this sysfs file:
> +
> + * "not_armed"   : Indicates that NVDIMM contents will not
> +   survive a power cycle.
> + * "flush_fail"  : Indicates that NVDIMM contents
> +   couldn't be flushed during last
> +   shut-down event.
> + * "restore_fail": Indicates that NVDIMM contents
> +   couldn't be restored during NVDIMM
> +   initialization.
> + * "encrypted"   : NVDIMM contents are encrypted.
> + * "smart_notify": There is health event for the 

Re: [PATCH v2] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Shengjiu Wang
On Tue, Jun 16, 2020 at 9:59 AM Nicolin Chen  wrote:
>
> On Tue, Jun 16, 2020 at 09:48:39AM +0800, Shengjiu Wang wrote:
> > On Tue, Jun 16, 2020 at 7:11 AM Nicolin Chen  wrote:
> > >
> > > On Mon, Jun 15, 2020 at 01:56:18PM +0800, Shengjiu Wang wrote:
> > > > For mono channel, SSI will switch to Normal mode.
> > > >
> > > > In Normal mode and Network mode, the Word Length Control bits
> > > > control the word length divider in clock generator, which is
> > > > different with I2S Master mode (the word length is fixed to
> > > > 32bit), it should be the value of params_width(hw_params).
> > > >
> > > > The condition "slots == 2" is not good for I2S Master mode,
> > > > because for Network mode and Normal mode, the slots can also
> > > > be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
> > > > to check if it is I2S Master mode.
> > >
> > > The fsl_ssi_set_bclk is only called when fsl_ssi_is_i2s_master,
> > > though I agree that that line of comments sounds confusing now.
> >
> > Actually I think fsl_ssi_is_i2s_master is not accurate, it just checks
> > the Master mode,  but didn't check the I2S mode.
> >
> > >
> > > > So we refine the famula for mono channel, otherwise there
> > >
> > > famula => formula?
> > >
> > > > will be sound issue for S24_LE.
> > > >
> > > > Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot 
> > > > number and width")
> > > > Signed-off-by: Shengjiu Wang 
> > > > ---
> > > > changes in v2
> > > > - refine patch for Network mode and Normal mode.
> > > >
> > > >  sound/soc/fsl/fsl_ssi.c | 15 +++
> > > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> > > > index bad89b0d129e..cbf67d132fda 100644
> > > > --- a/sound/soc/fsl/fsl_ssi.c
> > > > +++ b/sound/soc/fsl/fsl_ssi.c
> > > > @@ -678,7 +678,8 @@ static int fsl_ssi_set_bclk(struct 
> > > > snd_pcm_substream *substream,
> > > >   struct regmap *regs = ssi->regs;
> > > >   u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> > > >   unsigned long clkrate, baudrate, tmprate;
> > > > - unsigned int slots = params_channels(hw_params);
> > > > + unsigned int channels = params_channels(hw_params);
> > > > + unsigned int slots;
> > > >   unsigned int slot_width = 32;
> > > >   u64 sub, savesub = 10;
> > > >   unsigned int freq;
> > > > @@ -688,9 +689,15 @@ static int fsl_ssi_set_bclk(struct 
> > > > snd_pcm_substream *substream,
> > > >   /* Override slots and slot_width if being specifically set... */
> > > >   if (ssi->slots)
> > > >   slots = ssi->slots;
> > > > - /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> > > > - if (ssi->slot_width && slots != 2)
> > > > - slot_width = ssi->slot_width;
> > > > + else
> > > > + /* Apply two slots for mono channel, because DC = 2 */
> > > > + slots = (channels == 1) ? 2 : channels;
> > > > +
> > > > + /* ...but keep 32 bits if I2S Master mode */
> > > > + if ((ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) != 
> > > > SSI_SCR_I2S_MODE_MASTER ||
> > > > + channels == 1)
> > > > + slot_width = ssi->slot_width ? ssi->slot_width :
> > >
> > > This looks very complicated...can you review and try mine?
> > > (Basically, take 32-bit out of default but force it later)
> > >
> > > @@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > > *substream,
> > > struct regmap *regs = ssi->regs;
> > > u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> > > unsigned long clkrate, baudrate, tmprate;
> > > -   unsigned int slots = params_channels(hw_params);
> > > -   unsigned int slot_width = 32;
> > > +   unsigned int channels = params_channels(hw_params);
> > > +   unsigned int slot_width = params_width(hw_params);
> > > +   unsigned int slots = 2;
> > > u64 sub, savesub = 10;
> > > unsigned int freq;
> > > bool baudclk_is_used;
> > > @@ -688,10 +689,16 @@ static int fsl_ssi_set_bclk(struct 
> > > snd_pcm_substream *substream,
> > > /* Override slots and slot_width if being specifically set... */
> > > if (ssi->slots)
> > > slots = ssi->slots;
> > > -   /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> > > -   if (ssi->slot_width && slots != 2)
> > > +   if (ssi->slot_width)
> > > slot_width = ssi->slot_width;
> > >
> > > +   /*
> > > +* ...but force 32 bits for stereo audio. Note that mono audio is 
> > > also
> > > +* sent in 2 slots via NORMAL mode, so check both slots and 
> > > channels.
> > > +*/
> > > +   if (slots == 2 && channels == 2)
> > > +   slot_width = 32;
> >
> > slots ==2 && channels ==2 does not mean the I2S Master mode.
> > For LEFT_J, it is also slots =2 & channels = 2, then the slot_width
> > should be 

Re: [PATCH v2] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Nicolin Chen
On Tue, Jun 16, 2020 at 09:48:39AM +0800, Shengjiu Wang wrote:
> On Tue, Jun 16, 2020 at 7:11 AM Nicolin Chen  wrote:
> >
> > On Mon, Jun 15, 2020 at 01:56:18PM +0800, Shengjiu Wang wrote:
> > > For mono channel, SSI will switch to Normal mode.
> > >
> > > In Normal mode and Network mode, the Word Length Control bits
> > > control the word length divider in clock generator, which is
> > > different with I2S Master mode (the word length is fixed to
> > > 32bit), it should be the value of params_width(hw_params).
> > >
> > > The condition "slots == 2" is not good for I2S Master mode,
> > > because for Network mode and Normal mode, the slots can also
> > > be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
> > > to check if it is I2S Master mode.
> >
> > The fsl_ssi_set_bclk is only called when fsl_ssi_is_i2s_master,
> > though I agree that that line of comments sounds confusing now.
> 
> Actually I think fsl_ssi_is_i2s_master is not accurate, it just checks
> the Master mode,  but didn't check the I2S mode.
> 
> >
> > > So we refine the famula for mono channel, otherwise there
> >
> > famula => formula?
> >
> > > will be sound issue for S24_LE.
> > >
> > > Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot 
> > > number and width")
> > > Signed-off-by: Shengjiu Wang 
> > > ---
> > > changes in v2
> > > - refine patch for Network mode and Normal mode.
> > >
> > >  sound/soc/fsl/fsl_ssi.c | 15 +++
> > >  1 file changed, 11 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> > > index bad89b0d129e..cbf67d132fda 100644
> > > --- a/sound/soc/fsl/fsl_ssi.c
> > > +++ b/sound/soc/fsl/fsl_ssi.c
> > > @@ -678,7 +678,8 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > > *substream,
> > >   struct regmap *regs = ssi->regs;
> > >   u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> > >   unsigned long clkrate, baudrate, tmprate;
> > > - unsigned int slots = params_channels(hw_params);
> > > + unsigned int channels = params_channels(hw_params);
> > > + unsigned int slots;
> > >   unsigned int slot_width = 32;
> > >   u64 sub, savesub = 10;
> > >   unsigned int freq;
> > > @@ -688,9 +689,15 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > > *substream,
> > >   /* Override slots and slot_width if being specifically set... */
> > >   if (ssi->slots)
> > >   slots = ssi->slots;
> > > - /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> > > - if (ssi->slot_width && slots != 2)
> > > - slot_width = ssi->slot_width;
> > > + else
> > > + /* Apply two slots for mono channel, because DC = 2 */
> > > + slots = (channels == 1) ? 2 : channels;
> > > +
> > > + /* ...but keep 32 bits if I2S Master mode */
> > > + if ((ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) != 
> > > SSI_SCR_I2S_MODE_MASTER ||
> > > + channels == 1)
> > > + slot_width = ssi->slot_width ? ssi->slot_width :
> >
> > This looks very complicated...can you review and try mine?
> > (Basically, take 32-bit out of default but force it later)
> >
> > @@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > *substream,
> > struct regmap *regs = ssi->regs;
> > u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> > unsigned long clkrate, baudrate, tmprate;
> > -   unsigned int slots = params_channels(hw_params);
> > -   unsigned int slot_width = 32;
> > +   unsigned int channels = params_channels(hw_params);
> > +   unsigned int slot_width = params_width(hw_params);
> > +   unsigned int slots = 2;
> > u64 sub, savesub = 10;
> > unsigned int freq;
> > bool baudclk_is_used;
> > @@ -688,10 +689,16 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > *substream,
> > /* Override slots and slot_width if being specifically set... */
> > if (ssi->slots)
> > slots = ssi->slots;
> > -   /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> > -   if (ssi->slot_width && slots != 2)
> > +   if (ssi->slot_width)
> > slot_width = ssi->slot_width;
> >
> > +   /*
> > +* ...but force 32 bits for stereo audio. Note that mono audio is 
> > also
> > +* sent in 2 slots via NORMAL mode, so check both slots and 
> > channels.
> > +*/
> > +   if (slots == 2 && channels == 2)
> > +   slot_width = 32;
> 
> slots ==2 && channels ==2 does not mean the I2S Master mode.
> For LEFT_J, it is also slots =2 & channels = 2, then the slot_width
> should be params_width(hw_params).
> and DSP_A/B also supports stereo.

I think you have a point. Then would this condition work?

+   /* ...but force 32 bits for stereo audio using I2S Master Mode */
+   if (channels == 2 &&
+   ssi->i2s_net & SSI_SCR_I2S_MODE_MASK == 

[PATCH v4 3/3] btrfs: Use kfree() in btrfs_ioctl_get_subvol_info()

2020-06-15 Thread Waiman Long
In btrfs_ioctl_get_subvol_info(), there is a classic case where kzalloc()
was incorrectly paired with kzfree(). According to David Sterba, there
isn't any sensitive information in the subvol_info that needs to be
cleared before freeing. So kfree_sensitive() isn't really needed,
use kfree() instead.

Reported-by: David Sterba 
Signed-off-by: Waiman Long 
---
 fs/btrfs/ioctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index f1dd9e4271e9..e8f7c5f00894 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -2692,7 +2692,7 @@ static int btrfs_ioctl_get_subvol_info(struct file *file, 
void __user *argp)
btrfs_put_root(root);
 out_free:
btrfs_free_path(path);
-   kfree_sensitive(subvol_info);
+   kfree(subvol_info);
return ret;
 }
 
-- 
2.18.1



[PATCH v4 2/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the
recently added kvfree_sensitive() and make the intention of the API
more explicit. In addition, memzero_explicit() is used to clear the
memory to make sure that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/\bkzfree\b/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and the
use of memzero_explicit() instead of memset().

Suggested-by: Joe Perches 
Acked-by: David Howells 
Acked-by: Michal Hocko 
Acked-by: Johannes Weiner 
Signed-off-by: Waiman Long 
---
 arch/s390/crypto/prng.c   |  4 +--
 arch/x86/power/hibernate.c|  2 +-
 crypto/adiantum.c |  2 +-
 crypto/ahash.c|  4 +--
 crypto/api.c  |  2 +-
 crypto/asymmetric_keys/verify_pefile.c|  4 +--
 crypto/deflate.c  |  2 +-
 crypto/drbg.c | 10 +++---
 crypto/ecc.c  |  8 ++---
 crypto/ecdh.c |  2 +-
 crypto/gcm.c  |  2 +-
 crypto/gf128mul.c |  4 +--
 crypto/jitterentropy-kcapi.c  |  2 +-
 crypto/rng.c  |  2 +-
 crypto/rsa-pkcs1pad.c |  6 ++--
 crypto/seqiv.c|  2 +-
 crypto/shash.c|  2 +-
 crypto/skcipher.c |  2 +-
 crypto/testmgr.c  |  6 ++--
 crypto/zstd.c |  2 +-
 .../allwinner/sun8i-ce/sun8i-ce-cipher.c  |  2 +-
 .../allwinner/sun8i-ss/sun8i-ss-cipher.c  |  2 +-
 drivers/crypto/amlogic/amlogic-gxl-cipher.c   |  4 +--
 drivers/crypto/atmel-ecc.c|  2 +-
 drivers/crypto/caam/caampkc.c | 28 +++
 drivers/crypto/cavium/cpt/cptvf_main.c|  6 ++--
 drivers/crypto/cavium/cpt/cptvf_reqmanager.c  | 12 +++
 drivers/crypto/cavium/nitrox/nitrox_lib.c |  4 +--
 drivers/crypto/cavium/zip/zip_crypto.c|  6 ++--
 drivers/crypto/ccp/ccp-crypto-rsa.c   |  6 ++--
 drivers/crypto/ccree/cc_aead.c|  4 +--
 drivers/crypto/ccree/cc_buffer_mgr.c  |  4 +--
 drivers/crypto/ccree/cc_cipher.c  |  6 ++--
 drivers/crypto/ccree/cc_hash.c|  8 ++---
 drivers/crypto/ccree/cc_request_mgr.c |  2 +-
 drivers/crypto/marvell/cesa/hash.c|  2 +-
 .../crypto/marvell/octeontx/otx_cptvf_main.c  |  6 ++--
 .../marvell/octeontx/otx_cptvf_reqmgr.h   |  2 +-
 drivers/crypto/mediatek/mtk-aes.c |  2 +-
 drivers/crypto/nx/nx.c|  4 +--
 drivers/crypto/virtio/virtio_crypto_algs.c| 12 +++
 drivers/crypto/virtio/virtio_crypto_core.c|  2 +-
 drivers/md/dm-crypt.c | 32 -
 drivers/md/dm-integrity.c |  6 ++--
 drivers/misc/ibmvmc.c |  6 ++--
 .../hisilicon/hns3/hns3pf/hclge_mbx.c |  2 +-
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c|  6 ++--
 drivers/net/ppp/ppp_mppe.c|  6 ++--
 drivers/net/wireguard/noise.c |  4 +--
 drivers/net/wireguard/peer.c  |  2 +-
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c  |  2 +-
 .../net/wireless/intel/iwlwifi/pcie/tx-gen2.c |  6 ++--
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c  |  6 ++--
 drivers/net/wireless/intersil/orinoco/wext.c  |  4 +--
 drivers/s390/crypto/ap_bus.h  |  4 +--
 drivers/staging/ks7010/ks_hostif.c|  2 +-
 drivers/staging/rtl8723bs/core/rtw_security.c |  2 +-
 drivers/staging/wlan-ng/p80211netdev.c|  2 +-
 drivers/target/iscsi/iscsi_target_auth.c  |  2 +-
 fs/btrfs/ioctl.c  |  2 +-
 fs/cifs/cifsencrypt.c |  2 +-
 fs/cifs/connect.c | 10 +++---
 fs/cifs/dfs_cache.c   |  2 +-
 fs/cifs/misc.c|  8 ++---
 fs/crypto/keyring.c   |  6 ++--
 fs/crypto/keysetup_v1.c   |  4 +--
 fs/ecryptfs/keystore.c

[PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-15 Thread Waiman Long
The kzfree() function is normally used to clear some sensitive
information, like encryption keys, in the buffer before freeing it back
to the pool. Memset() is currently used for the buffer clearing. However,
it is entirely possible that the compiler may choose to optimize away the
memory clearing especially if LTO is being used. To make sure that this
optimization will not happen, memzero_explicit(), which is introduced
in v3.18, is now used in kzfree() to do the clearing.

Fixes: 3ef0e5ba4673 ("slab: introduce kzfree()")
Cc: sta...@vger.kernel.org
Signed-off-by: Waiman Long 
---
 mm/slab_common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 9e72ba224175..37d48a56431d 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1726,7 +1726,7 @@ void kzfree(const void *p)
if (unlikely(ZERO_OR_NULL_PTR(mem)))
return;
ks = ksize(mem);
-   memset(mem, 0, ks);
+   memzero_explicit(mem, ks);
kfree(mem);
 }
 EXPORT_SYMBOL(kzfree);
-- 
2.18.1



[PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long
 v4:
  - Break out the memzero_explicit() change as suggested by Dan Carpenter
so that it can be backported to stable.
  - Drop the "crypto: Remove unnecessary memzero_explicit()" patch for
now as there can be a bit more discussion on what is best. It will be
introduced as a separate patch later on after this one is merged.

This patchset makes a global rename of the kzfree() to kfree_sensitive()
to highlight the fact buffer clearing is only needed if the data objects
contain sensitive information like encrpytion key. The fact that kzfree()
uses memset() to do the clearing isn't totally safe either as compiler
may compile out the clearing in their optimizer especially if LTO is
used. Instead, the new kfree_sensitive() uses memzero_explicit() which
won't get compiled out.

Waiman Long (3):
  mm/slab: Use memzero_explicit() in kzfree()
  mm, treewide: Rename kzfree() to kfree_sensitive()
  btrfs: Use kfree() in btrfs_ioctl_get_subvol_info()

 arch/s390/crypto/prng.c   |  4 +--
 arch/x86/power/hibernate.c|  2 +-
 crypto/adiantum.c |  2 +-
 crypto/ahash.c|  4 +--
 crypto/api.c  |  2 +-
 crypto/asymmetric_keys/verify_pefile.c|  4 +--
 crypto/deflate.c  |  2 +-
 crypto/drbg.c | 10 +++---
 crypto/ecc.c  |  8 ++---
 crypto/ecdh.c |  2 +-
 crypto/gcm.c  |  2 +-
 crypto/gf128mul.c |  4 +--
 crypto/jitterentropy-kcapi.c  |  2 +-
 crypto/rng.c  |  2 +-
 crypto/rsa-pkcs1pad.c |  6 ++--
 crypto/seqiv.c|  2 +-
 crypto/shash.c|  2 +-
 crypto/skcipher.c |  2 +-
 crypto/testmgr.c  |  6 ++--
 crypto/zstd.c |  2 +-
 .../allwinner/sun8i-ce/sun8i-ce-cipher.c  |  2 +-
 .../allwinner/sun8i-ss/sun8i-ss-cipher.c  |  2 +-
 drivers/crypto/amlogic/amlogic-gxl-cipher.c   |  4 +--
 drivers/crypto/atmel-ecc.c|  2 +-
 drivers/crypto/caam/caampkc.c | 28 +++
 drivers/crypto/cavium/cpt/cptvf_main.c|  6 ++--
 drivers/crypto/cavium/cpt/cptvf_reqmanager.c  | 12 +++
 drivers/crypto/cavium/nitrox/nitrox_lib.c |  4 +--
 drivers/crypto/cavium/zip/zip_crypto.c|  6 ++--
 drivers/crypto/ccp/ccp-crypto-rsa.c   |  6 ++--
 drivers/crypto/ccree/cc_aead.c|  4 +--
 drivers/crypto/ccree/cc_buffer_mgr.c  |  4 +--
 drivers/crypto/ccree/cc_cipher.c  |  6 ++--
 drivers/crypto/ccree/cc_hash.c|  8 ++---
 drivers/crypto/ccree/cc_request_mgr.c |  2 +-
 drivers/crypto/marvell/cesa/hash.c|  2 +-
 .../crypto/marvell/octeontx/otx_cptvf_main.c  |  6 ++--
 .../marvell/octeontx/otx_cptvf_reqmgr.h   |  2 +-
 drivers/crypto/mediatek/mtk-aes.c |  2 +-
 drivers/crypto/nx/nx.c|  4 +--
 drivers/crypto/virtio/virtio_crypto_algs.c| 12 +++
 drivers/crypto/virtio/virtio_crypto_core.c|  2 +-
 drivers/md/dm-crypt.c | 32 -
 drivers/md/dm-integrity.c |  6 ++--
 drivers/misc/ibmvmc.c |  6 ++--
 .../hisilicon/hns3/hns3pf/hclge_mbx.c |  2 +-
 .../net/ethernet/intel/ixgbe/ixgbe_ipsec.c|  6 ++--
 drivers/net/ppp/ppp_mppe.c|  6 ++--
 drivers/net/wireguard/noise.c |  4 +--
 drivers/net/wireguard/peer.c  |  2 +-
 drivers/net/wireless/intel/iwlwifi/pcie/rx.c  |  2 +-
 .../net/wireless/intel/iwlwifi/pcie/tx-gen2.c |  6 ++--
 drivers/net/wireless/intel/iwlwifi/pcie/tx.c  |  6 ++--
 drivers/net/wireless/intersil/orinoco/wext.c  |  4 +--
 drivers/s390/crypto/ap_bus.h  |  4 +--
 drivers/staging/ks7010/ks_hostif.c|  2 +-
 drivers/staging/rtl8723bs/core/rtw_security.c |  2 +-
 drivers/staging/wlan-ng/p80211netdev.c|  2 +-
 drivers/target/iscsi/iscsi_target_auth.c  |  2 +-
 fs/btrfs/ioctl.c  |  2 +-
 fs/cifs/cifsencrypt.c |  2 +-
 fs/cifs/connect.c | 10 +++---
 fs/cifs/dfs_cache.c   |  2 +-
 fs/cifs/misc.c|  8 ++---
 fs/crypto/keyring.c   |  6 ++--
 fs/crypto/keysetup_v1.c   |  4 +--
 fs/ecryptfs/keystore.c|  4 +--
 fs/ecryptfs/messaging.c   |  2 +-
 include/crypto/aead.h |  2 +-
 include/crypto/akcipher.h |  2 +-
 include/crypto/gf128mul.h |  2 +-
 include/crypto/hash.h 

Re: [PATCH v2] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Shengjiu Wang
On Tue, Jun 16, 2020 at 7:11 AM Nicolin Chen  wrote:
>
> On Mon, Jun 15, 2020 at 01:56:18PM +0800, Shengjiu Wang wrote:
> > For mono channel, SSI will switch to Normal mode.
> >
> > In Normal mode and Network mode, the Word Length Control bits
> > control the word length divider in clock generator, which is
> > different with I2S Master mode (the word length is fixed to
> > 32bit), it should be the value of params_width(hw_params).
> >
> > The condition "slots == 2" is not good for I2S Master mode,
> > because for Network mode and Normal mode, the slots can also
> > be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
> > to check if it is I2S Master mode.
>
> The fsl_ssi_set_bclk is only called when fsl_ssi_is_i2s_master,
> though I agree that that line of comments sounds confusing now.

Actually I think fsl_ssi_is_i2s_master is not accurate, it just checks
the Master mode,  but didn't check the I2S mode.

>
> > So we refine the famula for mono channel, otherwise there
>
> famula => formula?
>
> > will be sound issue for S24_LE.
> >
> > Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot 
> > number and width")
> > Signed-off-by: Shengjiu Wang 
> > ---
> > changes in v2
> > - refine patch for Network mode and Normal mode.
> >
> >  sound/soc/fsl/fsl_ssi.c | 15 +++
> >  1 file changed, 11 insertions(+), 4 deletions(-)
> >
> > diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> > index bad89b0d129e..cbf67d132fda 100644
> > --- a/sound/soc/fsl/fsl_ssi.c
> > +++ b/sound/soc/fsl/fsl_ssi.c
> > @@ -678,7 +678,8 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > *substream,
> >   struct regmap *regs = ssi->regs;
> >   u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> >   unsigned long clkrate, baudrate, tmprate;
> > - unsigned int slots = params_channels(hw_params);
> > + unsigned int channels = params_channels(hw_params);
> > + unsigned int slots;
> >   unsigned int slot_width = 32;
> >   u64 sub, savesub = 10;
> >   unsigned int freq;
> > @@ -688,9 +689,15 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> > *substream,
> >   /* Override slots and slot_width if being specifically set... */
> >   if (ssi->slots)
> >   slots = ssi->slots;
> > - /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> > - if (ssi->slot_width && slots != 2)
> > - slot_width = ssi->slot_width;
> > + else
> > + /* Apply two slots for mono channel, because DC = 2 */
> > + slots = (channels == 1) ? 2 : channels;
> > +
> > + /* ...but keep 32 bits if I2S Master mode */
> > + if ((ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) != SSI_SCR_I2S_MODE_MASTER 
> > ||
> > + channels == 1)
> > + slot_width = ssi->slot_width ? ssi->slot_width :
>
> This looks very complicated...can you review and try mine?
> (Basically, take 32-bit out of default but force it later)
>
> @@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
> struct regmap *regs = ssi->regs;
> u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
> unsigned long clkrate, baudrate, tmprate;
> -   unsigned int slots = params_channels(hw_params);
> -   unsigned int slot_width = 32;
> +   unsigned int channels = params_channels(hw_params);
> +   unsigned int slot_width = params_width(hw_params);
> +   unsigned int slots = 2;
> u64 sub, savesub = 10;
> unsigned int freq;
> bool baudclk_is_used;
> @@ -688,10 +689,16 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
> /* Override slots and slot_width if being specifically set... */
> if (ssi->slots)
> slots = ssi->slots;
> -   /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> -   if (ssi->slot_width && slots != 2)
> +   if (ssi->slot_width)
> slot_width = ssi->slot_width;
>
> +   /*
> +* ...but force 32 bits for stereo audio. Note that mono audio is also
> +* sent in 2 slots via NORMAL mode, so check both slots and channels.
> +*/
> +   if (slots == 2 && channels == 2)
> +   slot_width = 32;

slots ==2 && channels ==2 does not mean the I2S Master mode.
For LEFT_J, it is also slots =2 & channels = 2, then the slot_width
should be params_width(hw_params).
and DSP_A/B also supports stereo.


Re: [PATCH v5 01/13] powerpc: Remove Xilinx PPC405/PPC440 support

2020-06-15 Thread Nathan Chancellor
On Thu, May 21, 2020 at 04:55:52PM +, Christophe Leroy wrote:
> From: Michal Simek 
> 
> The latest Xilinx design tools called ISE and EDK has been released in
> October 2013. New tool doesn't support any PPC405/PPC440 new designs.
> These platforms are no longer supported and tested.
> 
> PowerPC 405/440 port is orphan from 2013 by
> commit cdeb89943bfc ("MAINTAINERS: Fix incorrect status tag") and
> commit 19624236cce1 ("MAINTAINERS: Update Grant's email address and 
> maintainership")
> that's why it is time to remove the support fot these platforms.
> 
> Signed-off-by: Michal Simek 
> Acked-by: Arnd Bergmann 
> Signed-off-by: Christophe Leroy 

This patch causes qemu-system-ppc to fail to load ppc44x_defconfig:

$ make -skj"$(nproc)" ARCH=powerpc CROSS_COMPILE=powerpc-linux- O=out/ppc 
distclean ppc44x_defconfig zImage

$ timeout --foreground 30s unbuffer \
qemu-system-ppc \
-machine bamboo \
-no-reboot \
-append "console=ttyS0" \
-display none \
-initrd ../../cbl/github/boot-utils/images/ppc32/rootfs.cpio \
-kernel out/ppc/arch/powerpc/boot/zImage \
-m 128m \
-nodefaults \
-serial mon:stdio
qemu-system-ppc: could not load kernel 'out/ppc/arch/powerpc/boot/zImage'

$ ls out/ppc/arch/powerpc/boot/zImage
out/ppc/arch/powerpc/boot/zImage

Is this expected? Is there some other config or machine that we should
be testing instead?

Cheers,
Nathan


Re: [PATCH 00/17] spelling.txt: /decriptors/descriptors/

2020-06-15 Thread Mark Brown
On Tue, 9 Jun 2020 13:45:53 +0100, Kieran Bingham wrote:
> I wouldn't normally go through spelling fixes, but I caught sight of
> this typo twice, and then foolishly grepped the tree for it, and saw how
> pervasive it was.
> 
> so here I am ... fixing a typo globally... but with an addition in
> scripts/spelling.txt so it shouldn't re-appear ;-)
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git 
for-next

Thanks!

[1/2] regulator: Fix trivial spelling
  commit: d3f3723387f97118c337689fc73e4199fb4331ce
[2/2] regulator: gpio: Fix trivial spelling
  commit: 1f0b740004f09d2f1b716fd6c2fdca81004ded05

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH 0/3] ASoC: fsl_easrc: Fix several warnings

2020-06-15 Thread Mark Brown
On Wed, 3 Jun 2020 11:39:38 +0800, Shengjiu Wang wrote:
> Fix several warnings with "make W=1"
> 
> Shengjiu Wang (3):
>   ASoC: fsl_easrc: Fix -Wmissing-prototypes warning
>   ASoC: fsl_easrc: Fix -Wunused-but-set-variable
>   ASoC: fsl_easrc: Fix "Function parameter not described" warnings
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next

Thanks!

[1/3] ASoC: fsl_easrc: Fix -Wmissing-prototypes warning
  commit: e4cc0aaac390a87f80ae542c75d4c84de08816f9
[2/3] ASoC: fsl_easrc: Fix -Wunused-but-set-variable
  commit: 633a2c7d6e621e748d69423fa85be88c7dcd4f94
[3/3] ASoC: fsl_easrc: Fix "Function parameter not described" warnings
  commit: d73d682a9e87fa494868e8094fcc5b6a6b505464

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark


Re: [PATCH v2 04/12] docs: x86: Add documentation for SVA (Shared Virtual Addressing)

2020-06-15 Thread Fenghua Yu
Hi, Baolu,

On Sat, Jun 13, 2020 at 08:17:40PM +0800, Lu Baolu wrote:
> Hi Fenghua,
> 
> On 2020/6/13 8:41, Fenghua Yu wrote:
> >+implement implement fairness or ensure forward progress can be made.
> 
> Repeated "implement".

Will fix this.

> >+For example, the Intel Data Streaming Accelerator (DSA) uses
> >+intel_svm_bind_mm(), which will do the following.
> 
> The Intel SVM APIs have been deprecated. Drivers should use
> iommu_sva_bind_device() instead. Please also update other places in
> this document.

Will fix this.

Thanks.

-Fenghua



Re: [PATCH v2] ASoC: fsl_ssi: Fix bclk calculation for mono channel

2020-06-15 Thread Nicolin Chen
On Mon, Jun 15, 2020 at 01:56:18PM +0800, Shengjiu Wang wrote:
> For mono channel, SSI will switch to Normal mode.
> 
> In Normal mode and Network mode, the Word Length Control bits
> control the word length divider in clock generator, which is
> different with I2S Master mode (the word length is fixed to
> 32bit), it should be the value of params_width(hw_params).
> 
> The condition "slots == 2" is not good for I2S Master mode,
> because for Network mode and Normal mode, the slots can also
> be 2. Then we need to use (ssi->i2s_net & SSI_SCR_I2S_MODE_MASK)
> to check if it is I2S Master mode.

The fsl_ssi_set_bclk is only called when fsl_ssi_is_i2s_master,
though I agree that that line of comments sounds confusing now.

> So we refine the famula for mono channel, otherwise there

famula => formula?

> will be sound issue for S24_LE.
> 
> Fixes: b0a7043d5c2c ("ASoC: fsl_ssi: Caculate bit clock rate using slot 
> number and width")
> Signed-off-by: Shengjiu Wang 
> ---
> changes in v2
> - refine patch for Network mode and Normal mode.
> 
>  sound/soc/fsl/fsl_ssi.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
> 
> diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
> index bad89b0d129e..cbf67d132fda 100644
> --- a/sound/soc/fsl/fsl_ssi.c
> +++ b/sound/soc/fsl/fsl_ssi.c
> @@ -678,7 +678,8 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
>   struct regmap *regs = ssi->regs;
>   u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
>   unsigned long clkrate, baudrate, tmprate;
> - unsigned int slots = params_channels(hw_params);
> + unsigned int channels = params_channels(hw_params);
> + unsigned int slots;
>   unsigned int slot_width = 32;
>   u64 sub, savesub = 10;
>   unsigned int freq;
> @@ -688,9 +689,15 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
> *substream,
>   /* Override slots and slot_width if being specifically set... */
>   if (ssi->slots)
>   slots = ssi->slots;
> - /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
> - if (ssi->slot_width && slots != 2)
> - slot_width = ssi->slot_width;
> + else
> + /* Apply two slots for mono channel, because DC = 2 */
> + slots = (channels == 1) ? 2 : channels;
> +
> + /* ...but keep 32 bits if I2S Master mode */
> + if ((ssi->i2s_net & SSI_SCR_I2S_MODE_MASK) != SSI_SCR_I2S_MODE_MASTER ||
> + channels == 1)
> + slot_width = ssi->slot_width ? ssi->slot_width :

This looks very complicated...can you review and try mine?
(Basically, take 32-bit out of default but force it later)

@@ -678,8 +678,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
*substream,
struct regmap *regs = ssi->regs;
u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
unsigned long clkrate, baudrate, tmprate;
-   unsigned int slots = params_channels(hw_params);
-   unsigned int slot_width = 32;
+   unsigned int channels = params_channels(hw_params);
+   unsigned int slot_width = params_width(hw_params);
+   unsigned int slots = 2;
u64 sub, savesub = 10;
unsigned int freq;
bool baudclk_is_used;
@@ -688,10 +689,16 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream 
*substream,
/* Override slots and slot_width if being specifically set... */
if (ssi->slots)
slots = ssi->slots;
-   /* ...but keep 32 bits if slots is 2 -- I2S Master mode */
-   if (ssi->slot_width && slots != 2)
+   if (ssi->slot_width)
slot_width = ssi->slot_width;
 
+   /*
+* ...but force 32 bits for stereo audio. Note that mono audio is also
+* sent in 2 slots via NORMAL mode, so check both slots and channels.
+*/
+   if (slots == 2 && channels == 2)
+   slot_width = 32;
+
/* Generate bit clock based on the slot number and slot width */
freq = slots * slot_width * params_rate(hw_params);
 


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Raj, Ashok
On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> 
> I don't get why you need a rdmsr here, or why not having one would
> require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> 
> > > > +*/
> > > > +   rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > > +   if (pasid_msr & MSR_IA32_PASID_VALID)
> > > > +   return false;
> > > > +
> > > > +   /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > > +   wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> 
> How much more expensive is the wrmsr over the rdmsr? Can't we just
> unconditionally write the current PASID and call it a day?

The reason to check the rdmsr() is because we are using a hueristic taking
GP faults. If we already setup the MSR, but we get it a second time it
means the reason is something other than PASID_MSR not being set.

Ideally we should use the TIF_ to track this would be cheaper, but we were
told those bits aren't easy to give out. 

Cheers,
Ashok


[PATCH 17/25] mm/powerpc: Use mm_fault_accounting()

2020-06-15 Thread Peter Xu
Use the new mm_fault_accounting() helper for page fault accounting.

cmo_account_page_fault() is special.  Keep that.

CC: Michael Ellerman 
CC: Benjamin Herrenschmidt 
CC: Paul Mackerras 
CC: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Peter Xu 
---
 arch/powerpc/mm/fault.c | 13 -
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 84af6c8eecf7..6043b639ae42 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -481,8 +481,6 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
if (!arch_irq_disabled_regs(regs))
local_irq_enable();
 
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, address);
-
if (error_code & DSISR_KEYFAULT)
return bad_key_fault_exception(regs, address,
   get_mm_addr_key(mm, address));
@@ -604,14 +602,11 @@ static int __do_page_fault(struct pt_regs *regs, unsigned 
long address,
/*
 * Major/minor page fault accounting.
 */
-   if (major) {
-   current->maj_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs, address);
+   if (major)
cmo_account_page_fault();
-   } else {
-   current->min_flt++;
-   perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs, address);
-   }
+
+   mm_fault_accounting(current, regs, address, major);
+
return 0;
 }
 NOKPROBE_SYMBOL(__do_page_fault);
-- 
2.26.2



[Bug 208197] OF: /pci@f2000000/mac-io@17/gpio@50/...: could not find phandle

2020-06-15 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=208197

--- Comment #1 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 289687
  --> https://bugzilla.kernel.org/attachment.cgi?id=289687=edit
kernel .config (5.8-rc1, PowerMac G4 DP)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 208197] New: OF: /pci@f2000000/mac-io@17/gpio@50/...: could not find phandle

2020-06-15 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=208197

Bug ID: 208197
   Summary: OF: /pci@f200/mac-io@17/gpio@50/...: could not
find phandle
   Product: Platform Specific/Hardware
   Version: 2.5
Kernel Version: 5.8-rc1
  Hardware: PPC-32
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PPC-32
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: erhar...@mailbox.org
Regression: Yes

Created attachment 289685
  --> https://bugzilla.kernel.org/attachment.cgi?id=289685=edit
dmesg (5.8-rc1, PowerMac G4 DP)

Since 5.8-rc1 I get these OF errors on my PowerMac G4 DP:

[...]
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio5@6f: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/extint-gpio15@67: could not
find phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio6@70: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/extint-gpio16@68: could not
find phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/extint-gpio14@66: could not
find phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio12@76: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio11@75: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio5@6f: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio6@70: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/extint-gpio4@5c: could not
find phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/gpio11@75: could not find
phandle
T600 kernel: OF: /pci@f200/mac-io@17/gpio@50/extint-gpio15@67: could not
find phandle[...]

This has not been the case in kernel 5.7 or earlier.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 01:17:35PM -0700, Fenghua Yu wrote:
> Hi, Peter,
> 
> On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
> > On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
> > 
> > > Or do you suggest to add a random new flag in struct thread_info instead
> > > of a TIF flag?
> > 
> > Why thread_info? What's wrong with something simple like the below. It
> > takes a bit from the 'strictly current' flags word.
> > 
> > 
> > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > index b62e6aaf28f0..fca830b97055 100644
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -801,6 +801,9 @@ struct task_struct {
> > /* Stalled due to lack of memory */
> > unsignedin_memstall:1;
> >  #endif
> > +#ifdef CONFIG_PCI_PASID
> > +   unsignedhas_valid_pasid:1;
> > +#endif
> >  
> > unsigned long   atomic_flags; /* Flags requiring atomic 
> > access. */
> >  
> > diff --git a/kernel/fork.c b/kernel/fork.c
> > index 142b23645d82..10b3891be99e 100644
> > --- a/kernel/fork.c
> > +++ b/kernel/fork.c
> > @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
> > task_struct *orig, int node)
> > tsk->use_memdelay = 0;
> >  #endif
> >  
> > +#ifdef CONFIG_PCI_PASID
> > +   tsk->has_valid_pasid = 0;
> > +#endif
> > +
> >  #ifdef CONFIG_MEMCG
> > tsk->active_memcg = NULL;
> >  #endif
> 
> The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
> Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
> The flag should be cleared in cloned()/forked() and is only set and
> read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
> of x86.
> 
> That's why we think the flag should be in x86 struct thread_info instead
> of in generice struct task_struct.

I don't think anybody really cares, it's just one bit, we have plenty
left.

On x86_64 there's a u32 sized alignment hole in thread_info, also we
don't use the upper 32bit of thread_info::flags, however using those
would still mean you have to use atomic ops, which you really don't
need.




RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
> So what’s the RDMSR for?  Surely you
> have some state somewhere that says “this task has a PASID.”
> Can’t you just make sure that stays in sync with the MSR?  Then, on #GP, if 
> the task already has a PASID, you know the MSR is set.

We have state that says the process ("mm") has been allocated a PASID. But not 
for each task.

E.g. a process may clone a bunch of tasks, then one of them opens a device that 
needs
a PASID.   We did internally have patches to go hunt down all those other tasks 
and
force a PASID onto each. But the code was big and ugly. Also maybe the wrong 
thing
to do if those threads didn't ever need to access the device.

PeterZ suggested that we can add a bit to the task structure for this purpose.

Fenghua is hesitant about adding an x86 only bit there.

-Tony


Re: [PATCH v13 2/6] seq_buf: Export seq_buf_printf

2020-06-15 Thread Steven Rostedt
On Mon, 15 Jun 2020 14:55:52 +0200
Borislav Petkov  wrote:

> Can you please resend your patchset once a week like everyone else and
> not flood inboxes with it?

Boris,

Haven't you automated your inbox yet? I have patchwork reading my INBOX
and it's smart enough to understand new series, and the old one simply
disappears as "superseded".

Email is just the wires to our infrastructure. Get with the times man! ;-)

-- Steve


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Andy Lutomirski



> On Jun 15, 2020, at 1:56 PM, Luck, Tony  wrote:
> 
> 
>> 
>> Are we planning to keep PASID live once a task has used it once or are we 
>> going to swap it lazily?  If the latter, a percpu variable might be better.
> 
> Current plan is "touch it once and the task owns it until exit(2)"
> 
> Maybe someday in the future when we have data on how applications
> actually use accelerators we could look at something more complex
> if usage patterns look like it would be beneficial.
> 
> 

So what’s the RDMSR for?  Surely you
have some state somewhere that says “this task has a PASID.”  Can’t you just 
make sure that stays in sync with the MSR?  Then, on #GP, if the task already 
has a PASID, you know the MSR is set.

RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
> Are we planning to keep PASID live once a task has used it once or are we 
> going to swap it lazily?  If the latter, a percpu variable might be better.

Current plan is "touch it once and the task owns it until exit(2)"

Maybe someday in the future when we have data on how applications
actually use accelerators we could look at something more complex
if usage patterns look like it would be beneficial.

-Tony


[Bug 206203] kmemleak reports various leaks in drivers/of/unittest.c

2020-06-15 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206203

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #287675|0   |1
is obsolete||

--- Comment #15 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 289679
  --> https://bugzilla.kernel.org/attachment.cgi?id=289679=edit
kernel .config (kernel 5.8-rc1, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Andy Lutomirski


> On Jun 15, 2020, at 1:17 PM, Fenghua Yu  wrote:
> 
> Hi, Peter,
> 
>> On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
>>> On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
>>> 
>>> Or do you suggest to add a random new flag in struct thread_info instead
>>> of a TIF flag?
>> 
>> Why thread_info? What's wrong with something simple like the below. It
>> takes a bit from the 'strictly current' flags word.
>> 
>> 
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index b62e6aaf28f0..fca830b97055 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -801,6 +801,9 @@ struct task_struct {
>>/* Stalled due to lack of memory */
>>unsignedin_memstall:1;
>> #endif
>> +#ifdef CONFIG_PCI_PASID
>> +unsignedhas_valid_pasid:1;
>> +#endif
>> 
>>unsigned longatomic_flags; /* Flags requiring atomic access. 
>> */
>> 
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index 142b23645d82..10b3891be99e 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
>> task_struct *orig, int node)
>>tsk->use_memdelay = 0;
>> #endif
>> 
>> +#ifdef CONFIG_PCI_PASID
>> +tsk->has_valid_pasid = 0;
>> +#endif
>> +
>> #ifdef CONFIG_MEMCG
>>tsk->active_memcg = NULL;
>> #endif
> 
> The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
> Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
> The flag should be cleared in cloned()/forked() and is only set and
> read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
> of x86.
> 
> That's why we think the flag should be in x86 struct thread_info instead
> of in generice struct task_struct.
> 

Are we planning to keep PASID live once a task has used it once or are we going 
to swap it lazily?  If the latter, a percpu variable might be better.

> Please advice.
> 
> Thanks.
> 
> -Fenghua


[Bug 206203] kmemleak reports various leaks in drivers/of/unittest.c

2020-06-15 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206203

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #287673|0   |1
is obsolete||
 Attachment #288567|0   |1
is obsolete||

--- Comment #14 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 289677
  --> https://bugzilla.kernel.org/attachment.cgi?id=289677=edit
dmesg (kernel 5.8-rc1, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

[Bug 206203] kmemleak reports various leaks in drivers/of/unittest.c

2020-06-15 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=206203

Erhard F. (erhar...@mailbox.org) changed:

   What|Removed |Added

 Attachment #287671|0   |1
is obsolete||
 Attachment #288565|0   |1
is obsolete||

--- Comment #13 from Erhard F. (erhar...@mailbox.org) ---
Created attachment 289675
  --> https://bugzilla.kernel.org/attachment.cgi?id=289675=edit
kmemleak output (kernel 5.8-rc1, PowerMac G5 11,2)

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH net v2] ibmvnic: Harden device login requests

2020-06-15 Thread David Miller
From: Thomas Falcon 
Date: Mon, 15 Jun 2020 10:29:23 -0500

> The VNIC driver's "login" command sequence is the final step
> in the driver's initialization process with device firmware,
> confirming the available device queue resources to be utilized
> by the driver. Under high system load, firmware may not respond
> to the request in a timely manner or may abort the request. In
> such cases, the driver should reattempt the login command
> sequence. In case of a device error, the number of retries
> is bounded.
> 
> Signed-off-by: Thomas Falcon 
> ---
> v2: declare variables in Reverse Christmas tree format

Applied, thanks.


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,

On Mon, Jun 15, 2020 at 09:09:28PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:
> 
> > Or do you suggest to add a random new flag in struct thread_info instead
> > of a TIF flag?
> 
> Why thread_info? What's wrong with something simple like the below. It
> takes a bit from the 'strictly current' flags word.
> 
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index b62e6aaf28f0..fca830b97055 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -801,6 +801,9 @@ struct task_struct {
>   /* Stalled due to lack of memory */
>   unsignedin_memstall:1;
>  #endif
> +#ifdef CONFIG_PCI_PASID
> + unsignedhas_valid_pasid:1;
> +#endif
>  
>   unsigned long   atomic_flags; /* Flags requiring atomic 
> access. */
>  
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 142b23645d82..10b3891be99e 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
> task_struct *orig, int node)
>   tsk->use_memdelay = 0;
>  #endif
>  
> +#ifdef CONFIG_PCI_PASID
> + tsk->has_valid_pasid = 0;
> +#endif
> +
>  #ifdef CONFIG_MEMCG
>   tsk->active_memcg = NULL;
>  #endif

The PASID MSR is x86 specific although PASID is PCIe concept and per-mm.
Checking if the MSR has valid PASID (bit31=1) is an x86 specifc work.
The flag should be cleared in cloned()/forked() and is only set and
read in fixup() in x86 #GP for heuristic. It's not used anywhere outside
of x86.

That's why we think the flag should be in x86 struct thread_info instead
of in generice struct task_struct.

Please advice.

Thanks.

-Fenghua


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Brian Gerst
On Mon, Jun 15, 2020 at 2:47 PM Arnd Bergmann  wrote:
>
> On Mon, Jun 15, 2020 at 4:48 PM Brian Gerst  wrote:
> > On Mon, Jun 15, 2020 at 10:13 AM Christoph Hellwig  wrote:
> > > On Mon, Jun 15, 2020 at 03:31:35PM +0200, Arnd Bergmann wrote:
>
> > >
> > > I'd rather keep it in common code as that allows all the low-level
> > > exec stuff to be marked static, and avoid us growing new pointless
> > > compat variants through copy and paste.
> > > smart compiler to d
> > >
> > > > I don't really understand
> > > > the comment, why can't this just use this?
> > >
> > > That errors out with:
> > >
> > > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> > > `__x32_sys_execve'
> > > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> > > `__x32_sys_execveat'
> > > make: *** [Makefile:1139: vmlinux] Error 1
> >
> > I think I have a fix for this, by modifying the syscall wrappers to
> > add an alias for the __x32 variant to the native __x64_sys_foo().
> > I'll get back to you with a patch.
>
> Do we actually need the __x32 prefix any more, or could we just
> change all x32 specific calls to use __x64_compat_sys_foo()?

I suppose that would work too.  The prefix really describes the
register mapping.

--
Brian Gerst


Re: [PATCH v13 2/6] seq_buf: Export seq_buf_printf

2020-06-15 Thread Dan Williams
On Mon, Jun 15, 2020 at 5:56 AM Borislav Petkov  wrote:
>
> On Mon, Jun 15, 2020 at 06:14:03PM +0530, Vaibhav Jain wrote:
> > 'seq_buf' provides a very useful abstraction for writing to a string
> > buffer without needing to worry about it over-flowing. However even
> > though the API has been stable for couple of years now its still not
> > exported to kernel loadable modules limiting its usage.
> >
> > Hence this patch proposes update to 'seq_buf.c' to mark
> > seq_buf_printf() which is part of the seq_buf API to be exported to
> > kernel loadable GPL modules. This symbol will be used in later parts
> > of this patch-set to simplify content creation for a sysfs attribute.
> >
> > Cc: Piotr Maziarz 
> > Cc: Cezary Rojewski 
> > Cc: Christoph Hellwig 
> > Cc: Steven Rostedt 
> > Cc: Borislav Petkov 
> > Acked-by: Steven Rostedt (VMware) 
> > Signed-off-by: Vaibhav Jain 
> > ---
> > Changelog:
> >
> > v12..v13:
> > * None
> >
> > v11..v12:
> > * None
>
> Can you please resend your patchset once a week like everyone else and
> not flood inboxes with it?

Hi Boris,

I gave Vaibhav some long shot hope that his series could be included
in my libnvdimm pull request for -rc1. Save for a last minute clang
report that I misread as a gcc warning, I likely would have included.
This spin is looking to address the last of the comments I had and
something I would consider for -rc2. So, in this case the resends were
requested by me and I'll take the grumbles on Vaibhav's behalf.


Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2020-06-15 Thread Bhupesh Sharma
Hello Catalin, Will,

On Tue, Jun 2, 2020 at 10:54 AM Bhupesh Sharma  wrote:
>
> Hello,
>
> On Thu, May 14, 2020 at 12:22 AM Bhupesh Sharma  wrote:
> >
> > Apologies for the delayed update. Its been quite some time since I
> > posted the last version (v5), but I have been really caught up in some
> > other critical issues.
> >
> > Changes since v5:
> > 
> > - v5 can be viewed here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/024055.html
> > - Addressed review comments from James Morse and Boris.
> > - Added Tested-by received from John on v5 patchset.
> > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's
> >   patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo
> >   applied.
> >
> > Changes since v4:
> > 
> > - v4 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> > - Addressed comments from Dave and added patches for documenting
> >   new variables appended to vmcoreinfo documentation.
> > - Added testing report shared by Akashi for PATCH 2/5.
> >
> > Changes since v3:
> > 
> > - v3 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
> >   instead of PTRS_PER_PGD.
> > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
> >   'Documentation/arm64/memory.rst'
> >
> > Changes since v2:
> > 
> > - v2 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
> >   ifdef sections, as suggested by Kazu.
> > - Updated vmcoreinfo documentation to add description about
> >   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> >
> > Changes since v1:
> > 
> > - v1 was sent out as a single patch which can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> >
> > - v2 breaks the single patch into two independent patches:
> >   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
> >   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> > (all archs)
> >
> > This patchset primarily fixes the regression reported in user-space
> > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> > with the availability of 52-bit address space feature in underlying
> > kernel. These regressions have been reported both on CPUs which don't
> > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> > and also on prototype platforms (like ARMv8 FVP simulator model) which
> > support ARMv8.2 extensions and are running newer kernels.
> >
> > The reason for these regressions is that right now user-space tools
> > have no direct access to these values (since these are not exported
> > from the kernel) and hence need to rely on a best-guess method of
> > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> > by underlying kernel.
> >
> > Exporting these values via vmcoreinfo will help user-land in such cases.
> > In addition, as per suggestion from makedumpfile maintainer (Kazu),
> > it makes more sense to append 'MAX_PHYSMEM_BITS' to
> > vmcoreinfo in the core code itself rather than in arm64 arch-specific
> > code, so that the user-space code for other archs can also benefit from
> > this addition to the vmcoreinfo and use it as a standard way of
> > determining 'SECTIONS_SHIFT' value in user-land.
> >
> > Cc: Boris Petkov 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Jonathan Corbet 
> > Cc: James Morse 
> > Cc: Mark Rutland 
> > Cc: Will Deacon 
> > Cc: Steve Capper 
> > Cc: Catalin Marinas 
> > Cc: Ard Biesheuvel 
> > Cc: Michael Ellerman 
> > Cc: Paul Mackerras 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Dave Anderson 
> > Cc: Kazuhito Hagio 
> > Cc: John Donnelly 
> > Cc: scott.bran...@broadcom.com
> > Cc: Amit Kachhap 
> > Cc: x...@kernel.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: linux-...@vger.kernel.org
> > Cc: ke...@lists.infradead.org
> >
> > Bhupesh Sharma (2):
> >   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
> >   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
> >
> >  Documentation/admin-guide/kdump/vmcoreinfo.rst | 16 
> >  arch/arm64/include/asm/pgtable-hwdef.h |  1 +
> >  arch/arm64/kernel/crash_core.c | 10 ++
> >  kernel/crash_core.c|  1 +
> >  4 files changed, 28 insertions(+)
>
> Ping. @James Morse , Others
>
> Please share if you have some comments regarding this patchset.

Ping. I think we have two Tested-by available from Oracle and Marvell
folks on this patchset and no further review-comments.
Can you please help review/pick this patchset via the arm64 tree?

User-space utilities 

Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:55:29AM -0700, Fenghua Yu wrote:

> Or do you suggest to add a random new flag in struct thread_info instead
> of a TIF flag?

Why thread_info? What's wrong with something simple like the below. It
takes a bit from the 'strictly current' flags word.


diff --git a/include/linux/sched.h b/include/linux/sched.h
index b62e6aaf28f0..fca830b97055 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -801,6 +801,9 @@ struct task_struct {
/* Stalled due to lack of memory */
unsignedin_memstall:1;
 #endif
+#ifdef CONFIG_PCI_PASID
+   unsignedhas_valid_pasid:1;
+#endif
 
unsigned long   atomic_flags; /* Flags requiring atomic 
access. */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 142b23645d82..10b3891be99e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -955,6 +955,10 @@ static struct task_struct *dup_task_struct(struct 
task_struct *orig, int node)
tsk->use_memdelay = 0;
 #endif
 
+#ifdef CONFIG_PCI_PASID
+   tsk->has_valid_pasid = 0;
+#endif
+
 #ifdef CONFIG_MEMCG
tsk->active_memcg = NULL;
 #endif


Re: [PATCH 00/17] spelling.txt: /decriptors/descriptors/

2020-06-15 Thread Jason Gunthorpe
On Tue, Jun 09, 2020 at 01:45:53PM +0100, Kieran Bingham wrote:
> I wouldn't normally go through spelling fixes, but I caught sight of
> this typo twice, and then foolishly grepped the tree for it, and saw how
> pervasive it was.
> 
> so here I am ... fixing a typo globally... but with an addition in
> scripts/spelling.txt so it shouldn't re-appear ;-)
> 
> Cc: linux-arm-ker...@lists.infradead.org (moderated list:TI DAVINCI MACHINE 
> SUPPORT)
> Cc: linux-ker...@vger.kernel.org (open list)
> Cc: linux...@vger.kernel.org (open list:DEVICE FREQUENCY EVENT 
> (DEVFREQ-EVENT))
> Cc: linux-g...@vger.kernel.org (open list:GPIO SUBSYSTEM)
> Cc: dri-de...@lists.freedesktop.org (open list:DRM DRIVERS)
> Cc: linux-r...@vger.kernel.org (open list:HFI1 DRIVER)
> Cc: linux-in...@vger.kernel.org (open list:INPUT (KEYBOARD, MOUSE, JOYSTICK, 
> TOUCHSCREEN)...)
> Cc: linux-...@lists.infradead.org (open list:NAND FLASH SUBSYSTEM)
> Cc: net...@vger.kernel.org (open list:NETWORKING DRIVERS)
> Cc: ath...@lists.infradead.org (open list:QUALCOMM ATHEROS ATH10K WIRELESS 
> DRIVER)
> Cc: linux-wirel...@vger.kernel.org (open list:NETWORKING DRIVERS (WIRELESS))
> Cc: linux-s...@vger.kernel.org (open list:IBM Power Virtual FC Device Drivers)
> Cc: linuxppc-dev@lists.ozlabs.org (open list:LINUX FOR POWERPC (32-BIT AND 
> 64-BIT))
> Cc: linux-...@vger.kernel.org (open list:USB SUBSYSTEM)
> Cc: virtualizat...@lists.linux-foundation.org (open list:VIRTIO CORE AND NET 
> DRIVERS)
> Cc: linux...@kvack.org (open list:MEMORY MANAGEMENT)
> 
> 
> Kieran Bingham (17):
>   arch: arm: mach-davinci: Fix trivial spelling
>   drivers: infiniband: Fix trivial spelling
>   drivers: infiniband: Fix trivial spelling

I took these two RDMA patches and merged them, thanks

Jason


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,

On Mon, Jun 15, 2020 at 08:31:16PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 11:12:59AM -0700, Fenghua Yu wrote:
> > > I don't get why you need a rdmsr here, or why not having one would
> > > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> > 
> > My concern is TIF flags are precious (only 3 slots available). Defining
> > a new TIF flag may be not worth it while rdmsr() can check if PASID
> > is valid in the MSR. And performance here might not be a big issue
> > in #GP.
> > 
> > But if you think using TIF flag is better, I can define a new TIF flag
> > and maintain it per thread (init 0 when clone()/fork(), set 1 in fixup()).
> > Then we can avoid using rdmsr() to check valid PASID in the MSR.
> 
> WHY ?!?! What do you need a TIF flag for?

We need "a way" to check if the per thread MSR has a valid PASID. If yes,
no need to fix up the MSR (wrmsr()), and let other handler to handle the #GP.
Otherwise, apply the heuristics and fix up the MSR and exit the #GP.

The way to check the valid PASID in the MSR is rdmsr() in this series.
A TIF flag will be much faster than rdmsr() and seems a sutiable way
to check valid PASID status per thread. That's why it could replace
rdmsr() to check PASID in the MSR.

Or do you suggest to add a random new flag in struct thread_info instead
of a TIF flag?

Thanks.

-Fenghua



Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Arnd Bergmann
On Mon, Jun 15, 2020 at 4:48 PM Brian Gerst  wrote:
> On Mon, Jun 15, 2020 at 10:13 AM Christoph Hellwig  wrote:
> > On Mon, Jun 15, 2020 at 03:31:35PM +0200, Arnd Bergmann wrote:

> >
> > I'd rather keep it in common code as that allows all the low-level
> > exec stuff to be marked static, and avoid us growing new pointless
> > compat variants through copy and paste.
> > smart compiler to d
> >
> > > I don't really understand
> > > the comment, why can't this just use this?
> >
> > That errors out with:
> >
> > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> > `__x32_sys_execve'
> > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> > `__x32_sys_execveat'
> > make: *** [Makefile:1139: vmlinux] Error 1
>
> I think I have a fix for this, by modifying the syscall wrappers to
> add an alias for the __x32 variant to the native __x64_sys_foo().
> I'll get back to you with a patch.

Do we actually need the __x32 prefix any more, or could we just
change all x32 specific calls to use __x64_compat_sys_foo()?

  Arnd


Re: [PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Waiman Long

On 6/15/20 2:07 PM, Dan Carpenter wrote:

On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 23c7500eea7d..c08bc7eb20bd 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1707,17 +1707,17 @@ void *krealloc(const void *p, size_t new_size, gfp_t 
flags)
  EXPORT_SYMBOL(krealloc);
  
  /**

- * kzfree - like kfree but zero memory
+ * kfree_sensitive - Clear sensitive information in memory before freeing
   * @p: object to free memory of
   *
   * The memory of the object @p points to is zeroed before freed.
- * If @p is %NULL, kzfree() does nothing.
+ * If @p is %NULL, kfree_sensitive() does nothing.
   *
   * Note: this function zeroes the whole allocated buffer which can be a good
   * deal bigger than the requested buffer size passed to kmalloc(). So be
   * careful when using this function in performance sensitive code.
   */
-void kzfree(const void *p)
+void kfree_sensitive(const void *p)
  {
size_t ks;
void *mem = (void *)p;
@@ -1725,10 +1725,10 @@ void kzfree(const void *p)
if (unlikely(ZERO_OR_NULL_PTR(mem)))
return;
ks = ksize(mem);
-   memset(mem, 0, ks);
+   memzero_explicit(mem, ks);

 ^
This is an unrelated bug fix.  It really needs to be pulled into a
separate patch by itself and back ported to stable kernels.


kfree(mem);
  }
-EXPORT_SYMBOL(kzfree);
+EXPORT_SYMBOL(kfree_sensitive);
  
  /**

   * ksize - get the actual amount of memory allocated for a given object

regards,
dan carpenter


Thanks for the suggestion. I will break it out and post a version soon.

Cheers,
Longman



Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:19:21AM -0700, Raj, Ashok wrote:
> On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> > 
> > I don't get why you need a rdmsr here, or why not having one would
> > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> > 
> > > > > +  */
> > > > > + rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > > > + if (pasid_msr & MSR_IA32_PASID_VALID)
> > > > > + return false;
> > > > > +
> > > > > + /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > > > + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > 
> > How much more expensive is the wrmsr over the rdmsr? Can't we just
> > unconditionally write the current PASID and call it a day?
> 
> The reason to check the rdmsr() is because we are using a hueristic taking
> GP faults. If we already setup the MSR, but we get it a second time it
> means the reason is something other than PASID_MSR not being set.
> 
> Ideally we should use the TIF_ to track this would be cheaper, but we were
> told those bits aren't easy to give out. 

Why do you need a TIF flag? Why not any other random flag in current?


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 11:12:59AM -0700, Fenghua Yu wrote:
> > I don't get why you need a rdmsr here, or why not having one would
> > require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?
> 
> My concern is TIF flags are precious (only 3 slots available). Defining
> a new TIF flag may be not worth it while rdmsr() can check if PASID
> is valid in the MSR. And performance here might not be a big issue
> in #GP.
> 
> But if you think using TIF flag is better, I can define a new TIF flag
> and maintain it per thread (init 0 when clone()/fork(), set 1 in fixup()).
> Then we can avoid using rdmsr() to check valid PASID in the MSR.

WHY ?!?! What do you need a TIF flag for?


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
On Mon, Jun 15, 2020 at 06:03:57PM +0200, Peter Zijlstra wrote:
> On Mon, Jun 15, 2020 at 08:48:54AM -0700, Fenghua Yu wrote:
> > Hi, Peter,
> > On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> > > On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > > > +/*
> > > > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > > > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like 
> > > > that
> > > > + * is the problem, try initializing the IA32_PASID MSR. If the 
> > > > heuristic
> > > > + * guesses incorrectly, take one more #GP fault.
> > > 
> > > How is that going to help? Aren't we then going to run this same
> > > heuristic again and again and again?
> > 
> > The heuristic always initializes the MSR with the per mm PASID IIF the
> > mm has a valid PASID but the MSR doesn't have one. This heuristic usually
> > happens only once on the first #GP in a thread.
> 
> But it doesn't guarantee the PASID is the right one. Suppose both the mm
> has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
> Then we NO-OP. So if the exception was due to us having the wrong PASID,
> we stuck.

The MSR for each thread was cleared during fork() and clone(). The PASID
was cleared during mm_init(). The per-mm PASID was assigned when fist
bind_mm() is called and remains the same value until process exit().

The MSR is only fixed up when the first ENQCMD is executed in a thread:
bit 31 in the MSR is 0 and the PASID in the mm is non-zero.

The MSR remains the same PASID value once it's fixed up until the thread
exits.

So the work flow ensures the PASID goes from mm's PASID to the MSR.

The PASID could be unbund from the mm. In this case, iommu will generate
#PF and kernel oops instead of #GP.

> 
> > If the next #GP still comes in, the heuristic finds out the MSR already
> > has a valid PASID and thus will not fixup the MSR any more. The fixup()
> > returns "false" and lets others to handle the #GP.
> > 
> > So the heuristic will be executed once (at most) and won't be executed
> > again and again.
> 
> So I get that you want to set the PASID on-demand, but I find this all
> really weird code to make that happen.

We could keep PASID same in all threads sychronously by propogating the MSRs
when the PASID is bound to the mm via IPIs or taskworks to all
threads in the process. But the code is complex and error-prone and
overhead could be high:
1. The user can call driver to do binding and unbinding multiple times.
   The IPIs or taskworks will be sent multiple times to make sure only
   the last IPIs or taskworks take action.
2. Even if a thread never executes ENQCMD and thus never uses the MSR,
   the MSR still needs to be updated whenever bind_mm() and needs to be
   context switched each time. This could cause high overhead.

Setting the PASID on-demand is simpler and cleaner and was recommended
by Thomas.

> 
> > > > +bool __fixup_pasid_exception(void)
> > > > +{
> > > > +   u64 pasid_msr;
> > > > +   unsigned int pasid;
> > > > +
> > > > +   /*
> > > > +* This function is called only when this #GP was triggered 
> > > > from user
> > > > +* space. So the mm cannot be NULL.
> > > > +*/
> > > > +   pasid = current->mm->pasid;
> > > > +   /* If the mm doesn't have a valid PASID, then can't help. */
> > > > +   if (invalid_pasid(pasid))
> > > > +   return false;
> > > > +
> > > > +   /*
> > > > +* Since IRQ is disabled now, the current task still owns the 
> > > > FPU on
> > > 
> > > That's just weird and confusing. What you want to say is that you rely
> > > on the exception disabling the interrupt.
> > 
> > I checked SDM again. You are right. #GP can be interrupted by machine check
> > or other interrupts. So I cannot assume the current task still owns the FPU.
> > Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
> > either the MSR on the processor or the PASID state in the memory.
> 
> That's not in fact what I meant, but yes, you can take exceptions while
> !IF just fine.
> 
> > > > +* this CPU and the PASID MSR can be directly accessed.
> > > > +*
> > > > +* If the MSR has a valid PASID, the #GP must be for some other 
> > > > reason.
> > > > +*
> > > > +* If rdmsr() is really a performance issue, a TIF_ flag may be
> > > > +* added to check if the thread has a valid PASID instead of 
> > > > rdmsr().
> > > 
> > > I don't understand any of this. Nobody except us writes to this MSR, we
> > > should bloody well know what's in it. What gives?
> > 
> > Patch 4 describes how to manage the MSR and patch 7 describes the format
> > of the MSR (20-bit PASID value and bit 31 is valid bit).
> > 
> > Are they sufficient to help? Or do you mean something else?
> 
> I don't get why you need a rdmsr here, or why not having one would
> require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?

My 

Re: [PATCH 1/2] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-15 Thread Dan Carpenter
On Mon, Apr 13, 2020 at 05:15:49PM -0400, Waiman Long wrote:
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 23c7500eea7d..c08bc7eb20bd 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -1707,17 +1707,17 @@ void *krealloc(const void *p, size_t new_size, gfp_t 
> flags)
>  EXPORT_SYMBOL(krealloc);
>  
>  /**
> - * kzfree - like kfree but zero memory
> + * kfree_sensitive - Clear sensitive information in memory before freeing
>   * @p: object to free memory of
>   *
>   * The memory of the object @p points to is zeroed before freed.
> - * If @p is %NULL, kzfree() does nothing.
> + * If @p is %NULL, kfree_sensitive() does nothing.
>   *
>   * Note: this function zeroes the whole allocated buffer which can be a good
>   * deal bigger than the requested buffer size passed to kmalloc(). So be
>   * careful when using this function in performance sensitive code.
>   */
> -void kzfree(const void *p)
> +void kfree_sensitive(const void *p)
>  {
>   size_t ks;
>   void *mem = (void *)p;
> @@ -1725,10 +1725,10 @@ void kzfree(const void *p)
>   if (unlikely(ZERO_OR_NULL_PTR(mem)))
>   return;
>   ks = ksize(mem);
> - memset(mem, 0, ks);
> + memzero_explicit(mem, ks);
^
This is an unrelated bug fix.  It really needs to be pulled into a
separate patch by itself and back ported to stable kernels.

>   kfree(mem);
>  }
> -EXPORT_SYMBOL(kzfree);
> +EXPORT_SYMBOL(kfree_sensitive);
>  
>  /**
>   * ksize - get the actual amount of memory allocated for a given object

regards,
dan carpenter


RE: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Luck, Tony
>> The heuristic always initializes the MSR with the per mm PASID IIF the
>> mm has a valid PASID but the MSR doesn't have one. This heuristic usually
>> happens only once on the first #GP in a thread.
>
> But it doesn't guarantee the PASID is the right one. Suppose both the mm
> has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
> Then we NO-OP. So if the exception was due to us having the wrong PASID,
> we stuck.

ENQCMD only checks the 'valid' bit of the IA32_PASID MSR to decide whether
to #GP or not.  H/W has no concept of the "right" pasid value.

If IA32_PASID is valid with the wrong value ... then the system is about to
see some major corruption because the operations in the accelerator are
not going to translate to the physical addresses for pages owned by the process
that issued the ENQCMD.

-Tony


Re: [PATCH v1 4/4] KVM: PPC: Book3S HV: migrate hot plugged memory

2020-06-15 Thread Laurent Dufour

Le 31/05/2020 à 04:27, Ram Pai a écrit :

From: Laurent Dufour 

When a memory slot is hot plugged to a SVM, GFNs associated with that
memory slot automatically default to secure GFN. Hence migrate the
PFNs associated with these GFNs to device-PFNs.

uv_migrate_mem_slot() is called to achieve that. It will not call
UV_PAGE_IN since this request is ignored by the Ultravisor.
NOTE: Ultravisor does not trust any page content provided by
the Hypervisor, ones the VM turns secure.

Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: Bharata B Rao 
Cc: Aneesh Kumar K.V 
Cc: Sukadev Bhattiprolu 
Cc: Laurent Dufour 
Cc: Thiago Jung Bauermann 
Cc: David Gibson 
Cc: Claudio Carvalho 
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Ram Pai 
(fixed merge conflicts. Modified the commit message)
Signed-off-by: Laurent Dufour 
---
  arch/powerpc/include/asm/kvm_book3s_uvmem.h |  4 
  arch/powerpc/kvm/book3s_hv.c| 11 +++
  arch/powerpc/kvm/book3s_hv_uvmem.c  |  3 +--
  3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_book3s_uvmem.h 
b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
index f0c5708..2ec2e5afb 100644
--- a/arch/powerpc/include/asm/kvm_book3s_uvmem.h
+++ b/arch/powerpc/include/asm/kvm_book3s_uvmem.h
@@ -23,6 +23,7 @@ unsigned long kvmppc_h_svm_page_out(struct kvm *kvm,
  void kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
 struct kvm *kvm, bool skip_page_out,
 bool purge_gfn);
+int uv_migrate_mem_slot(struct kvm *kvm, const struct kvm_memory_slot 
*memslot);
  #else
  static inline int kvmppc_uvmem_init(void)
  {
@@ -78,5 +79,8 @@ static inline int kvmppc_send_page_to_uv(struct kvm *kvm, 
unsigned long gfn)
  kvmppc_uvmem_drop_pages(const struct kvm_memory_slot *free,
struct kvm *kvm, bool skip_page_out,
bool purge_gfn) { }
+
+static int uv_migrate_mem_slot(struct kvm *kvm,
+   const struct kvm_memory_slot *memslot);
  #endif /* CONFIG_PPC_UV */
  #endif /* __ASM_KVM_BOOK3S_UVMEM_H__ */
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 4c62bfe..604d062 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -4516,13 +4516,16 @@ static void kvmppc_core_commit_memory_region_hv(struct 
kvm *kvm,
case KVM_MR_CREATE:
if (kvmppc_uvmem_slot_init(kvm, new))
return;
-   uv_register_mem_slot(kvm->arch.lpid,
-new->base_gfn << PAGE_SHIFT,
-new->npages * PAGE_SIZE,
-0, new->id);
+   if (uv_register_mem_slot(kvm->arch.lpid,
+new->base_gfn << PAGE_SHIFT,
+new->npages * PAGE_SIZE,
+0, new->id))
+   return;
+   uv_migrate_mem_slot(kvm, new);
break;
case KVM_MR_DELETE:
uv_unregister_mem_slot(kvm->arch.lpid, old->id);
+   kvmppc_uvmem_drop_pages(old, kvm, true, true);


My mistake, kvmppc_radix_flush_memslot() called just before is already 
triggering the call to kvmppc_uvmem_drop_pages(), so that call is useless.


You should remove it in your v2.


kvmppc_uvmem_slot_free(kvm, old);
break;
default:
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c 
b/arch/powerpc/kvm/book3s_hv_uvmem.c
index 36dda1d..1fa5f2a 100644
--- a/arch/powerpc/kvm/book3s_hv_uvmem.c
+++ b/arch/powerpc/kvm/book3s_hv_uvmem.c
@@ -377,8 +377,7 @@ static int kvmppc_svm_migrate_page(struct vm_area_struct 
*vma,
return ret;
  }
  
-static int uv_migrate_mem_slot(struct kvm *kvm,

-   const struct kvm_memory_slot *memslot)
+int uv_migrate_mem_slot(struct kvm *kvm, const struct kvm_memory_slot *memslot)
  {
unsigned long gfn = memslot->base_gfn;
unsigned long end;





Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Christoph Hellwig
On Mon, Jun 15, 2020 at 11:33:49AM -0400, Brian Gerst wrote:
> If you move those aliases above all the __SYSCALL_* defines it will
> work, since that will get the forward declaration too.  This would be
> the simplest workaround.

That compiles and also passes my exaustive x32 tests (chroot + ls -l).

This is the updated version:

http://git.infradead.org/users/hch/misc.git/commitdiff/c8d319711ad2f53be003ae8e9be08519068bdcee


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 08:48:54AM -0700, Fenghua Yu wrote:
> Hi, Peter,
> On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > > +/*
> > > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like that
> > > + * is the problem, try initializing the IA32_PASID MSR. If the heuristic
> > > + * guesses incorrectly, take one more #GP fault.
> > 
> > How is that going to help? Aren't we then going to run this same
> > heuristic again and again and again?
> 
> The heuristic always initializes the MSR with the per mm PASID IIF the
> mm has a valid PASID but the MSR doesn't have one. This heuristic usually
> happens only once on the first #GP in a thread.

But it doesn't guarantee the PASID is the right one. Suppose both the mm
has a PASID and the MSR has a VALID one, but the MSR isn't the mm one.
Then we NO-OP. So if the exception was due to us having the wrong PASID,
we stuck.

> If the next #GP still comes in, the heuristic finds out the MSR already
> has a valid PASID and thus will not fixup the MSR any more. The fixup()
> returns "false" and lets others to handle the #GP.
> 
> So the heuristic will be executed once (at most) and won't be executed
> again and again.

So I get that you want to set the PASID on-demand, but I find this all
really weird code to make that happen.

> > > +bool __fixup_pasid_exception(void)
> > > +{
> > > + u64 pasid_msr;
> > > + unsigned int pasid;
> > > +
> > > + /*
> > > +  * This function is called only when this #GP was triggered from user
> > > +  * space. So the mm cannot be NULL.
> > > +  */
> > > + pasid = current->mm->pasid;
> > > + /* If the mm doesn't have a valid PASID, then can't help. */
> > > + if (invalid_pasid(pasid))
> > > + return false;
> > > +
> > > + /*
> > > +  * Since IRQ is disabled now, the current task still owns the FPU on
> > 
> > That's just weird and confusing. What you want to say is that you rely
> > on the exception disabling the interrupt.
> 
> I checked SDM again. You are right. #GP can be interrupted by machine check
> or other interrupts. So I cannot assume the current task still owns the FPU.
> Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
> either the MSR on the processor or the PASID state in the memory.

That's not in fact what I meant, but yes, you can take exceptions while
!IF just fine.

> > > +  * this CPU and the PASID MSR can be directly accessed.
> > > +  *
> > > +  * If the MSR has a valid PASID, the #GP must be for some other reason.
> > > +  *
> > > +  * If rdmsr() is really a performance issue, a TIF_ flag may be
> > > +  * added to check if the thread has a valid PASID instead of rdmsr().
> > 
> > I don't understand any of this. Nobody except us writes to this MSR, we
> > should bloody well know what's in it. What gives?
> 
> Patch 4 describes how to manage the MSR and patch 7 describes the format
> of the MSR (20-bit PASID value and bit 31 is valid bit).
> 
> Are they sufficient to help? Or do you mean something else?

I don't get why you need a rdmsr here, or why not having one would
require a TIF flag. Is that because this MSR is XSAVE/XRSTOR managed?

> > > +  */
> > > + rdmsrl(MSR_IA32_PASID, pasid_msr);
> > > + if (pasid_msr & MSR_IA32_PASID_VALID)
> > > + return false;
> > > +
> > > + /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > > + wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);

How much more expensive is the wrmsr over the rdmsr? Can't we just
unconditionally write the current PASID and call it a day?


Re: [PATCH v2 12/12] x86/traps: Fix up invalid PASID

2020-06-15 Thread Fenghua Yu
Hi, Peter,
On Mon, Jun 15, 2020 at 09:56:49AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 12, 2020 at 05:41:33PM -0700, Fenghua Yu wrote:
> > +/*
> > + * Apply some heuristics to see if the #GP fault was caused by a thread
> > + * that hasn't had the IA32_PASID MSR initialized.  If it looks like that
> > + * is the problem, try initializing the IA32_PASID MSR. If the heuristic
> > + * guesses incorrectly, take one more #GP fault.
> 
> How is that going to help? Aren't we then going to run this same
> heuristic again and again and again?

The heuristic always initializes the MSR with the per mm PASID IIF the
mm has a valid PASID but the MSR doesn't have one. This heuristic usually
happens only once on the first #GP in a thread.

If the next #GP still comes in, the heuristic finds out the MSR already
has a valid PASID and thus will not fixup the MSR any more. The fixup()
returns "false" and lets others to handle the #GP.

So the heuristic will be executed once (at most) and won't be executed
again and again.

> 
> > + */
> > +bool __fixup_pasid_exception(void)
> > +{
> > +   u64 pasid_msr;
> > +   unsigned int pasid;
> > +
> > +   /*
> > +* This function is called only when this #GP was triggered from user
> > +* space. So the mm cannot be NULL.
> > +*/
> > +   pasid = current->mm->pasid;
> > +   /* If the mm doesn't have a valid PASID, then can't help. */
> > +   if (invalid_pasid(pasid))
> > +   return false;
> > +
> > +   /*
> > +* Since IRQ is disabled now, the current task still owns the FPU on
> 
> That's just weird and confusing. What you want to say is that you rely
> on the exception disabling the interrupt.

I checked SDM again. You are right. #GP can be interrupted by machine check
or other interrupts. So I cannot assume the current task still owns the FPU.
Instead of directly rdmsr() and wrmsr(), I will add helpers that can access
either the MSR on the processor or the PASID state in the memory.

> 
> > +* this CPU and the PASID MSR can be directly accessed.
> > +*
> > +* If the MSR has a valid PASID, the #GP must be for some other reason.
> > +*
> > +* If rdmsr() is really a performance issue, a TIF_ flag may be
> > +* added to check if the thread has a valid PASID instead of rdmsr().
> 
> I don't understand any of this. Nobody except us writes to this MSR, we
> should bloody well know what's in it. What gives?

Patch 4 describes how to manage the MSR and patch 7 describes the format
of the MSR (20-bit PASID value and bit 31 is valid bit).

Are they sufficient to help? Or do you mean something else?

> > +*/
> > +   rdmsrl(MSR_IA32_PASID, pasid_msr);
> > +   if (pasid_msr & MSR_IA32_PASID_VALID)
> > +   return false;
> > +
> > +   /* Fix up the MSR if the MSR doesn't have a valid PASID. */
> > +   wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
> > +
> > +   return true;
> > +}
> > -- 
> > 2.19.1
> > 

Thanks.

-Fenghua


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Brian Gerst
On Mon, Jun 15, 2020 at 11:10 AM Christoph Hellwig  wrote:
>
> On Mon, Jun 15, 2020 at 04:46:15PM +0200, Arnd Bergmann wrote:
> > How about this one:
> >
> > diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
> > index 3d8d70d3896c..0ce15807cf54 100644
> > --- a/arch/x86/entry/syscall_x32.c
> > +++ b/arch/x86/entry/syscall_x32.c
> > @@ -16,6 +16,9 @@
> >  #undef __SYSCALL_X32
> >  #undef __SYSCALL_COMMON
> >
> > +#define __x32_sys_execve __x64_sys_execve
> > +#define __x32_sys_execveat __x64_sys_execveat
> > +
>
>
> arch/x86/entry/syscall_x32.c:19:26: error: ‘__x64_sys_execve’ undeclared here 
> (not in a function); did you mean ‘__x32_sys_execve’?
>19 | #define __x32_sys_execve __x64_sys_execve
>   |  ^~~~
> arch/x86/entry/syscall_x32.c:22:39: note: in expansion of macro 
> ‘__x32_sys_execve’
>22 | #define __SYSCALL_X32(nr, sym) [nr] = __x32_##sym,
>   |   ^~
> ./arch/x86/include/generated/asm/syscalls_64.h:344:1: note: in expansion of 
> macro ‘__SYSCALL_X32’
>   344 | __SYSCALL_X32(520, sys_execve)
>   | ^
> arch/x86/entry/syscall_x32.c:20:28: error: ‘__x64_sys_execveat’ undeclared 
> here (not in a function); did you mean ‘__x32_sys_execveat’?
>20 | #define __x32_sys_execveat __x64_sys_execveat
>   |^~
> arch/x86/entry/syscall_x32.c:22:39: note: in expansion of macro 
> ‘__x32_sys_execveat’
>22 | #define __SYSCALL_X32(nr, sym) [nr] = __x32_##sym,
>   |   ^~
> ./arch/x86/include/generated/asm/syscalls_64.h:369:1: note: in expansion of 
> macro ‘__SYSCALL_X32’
>   369 | __SYSCALL_X32(545, sys_execveat)
>   | ^
> make[2]: *** [scripts/Makefile.build:281: arch/x86/entry/syscall_x32.o] Error 
> 1
> make[1]: *** [scripts/Makefile.build:497: arch/x86/entry] Error 2
> make[1]: *** Waiting for unfinished jobs
> kernel/exit.o: warning: objtool: __x64_sys_exit_group()+0x14: unreachable 
> instruction
> make: *** [Makefile:1764: arch/x86] Error 2
> make: *** Waiting for unfinished jobs

If you move those aliases above all the __SYSCALL_* defines it will
work, since that will get the forward declaration too.  This would be
the simplest workaround.

--
Brian Gerst


[PATCH net v2] ibmvnic: Harden device login requests

2020-06-15 Thread Thomas Falcon
The VNIC driver's "login" command sequence is the final step
in the driver's initialization process with device firmware,
confirming the available device queue resources to be utilized
by the driver. Under high system load, firmware may not respond
to the request in a timely manner or may abort the request. In
such cases, the driver should reattempt the login command
sequence. In case of a device error, the number of retries
is bounded.

Signed-off-by: Thomas Falcon 
---
v2: declare variables in Reverse Christmas tree format
---
 drivers/net/ethernet/ibm/ibmvnic.c | 21 +
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 1b4d04e..2baf7b3 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -842,12 +842,13 @@ static int ibmvnic_login(struct net_device *netdev)
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
unsigned long timeout = msecs_to_jiffies(3);
int retry_count = 0;
+   int retries = 10;
bool retry;
int rc;
 
do {
retry = false;
-   if (retry_count > IBMVNIC_MAX_QUEUES) {
+   if (retry_count > retries) {
netdev_warn(netdev, "Login attempts exceeded\n");
return -1;
}
@@ -862,11 +863,23 @@ static int ibmvnic_login(struct net_device *netdev)
 
if (!wait_for_completion_timeout(>init_done,
 timeout)) {
-   netdev_warn(netdev, "Login timed out\n");
-   return -1;
+   netdev_warn(netdev, "Login timed out, retrying...\n");
+   retry = true;
+   adapter->init_done_rc = 0;
+   retry_count++;
+   continue;
}
 
-   if (adapter->init_done_rc == PARTIALSUCCESS) {
+   if (adapter->init_done_rc == ABORTED) {
+   netdev_warn(netdev, "Login aborted, retrying...\n");
+   retry = true;
+   adapter->init_done_rc = 0;
+   retry_count++;
+   /* FW or device may be busy, so
+* wait a bit before retrying login
+*/
+   msleep(500);
+   } else if (adapter->init_done_rc == PARTIALSUCCESS) {
retry_count++;
release_sub_crqs(adapter, 1);
 
-- 
1.8.3.1



Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver

2020-06-15 Thread Chris Boot
On 15/06/2020 00:28, Finn Thain wrote:
> On Sun, 14 Jun 2020, Chris Boot wrote:
> 
>> I expect that if someone finds this useful it can stick around (but 
>> that's not my call).
> 
> Who's call is that? If the patch had said "From: Martin K. Petersen" and 
> "This driver is being removed because it has the following defects..." 
> that would be some indication of a good-faith willingness to accept users 
> as developers in the spirit of the GPL, which is what you seem to be 
> alluding to (?).

If you're asking me, I'd say it was martin's call:

> SCSI TARGET SUBSYSTEM 
>  
> M:  "Martin K. Petersen"  
>  
[...]
> F:  drivers/target/   
>  
> F:  include/target/   
>  

>> I just don't have the time or inclination or hardware to be able to 
>> maintain it anymore, so someone else would have to pick it up.
>>
> 
> Which is why most drivers get orphaned, right?

Sure, but that's not what Martin asked me to do, hence this patch.

-- 
Chris Boot
bo...@boo.tc


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Christoph Hellwig
On Mon, Jun 15, 2020 at 04:46:15PM +0200, Arnd Bergmann wrote:
> How about this one:
> 
> diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
> index 3d8d70d3896c..0ce15807cf54 100644
> --- a/arch/x86/entry/syscall_x32.c
> +++ b/arch/x86/entry/syscall_x32.c
> @@ -16,6 +16,9 @@
>  #undef __SYSCALL_X32
>  #undef __SYSCALL_COMMON
> 
> +#define __x32_sys_execve __x64_sys_execve
> +#define __x32_sys_execveat __x64_sys_execveat
> +


arch/x86/entry/syscall_x32.c:19:26: error: ‘__x64_sys_execve’ undeclared here 
(not in a function); did you mean ‘__x32_sys_execve’?
   19 | #define __x32_sys_execve __x64_sys_execve
  |  ^~~~
arch/x86/entry/syscall_x32.c:22:39: note: in expansion of macro 
‘__x32_sys_execve’
   22 | #define __SYSCALL_X32(nr, sym) [nr] = __x32_##sym,
  |   ^~
./arch/x86/include/generated/asm/syscalls_64.h:344:1: note: in expansion of 
macro ‘__SYSCALL_X32’
  344 | __SYSCALL_X32(520, sys_execve)
  | ^
arch/x86/entry/syscall_x32.c:20:28: error: ‘__x64_sys_execveat’ undeclared here 
(not in a function); did you mean ‘__x32_sys_execveat’?
   20 | #define __x32_sys_execveat __x64_sys_execveat
  |^~
arch/x86/entry/syscall_x32.c:22:39: note: in expansion of macro 
‘__x32_sys_execveat’
   22 | #define __SYSCALL_X32(nr, sym) [nr] = __x32_##sym,
  |   ^~
./arch/x86/include/generated/asm/syscalls_64.h:369:1: note: in expansion of 
macro ‘__SYSCALL_X32’
  369 | __SYSCALL_X32(545, sys_execveat)
  | ^
make[2]: *** [scripts/Makefile.build:281: arch/x86/entry/syscall_x32.o] Error 1
make[1]: *** [scripts/Makefile.build:497: arch/x86/entry] Error 2
make[1]: *** Waiting for unfinished jobs
kernel/exit.o: warning: objtool: __x64_sys_exit_group()+0x14: unreachable 
instruction
make: *** [Makefile:1764: arch/x86] Error 2
make: *** Waiting for unfinished jobs


Re: [PATCH] powerpc/fsl_booke/32: fix build with CONFIG_RANDOMIZE_BASE

2020-06-15 Thread Scott Wood
On Sat, 2020-06-13 at 23:28 +0700, Arseny Solokha wrote:
> Building the current 5.8 kernel for a e500 machine with
> CONFIG_RANDOMIZE_BASE set yields the following failure:
> 
>   arch/powerpc/mm/nohash/kaslr_booke.c: In function 'kaslr_early_init':
>   arch/powerpc/mm/nohash/kaslr_booke.c:387:2: error: implicit declaration
> of function 'flush_icache_range'; did you mean 'flush_tlb_range'?
> [-Werror=implicit-function-declaration]
> 
> Indeed, including asm/cacheflush.h into kaslr_booke.c fixes the build.
> 
> The issue dates back to the introduction of that file and probably went
> unnoticed because there's no in-tree defconfig with CONFIG_RANDOMIZE_BASE
> set.
> 
> Fixes: 2b0e86cc5de6 ("powerpc/fsl_booke/32: implement KASLR infrastructure")
> Cc: sta...@vger.kernel.org
> Signed-off-by: Arseny Solokha 
> ---
>  arch/powerpc/mm/nohash/kaslr_booke.c | 1 +
>  1 file changed, 1 insertion(+)

Acked-by: Scott Wood 

-Scott




Re: [PATCH v2 00/12] x86: tag application address space for devices

2020-06-15 Thread Fenghua Yu
Hi, Peter,
On Mon, Jun 15, 2020 at 09:52:02AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 12, 2020 at 05:41:21PM -0700, Fenghua Yu wrote:
> 
> > This series only provides simple and basic support for ENQCMD and the MSR:
> > 1. Clean up type definitions (patch 1-3). These patches can be in a
> >separate series.
> >- Define "pasid" as "unsigned int" consistently (patch 1 and 2).
> >- Define "flags" as "unsigned int"
> > 2. Explain different various technical terms used in the series (patch 4).
> > 3. Enumerate support for ENQCMD in the processor (patch 5).
> > 4. Handle FPU PASID state and the MSR during context switch (patches 6-7).
> > 5. Define "pasid" in mm_struct (patch 8).
> > 5. Clear PASID state for new mm and forked and cloned thread (patch 9-10).
> > 6. Allocate and free PASID for a process (patch 11).
> > 7. Fix up the PASID MSR in #GP handler when one thread in a process
> >executes ENQCMD for the first time (patches 12).
> 
> If this is per mm, should not switch_mm() update the MSR ? I'm not
> seeing that, nor do I see it explained why not.

PASID value is per mm and all threads in a process have the same PASID
value in the MSR. However, the MSR is per thread and is context switched
by XSAVES/XRSTROS in patches 6-7.

Thanks.

-Fenghua


Re: [PATCH net] ibmvnic: Harden device login requests

2020-06-15 Thread Thomas Falcon

On 6/12/20 4:10 PM, David Miller wrote:

From: Thomas Falcon 
Date: Fri, 12 Jun 2020 13:31:39 -0500


@@ -841,13 +841,14 @@ static int ibmvnic_login(struct net_device *netdev)
  {
struct ibmvnic_adapter *adapter = netdev_priv(netdev);
unsigned long timeout = msecs_to_jiffies(3);
+   int retries = 10;
int retry_count = 0;
bool retry;
int rc;

Reverse christmas tree, please.


Oops, sending a v2 soon.

Thanks,

Tom



Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Brian Gerst
On Mon, Jun 15, 2020 at 10:13 AM Christoph Hellwig  wrote:
>
> On Mon, Jun 15, 2020 at 03:31:35PM +0200, Arnd Bergmann wrote:
> > >  #ifdef CONFIG_COMPAT
> > > -   if (unlikely(argv.is_compat)) {
> > > +   if (in_compat_syscall()) {
> > > +   const compat_uptr_t __user *compat_argv =
> > > +   compat_ptr((unsigned long)argv);
> > > compat_uptr_t compat;
> > >
> > > -   if (get_user(compat, argv.ptr.compat + nr))
> > > +   if (get_user(compat, compat_argv + nr))
> > > return ERR_PTR(-EFAULT);
> > >
> > > return compat_ptr(compat);
> > > }
> > >  #endif
> >
> > I would expect that the "#ifdef CONFIG_COMPAT" can be removed
> > now, since compat_ptr() and in_compat_syscall() are now defined
> > unconditionally. I have not tried that though.
>
> True, I'll give it a spin.
>
> > > +/*
> > > + * x32 syscalls are listed in the same table as x86_64 ones, so we need 
> > > to
> > > + * define compat syscalls that are exactly the same as the native 
> > > version for
> > > + * the syscall table machinery to work.  Sigh..
> > > + */
> > > +#ifdef CONFIG_X86_X32
> > >  COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename,
> > > -   const compat_uptr_t __user *, argv,
> > > -   const compat_uptr_t __user *, envp)
> > > +  const char __user *const __user *, argv,
> > > +  const char __user *const __user *, envp)
> > >  {
> > > -   return do_compat_execve(AT_FDCWD, getname(filename), argv, envp, 
> > > 0);
> > > +   return do_execveat(AT_FDCWD, getname(filename), argv, envp, 0, 
> > > NULL);
> > >  }
> >
> > Maybe move it to arch/x86/kernel/process_64.c or 
> > arch/x86/entry/syscall_x32.c
> > to keep it out of the common code if this is needed.
>
> I'd rather keep it in common code as that allows all the low-level
> exec stuff to be marked static, and avoid us growing new pointless
> compat variants through copy and paste.
> smart compiler to d
>
> > I don't really understand
> > the comment, why can't this just use this?
>
> That errors out with:
>
> ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> `__x32_sys_execve'
> ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> `__x32_sys_execveat'
> make: *** [Makefile:1139: vmlinux] Error 1

I think I have a fix for this, by modifying the syscall wrappers to
add an alias for the __x32 variant to the native __x64_sys_foo().
I'll get back to you with a patch.

--
Brian Gerst


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Arnd Bergmann
On Mon, Jun 15, 2020 at 4:43 PM Christoph Hellwig  wrote:
>
> On Mon, Jun 15, 2020 at 04:40:28PM +0200, Arnd Bergmann wrote:
> > > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> > > `__x32_sys_execve'
> > > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> > > `__x32_sys_execveat'
> > > make: *** [Makefile:1139: vmlinux] Error 1
> >
> > Ah, I see: it's marked x32-only, so arch/x86/entry/syscall_x32.c
> > uses the __x32 prefix instead of the __x64 one. Marking it 'common'
> > instead would make it work, but also create an extra entry point
> > for native processes, something that commit
> > 6365b842aae4 ("x86/syscalls: Split the x32 syscalls into their own table")
> > was trying to avoid.
>
> Marking it common also doesn't compile at all because __NR_execve
> and __NR_execveat get redefined in unistd_64.h.  I then tried to rename
> the x32 versions, which failed in yet another way.  At that point I gave
> up instead of digging myself into a deeper hole..

How about this one:

diff --git a/arch/x86/entry/syscall_x32.c b/arch/x86/entry/syscall_x32.c
index 3d8d70d3896c..0ce15807cf54 100644
--- a/arch/x86/entry/syscall_x32.c
+++ b/arch/x86/entry/syscall_x32.c
@@ -16,6 +16,9 @@
 #undef __SYSCALL_X32
 #undef __SYSCALL_COMMON

+#define __x32_sys_execve __x64_sys_execve
+#define __x32_sys_execveat __x64_sys_execveat
+
 #define __SYSCALL_X32(nr, sym) [nr] = __x32_##sym,
 #define __SYSCALL_COMMON(nr, sym) [nr] = __x64_##sym,

Still ugly, but much simpler and more localized (if it works).

Arnd


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Arnd Bergmann
On Mon, Jun 15, 2020 at 4:12 PM Christoph Hellwig  wrote:
> On Mon, Jun 15, 2020 at 03:31:35PM +0200, Arnd Bergmann wrote:

>
> > I don't really understand
> > the comment, why can't this just use this?
>
> That errors out with:
>
> ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> `__x32_sys_execve'
> ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> `__x32_sys_execveat'
> make: *** [Makefile:1139: vmlinux] Error 1

Ah, I see: it's marked x32-only, so arch/x86/entry/syscall_x32.c
uses the __x32 prefix instead of the __x64 one. Marking it 'common'
instead would make it work, but also create an extra entry point
for native processes, something that commit
6365b842aae4 ("x86/syscalls: Split the x32 syscalls into their own table")
was trying to avoid.

  Arnd


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Christoph Hellwig
On Mon, Jun 15, 2020 at 04:40:28PM +0200, Arnd Bergmann wrote:
> > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
> > `__x32_sys_execve'
> > ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
> > `__x32_sys_execveat'
> > make: *** [Makefile:1139: vmlinux] Error 1
> 
> Ah, I see: it's marked x32-only, so arch/x86/entry/syscall_x32.c
> uses the __x32 prefix instead of the __x64 one. Marking it 'common'
> instead would make it work, but also create an extra entry point
> for native processes, something that commit
> 6365b842aae4 ("x86/syscalls: Split the x32 syscalls into their own table")
> was trying to avoid.

Marking it common also doesn't compile at all because __NR_execve
and __NR_execveat get redefined in unistd_64.h.  I then tried to rename
the x32 versions, which failed in yet another way.  At that point I gave
up instead of digging myself into a deeper hole..


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Christoph Hellwig
On Mon, Jun 15, 2020 at 03:31:35PM +0200, Arnd Bergmann wrote:
> >  #ifdef CONFIG_COMPAT
> > -   if (unlikely(argv.is_compat)) {
> > +   if (in_compat_syscall()) {
> > +   const compat_uptr_t __user *compat_argv =
> > +   compat_ptr((unsigned long)argv);
> > compat_uptr_t compat;
> >
> > -   if (get_user(compat, argv.ptr.compat + nr))
> > +   if (get_user(compat, compat_argv + nr))
> > return ERR_PTR(-EFAULT);
> >
> > return compat_ptr(compat);
> > }
> >  #endif
> 
> I would expect that the "#ifdef CONFIG_COMPAT" can be removed
> now, since compat_ptr() and in_compat_syscall() are now defined
> unconditionally. I have not tried that though.

True, I'll give it a spin.

> > +/*
> > + * x32 syscalls are listed in the same table as x86_64 ones, so we need to
> > + * define compat syscalls that are exactly the same as the native version 
> > for
> > + * the syscall table machinery to work.  Sigh..
> > + */
> > +#ifdef CONFIG_X86_X32
> >  COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename,
> > -   const compat_uptr_t __user *, argv,
> > -   const compat_uptr_t __user *, envp)
> > +  const char __user *const __user *, argv,
> > +  const char __user *const __user *, envp)
> >  {
> > -   return do_compat_execve(AT_FDCWD, getname(filename), argv, envp, 0);
> > +   return do_execveat(AT_FDCWD, getname(filename), argv, envp, 0, 
> > NULL);
> >  }
> 
> Maybe move it to arch/x86/kernel/process_64.c or arch/x86/entry/syscall_x32.c
> to keep it out of the common code if this is needed.

I'd rather keep it in common code as that allows all the low-level
exec stuff to be marked static, and avoid us growing new pointless
compat variants through copy and paste.
smart compiler to d

> I don't really understand
> the comment, why can't this just use this?

That errors out with:

ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1040): undefined reference to
`__x32_sys_execve'
ld: arch/x86/entry/syscall_x32.o:(.rodata+0x1108): undefined reference to
`__x32_sys_execveat'
make: *** [Makefile:1139: vmlinux] Error 1


Re: [PATCH 0/8 v2] PCI: Align return values of PCIe capability and PCI accessors

2020-06-15 Thread Jason Gunthorpe
On Mon, Jun 15, 2020 at 09:32:17AM +0200, refactormys...@gmail.com wrote:
> From: Bolarinwa Olayemi Saheed 
> 
> 
> PATCH 1/8 to 7/8:
> PCIBIOS_ error codes have positive values and they are passed down the
> call heirarchy from accessors. For functions which are meant to return
> only a negative value on failure, passing on this value is a bug.
> To mitigate this, call pcibios_err_to_errno() before passing on return
> value from PCIe capability accessors call heirarchy. This function
> converts any positive PCIBIOS_ error codes to negative generic error
> values.
> 
> PATCH 8/8:
> The PCIe capability accessors can return 0, -EINVAL, or any PCIBIOS_ error
> code. The pci accessor on the other hand can only return 0 or any PCIBIOS_
> error code.This inconsistency among these accessor makes it harder for
> callers to check for errors.
> Return PCIBIOS_BAD_REGISTER_NUMBER instead of -EINVAL in all PCIe
> capability accessors.
> 
> MERGING:
> These may all be merged via the PCI tree, since it is a collection of
> similar fixes. This way they all get merged at once.

I prefer this not happen for active trees, it just risks needless
merge conflicts.

I will take the hfi1 patches at least, let me know when they are
reviewed

Thanks,
Jason


Re: [PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Arnd Bergmann
On Mon, Jun 15, 2020 at 3:00 PM Christoph Hellwig  wrote:
>
> The only differenence betweeen the compat exec* syscalls and their
> native versions is that compat_ptr sign extension, and the fact that
> the pointer arithmetics for the two dimensional arrays needs to use
> the compat pointer size.  Instead of the compat wrappers and the
> struct user_arg_ptr machinery just use in_compat_syscall() to do the
> right thing for the compat case deep inside get_user_arg_ptr().
>
> Signed-off-by: Christoph Hellwig 

Nice!

I have an older patch doing the same for sys_mount() somewhere,
but never got around to send that. One day we should see if we
can just do this for all of them.

> -
> -static const char __user *get_user_arg_ptr(struct user_arg_ptr argv, int nr)
> +static const char __user *
> +get_user_arg_ptr(const char __user *const __user *argv, int nr)
>  {
> const char __user *native;
>
>  #ifdef CONFIG_COMPAT
> -   if (unlikely(argv.is_compat)) {
> +   if (in_compat_syscall()) {
> +   const compat_uptr_t __user *compat_argv =
> +   compat_ptr((unsigned long)argv);
> compat_uptr_t compat;
>
> -   if (get_user(compat, argv.ptr.compat + nr))
> +   if (get_user(compat, compat_argv + nr))
> return ERR_PTR(-EFAULT);
>
> return compat_ptr(compat);
> }
>  #endif

I would expect that the "#ifdef CONFIG_COMPAT" can be removed
now, since compat_ptr() and in_compat_syscall() are now defined
unconditionally. I have not tried that though.

> +/*
> + * x32 syscalls are listed in the same table as x86_64 ones, so we need to
> + * define compat syscalls that are exactly the same as the native version for
> + * the syscall table machinery to work.  Sigh..
> + */
> +#ifdef CONFIG_X86_X32
>  COMPAT_SYSCALL_DEFINE3(execve, const char __user *, filename,
> -   const compat_uptr_t __user *, argv,
> -   const compat_uptr_t __user *, envp)
> +  const char __user *const __user *, argv,
> +  const char __user *const __user *, envp)
>  {
> -   return do_compat_execve(AT_FDCWD, getname(filename), argv, envp, 0);
> +   return do_execveat(AT_FDCWD, getname(filename), argv, envp, 0, NULL);
>  }

Maybe move it to arch/x86/kernel/process_64.c or arch/x86/entry/syscall_x32.c
to keep it out of the common code if this is needed. I don't really understand
the comment, why can't this just use this?

--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -379,7 +379,7 @@
 517x32 recvfromcompat_sys_recvfrom
 518x32 sendmsg compat_sys_sendmsg
 519x32 recvmsg compat_sys_recvmsg
-520x32 execve  compat_sys_execve
+520x32 execve  sys_execve
 521x32 ptrace  compat_sys_ptrace
 522x32 rt_sigpending   compat_sys_rt_sigpending
 523x32 rt_sigtimedwait compat_sys_rt_sigtimedwait_time64
@@ -404,7 +404,7 @@
 542x32 getsockopt  compat_sys_getsockopt
 543x32 io_setupcompat_sys_io_setup
 544x32 io_submit   compat_sys_io_submit
-545x32 execveatcompat_sys_execveat
+545x32 execveatsys_execveat
 546x32 preadv2 compat_sys_preadv64v2
 547x32 pwritev2compat_sys_pwritev64v2
 548x32 process_madvise compat_sys_process_madvise

   Arnd


Re: properly support exec and wait with kernel pointers

2020-06-15 Thread Arnd Bergmann
On Mon, Jun 15, 2020 at 3:00 PM Christoph Hellwig  wrote:
>
> Hi all,
>
> this series first cleans up the exec code and then adds proper
> kernel_execveat and kernel_wait callers instead of relying on the fact
> that the early init code and kernel threads implicitly run with
> the address limit set to KERNEL_DS.
>
> Note that the cleanup removes the compat execve(at) handlers (almost)
> entirely, as we can handle the compat difference very nicely in a
> unified codebase.  The only exception is x86 where this would list the
> handlers twice in the same syscall table due to the messed up x32
> design.  I had to add an extra compat handler just for that case, but
> maybe someone has a better idea.

I looked at all the patches and I like it a lot. I replied with some suggestions
for x32, but maybe I misunderstood what its problem is, as I don't see
anything preventing us from having two entries in the x32 table pointing
to the same function.

   Arnd


Re: [PATCH 3/3] powerpc/8xx: Provide ptep_get() with 16k pages

2020-06-15 Thread Peter Zijlstra
On Mon, Jun 15, 2020 at 12:57:59PM +, Christophe Leroy wrote:
> READ_ONCE() now enforces atomic read, which leads to:


> Fixes: 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() 
> memory accesses")
> Cc: Will Deacon 
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/nohash/32/pgtable.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
> b/arch/powerpc/include/asm/nohash/32/pgtable.h
> index b56f14160ae5..77addb599ce7 100644
> --- a/arch/powerpc/include/asm/nohash/32/pgtable.h
> +++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
> @@ -286,6 +286,16 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
> *mm, unsigned long addr,
>   return __pte(pte_update(mm, addr, ptep, ~0, 0, 0));
>  }
>  
> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
> +#define __HAVE_ARCH_PTEP_GET
> +static inline pte_t ptep_get(pte_t *ptep)
> +{
> + pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0};
> +
> + return pte;
> +}
> +#endif

Would it make sense to have a comment with this magic? The casual reader
might wonder WTH just happened when he stumbles on this :-)

Other than that, the series looks good to me, Thanks!

Acked-by: Peter Zijlstra (Intel) 


[PATCH] powerpc/ptdump: Fix build failure in hashpagetable.c

2020-06-15 Thread Christophe Leroy
H_SUCCESS is only defined when CONFIG_PPC_PSERIES is defined.

!= H_SUCCESS means != 0. Modify the test accordingly.

Reported-by: kernel test robot 
Fixes: 65e701b2d2a8 ("powerpc/ptdump: drop non vital #ifdefs")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/mm/ptdump/hashpagetable.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c 
b/arch/powerpc/mm/ptdump/hashpagetable.c
index a2c33efc7ce8..5b8bd34cd3a1 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -258,7 +258,7 @@ static int pseries_find(unsigned long ea, int psize, bool 
primary, u64 *v, u64 *
for (i = 0; i < HPTES_PER_GROUP; i += 4, hpte_group += 4) {
lpar_rc = plpar_pte_read_4(0, hpte_group, (void *)ptes);
 
-   if (lpar_rc != H_SUCCESS)
+   if (lpar_rc)
continue;
for (j = 0; j < 4; j++) {
if (HPTE_V_COMPARE(ptes[j].v, want_v) &&
-- 
2.25.0



[PATCH 6/6] kernel: add a kernel_wait helper

2020-06-15 Thread Christoph Hellwig
Add a helper that waits for a pid and stores the status in the passed
in kernel pointer.  Use it to fix the usage of kernel_wait4 in
call_usermodehelper_exec_sync that only happens to work due to the
implicit set_fs(KERNEL_DS) for kernel threads.

Signed-off-by: Christoph Hellwig 
---
 include/linux/sched/task.h |  1 +
 kernel/exit.c  | 16 
 kernel/umh.c   | 29 -
 3 files changed, 21 insertions(+), 25 deletions(-)

diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h
index 38359071236ad7..a80007df396e95 100644
--- a/include/linux/sched/task.h
+++ b/include/linux/sched/task.h
@@ -102,6 +102,7 @@ struct task_struct *fork_idle(int);
 struct mm_struct *copy_init_mm(void);
 extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
 extern long kernel_wait4(pid_t, int __user *, int, struct rusage *);
+int kernel_wait(pid_t pid, int *stat);
 
 extern void free_task(struct task_struct *tsk);
 
diff --git a/kernel/exit.c b/kernel/exit.c
index 727150f2810338..fd598846df0b17 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1626,6 +1626,22 @@ long kernel_wait4(pid_t upid, int __user *stat_addr, int 
options,
return ret;
 }
 
+int kernel_wait(pid_t pid, int *stat)
+{
+   struct wait_opts wo = {
+   .wo_type= PIDTYPE_PID,
+   .wo_pid = find_get_pid(pid),
+   .wo_flags   = WEXITED,
+   };
+   int ret;
+
+   ret = do_wait();
+   if (ret > 0 && wo.wo_stat)
+   *stat = wo.wo_stat;
+   put_pid(wo.wo_pid);
+   return ret;
+}
+
 SYSCALL_DEFINE4(wait4, pid_t, upid, int __user *, stat_addr,
int, options, struct rusage __user *, ru)
 {
diff --git a/kernel/umh.c b/kernel/umh.c
index 1284823dbad338..6fd948e478bec4 100644
--- a/kernel/umh.c
+++ b/kernel/umh.c
@@ -126,37 +126,16 @@ static void call_usermodehelper_exec_sync(struct 
subprocess_info *sub_info)
 {
pid_t pid;
 
-   /* If SIGCLD is ignored kernel_wait4 won't populate the status. */
+   /* If SIGCLD is ignored do_wait won't populate the status. */
kernel_sigaction(SIGCHLD, SIG_DFL);
pid = kernel_thread(call_usermodehelper_exec_async, sub_info, SIGCHLD);
-   if (pid < 0) {
+   if (pid < 0)
sub_info->retval = pid;
-   } else {
-   int ret = -ECHILD;
-   /*
-* Normally it is bogus to call wait4() from in-kernel because
-* wait4() wants to write the exit code to a userspace address.
-* But call_usermodehelper_exec_sync() always runs as kernel
-* thread (workqueue) and put_user() to a kernel address works
-* OK for kernel threads, due to their having an mm_segment_t
-* which spans the entire address space.
-*
-* Thus the __user pointer cast is valid here.
-*/
-   kernel_wait4(pid, (int __user *), 0, NULL);
-
-   /*
-* If ret is 0, either call_usermodehelper_exec_async failed and
-* the real error code is already in sub_info->retval or
-* sub_info->retval is 0 anyway, so don't mess with it then.
-*/
-   if (ret)
-   sub_info->retval = ret;
-   }
+   else
+   kernel_wait(pid, _info->retval);
 
/* Restore default kernel sig handler */
kernel_sigaction(SIGCHLD, SIG_IGN);
-
umh_complete(sub_info);
 }
 
-- 
2.26.2



[PATCH 5/6] exec: add a kernel_execveat helper

2020-06-15 Thread Christoph Hellwig
Add a kernel_execveat helper to execute a binary with kernel space argv
and envp pointers.  Switch executing init and user mode helpers to this
new helper instead of relying on the implicit set_fs(KERNEL_DS) for early
init code and kernel threads, and move the getname call into the
do_execve helper.

Signed-off-by: Christoph Hellwig 
---
 fs/exec.c   | 116 +++-
 include/linux/binfmts.h |   6 +--
 init/main.c |   6 +--
 kernel/umh.c|   8 ++-
 4 files changed, 97 insertions(+), 39 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 696b1e8180d7d8..9e16d56aa53e86 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -439,6 +439,21 @@ static int count_strings(const char __user *const __user 
*argv)
return i;
 }
 
+static int count_kernel_strings(const char *const *argv)
+{
+   int i;
+
+   if (!argv)
+   return 0;
+
+   for (i = 0; argv[i]; i++) {
+   if (i >= MAX_ARG_STRINGS)
+   return -E2BIG;
+   }
+
+   return i;
+}
+
 static int check_arg_limit(struct linux_binprm *bprm)
 {
unsigned long limit, ptr_size;
@@ -615,6 +630,19 @@ int copy_string_kernel(const char *arg, struct 
linux_binprm *bprm)
 }
 EXPORT_SYMBOL(copy_string_kernel);
 
+static int copy_strings_kernel(int argc, const char *const *argv,
+   struct linux_binprm *bprm)
+{
+   int ret;
+
+   while (argc-- > 0) {
+   ret = copy_string_kernel(argv[argc], bprm);
+   if (ret)
+   break;
+   }
+   return ret;
+}
+
 #ifdef CONFIG_MMU
 
 /*
@@ -1797,9 +1825,11 @@ static int exec_binprm(struct linux_binprm *bprm)
return 0;
 }
 
-int do_execveat(int fd, struct filename *filename,
+static int __do_execveat(int fd, struct filename *filename,
const char __user *const __user *argv,
const char __user *const __user *envp,
+   const char *const *kernel_argv,
+   const char *const *kernel_envp,
int flags, struct file *file)
 {
char *pathbuf = NULL;
@@ -1880,16 +1910,30 @@ int do_execveat(int fd, struct filename *filename,
if (retval)
goto out_unmark;
 
-   bprm->argc = count_strings(argv);
-   if (bprm->argc < 0) {
-   retval = bprm->argc;
-   goto out;
-   }
+   if (unlikely(kernel_argv)) {
+   bprm->argc = count_kernel_strings(kernel_argv);
+   if (bprm->argc < 0) {
+   retval = bprm->argc;
+   goto out;
+   }
 
-   bprm->envc = count_strings(envp);
-   if (bprm->envc < 0) {
-   retval = bprm->envc;
-   goto out;
+   bprm->envc = count_kernel_strings(kernel_envp);
+   if (bprm->envc < 0) {
+   retval = bprm->envc;
+   goto out;
+   }
+   } else {
+   bprm->argc = count_strings(argv);
+   if (bprm->argc < 0) {
+   retval = bprm->argc;
+   goto out;
+   }
+
+   bprm->envc = count_strings(envp);
+   if (bprm->envc < 0) {
+   retval = bprm->envc;
+   goto out;
+   }
}
 
retval = check_arg_limit(bprm);
@@ -1906,13 +1950,22 @@ int do_execveat(int fd, struct filename *filename,
goto out;
 
bprm->exec = bprm->p;
-   retval = copy_strings(bprm->envc, envp, bprm);
-   if (retval < 0)
-   goto out;
 
-   retval = copy_strings(bprm->argc, argv, bprm);
-   if (retval < 0)
-   goto out;
+   if (unlikely(kernel_argv)) {
+   retval = copy_strings_kernel(bprm->envc, kernel_envp, bprm);
+   if (retval < 0)
+   goto out;
+   retval = copy_strings_kernel(bprm->argc, kernel_argv, bprm);
+   if (retval < 0)
+   goto out;
+   } else {
+   retval = copy_strings(bprm->envc, envp, bprm);
+   if (retval < 0)
+   goto out;
+   retval = copy_strings(bprm->argc, argv, bprm);
+   if (retval < 0)
+   goto out;
+   }
 
retval = exec_binprm(bprm);
if (retval < 0)
@@ -1963,6 +2016,23 @@ int do_execveat(int fd, struct filename *filename,
return retval;
 }
 
+static int do_execveat(int fd, const char *filename,
+  const char __user *const __user *argv,
+  const char __user *const __user *envp, int flags)
+{
+   int lookup_flags = (flags & AT_EMPTY_PATH) ? LOOKUP_EMPTY : 0;
+   struct filename *name = getname_flags(filename, lookup_flags, NULL);
+
+   return __do_execveat(fd, name, argv, envp, NULL, NULL, flags, NULL);
+}
+
+int kernel_execveat(int fd, const char 

[PATCH 4/6] exec: split prepare_arg_pages

2020-06-15 Thread Christoph Hellwig
Move counting the arguments and enviroment variables out of
prepare_arg_pages and rename the rest of the function to check_arg_limit.
This prepares for a version of do_execvat that takes kernel pointers.

Signed-off-by: Christoph Hellwig 
---
 fs/exec.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 6e4d9d1ffa35fa..696b1e8180d7d8 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -439,20 +439,10 @@ static int count_strings(const char __user *const __user 
*argv)
return i;
 }
 
-static int prepare_arg_pages(struct linux_binprm *bprm,
-   const char __user *const __user *argv,
-   const char __user *const __user *envp)
+static int check_arg_limit(struct linux_binprm *bprm)
 {
unsigned long limit, ptr_size;
 
-   bprm->argc = count_strings(argv);
-   if (bprm->argc < 0)
-   return bprm->argc;
-
-   bprm->envc = count_strings(envp);
-   if (bprm->envc < 0)
-   return bprm->envc;
-
/*
 * Limit to 1/4 of the max stack size or 3/4 of _STK_LIM
 * (whichever is smaller) for the argv+env strings.
@@ -1890,7 +1880,19 @@ int do_execveat(int fd, struct filename *filename,
if (retval)
goto out_unmark;
 
-   retval = prepare_arg_pages(bprm, argv, envp);
+   bprm->argc = count_strings(argv);
+   if (bprm->argc < 0) {
+   retval = bprm->argc;
+   goto out;
+   }
+
+   bprm->envc = count_strings(envp);
+   if (bprm->envc < 0) {
+   retval = bprm->envc;
+   goto out;
+   }
+
+   retval = check_arg_limit(bprm);
if (retval < 0)
goto out;
 
-- 
2.26.2



[PATCH 3/6] exec: cleanup the count() function

2020-06-15 Thread Christoph Hellwig
Remove the max argument as it is hard wired to MAX_ARG_STRINGS, and
give the function a slightly less generic name.

Signed-off-by: Christoph Hellwig 
---
 fs/exec.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 94107eceda8a67..6e4d9d1ffa35fa 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -411,9 +411,9 @@ get_user_arg_ptr(const char __user *const __user *argv, int 
nr)
 }
 
 /*
- * count() counts the number of strings in array ARGV.
+ * count_strings() counts the number of strings in array ARGV.
  */
-static int count(const char __user *const __user *argv, int max)
+static int count_strings(const char __user *const __user *argv)
 {
int i = 0;
 
@@ -427,7 +427,7 @@ static int count(const char __user *const __user *argv, int 
max)
if (IS_ERR(p))
return -EFAULT;
 
-   if (i >= max)
+   if (i >= MAX_ARG_STRINGS)
return -E2BIG;
++i;
 
@@ -445,11 +445,11 @@ static int prepare_arg_pages(struct linux_binprm *bprm,
 {
unsigned long limit, ptr_size;
 
-   bprm->argc = count(argv, MAX_ARG_STRINGS);
+   bprm->argc = count_strings(argv);
if (bprm->argc < 0)
return bprm->argc;
 
-   bprm->envc = count(envp, MAX_ARG_STRINGS);
+   bprm->envc = count_strings(envp);
if (bprm->envc < 0)
return bprm->envc;
 
-- 
2.26.2



[PATCH 2/6] exec: simplify the compat syscall handling

2020-06-15 Thread Christoph Hellwig
The only differenence betweeen the compat exec* syscalls and their
native versions is that compat_ptr sign extension, and the fact that
the pointer arithmetics for the two dimensional arrays needs to use
the compat pointer size.  Instead of the compat wrappers and the
struct user_arg_ptr machinery just use in_compat_syscall() to do the
right thing for the compat case deep inside get_user_arg_ptr().

Signed-off-by: Christoph Hellwig 
---
 arch/arm64/include/asm/unistd32.h |  4 +-
 arch/mips/kernel/syscalls/syscall_n32.tbl |  4 +-
 arch/mips/kernel/syscalls/syscall_o32.tbl |  4 +-
 arch/parisc/kernel/syscalls/syscall.tbl   |  4 +-
 arch/powerpc/kernel/syscalls/syscall.tbl  |  4 +-
 arch/s390/kernel/syscalls/syscall.tbl |  4 +-
 arch/sparc/kernel/syscalls.S  |  4 +-
 arch/x86/entry/syscalls/syscall_32.tbl|  4 +-
 fs/exec.c | 89 ++-
 include/linux/compat.h|  7 --
 include/uapi/asm-generic/unistd.h |  4 +-
 tools/include/uapi/asm-generic/unistd.h   |  4 +-
 .../arch/powerpc/entry/syscalls/syscall.tbl   |  4 +-
 .../perf/arch/s390/entry/syscalls/syscall.tbl |  4 +-
 14 files changed, 53 insertions(+), 91 deletions(-)

diff --git a/arch/arm64/include/asm/unistd32.h 
b/arch/arm64/include/asm/unistd32.h
index 6d95d0c8bf2f47..141f5d2ff1c34f 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -33,7 +33,7 @@ __SYSCALL(__NR_link, sys_link)
 #define __NR_unlink 10
 __SYSCALL(__NR_unlink, sys_unlink)
 #define __NR_execve 11
-__SYSCALL(__NR_execve, compat_sys_execve)
+__SYSCALL(__NR_execve, sys_execve)
 #define __NR_chdir 12
 __SYSCALL(__NR_chdir, sys_chdir)
/* 13 was sys_time */
@@ -785,7 +785,7 @@ __SYSCALL(__NR_memfd_create, sys_memfd_create)
 #define __NR_bpf 386
 __SYSCALL(__NR_bpf, sys_bpf)
 #define __NR_execveat 387
-__SYSCALL(__NR_execveat, compat_sys_execveat)
+__SYSCALL(__NR_execveat, sys_execveat)
 #define __NR_userfaultfd 388
 __SYSCALL(__NR_userfaultfd, sys_userfaultfd)
 #define __NR_membarrier 389
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl 
b/arch/mips/kernel/syscalls/syscall_n32.tbl
index f777141f52568f..e861b5ab7179c9 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -64,7 +64,7 @@
 54 n32 getsockopt  compat_sys_getsockopt
 55 n32 clone   __sys_clone
 56 n32 fork__sys_fork
-57 n32 execve  compat_sys_execve
+57 n32 execve  sys_execve
 58 n32 exitsys_exit
 59 n32 wait4   compat_sys_wait4
 60 n32 killsys_kill
@@ -328,7 +328,7 @@
 317n32 getrandom   sys_getrandom
 318n32 memfd_createsys_memfd_create
 319n32 bpf sys_bpf
-320n32 execveatcompat_sys_execveat
+320n32 execveatsys_execveat
 321n32 userfaultfd sys_userfaultfd
 322n32 membarrier  sys_membarrier
 323n32 mlock2  sys_mlock2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl 
b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 13280625d312e9..bba80f74e9968e 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -18,7 +18,7 @@
 8  o32 creat   sys_creat
 9  o32 linksys_link
 10 o32 unlink  sys_unlink
-11 o32 execve  sys_execve  
compat_sys_execve
+11 o32 execve  sys_execve
 12 o32 chdir   sys_chdir
 13 o32 timesys_time32
 14 o32 mknod   sys_mknod
@@ -367,7 +367,7 @@
 353o32 getrandom   sys_getrandom
 354o32 memfd_createsys_memfd_create
 355o32 bpf sys_bpf
-356o32 execveatsys_execveat
compat_sys_execveat
+356o32 execveatsys_execveat
 357o32 userfaultfd sys_userfaultfd
 358o32 membarrier  sys_membarrier
 359o32 mlock2  sys_mlock2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl 
b/arch/parisc/kernel/syscalls/syscall.tbl
index 5a758fa6ec5242..23fa0d0edf3384 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -18,7 +18,7 @@
 8  common  

properly support exec and wait with kernel pointers

2020-06-15 Thread Christoph Hellwig
Hi all,

this series first cleans up the exec code and then adds proper
kernel_execveat and kernel_wait callers instead of relying on the fact
that the early init code and kernel threads implicitly run with
the address limit set to KERNEL_DS.

Note that the cleanup removes the compat execve(at) handlers (almost)
entirely, as we can handle the compat difference very nicely in a
unified codebase.  The only exception is x86 where this would list the
handlers twice in the same syscall table due to the messed up x32
design.  I had to add an extra compat handler just for that case, but
maybe someone has a better idea.


[PATCH 1/6] exec: cleanup the execve wrappers

2020-06-15 Thread Christoph Hellwig
Remove a whole bunch of wrappers that eventually all call
__do_execve_file, and consolidate the execvce helpers to:

  (1) __do_execveat, which is the lowest level helper implementing the
  actual functionality
  (2) do_execvat, which is used by all callers that want native
  pointers
  (3) do_compat_execve, which is used by all compat syscalls

Signed-off-by: Christoph Hellwig 
---
 fs/exec.c   | 98 +++--
 include/linux/binfmts.h | 12 ++---
 init/main.c |  7 +--
 kernel/umh.c| 16 +++
 4 files changed, 41 insertions(+), 92 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index e6e8a9a7032784..354fdaa536ae7d 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1815,10 +1815,7 @@ static int exec_binprm(struct linux_binprm *bprm)
return 0;
 }
 
-/*
- * sys_execve() executes a new program.
- */
-static int __do_execve_file(int fd, struct filename *filename,
+static int __do_execveat(int fd, struct filename *filename,
struct user_arg_ptr argv,
struct user_arg_ptr envp,
int flags, struct file *file)
@@ -1972,74 +1969,16 @@ static int __do_execve_file(int fd, struct filename 
*filename,
return retval;
 }
 
-static int do_execveat_common(int fd, struct filename *filename,
- struct user_arg_ptr argv,
- struct user_arg_ptr envp,
- int flags)
-{
-   return __do_execve_file(fd, filename, argv, envp, flags, NULL);
-}
-
-int do_execve_file(struct file *file, void *__argv, void *__envp)
-{
-   struct user_arg_ptr argv = { .ptr.native = __argv };
-   struct user_arg_ptr envp = { .ptr.native = __envp };
-
-   return __do_execve_file(AT_FDCWD, NULL, argv, envp, 0, file);
-}
-
-int do_execve(struct filename *filename,
-   const char __user *const __user *__argv,
-   const char __user *const __user *__envp)
-{
-   struct user_arg_ptr argv = { .ptr.native = __argv };
-   struct user_arg_ptr envp = { .ptr.native = __envp };
-   return do_execveat_common(AT_FDCWD, filename, argv, envp, 0);
-}
-
 int do_execveat(int fd, struct filename *filename,
const char __user *const __user *__argv,
const char __user *const __user *__envp,
-   int flags)
+   int flags, struct file *file)
 {
struct user_arg_ptr argv = { .ptr.native = __argv };
struct user_arg_ptr envp = { .ptr.native = __envp };
 
-   return do_execveat_common(fd, filename, argv, envp, flags);
-}
-
-#ifdef CONFIG_COMPAT
-static int compat_do_execve(struct filename *filename,
-   const compat_uptr_t __user *__argv,
-   const compat_uptr_t __user *__envp)
-{
-   struct user_arg_ptr argv = {
-   .is_compat = true,
-   .ptr.compat = __argv,
-   };
-   struct user_arg_ptr envp = {
-   .is_compat = true,
-   .ptr.compat = __envp,
-   };
-   return do_execveat_common(AT_FDCWD, filename, argv, envp, 0);
-}
-
-static int compat_do_execveat(int fd, struct filename *filename,
- const compat_uptr_t __user *__argv,
- const compat_uptr_t __user *__envp,
- int flags)
-{
-   struct user_arg_ptr argv = {
-   .is_compat = true,
-   .ptr.compat = __argv,
-   };
-   struct user_arg_ptr envp = {
-   .is_compat = true,
-   .ptr.compat = __envp,
-   };
-   return do_execveat_common(fd, filename, argv, envp, flags);
+   return __do_execveat(fd, filename, argv, envp, flags, file);
 }
-#endif
 
 void set_binfmt(struct linux_binfmt *new)
 {
@@ -2070,7 +2009,7 @@ SYSCALL_DEFINE3(execve,
const char __user *const __user *, argv,
const char __user *const __user *, envp)
 {
-   return do_execve(getname(filename), argv, envp);
+   return do_execveat(AT_FDCWD, getname(filename), argv, envp, 0, NULL);
 }
 
 SYSCALL_DEFINE5(execveat,
@@ -2080,18 +2019,34 @@ SYSCALL_DEFINE5(execveat,
int, flags)
 {
int lookup_flags = (flags & AT_EMPTY_PATH) ? LOOKUP_EMPTY : 0;
+   struct filename *name = getname_flags(filename, lookup_flags, NULL);
 
-   return do_execveat(fd,
-  getname_flags(filename, lookup_flags, NULL),
-  argv, envp, flags);
+   return do_execveat(fd, name, argv, envp, flags, NULL);
 }
 
 #ifdef CONFIG_COMPAT
+static int do_compat_execve(int fd, struct filename *filename,
+   const compat_uptr_t __user *__argv,
+   const compat_uptr_t __user *__envp,
+   int flags)
+{
+   struct user_arg_ptr argv = {
+   .is_compat = true,
+   .ptr.compat = __argv,
+   };
+   struct user_arg_ptr envp = {
+   .is_compat = true,
+  

[PATCH 3/3] powerpc/8xx: Provide ptep_get() with 16k pages

2020-06-15 Thread Christophe Leroy
READ_ONCE() now enforces atomic read, which leads to:

  CC  mm/gup.o
In file included from ./include/linux/kernel.h:11:0,
 from mm/gup.c:2:
In function 'gup_hugepte.constprop',
inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' 
declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:373:4: note: in definition of macro 
'__compiletime_assert'
prefix ## suffix();\
^
./include/linux/compiler.h:392:2: note: in expansion of macro 
'_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 
'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 
'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);\
  ^
mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
  pte = READ_ONCE(*ptep);
^
In function 'gup_get_pte',
inlined from 'gup_pte_range' at mm/gup.c:2228:9,
inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
inlined from 'gup_pud_range' at mm/gup.c:2641:15,
inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' 
declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:373:4: note: in definition of macro 
'__compiletime_assert'
prefix ## suffix();\
^
./include/linux/compiler.h:392:2: note: in expansion of macro 
'_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 
'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 
'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);\
  ^
mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
  return READ_ONCE(*ptep);
 ^
make[2]: *** [mm/gup.o] Error 1

Define ptep_get() on 8xx when using 16k pages.

Fixes: 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() 
memory accesses")
Cc: Will Deacon 
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/nohash/32/pgtable.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/nohash/32/pgtable.h 
b/arch/powerpc/include/asm/nohash/32/pgtable.h
index b56f14160ae5..77addb599ce7 100644
--- a/arch/powerpc/include/asm/nohash/32/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/32/pgtable.h
@@ -286,6 +286,16 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm, unsigned long addr,
return __pte(pte_update(mm, addr, ptep, ~0, 0, 0));
 }
 
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+#define __HAVE_ARCH_PTEP_GET
+static inline pte_t ptep_get(pte_t *ptep)
+{
+   pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0};
+
+   return pte;
+}
+#endif
+
 #define __HAVE_ARCH_PTEP_SET_WRPROTECT
 static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr,
  pte_t *ptep)
-- 
2.25.0



[PATCH 0/3] Fix build failure with v5.8-rc1

2020-06-15 Thread Christophe Leroy
Commit 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses") leads to following build
failure on powerpc 8xx.

To fix it, this small series introduces a new helper named ptep_get()
to replace the direct access with READ_ONCE(). This new helper
can be overriden by architectures.


  CC  mm/gup.o
In file included from ./include/linux/kernel.h:11:0,
 from mm/gup.c:2:
In function 'gup_hugepte.constprop',
inlined from 'gup_huge_pd.isra.79' at mm/gup.c:2465:8:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_222' 
declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:373:4: note: in definition of macro 
'__compiletime_assert'
prefix ## suffix();\
^
./include/linux/compiler.h:392:2: note: in expansion of macro 
'_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 
'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 
'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);\
  ^
mm/gup.c:2428:8: note: in expansion of macro 'READ_ONCE'
  pte = READ_ONCE(*ptep);
^
In function 'gup_get_pte',
inlined from 'gup_pte_range' at mm/gup.c:2228:9,
inlined from 'gup_pmd_range' at mm/gup.c:2613:15,
inlined from 'gup_pud_range' at mm/gup.c:2641:15,
inlined from 'gup_p4d_range' at mm/gup.c:2666:15,
inlined from 'gup_pgd_range' at mm/gup.c:2694:15,
inlined from 'internal_get_user_pages_fast' at mm/gup.c:2795:3:
./include/linux/compiler.h:392:38: error: call to '__compiletime_assert_219' 
declared with attribute error: Unsupported access size for {READ,WRITE}_ONCE().
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:373:4: note: in definition of macro 
'__compiletime_assert'
prefix ## suffix();\
^
./include/linux/compiler.h:392:2: note: in expansion of macro 
'_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
  ^
./include/linux/compiler.h:405:2: note: in expansion of macro 
'compiletime_assert'
  compiletime_assert(__native_word(t) || sizeof(t) == sizeof(long long), \
  ^
./include/linux/compiler.h:291:2: note: in expansion of macro 
'compiletime_assert_rwonce_type'
  compiletime_assert_rwonce_type(x);\
  ^
mm/gup.c:2199:9: note: in expansion of macro 'READ_ONCE'
  return READ_ONCE(*ptep);
 ^
make[2]: *** [mm/gup.o] Error 1

Christophe Leroy (3):
  mm/gup: Use huge_ptep_get() in gup_hugepte()
  mm: Allow arches to provide ptep_get()
  powerpc/8xx: Provide ptep_get() with 16k pages

 arch/powerpc/include/asm/nohash/32/pgtable.h | 10 ++
 include/asm-generic/hugetlb.h|  2 +-
 include/linux/pgtable.h  |  7 +++
 mm/gup.c |  4 ++--
 4 files changed, 20 insertions(+), 3 deletions(-)

-- 
2.25.0



[PATCH 1/3] mm/gup: Use huge_ptep_get() in gup_hugepte()

2020-06-15 Thread Christophe Leroy
gup_hugepte() reads hugepage table entries, it can't read
them directly, huge_ptep_get() must be used.

Fixes: 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() 
memory accesses")
Cc: Will Deacon 
Signed-off-by: Christophe Leroy 
---
 mm/gup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index de9e36262ccb..761df4944ef5 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2425,7 +2425,7 @@ static int gup_hugepte(pte_t *ptep, unsigned long sz, 
unsigned long addr,
if (pte_end < end)
end = pte_end;
 
-   pte = READ_ONCE(*ptep);
+   pte = huge_ptep_get(ptep);
 
if (!pte_access_permitted(pte, flags & FOLL_WRITE))
return 0;
-- 
2.25.0



[PATCH 2/3] mm: Allow arches to provide ptep_get()

2020-06-15 Thread Christophe Leroy
Since commit 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for
{READ,WRITE}_ONCE() memory accesses"), it is not possible anymore
to use READ_ONCE() to access complex page table entries like
the one defined for powerpc 8xx with 16k size pages.

Define a ptep_get() helper that architectures can override instead
of performing a READ_ONCE() on the page table entry pointer.

Fixes: 2ab3a0a02905 ("READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() 
memory accesses")
Cc: Will Deacon 
Signed-off-by: Christophe Leroy 
---
 include/asm-generic/hugetlb.h | 2 +-
 include/linux/pgtable.h   | 7 +++
 mm/gup.c  | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index 40f85decc2ee..8e1e6244a89d 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -122,7 +122,7 @@ static inline int huge_ptep_set_access_flags(struct 
vm_area_struct *vma,
 #ifndef __HAVE_ARCH_HUGE_PTEP_GET
 static inline pte_t huge_ptep_get(pte_t *ptep)
 {
-   return READ_ONCE(*ptep);
+   return ptep_get(ptep);
 }
 #endif
 
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 32b6c52d41b9..56c1e8eb7bb0 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -249,6 +249,13 @@ static inline pte_t ptep_get_and_clear(struct mm_struct 
*mm,
 }
 #endif
 
+#ifndef __HAVE_ARCH_PTEP_GET
+static inline pte_t ptep_get(pte_t *ptep)
+{
+   return READ_ONCE(*ptep);
+}
+#endif
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #ifndef __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR
 static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
diff --git a/mm/gup.c b/mm/gup.c
index 761df4944ef5..6f47697f8fb0 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2196,7 +2196,7 @@ static inline pte_t gup_get_pte(pte_t *ptep)
  */
 static inline pte_t gup_get_pte(pte_t *ptep)
 {
-   return READ_ONCE(*ptep);
+   return ptep_get(ptep);
 }
 #endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */
 
-- 
2.25.0



Re: [PATCH v13 2/6] seq_buf: Export seq_buf_printf

2020-06-15 Thread Borislav Petkov
On Mon, Jun 15, 2020 at 06:14:03PM +0530, Vaibhav Jain wrote:
> 'seq_buf' provides a very useful abstraction for writing to a string
> buffer without needing to worry about it over-flowing. However even
> though the API has been stable for couple of years now its still not
> exported to kernel loadable modules limiting its usage.
> 
> Hence this patch proposes update to 'seq_buf.c' to mark
> seq_buf_printf() which is part of the seq_buf API to be exported to
> kernel loadable GPL modules. This symbol will be used in later parts
> of this patch-set to simplify content creation for a sysfs attribute.
> 
> Cc: Piotr Maziarz 
> Cc: Cezary Rojewski 
> Cc: Christoph Hellwig 
> Cc: Steven Rostedt 
> Cc: Borislav Petkov 
> Acked-by: Steven Rostedt (VMware) 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
> 
> v12..v13:
> * None
> 
> v11..v12:
> * None

Can you please resend your patchset once a week like everyone else and
not flood inboxes with it?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v12 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

2020-06-15 Thread Vaibhav Jain
This accidently got reposted. Please ignore.

v13 version of the patch series located at
https://lore.kernel.org/linux-nvdimm/20200615124407.32596-1-vaib...@linux.ibm.com


Vaibhav Jain  writes:

> Changes since v11 [1]:
> * Minor update to 'papr_pdsm.h' fixing a misleading comment about
>   'possible' padding being added by GCC which doesn't apply in case
>   structs are marked as __packed.
> * Fix the order of initialization of 'struct nd_papr_pdsm_health' in
>   papr_pdsm_health().
> * Added acks from Ira for various patches.
>
> [1] 
> https://lore.kernel.org/linux-nvdimm/20200607131339.476036-1-vaib...@linux.ibm.com
> ---
>
> The PAPR standard[2][4] provides mechanisms to query the health and
> performance stats of an NVDIMM via various hcalls as described in
> Ref[3].  Until now these stats were never available nor exposed to the
> user-space tools like 'ndctl'. This is partly due to PAPR platform not
> having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
> report the dimm health status and a user had no way to determine the
> current health status of a NDVIMM.
>
> To overcome this limitation, this patch-set updates papr_scm kernel
> module to query and fetch NVDIMM health stats using hcalls described
> in Ref[3].  This health and performance stats are then exposed to
> userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
> libndctl.
>
> These changes coupled with proposed ndtcl changes located at Ref[5]
> should provide a way for the user to retrieve NVDIMM health status
> using ndtcl.
>
> Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
> in a emulation environment:
>
>  # ndctl list -DH
> [
>   {
> "dev":"nmem0",
> "health":{
>   "health_state":"fatal",
>   "shutdown_state":"dirty"
> }
>   }
> ]
>
> Dimm health report output on a pseries guest lpar with vPMEM or HMS
> based NVDIMMs that are in perfectly healthy conditions:
>
>  # ndctl list -d nmem0 -H
> [
>   {
> "dev":"nmem0",
> "health":{
>   "health_state":"ok",
>   "shutdown_state":"clean"
> }
>   }
> ]
>
> PAPR NVDIMM-Specific-Methods(PDSM)
> ==
>
> PDSM requests are issued by vendor specific code in libndctl to
> execute certain operations or fetch information from NVDIMMS. PDSMs
> requests can be sent to papr_scm module via libndctl(userspace) and
> libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
> handled in the dimm control function papr_scm_ndctl(). Current
> patchset proposes a single PDSM to retrieve NVDIMM health, defined in
> the newly introduced uapi header named 'papr_pdsm.h'. Support for
> more PDSMs will be added in future.
>
> Structure of the patch-set
> ==
>
> The patch-set starts with a doc patch documenting details of hcall
> H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
> thats used in subsequent patches to generate sysfs attribute content.
>
> Third patch implements support for fetching NVDIMM health information
> from PHYP and partially exposing it to user-space via a NVDIMM sysfs
> flag.
>
> Fourth patch updates papr_scm_ndctl() to handle a possible error case
> and also improve debug logging.
>
> Fifth patch deals with implementing support for servicing PDSM
> commands in papr_scm module.
>
> Finally the last patch implements support for servicing PDSM
> 'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
> libndctl.
>
> References:
> [2] "Power Architecture Platform Reference"
>   https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
> [3] commit 58b278f568f0
>  ("powerpc: Provide initial documentation for PAPR hcalls")
> [4] "Linux on Power Architecture Platform Reference"
>  https://members.openpowerfoundation.org/document/dl/469
> [5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v12
>
> ---
>
> Vaibhav Jain (6):
>   powerpc: Document details on H_SCM_HEALTH hcall
>   seq_buf: Export seq_buf_printf
>   powerpc/papr_scm: Fetch nvdimm health information from PHYP
>   powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
>   ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
>   powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH
>
>  Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
>  Documentation/powerpc/papr_hcalls.rst |  46 ++-
>  arch/powerpc/include/uapi/asm/papr_pdsm.h | 125 ++
>  arch/powerpc/platforms/pseries/papr_scm.c | 373 +-
>  include/uapi/linux/ndctl.h|   1 +
>  lib/seq_buf.c |   1 +
>  6 files changed, 562 insertions(+), 11 deletions(-)
>  create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
>  create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h
>
> -- 
> 2.26.2
>

-- 
Cheers
~ Vaibhav


Re: [PATCH v12 1/6] powerpc: Document details on H_SCM_HEALTH hcall

2020-06-15 Thread Vaibhav Jain


This accidently got reposted. Please ignore.

Vaibhav Jain  writes:

> Add documentation to 'papr_hcalls.rst' describing the bitmap flags
> that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
> specification.
>
> Cc: "Aneesh Kumar K . V" 
> Cc: Dan Williams 
> Cc: Michael Ellerman 
> Cc: Ira Weiny 
> Acked-by: Ira Weiny 
> Signed-off-by: Vaibhav Jain 
> ---
> Changelog:
>
> v11..v12:
> * None
>
> v10..v11:
> * None
>
> v9..v10:
> * Added ack from Ira.
>
> Resend:
> * None
>
> v8..v9:
> * s/SCM/PMEM device. [ Dan Williams, Aneesh ]
>
> v7..v8:
> * Added a clarification on bit-ordering of Health Bitmap
>
> Resend:
> * None
>
> v6..v7:
> * None
>
> v5..v6:
> * New patch in the series
> ---
>  Documentation/powerpc/papr_hcalls.rst | 46 ---
>  1 file changed, 42 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/powerpc/papr_hcalls.rst 
> b/Documentation/powerpc/papr_hcalls.rst
> index 3493631a60f8..48fcf1255a33 100644
> --- a/Documentation/powerpc/papr_hcalls.rst
> +++ b/Documentation/powerpc/papr_hcalls.rst
> @@ -220,13 +220,51 @@ from the LPAR memory.
>  **H_SCM_HEALTH**
>  
>  | Input: drcIndex
> -| Out: *health-bitmap, health-bit-valid-bitmap*
> +| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
>  | Return Value: *H_Success, H_Parameter, H_Hardware*
>  
>  Given a DRC Index return the info on predictive failure and overall health of
> -the NVDIMM. The asserted bits in the health-bitmap indicate a single 
> predictive
> -failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
> -valid.
> +the PMEM device. The asserted bits in the health-bitmap indicate one or more 
> states
> +(described in table below) of the PMEM device and health-bit-valid-bitmap 
> indicate
> +which bits in health-bitmap are valid. The bits are reported in
> +reverse bit ordering for example a value of 0xC400
> +indicates bits 0, 1, and 5 are valid.
> +
> +Health Bitmap Flags:
> +
> ++--+---+
> +|  Bit |   Definition
>   |
> ++==+===+
> +|  00  | PMEM device is unable to persist memory contents.   
>   |
> +|  | If the system is powered down, nothing will be saved.   
>   |
> ++--+---+
> +|  01  | PMEM device failed to persist memory contents. Either contents were 
>   |
> +|  | not saved successfully on power down or were not restored properly 
> on |
> +|  | power up.   
>   |
> ++--+---+
> +|  02  | PMEM device contents are persisted from previous IPL. The data from 
>   |
> +|  | the last boot were successfully restored.   
>   |
> ++--+---+
> +|  03  | PMEM device contents are not persisted from previous IPL. There was 
> no|
> +|  | data to restore from the last boot. 
>   |
> ++--+---+
> +|  04  | PMEM device memory life remaining is critically low 
>   |
> ++--+---+
> +|  05  | PMEM device will be garded off next IPL due to failure  
>   |
> ++--+---+
> +|  06  | PMEM device contents cannot persist due to current platform health  
>   |
> +|  | status. A hardware failure may prevent data from being saved or 
>   |
> +|  | restored.   
>   |
> ++--+---+
> +|  07  | PMEM device is unable to persist memory contents in certain 
> conditions|
> ++--+---+
> +|  08  | PMEM device is encrypted
>   |
> ++--+---+
> +|  09  | PMEM device has successfully completed a requested erase or secure  
>   |
> +|  | erase procedure.
>   |
> ++--+---+
> +|10:63 | Reserved / Unused   
>   |
> ++--+---+
>  
>  **H_SCM_PERFORMANCE_STATS**
>  
> -- 
> 2.26.2
>

-- 
Cheers
~ Vaibhav


[PATCH v13 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-15 Thread Vaibhav Jain
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a 'struct nd_pkg_pdsm' and a maximal union named
'nd_pdsm_payload'. These new structs together with 'struct nd_cmd_pkg'
for a pdsm envelop thats sent by libndctl to libnvdimm and serviced by
papr_scm in 'papr_scm_service_pdsm()'. The PDSM request is
communicated by member 'struct nd_cmd_pkg.nd_command' together with
other information on the pdsm payload (size-in, size-out).

The patch also introduces 'struct pdsm_cmd_desc' instances of which
are stored in an array __pdsm_cmd_descriptors[] indexed with PDSM cmd
and corresponding access function pdsm_cmd_desc() is
introduced. 'struct pdsm_cdm_desc' holds the service function for a
given PDSM and corresponding payload in/out sizes.

A new function papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm. The function performs validation on the PDSM
payload based on info present in corresponding PDSM descriptor and if
valid calls the 'struct pdcm_cmd_desc.service' function to service the
PDSM.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v12..v13:
* s/struct nd_pdsm_cmd_pkg/struct nd_pkg_pdsm/
* Removed instance of 'struct nd_cmd_pkg hdr' from 'struct
  nd_pdsm_cmd_pkg'.
* Introduced 'union nd_pdsm_payload' thats a maximal union of all
  possible payload structs.
* Instead of having flexible 'payload' member 'struct nd_pdsm_cmd_pkg'
  replace it with a 'union nd_pdsm_payload'.
* Introduce pdsm descriptor 'struct pdsm_cmd_desc' and its array
  __pdsm_cmd_descriptors[] that holds a payload 'size_[in|out]' and
  service function for each pdsm. This is analogus to
  '__nd_cmd_dimm_descs[]'
* Introduce function 'pdsm_cmd_desc()' to fetch the corrosponding pdsm
  descriptor for each valid pdsm.
* Updated papr_scm_service_pdsm() to use 'pdsm_cmd_desc()' and apply
  checks on 'nd_cmd_pkg' payload based on psdm descriptor members and
  finally service the pdsm using the 'service' member of the
  descriptor.
* Updated 'papr_scm_ndctl()' to send the 'struct nd_cmd_pkg*' instance
  to papr_scm_service_pdsm() instead of a 'struct nd_pdsm_cmd_pkg*'.
* Updated documentation in 'papr_psdm.h' to reflect new pdsm envelope
  layout.
* Update patch description.

v11..v12:
* Updated a misleading comment in 'papr_pdsm.h' regarding payload
  size. [ Ira ]

v10..v11:
* Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
  'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
  incompatibility issue with non-GPL-2.0 user-space code trying to
  include the header in its code. [ Ira ]
* Verified papr_pdsm.h with UAPI_HEADER_TEST config.
* Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
  cmd_rc == NULL. This prevents cmd_rc to be updated in case the
  nd-cmd is invalid or unknown.

v9..v10:
* Simplified 'struct nd_pdsm_cmd_pkg' by removing the
  'payload_version' field.
* Removed the corrosponding documentation on versioning and backward
  compatibility from 'papr_pdsm.h'
* Reduced the size of reserved fields to 4-bytes making 'struct
  nd_pdsm_cmd_pkg' 64 + 8 bytes long.
* Updated is_cmd_valid() to enforce validation checks on pdsm
  commands. [ Dan Williams ]
* Added check for reserved fields being set to '0' in is_cmd_valid()
  [ Ira ]
* Moved changes for checking cmd_rc == NULL and logging improvements
  to a separate prelim patch [ Ira ].
* Moved  pdsm package validation checks from papr_scm_service_pdsm()
  to is_cmd_valid().
* Marked papr_scm_service_pdsm() return type as 'void' since errors
  are reported in nd_pdsm_cmd_pkg.cmd_status field.

Resend:
* Added ack from Aneesh.

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* 

[PATCH v13 2/6] seq_buf: Export seq_buf_printf

2020-06-15 Thread Vaibhav Jain
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz 
Cc: Cezary Rojewski 
Cc: Christoph Hellwig 
Cc: Steven Rostedt 
Cc: Borislav Petkov 
Acked-by: Steven Rostedt (VMware) 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v12..v13:
* None

v11..v12:
* None

v10..v11:
* None

v9..v10:
* None

Resend:
* Added ack from Steven Rostedt

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2



[PATCH v13 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()

2020-06-15 Thread Vaibhav Jain
Since papr_scm_ndctl() can be called from outside papr_scm, its
exposed to the possibility of receiving NULL as value of 'cmd_rc'
argument. This patch updates papr_scm_ndctl() to protect against such
possibility by assigning it pointer to a local variable in case cmd_rc
== NULL.

Finally the patch also updates the 'default' add a debug log unknown
'cmd' values.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Reviewed-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v12..v13:
* None

v11..v12:
* Added ack from Ira

v10..v11:
* Instead of returning *cmd_rd just return '0' in case nd_cmd is
  handled. In case of unknown nd-cmd return -EINVAL
  [ Ira and Dan Williams ]
* Updated patch description.

v9..v10
* New patch in the series
---
 arch/powerpc/platforms/pseries/papr_scm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index 0c091622b15e..692ad3d79826 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
 {
struct nd_cmd_get_config_size *get_size_hdr;
struct papr_scm_priv *p;
+   int rc;
 
/* Only dimm-specific calls are supported atm */
if (!nvdimm)
return -EINVAL;
 
+   /* Use a local variable in case cmd_rc pointer is NULL */
+   if (!cmd_rc)
+   cmd_rc = 
+
p = nvdimm_provider_data(nvdimm);
 
switch (cmd) {
@@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
break;
 
default:
+   dev_dbg(>pdev->dev, "Unknown command = %d\n", cmd);
return -EINVAL;
}
 
-- 
2.26.2



[PATCH v13 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-15 Thread Vaibhav Jain
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:
v12..v13:
* Added the 'struct nd_papr_pdsm_health' to 'union nd_pdsm_payload'
* Added pdsm descriptor of 'PAPR_PDSM_HEALTH' to
  __pdsm_cmd_descriptors[].
* Updated papr_pdsm_health() to copy the dimm health information
  directly into PDSM payload rather then copying it to an intermediate
  struct and then memcpy it to the PDSM payload area.

v11..v12:
* Minor: Reodered the initialization of 'struct nd_papr_pdsm_health'
  fields to match order present in its definition. [ Ira ]
* Added ack from Ira

v10..v11:
* Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
  struct 184 bytes in size [ Dan Williams ]
* Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
  [ Dan Williams ]
* Updated papr_pdsm_health() to set field 'extension_flags' to 0.
* Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
  maximum size of a payload.
* Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
  that was preventing correct initialization of 'struct
  nd_papr_pdsm_health'. [ Ira ]

v9..v10:
* Removed code in papr_pdsm_health that performed validation on pdsm
  payload version and corrosponding struct and defines used for
  validation of payload version.
* Dropped usage of struct papr_pdsm_health in 'struct
  papr_scm_priv'. Instead papr_psdm_health() now uses
  'papr_scm_priv.health_bitmap' to populate the pdsm payload.
* Above change also fixes the problem where this patch was removing
  the code that was previously introduced in this patch-series.
  [ Ira ]
* Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
  space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
  nd_cmd_pkg'. This def is useful in validating payload sizes.
* Reworked papr_pdsm_health() to enforce a specific payload size for
  'PAPR_PDSM_HEALTH' pdsm request.

Resend:
* Added ack from Aneesh.

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 37 
 arch/powerpc/platforms/pseries/papr_scm.c | 51 +++
 2 files changed, 88 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 28115152aa4e..9ccecc1d6840 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -66,17 +66,54 @@
 #define ND_PDSM_HDR_SIZE \
(sizeof(struct nd_pkg_pdsm) - ND_PDSM_PAYLOAD_MAX_SIZE)
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY   0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL  2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * extension_flags : Any extension fields present in the struct.
+ * dimm_unarmed: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown   : Previous shutdown did not persist contents.
+ * dimm_bad_restore: Contents from previous shutdown werent restored.
+ * dimm_scrubbed   : Contents of the dimm have been scrubbed.
+ * dimm_locked : Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted  : Contents of dimm are encrypted.
+ * dimm_health 

[PATCH v13 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-15 Thread Vaibhav Jain
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v12..v13:
* None

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].

Resend:
* Added ack from Aneesh.

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
 2 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index ..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:  /sys/bus/nd/devices/nmemX/papr/flags
+Date:  Apr, 2020
+KernelVersion: v5.8
+Contact:   linuxppc-dev , 
linux-nvd...@lists.01.org,
+Description:
+   (RO) Report flags indicating various states of a
+   papr-pmem NVDIMM device. Each flag maps to a one or
+   more bits set in the dimm-health-bitmap retrieved in
+   response to H_SCM_HEALTH hcall. The details of the bit
+   flags returned in response to this hcall is available
+   at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+   the flags reported in this sysfs file:
+
+   * "not_armed"   : Indicates that NVDIMM contents will not
+ survive a power cycle.
+   * "flush_fail"  : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+   * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+   * "encrypted"   : NVDIMM contents are encrypted.
+   * "smart_notify": There is health event for the NVDIMM.
+   * "scrubbed": Indicating that contents of the
+ NVDIMM have been scrubbed.
+   * "locked"  : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 

[PATCH v13 1/6] powerpc: Document details on H_SCM_HEALTH hcall

2020-06-15 Thread Vaibhav Jain
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Acked-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v12..v13:
* None

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Added ack from Ira.

Resend:
* None

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst 
b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more 
states
+(described in table below) of the PMEM device and health-bit-valid-bitmap 
indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++--+---+
+|  Bit |   Definition  
|
++==+===+
+|  00  | PMEM device is unable to persist memory contents. 
|
+|  | If the system is powered down, nothing will be saved. 
|
++--+---+
+|  01  | PMEM device failed to persist memory contents. Either contents were   
|
+|  | not saved successfully on power down or were not restored properly on 
|
+|  | power up. 
|
++--+---+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   
|
+|  | the last boot were successfully restored. 
|
++--+---+
+|  03  | PMEM device contents are not persisted from previous IPL. There was 
no|
+|  | data to restore from the last boot.   
|
++--+---+
+|  04  | PMEM device memory life remaining is critically low   
|
++--+---+
+|  05  | PMEM device will be garded off next IPL due to failure
|
++--+---+
+|  06  | PMEM device contents cannot persist due to current platform health
|
+|  | status. A hardware failure may prevent data from being saved or   
|
+|  | restored. 
|
++--+---+
+|  07  | PMEM device is unable to persist memory contents in certain 
conditions|
++--+---+
+|  08  | PMEM device is encrypted  
|
++--+---+
+|  09  | PMEM device has successfully completed a requested erase or secure
|
+|  | erase procedure.  
|
++--+---+
+|10:63 | Reserved / Unused 
|
++--+---+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2



[PATCH v13 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

2020-06-15 Thread Vaibhav Jain
Changes since v12 [1]:
* Fixed the clang warning regarding variable length object' being not at
  the end of the 'struct nd_pdsm_cmd_pkg' by introducing a new layout.
* Removed instance of 'struct nd_cmd_pkg hdr' from 'struct
  nd_pdsm_cmd_pkg' and renamed the struct to 'struct nd_pkg_pdsm' to
  match the libndctl naming convention.
* Introduced 'union nd_pdsm_payload' thats a maximal union of all
  possible payload structs and use it instead of having flexible
  'payload' member 'struct nd_pdsm_cmd_pkg'.
* Introduce pdsm descriptor 'struct pdsm_cmd_desc' and its array
  __pdsm_cmd_descriptors[] that holds a payload 'size_[in|out]' and
  service function for each pdsm. This is analogues to
  '__nd_cmd_dimm_descs[]'
* Introduce function 'pdsm_cmd_desc()' to fetch the corresponding pdsm
  descriptor for each valid pdsm.
* Updated papr_scm_service_pdsm() to use 'pdsm_cmd_desc()' and apply
  checks on 'nd_cmd_pkg' payload based on psdm descriptor members and
  finally service the pdsm using the 'service' member of the
  descriptor.
* Updated Patch-5 that to use the updated 'struct nd_pkg_pdsm'
  definition.
  
[1] 
https://lore.kernel.org/linux-nvdimm/20200608211026.67573-1-vaib...@linux.ibm.com
---

The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
"dev":"nmem0",
"health":{
  "health_state":"fatal",
  "shutdown_state":"dirty"
}
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
"dev":"nmem0",
"health":{
  "health_state":"ok",
  "shutdown_state":"clean"
}
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patch updates papr_scm_ndctl() to handle a possible error case
and also improve debug logging.

Fifth patch deals with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

References:
[2] "Power Architecture Platform Reference"
  https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
 ("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
 https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v13

---


Vaibhav Jain (6):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
 Documentation/powerpc/papr_hcalls.rst |  46 +-
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 132 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 420 +-
 include/uapi/linux/ndctl.h|   1 +
 lib/seq_buf.c 

Re: [PATCH v2] SUNRPC: Add missing definition of ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE

2020-06-15 Thread Chuck Lever



> On Jun 15, 2020, at 2:25 AM, Christophe Leroy  
> wrote:
> 
> Even if that's only a warning, not including asm/cacheflush.h
> leads to svc_flush_bvec() being empty allthough powerpc defines
> ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE.
> 
>  CC  net/sunrpc/svcsock.o
> net/sunrpc/svcsock.c:227:5: warning: "ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE" is 
> not defined [-Wundef]
> #if ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE
> ^
> 
> Include linux/highmem.h so that asm/cacheflush.h will be included.
> 
> Reported-by: Christophe Leroy 
> Reported-by: kernel test robot 
> Cc: Chuck Lever 
> Fixes: ca07eda33e01 ("SUNRPC: Refactor svc_recvfrom()")
> Signed-off-by: Christophe Leroy 

LGTM.

Acked-by: Chuck Lever 


> ---
> v2: Use linux/highmem.h instead of asm/cacheflush.sh
> Signed-off-by: Christophe Leroy 
> ---
> net/sunrpc/svcsock.c | 1 +
> 1 file changed, 1 insertion(+)
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 5c4ec9386f81..c537272f9c7e 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -44,6 +44,7 @@
> #include 
> #include 
> #include 
> +#include 
> #include 
> 
> #include 
> -- 
> 2.25.0
> 

--
Chuck Lever





[PATCH v12 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-15 Thread Vaibhav Jain
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Reviewed-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Minor: Reodered the initialization of 'struct nd_papr_pdsm_health'
  fields to match order present in its definition. [ Ira ]
* Added ack from Ira

v10..v11:
* Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
  struct 184 bytes in size [ Dan Williams ]
* Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
  [ Dan Williams ]
* Updated papr_pdsm_health() to set field 'extension_flags' to 0.
* Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
  maximum size of a payload.
* Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
  that was preventing correct initialization of 'struct
  nd_papr_pdsm_health'. [ Ira ]

v9..v10:
* Removed code in papr_pdsm_health that performed validation on pdsm
  payload version and corrosponding struct and defines used for
  validation of payload version.
* Dropped usage of struct papr_pdsm_health in 'struct
  papr_scm_priv'. Instead papr_psdm_health() now uses
  'papr_scm_priv.health_bitmap' to populate the pdsm payload.
* Above change also fixes the problem where this patch was removing
  the code that was previously introduced in this patch-series.
  [ Ira ]
* Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
  space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
  nd_cmd_pkg'. This def is useful in validating payload sizes.
* Reworked papr_pdsm_health() to enforce a specific payload size for
  'PAPR_PDSM_HEALTH' pdsm request.

Resend:
* Added ack from Aneesh.

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 71 +++
 2 files changed, 114 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index 34d1a41d2406..d453baea13c4 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -70,13 +70,56 @@ struct nd_pdsm_cmd_pkg {
__u8 payload[]; /* In/Out: Sub-cmd data buffer */
 } __packed;
 
+/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
+#define ND_PDSM_HDR_SIZE \
+   (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
+
+/* Max payload size that we can handle */
+#define ND_PDSM_PAYLOAD_MAX_SIZE 184
+
 /*
  * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
  * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
  */
 enum papr_pdsm {
PAPR_PDSM_MIN = 0x0,
+   PAPR_PDSM_HEALTH,
PAPR_PDSM_MAX,
 };
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY   0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL  2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * extension_flags : Any extension fields present in the struct.
+ * dimm_unarmed: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown   : Previous shutdown did not persist contents.
+ * dimm_bad_restore: Contents from previous shutdown werent restored.
+ * 

[PATCH v12 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-15 Thread Vaibhav Jain
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Updated a misleading comment in 'papr_pdsm.h' regarding payload
  size. [ Ira ]

v10..v11:
* Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
  'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
  incompatibility issue with non-GPL-2.0 user-space code trying to
  include the header in its code. [ Ira ]
* Verified papr_pdsm.h with UAPI_HEADER_TEST config.
* Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
  cmd_rc == NULL. This prevents cmd_rc to be updated in case the
  nd-cmd is invalid or unknown.

v9..v10:
* Simplified 'struct nd_pdsm_cmd_pkg' by removing the
  'payload_version' field.
* Removed the corrosponding documentation on versioning and backward
  compatibility from 'papr_pdsm.h'
* Reduced the size of reserved fields to 4-bytes making 'struct
  nd_pdsm_cmd_pkg' 64 + 8 bytes long.
* Updated is_cmd_valid() to enforce validation checks on pdsm
  commands. [ Dan Williams ]
* Added check for reserved fields being set to '0' in is_cmd_valid()
  [ Ira ]
* Moved changes for checking cmd_rc == NULL and logging improvements
  to a separate prelim patch [ Ira ].
* Moved  pdsm package validation checks from papr_scm_service_pdsm()
  to is_cmd_valid().
* Marked papr_scm_service_pdsm() return type as 'void' since errors
  are reported in nd_pdsm_cmd_pkg.cmd_status field.

Resend:
* Added ack from Aneesh.

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h |  82 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 126 +-
 include/uapi/linux/ndctl.h|   1 +
 3 files changed, 205 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index ..34d1a41d2406
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,82 @@
+/* 

[PATCH v12 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()

2020-06-15 Thread Vaibhav Jain
Since papr_scm_ndctl() can be called from outside papr_scm, its
exposed to the possibility of receiving NULL as value of 'cmd_rc'
argument. This patch updates papr_scm_ndctl() to protect against such
possibility by assigning it pointer to a local variable in case cmd_rc
== NULL.

Finally the patch also updates the 'default' add a debug log unknown
'cmd' values.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Reviewed-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* Added ack from Ira

v10..v11:
* Instead of returning *cmd_rd just return '0' in case nd_cmd is
  handled. In case of unknown nd-cmd return -EINVAL
  [ Ira and Dan Williams ]
* Updated patch description.

v9..v10
* New patch in the series
---
 arch/powerpc/platforms/pseries/papr_scm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index 0c091622b15e..692ad3d79826 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
 {
struct nd_cmd_get_config_size *get_size_hdr;
struct papr_scm_priv *p;
+   int rc;
 
/* Only dimm-specific calls are supported atm */
if (!nvdimm)
return -EINVAL;
 
+   /* Use a local variable in case cmd_rc pointer is NULL */
+   if (!cmd_rc)
+   cmd_rc = 
+
p = nvdimm_provider_data(nvdimm);
 
switch (cmd) {
@@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
break;
 
default:
+   dev_dbg(>pdev->dev, "Unknown command = %d\n", cmd);
return -EINVAL;
}
 
-- 
2.26.2



[PATCH v12 2/6] seq_buf: Export seq_buf_printf

2020-06-15 Thread Vaibhav Jain
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz 
Cc: Cezary Rojewski 
Cc: Christoph Hellwig 
Cc: Steven Rostedt 
Cc: Borislav Petkov 
Acked-by: Steven Rostedt (VMware) 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* None

Resend:
* Added ack from Steven Rostedt

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2



[PATCH v12 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-15 Thread Vaibhav Jain
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].

Resend:
* Added ack from Aneesh.

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
 2 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index ..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:  /sys/bus/nd/devices/nmemX/papr/flags
+Date:  Apr, 2020
+KernelVersion: v5.8
+Contact:   linuxppc-dev , 
linux-nvd...@lists.01.org,
+Description:
+   (RO) Report flags indicating various states of a
+   papr-pmem NVDIMM device. Each flag maps to a one or
+   more bits set in the dimm-health-bitmap retrieved in
+   response to H_SCM_HEALTH hcall. The details of the bit
+   flags returned in response to this hcall is available
+   at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+   the flags reported in this sysfs file:
+
+   * "not_armed"   : Indicates that NVDIMM contents will not
+ survive a power cycle.
+   * "flush_fail"  : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+   * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+   * "encrypted"   : NVDIMM contents are encrypted.
+   * "smart_notify": There is health event for the NVDIMM.
+   * "scrubbed": Indicating that contents of the
+ NVDIMM have been scrubbed.
+   * "locked"  : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 

[PATCH v12 1/6] powerpc: Document details on H_SCM_HEALTH hcall

2020-06-15 Thread Vaibhav Jain
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Acked-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v11..v12:
* None

v10..v11:
* None

v9..v10:
* Added ack from Ira.

Resend:
* None

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst 
b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more 
states
+(described in table below) of the PMEM device and health-bit-valid-bitmap 
indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++--+---+
+|  Bit |   Definition  
|
++==+===+
+|  00  | PMEM device is unable to persist memory contents. 
|
+|  | If the system is powered down, nothing will be saved. 
|
++--+---+
+|  01  | PMEM device failed to persist memory contents. Either contents were   
|
+|  | not saved successfully on power down or were not restored properly on 
|
+|  | power up. 
|
++--+---+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   
|
+|  | the last boot were successfully restored. 
|
++--+---+
+|  03  | PMEM device contents are not persisted from previous IPL. There was 
no|
+|  | data to restore from the last boot.   
|
++--+---+
+|  04  | PMEM device memory life remaining is critically low   
|
++--+---+
+|  05  | PMEM device will be garded off next IPL due to failure
|
++--+---+
+|  06  | PMEM device contents cannot persist due to current platform health
|
+|  | status. A hardware failure may prevent data from being saved or   
|
+|  | restored. 
|
++--+---+
+|  07  | PMEM device is unable to persist memory contents in certain 
conditions|
++--+---+
+|  08  | PMEM device is encrypted  
|
++--+---+
+|  09  | PMEM device has successfully completed a requested erase or secure
|
+|  | erase procedure.  
|
++--+---+
+|10:63 | Reserved / Unused 
|
++--+---+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2



[PATCH v12 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

2020-06-15 Thread Vaibhav Jain
Changes since v11 [1]:
* Minor update to 'papr_pdsm.h' fixing a misleading comment about
  'possible' padding being added by GCC which doesn't apply in case
  structs are marked as __packed.
* Fix the order of initialization of 'struct nd_papr_pdsm_health' in
  papr_pdsm_health().
* Added acks from Ira for various patches.

[1] 
https://lore.kernel.org/linux-nvdimm/20200607131339.476036-1-vaib...@linux.ibm.com
---

The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
"dev":"nmem0",
"health":{
  "health_state":"fatal",
  "shutdown_state":"dirty"
}
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
"dev":"nmem0",
"health":{
  "health_state":"ok",
  "shutdown_state":"clean"
}
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patch updates papr_scm_ndctl() to handle a possible error case
and also improve debug logging.

Fifth patch deals with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

References:
[2] "Power Architecture Platform Reference"
  https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
 ("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
 https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v12

---

Vaibhav Jain (6):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
 Documentation/powerpc/papr_hcalls.rst |  46 ++-
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 125 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 373 +-
 include/uapi/linux/ndctl.h|   1 +
 lib/seq_buf.c |   1 +
 6 files changed, 562 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

-- 
2.26.2



  1   2   >