[RESEND][PATCH] selftests/vm: Fix vm tests build and run

2020-02-09 Thread Harish
A recent change overrides the ARCH env variable and hence runs
using make fails with the following.

$ make -C vm/
make: Entering directory '/home/harish/linux/tools/testing/selftests/vm'
make --no-builtin-rules ARCH=ppc64le -C ../../../.. headers_install
make[1]: Entering directory '/home/harish/linux'
Makefile:652: arch/ppc64le/Makefile: No such file or directory
make[1]: *** No rule to make target 'arch/ppc64le/Makefile'.  Stop.
make[1]: Leaving directory '/home/harish/linux'
make: *** [../lib.mk:50: khdr] Error 2
make: Leaving directory '/home/harish/linux/tools/testing/selftests/vm'

Patch fixes this issue and also handles ppc64/ppc64le archs to enable
few tests

Signed-off-by: Harish 
---
 tools/testing/selftests/vm/Makefile| 4 ++--
 tools/testing/selftests/vm/run_vmtests | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/vm/Makefile 
b/tools/testing/selftests/vm/Makefile
index 7f9a8a8c31da..49bb15be1447 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
 # Makefile for vm selftests
 uname_M := $(shell uname -m 2>/dev/null || echo not)
-ARCH ?= $(shell echo $(uname_M) | sed -e 's/aarch64.*/arm64/')
+ARCH_USED ?= $(shell echo $(uname_M) | sed -e 's/aarch64.*/arm64/' -e 
's/ppc64.*/ppc64/')
 
 CFLAGS = -Wall -I ../../../../usr/include $(EXTRA_CFLAGS)
 LDLIBS = -lrt
@@ -19,7 +19,7 @@ TEST_GEN_FILES += thuge-gen
 TEST_GEN_FILES += transhuge-stress
 TEST_GEN_FILES += userfaultfd
 
-ifneq (,$(filter $(ARCH),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 
sparc64 x86_64))
+ifneq (,$(filter $(ARCH_USED),arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x 
sh64 sparc64 x86_64))
 TEST_GEN_FILES += va_128TBswitch
 TEST_GEN_FILES += virtual_address_range
 endif
diff --git a/tools/testing/selftests/vm/run_vmtests 
b/tools/testing/selftests/vm/run_vmtests
index a692ea828317..da63dfb9713a 100755
--- a/tools/testing/selftests/vm/run_vmtests
+++ b/tools/testing/selftests/vm/run_vmtests
@@ -61,7 +61,7 @@ fi
 #filter 64bit architectures
 ARCH64STR="arm64 ia64 mips64 parisc64 ppc64 riscv64 s390x sh64 sparc64 x86_64"
 if [ -z $ARCH ]; then
-  ARCH=`uname -m 2>/dev/null | sed -e 's/aarch64.*/arm64/'`
+  ARCH=`uname -m 2>/dev/null | sed -e 's/aarch64.*/arm64/' -e 
's/ppc64.*/ppc64/'`
 fi
 VADDR64=0
 echo "$ARCH64STR" | grep $ARCH && VADDR64=1
-- 
2.21.0



Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-09 Thread Christophe Leroy




Le 10/02/2020 à 06:35, Anshuman Khandual a écrit :



On 02/10/2020 10:22 AM, Andrew Morton wrote:

On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual  
wrote:



On 02/06/2020 04:40 AM, kbuild test robot wrote:

Hi Anshuman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on s390/features linus/master arc/for-next v5.5]
[cannot apply to mmotm/master tip/x86/core arm64/for-next/core next-20200205]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: ia64-allmodconfig (attached as .config)
compiler: ia64-linux-gcc (GCC) 7.5.0
reproduce:
 wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
 chmod +x ~/bin/make.cross
 # save the attached .config to linux build tree
 GCC_VERSION=7.5.0 make.cross ARCH=ia64

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot 

All error/warnings (new ones prefixed by >>):

In file included from include/asm-generic/pgtable-nopud.h:8:0,
 from arch/ia64/include/asm/pgtable.h:586,
 from include/linux/mm.h:99,
 from include/linux/highmem.h:8,
 from mm/debug_vm_pgtable.c:14:
mm/debug_vm_pgtable.c: In function 'pud_clear_tests':

include/asm-generic/pgtable-nop4d-hack.h:47:32: error: implicit declaration of 
function '__pgd'; did you mean '__p4d'? [-Werror=implicit-function-declaration]

 #define __pud(x)((pud_t) { __pgd(x) })
^

mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'

  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
^

include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: missing braces around 
initializer [-Wmissing-braces]

 #define __pud(x)((pud_t) { __pgd(x) })
  ^

mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'

  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
^
cc1: some warnings being treated as errors


This build failure is expected now given that we have allowed DEBUG_VM_PGTABLE
with EXPERT without platform requiring ARCH_HAS_DEBUG_VM_PGTABLE. This problem
i.e build failure caused without a platform __pgd(), is known to exist both on
ia64 and arm (32bit) platforms. Please refer https://lkml.org/lkml/2019/9/24/314
for details where this was discussed earlier.



I'd prefer not to merge a patch which is known to cause build
regressions.  Is there some temporary thing we can do to prevent these
errors until arch maintainers(?) get around to implementing the
long-term fixes?


We could explicitly disable CONFIG_DEBUG_VM_PGTABLE on ia64 and arm platforms
which will ensure that others can still use the EXPERT path.

config DEBUG_VM_PGTABLE
bool "Debug arch page table for semantics compliance"
depends on MMU
depends on !(IA64 || ARM)
depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
default n if !ARCH_HAS_DEBUG_VM_PGTABLE
default y if DEBUG_VM



On both ia32 and arm, the fix is trivial.

Can we include the fix within this patch, just the same way as the 
mm_p4d_folded() fix for x86 ?


Christophe


Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-09 Thread Anshuman Khandual



On 02/10/2020 10:22 AM, Andrew Morton wrote:
> On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual 
>  wrote:
> 
>>
>> On 02/06/2020 04:40 AM, kbuild test robot wrote:
>>> Hi Anshuman,
>>>
>>> Thank you for the patch! Yet something to improve:
>>>
>>> [auto build test ERROR on powerpc/next]
>>> [also build test ERROR on s390/features linus/master arc/for-next v5.5]
>>> [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
>>> next-20200205]
>>> [if your patch is applied to the wrong git tree, please drop us a note to 
>>> help
>>> improve the system. BTW, we also suggest to use '--base' option to specify 
>>> the
>>> base tree in git format-patch, please see 
>>> https://stackoverflow.com/a/37406982]
>>>
>>> url:
>>> https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
>>> base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
>>> next
>>> config: ia64-allmodconfig (attached as .config)
>>> compiler: ia64-linux-gcc (GCC) 7.5.0
>>> reproduce:
>>> wget 
>>> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
>>> ~/bin/make.cross
>>> chmod +x ~/bin/make.cross
>>> # save the attached .config to linux build tree
>>> GCC_VERSION=7.5.0 make.cross ARCH=ia64 
>>>
>>> If you fix the issue, kindly add following tag
>>> Reported-by: kbuild test robot 
>>>
>>> All error/warnings (new ones prefixed by >>):
>>>
>>>In file included from include/asm-generic/pgtable-nopud.h:8:0,
>>> from arch/ia64/include/asm/pgtable.h:586,
>>> from include/linux/mm.h:99,
>>> from include/linux/highmem.h:8,
>>> from mm/debug_vm_pgtable.c:14:
>>>mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> include/asm-generic/pgtable-nop4d-hack.h:47:32: error: implicit 
> declaration of function '__pgd'; did you mean '__p4d'? 
> [-Werror=implicit-function-declaration]
>>> #define __pud(x)((pud_t) { __pgd(x) })
>>>^
> mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
>>>  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
>>>^
> include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: missing braces 
> around initializer [-Wmissing-braces]
>>> #define __pud(x)((pud_t) { __pgd(x) })
>>>  ^
> mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
>>>  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
>>>^
>>>cc1: some warnings being treated as errors
>>
>> This build failure is expected now given that we have allowed 
>> DEBUG_VM_PGTABLE
>> with EXPERT without platform requiring ARCH_HAS_DEBUG_VM_PGTABLE. This 
>> problem
>> i.e build failure caused without a platform __pgd(), is known to exist both 
>> on
>> ia64 and arm (32bit) platforms. Please refer 
>> https://lkml.org/lkml/2019/9/24/314
>> for details where this was discussed earlier.
>>
> 
> I'd prefer not to merge a patch which is known to cause build
> regressions.  Is there some temporary thing we can do to prevent these
> errors until arch maintainers(?) get around to implementing the
> long-term fixes?

We could explicitly disable CONFIG_DEBUG_VM_PGTABLE on ia64 and arm platforms
which will ensure that others can still use the EXPERT path.

config DEBUG_VM_PGTABLE
bool "Debug arch page table for semantics compliance"
depends on MMU
depends on !(IA64 || ARM)
depends on ARCH_HAS_DEBUG_VM_PGTABLE || EXPERT
default n if !ARCH_HAS_DEBUG_VM_PGTABLE
default y if DEBUG_VM


Re: [PATCH V13] mm/debug: Add tests validating architecture page table helpers

2020-02-09 Thread Andrew Morton
On Thu, 6 Feb 2020 13:49:35 +0530 Anshuman Khandual  
wrote:

> 
> On 02/06/2020 04:40 AM, kbuild test robot wrote:
> > Hi Anshuman,
> > 
> > Thank you for the patch! Yet something to improve:
> > 
> > [auto build test ERROR on powerpc/next]
> > [also build test ERROR on s390/features linus/master arc/for-next v5.5]
> > [cannot apply to mmotm/master tip/x86/core arm64/for-next/core 
> > next-20200205]
> > [if your patch is applied to the wrong git tree, please drop us a note to 
> > help
> > improve the system. BTW, we also suggest to use '--base' option to specify 
> > the
> > base tree in git format-patch, please see 
> > https://stackoverflow.com/a/37406982]
> > 
> > url:
> > https://github.com/0day-ci/linux/commits/Anshuman-Khandual/mm-debug-Add-tests-validating-architecture-page-table-helpers/20200205-215507
> > base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> > next
> > config: ia64-allmodconfig (attached as .config)
> > compiler: ia64-linux-gcc (GCC) 7.5.0
> > reproduce:
> > wget 
> > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> > ~/bin/make.cross
> > chmod +x ~/bin/make.cross
> > # save the attached .config to linux build tree
> > GCC_VERSION=7.5.0 make.cross ARCH=ia64 
> > 
> > If you fix the issue, kindly add following tag
> > Reported-by: kbuild test robot 
> > 
> > All error/warnings (new ones prefixed by >>):
> > 
> >In file included from include/asm-generic/pgtable-nopud.h:8:0,
> > from arch/ia64/include/asm/pgtable.h:586,
> > from include/linux/mm.h:99,
> > from include/linux/highmem.h:8,
> > from mm/debug_vm_pgtable.c:14:
> >mm/debug_vm_pgtable.c: In function 'pud_clear_tests':
> >>> include/asm-generic/pgtable-nop4d-hack.h:47:32: error: implicit 
> >>> declaration of function '__pgd'; did you mean '__p4d'? 
> >>> [-Werror=implicit-function-declaration]
> > #define __pud(x)((pud_t) { __pgd(x) })
> >^
> >>> mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> >  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> >^
> >>> include/asm-generic/pgtable-nop4d-hack.h:47:22: warning: missing braces 
> >>> around initializer [-Wmissing-braces]
> > #define __pud(x)((pud_t) { __pgd(x) })
> >  ^
> >>> mm/debug_vm_pgtable.c:141:8: note: in expansion of macro '__pud'
> >  pud = __pud(pud_val(pud) | RANDOM_ORVALUE);
> >^
> >cc1: some warnings being treated as errors
> 
> This build failure is expected now given that we have allowed DEBUG_VM_PGTABLE
> with EXPERT without platform requiring ARCH_HAS_DEBUG_VM_PGTABLE. This problem
> i.e build failure caused without a platform __pgd(), is known to exist both on
> ia64 and arm (32bit) platforms. Please refer 
> https://lkml.org/lkml/2019/9/24/314
> for details where this was discussed earlier.
> 

I'd prefer not to merge a patch which is known to cause build
regressions.  Is there some temporary thing we can do to prevent these
errors until arch maintainers(?) get around to implementing the
long-term fixes?




Re: [PATCH V5 06/14] powerpc/vas: Setup thread IRQ handler per VAS instance

2020-02-09 Thread Haren Myneni
On Fri, 2020-02-07 at 16:57 +1100, Michael Neuling wrote:
> >  /*
> > + * Process CRBs that we receive on the fault window.
> > + */
> > +irqreturn_t vas_fault_handler(int irq, void *data)
> > +{
> > +   struct vas_instance *vinst = data;
> > +   struct coprocessor_request_block buf, *crb;
> > +   struct vas_window *window;
> > +   void *fifo;
> > +
> > +   /*
> > +* VAS can interrupt with multiple page faults. So process all
> > +* valid CRBs within fault FIFO until reaches invalid CRB.
> > +* NX updates nx_fault_stamp in CRB and pastes in fault FIFO.
> > +* kernel retrives send window from parition send window ID
> > +* (pswid) in nx_fault_stamp. So pswid should be non-zero and
> > +* use this to check whether CRB is valid.
> > +* After reading CRB entry, it is reset with 0's in fault FIFO.
> > +*
> > +* In case kernel receives another interrupt with different page
> > +* fault and CRBs are processed by the previous handling, will be
> > +* returned from this function when it sees invalid CRB (means 0's).
> > +*/
> > +   do {
> > +   mutex_lock(>mutex);
> 
> This isn't going to work.
> 
> From Documentation/locking/mutex-design.rst
> 
> - Mutexes may not be used in hardware or software interrupt
>   contexts such as tasklets and timers.

Initially used kernel thread per VAS instance and later using IRQ
thread. 

vas_fault_handler() is IRQ thread function, not IRQ handler. I thought
we can use mutex_lock() in thread function.

> 
> Mikey
> 




Re: [PATCH V5 09/14] powerpc/vas: Update CSB and notify process for fault CRBs

2020-02-09 Thread Haren Myneni
Mikey, Thanks for your review comments.

On Fri, 2020-02-07 at 16:46 +1100, Michael Neuling wrote:
> On Wed, 2020-01-22 at 00:17 -0800, Haren Myneni wrote:
> > For each fault CRB, update fault address in CRB (fault_storage_addr)
> > and translation error status in CSB so that user space can touch the
> > fault address and resend the request. If the user space passed invalid
> > CSB address send signal to process with SIGSEGV.
> > 
> > Signed-off-by: Sukadev Bhattiprolu 
> > Signed-off-by: Haren Myneni 
> > ---
> >  arch/powerpc/platforms/powernv/vas-fault.c | 116
> > +
> >  1 file changed, 116 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/vas-fault.c
> > b/arch/powerpc/platforms/powernv/vas-fault.c
> > index 5c2cada..2cfab0c 100644
> > --- a/arch/powerpc/platforms/powernv/vas-fault.c
> > +++ b/arch/powerpc/platforms/powernv/vas-fault.c
> > @@ -11,6 +11,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  
> > @@ -26,6 +27,120 @@
> >  #define VAS_FAULT_WIN_FIFO_SIZE(4 << 20)
> >  
> >  /*
> > + * Update the CSB to indicate a translation error.
> > + *
> > + * If the fault is in the CSB address itself or if we are unable to
> > + * update the CSB, send a signal to the process, because we have no
> > + * other way of notifying the user process.
> > + *
> > + * Remaining settings in the CSB are based on wait_for_csb() of
> > + * NX-GZIP.
> > + */
> > +static void update_csb(struct vas_window *window,
> > +   struct coprocessor_request_block *crb)
> > +{
> > +   int rc;
> > +   struct pid *pid;
> > +   void __user *csb_addr;
> > +   struct task_struct *tsk;
> > +   struct kernel_siginfo info;
> > +   struct coprocessor_status_block csb;
> > +
> > +   /*
> > +* NX user space windows can not be opened for task->mm=NULL
> > +* and faults will not be generated for kernel requests.
> > +*/
> > +   if (!window->mm || !window->user_win)
> > +   return;
> > +
> > +   csb_addr = (void *)be64_to_cpu(crb->csb_addr);
> > +
> > +   csb.cc = CSB_CC_TRANSLATION;
> > +   csb.ce = CSB_CE_TERMINATION;
> > +   csb.cs = 0;
> > +   csb.count = 0;
> > +
> > +   /*
> > +* Returns the fault address in CPU format since it is passed with
> > +* signal. But if the user space expects BE format, need changes.
> > +* i.e either kernel (here) or user should convert to CPU format.
> > +* Not both!
> > +*/
> > +   csb.address = be64_to_cpu(crb->stamp.nx.fault_storage_addr);
> 
> This looks wrong and I don't understand the comment. You need to convert this
> back to be64 to write it to csb.address. ie.
> 
>   csb.address = cpu_to_be64(be64_to_cpu(crb->stamp.nx.fault_storage_addr));
> 
> Which I think you can just avoid the endian conversion all together.

NX pastes fault CRB in big-endian, so passing this address in CPU format
to user space, otherwise the library has to convert. 

What is the standard way for passing to user space? 

> 
> > +   csb.flags = 0;
> > +
> > +   pid = window->pid;
> > +   tsk = get_pid_task(pid, PIDTYPE_PID);
> > +   /*
> > +* Send window will be closed after processing all NX requests
> > +* and process exits after closing all windows. In multi-thread
> > +* applications, thread may not exists, but does not close FD
> > +* (means send window) upon exit. Parent thread (tgid) can use
> > +* and close the window later.
> > +* pid and mm references are taken when window is opened by
> > +* process (pid). So tgid is used only when child thread opens
> > +* a window and exits without closing it in multithread tasks.
> > +*/
> > +   if (!tsk) {
> > +   pid = window->tgid;
> > +   tsk = get_pid_task(pid, PIDTYPE_PID);
> > +   /*
> > +* Parent thread will be closing window during its exit.
> > +* So should not get here.
> > +*/
> > +   if (!tsk)
> > +   return;
> > +   }
> > +
> > +   /* Return if the task is exiting. */
> > +   if (tsk->flags & PF_EXITING) {
> > +   put_task_struct(tsk);
> > +   return;
> > +   }
> > +
> > +   use_mm(window->mm);
> > +   rc = copy_to_user(csb_addr, , sizeof(csb));
> > +   /*
> > +* User space polls on csb.flags (first byte). So add barrier
> > +* then copy first byte with csb flags update.
> > +*/
> > +   smp_mb();
> > +   if (!rc) {
> > +   csb.flags = CSB_V;
> > +   rc = copy_to_user(csb_addr, , sizeof(u8));
> > +   }
> > +   unuse_mm(window->mm);
> > +   put_task_struct(tsk);
> > +
> > +   /* Success */
> > +   if (!rc)
> > +   return;
> > +
> > +   pr_err("Invalid CSB address 0x%p signalling pid(%d)\n",
> > +   csb_addr, pid_vnr(pid));
> 
> This is a userspace error, not a kernel error. This should not be a pr_err().
> 
> Userspace could spam the console with this.

Will change it to pr_debug/info. Added pr_err() during development and
missed to 

Re: [PATCH v6 2/6] sysfs: wrap __compat_only_sysfs_link_entry_to_kobj function to change the symlink name

2020-02-09 Thread Michael Ellerman
Sourabh Jain  writes:
> The __compat_only_sysfs_link_entry_to_kobj function creates a symlink to a
> kobject but doesn't provide an option to change the symlink file name.
>
> This patch adds a wrapper function compat_only_sysfs_link_entry_to_kobj
> that extends the __compat_only_sysfs_link_entry_to_kobj functionality
> which allows function caller to customize the symlink name.
>
> Signed-off-by: Sourabh Jain 
> ---
>  fs/sysfs/group.c  | 28 +---
>  include/linux/sysfs.h | 12 
>  2 files changed, 37 insertions(+), 3 deletions(-)

I'll assume no one has any objections to this and merge it via the
powerpc tree with the rest of the series.

cheers

> diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c
> index d41c21fef138..0993645f0b59 100644
> --- a/fs/sysfs/group.c
> +++ b/fs/sysfs/group.c
> @@ -424,6 +424,25 @@ EXPORT_SYMBOL_GPL(sysfs_remove_link_from_group);
>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> struct kobject *target_kobj,
> const char *target_name)
> +{
> + return compat_only_sysfs_link_entry_to_kobj(kobj, target_kobj,
> + target_name, NULL);
> +}
> +EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> +
> +/**
> + * compat_only_sysfs_link_entry_to_kobj - add a symlink to a kobject pointing
> + * to a group or an attribute
> + * @kobj:The kobject containing the group.
> + * @target_kobj: The target kobject.
> + * @target_name: The name of the target group or attribute.
> + * @symlink_name:The name of the symlink file (target_name will be
> + *   considered if symlink_name is NULL).
> + */
> +int compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> +  struct kobject *target_kobj,
> +  const char *target_name,
> +  const char *symlink_name)
>  {
>   struct kernfs_node *target;
>   struct kernfs_node *entry;
> @@ -448,12 +467,15 @@ int __compat_only_sysfs_link_entry_to_kobj(struct 
> kobject *kobj,
>   return -ENOENT;
>   }
>  
> - link = kernfs_create_link(kobj->sd, target_name, entry);
> + if (!symlink_name)
> + symlink_name = target_name;
> +
> + link = kernfs_create_link(kobj->sd, symlink_name, entry);
>   if (IS_ERR(link) && PTR_ERR(link) == -EEXIST)
> - sysfs_warn_dup(kobj->sd, target_name);
> + sysfs_warn_dup(kobj->sd, symlink_name);
>  
>   kernfs_put(entry);
>   kernfs_put(target);
>   return PTR_ERR_OR_ZERO(link);
>  }
> -EXPORT_SYMBOL_GPL(__compat_only_sysfs_link_entry_to_kobj);
> +EXPORT_SYMBOL_GPL(compat_only_sysfs_link_entry_to_kobj);
> diff --git a/include/linux/sysfs.h b/include/linux/sysfs.h
> index 5420817ed317..15b195a4529d 100644
> --- a/include/linux/sysfs.h
> +++ b/include/linux/sysfs.h
> @@ -300,6 +300,10 @@ void sysfs_remove_link_from_group(struct kobject *kobj, 
> const char *group_name,
>  int __compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> struct kobject *target_kobj,
> const char *target_name);
> +int compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> +  struct kobject *target_kobj,
> +  const char *target_name,
> +  const char *symlink_name);
>  
>  void sysfs_notify(struct kobject *kobj, const char *dir, const char *attr);
>  
> @@ -508,6 +512,14 @@ static inline int __compat_only_sysfs_link_entry_to_kobj(
>   return 0;
>  }
>  
> +static int compat_only_sysfs_link_entry_to_kobj(struct kobject *kobj,
> + struct kobject *target_kobj,
> + const char *target_name,
> + const char *symlink_name)
> +{
> + return 0;
> +}
> +
>  static inline void sysfs_notify(struct kobject *kobj, const char *dir,
>   const char *attr)
>  {
> -- 
> 2.17.2


[PATCH] powerpc/8xx: Fix clearing of bits 20-23 in ITLB miss

2020-02-09 Thread Christophe Leroy
In ITLB miss handled the line supposed to clear bits 20-23 on the
L2 ITLB entry is buggy and does indeed nothing, leading to undefined
value which could allow execution when it shouldn't.

Properly do the clearing with the relevant instruction.

Fixes: 74fabcadfd43 ("powerpc/8xx: don't use r12/SPRN_SPRG_SCRATCH2 in TLB Miss 
handlers")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_8xx.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 9922306ae512..073a651787df 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -256,7 +256,7 @@ InstructionTLBMiss:
 * set.  All other Linux PTE bits control the behavior
 * of the MMU.
 */
-   rlwimi  r10, r10, 0, 0x0f00 /* Clear bits 20-23 */
+   rlwinm  r10, r10, 0, ~0x0f00/* Clear bits 20-23 */
rlwimi  r10, r10, 4, 0x0400 /* Copy _PAGE_EXEC into bit 21 */
ori r10, r10, RPN_PATTERN | 0x200 /* Set 22 and 24-27 */
mtspr   SPRN_MI_RPN, r10/* Update TLB entry */
-- 
2.25.0



Re: [PATCH v2] libnvdimm: Update persistence domain value for of_pmem and papr_scm device

2020-02-09 Thread Dan Williams
On Tue, Feb 4, 2020 at 9:21 PM Aneesh Kumar K.V
 wrote:
>
> Currently, kernel shows the below values
> "persistence_domain":"cpu_cache"
> "persistence_domain":"memory_controller"
> "persistence_domain":"unknown"
>
> "cpu_cache" indicates no extra instructions is needed to ensure the 
> persistence
> of data in the pmem media on power failure.
>
> "memory_controller" indicates platform provided instructions need to be issued

No, it does not. The only requirement implied by "memory_controller"
is global visibility outside the cpu cache. If there are special
instructions beyond that then it isn't persistent memory, at least not
pmem that is safe for dax. virtio-pmem is an example of pmem-like
memory that is not enabled for userspace flushing (MAP_SYNC disabled).

> as per documented sequence to make sure data get flushed so that it is
> guaranteed to be on pmem media in case of system power loss.
>
> Based on the above use memory_controller for non volatile regions on ppc64.
>
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  arch/powerpc/platforms/pseries/papr_scm.c | 7 ++-
>  drivers/nvdimm/of_pmem.c  | 4 +++-
>  include/linux/libnvdimm.h | 1 -
>  3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
> b/arch/powerpc/platforms/pseries/papr_scm.c
> index 7525635a8536..ffcd0d7a867c 100644
> --- a/arch/powerpc/platforms/pseries/papr_scm.c
> +++ b/arch/powerpc/platforms/pseries/papr_scm.c
> @@ -359,8 +359,13 @@ static int papr_scm_nvdimm_init(struct papr_scm_priv *p)
>
> if (p->is_volatile)
> p->region = nvdimm_volatile_region_create(p->bus, _desc);
> -   else
> +   else {
> +   /*
> +* We need to flush things correctly to guarantee persistance
> +*/

There are never guarantees. If you're going to comment what does
software need to flush, and how?

> +   set_bit(ND_REGION_PERSIST_MEMCTRL, _desc.flags);
> p->region = nvdimm_pmem_region_create(p->bus, _desc);
> +   }
> if (!p->region) {
> dev_err(dev, "Error registering region %pR from %pOF\n",
> ndr_desc.res, p->dn);
> diff --git a/drivers/nvdimm/of_pmem.c b/drivers/nvdimm/of_pmem.c
> index 8224d1431ea9..6826a274a1f1 100644
> --- a/drivers/nvdimm/of_pmem.c
> +++ b/drivers/nvdimm/of_pmem.c
> @@ -62,8 +62,10 @@ static int of_pmem_region_probe(struct platform_device 
> *pdev)
>
> if (is_volatile)
> region = nvdimm_volatile_region_create(bus, 
> _desc);
> -   else
> +   else {
> +   set_bit(ND_REGION_PERSIST_MEMCTRL, _desc.flags);
> region = nvdimm_pmem_region_create(bus, _desc);
> +   }
>
> if (!region)
> dev_warn(>dev, "Unable to register region %pR 
> from %pOF\n",
> diff --git a/include/linux/libnvdimm.h b/include/linux/libnvdimm.h
> index 0f366706b0aa..771d888a5ed7 100644
> --- a/include/linux/libnvdimm.h
> +++ b/include/linux/libnvdimm.h
> @@ -54,7 +54,6 @@ enum {
> /*
>  * Platform provides mechanisms to automatically flush outstanding
>  * write data from memory controler to pmem on system power loss.
> -* (ADR)

I'd rather not delete critical terminology for a developer / platform
owner to be able to consult documentation, or their vendor. Can you
instead add the PowerPC equivalent term for this capability? I.e. list
(x86: ADR PowerPC: foo ...).


[PATCH v2] powerpc/hugetlb: Fix 8M hugepages on 8xx

2020-02-09 Thread Christophe Leroy
With HW assistance all page tables must be 4k aligned, the 8xx
drops the last 12 bits during the walk.

Redefine HUGEPD_SHIFT_MASK to mask last 12 bits out.
HUGEPD_SHIFT_MASK is used to for alignment of page table cache.

Fixes: 22569b881d37 ("powerpc/8xx: Enable 8M hugepage support with HW 
assistance")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
v2: Only do the fix of alignment which is the only vital fix.
---
 arch/powerpc/include/asm/page.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 86332080399a..080a0bf8e54b 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -295,8 +295,13 @@ static inline bool pfn_valid(unsigned long pfn)
 /*
  * Some number of bits at the level of the page table that points to
  * a hugepte are used to encode the size.  This masks those bits.
+ * On 8xx, HW assistance requires 4k alignment for the hugepte.
  */
+#ifdef CONFIG_PPC_8xx
+#define HUGEPD_SHIFT_MASK 0xfff
+#else
 #define HUGEPD_SHIFT_MASK 0x3f
+#endif
 
 #ifndef __ASSEMBLY__
 
-- 
2.25.0



[PATCH 1/6] powerpc: kernel: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Hari Bathini 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/kernel/fadump.c   |  9 ++---
 arch/powerpc/kernel/setup-common.c |  3 +--
 arch/powerpc/kernel/traps.c| 25 +
 3 files changed, 8 insertions(+), 29 deletions(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index ff0114aeba9b..b83fa42c19e1 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -1432,7 +1432,6 @@ DEFINE_SHOW_ATTRIBUTE(fadump_region);
 
 static void fadump_init_files(void)
 {
-   struct dentry *debugfs_file;
int rc = 0;
 
rc = sysfs_create_file(kernel_kobj, _attr.attr);
@@ -1445,12 +1444,8 @@ static void fadump_init_files(void)
printk(KERN_ERR "fadump: unable to create sysfs file"
" fadump_registered (%d)\n", rc);
 
-   debugfs_file = debugfs_create_file("fadump_region", 0444,
-   powerpc_debugfs_root, NULL,
-   _region_fops);
-   if (!debugfs_file)
-   printk(KERN_ERR "fadump: unable to create debugfs file"
-   " fadump_region\n");
+   debugfs_create_file("fadump_region", 0444, powerpc_debugfs_root, NULL,
+   _region_fops);
 
if (fw_dump.dump_active) {
rc = sysfs_create_file(kernel_kobj, _release_attr.attr);
diff --git a/arch/powerpc/kernel/setup-common.c 
b/arch/powerpc/kernel/setup-common.c
index 7f8c890360fe..f9c0d888ce8a 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -787,8 +787,7 @@ EXPORT_SYMBOL(powerpc_debugfs_root);
 static int powerpc_debugfs_init(void)
 {
powerpc_debugfs_root = debugfs_create_dir("powerpc", NULL);
-
-   return powerpc_debugfs_root == NULL;
+   return 0;
 }
 arch_initcall(powerpc_debugfs_init);
 #endif
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 82a3438300fd..3fca22276bb1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -2278,35 +2278,20 @@ void ppc_warn_emulated_print(const char *type)
 
 static int __init ppc_warn_emulated_init(void)
 {
-   struct dentry *dir, *d;
+   struct dentry *dir;
unsigned int i;
struct ppc_emulated_entry *entries = (void *)_emulated;
 
-   if (!powerpc_debugfs_root)
-   return -ENODEV;
-
dir = debugfs_create_dir("emulated_instructions",
 powerpc_debugfs_root);
-   if (!dir)
-   return -ENOMEM;
 
-   d = debugfs_create_u32("do_warn", 0644, dir,
-  _warn_emulated);
-   if (!d)
-   goto fail;
+   debugfs_create_u32("do_warn", 0644, dir, _warn_emulated);
 
-   for (i = 0; i < sizeof(ppc_emulated)/sizeof(*entries); i++) {
-   d = debugfs_create_u32(entries[i].name, 0644, dir,
-  (u32 *)[i].val.counter);
-   if (!d)
-   goto fail;
-   }
+   for (i = 0; i < sizeof(ppc_emulated)/sizeof(*entries); i++)
+   debugfs_create_u32(entries[i].name, 0644, dir,
+  (u32 *)[i].val.counter);
 
return 0;
-
-fail:
-   debugfs_remove_recursive(dir);
-   return -ENOMEM;
 }
 
 device_initcall(ppc_warn_emulated_init);
-- 
2.25.0



[PATCH 6/6] powerpc: powernv: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Sukadev Bhattiprolu 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/platforms/powernv/memtrace.c  |  7 
 arch/powerpc/platforms/powernv/opal-imc.c  | 24 --
 arch/powerpc/platforms/powernv/pci-ioda.c  |  5 ---
 arch/powerpc/platforms/powernv/vas-debug.c | 37 ++
 4 files changed, 10 insertions(+), 63 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/memtrace.c 
b/arch/powerpc/platforms/powernv/memtrace.c
index eb2e75dac369..d6d64f8718e6 100644
--- a/arch/powerpc/platforms/powernv/memtrace.c
+++ b/arch/powerpc/platforms/powernv/memtrace.c
@@ -187,11 +187,6 @@ static int memtrace_init_debugfs(void)
 
snprintf(ent->name, 16, "%08x", ent->nid);
dir = debugfs_create_dir(ent->name, memtrace_debugfs_dir);
-   if (!dir) {
-   pr_err("Failed to create debugfs directory for node 
%d\n",
-   ent->nid);
-   return -1;
-   }
 
ent->dir = dir;
debugfs_create_file("trace", 0400, dir, ent, _fops);
@@ -314,8 +309,6 @@ static int memtrace_init(void)
 {
memtrace_debugfs_dir = debugfs_create_dir("memtrace",
  powerpc_debugfs_root);
-   if (!memtrace_debugfs_dir)
-   return -1;
 
debugfs_create_file("enable", 0600, memtrace_debugfs_dir,
NULL, _init_fops);
diff --git a/arch/powerpc/platforms/powernv/opal-imc.c 
b/arch/powerpc/platforms/powernv/opal-imc.c
index 000b350d4060..968b9a4d1cd9 100644
--- a/arch/powerpc/platforms/powernv/opal-imc.c
+++ b/arch/powerpc/platforms/powernv/opal-imc.c
@@ -35,11 +35,10 @@ static int imc_mem_set(void *data, u64 val)
 }
 DEFINE_DEBUGFS_ATTRIBUTE(fops_imc_x64, imc_mem_get, imc_mem_set, 
"0x%016llx\n");
 
-static struct dentry *imc_debugfs_create_x64(const char *name, umode_t mode,
-struct dentry *parent, u64  *value)
+static void imc_debugfs_create_x64(const char *name, umode_t mode,
+  struct dentry *parent, u64  *value)
 {
-   return debugfs_create_file_unsafe(name, mode, parent,
- value, _imc_x64);
+   debugfs_create_file_unsafe(name, mode, parent, value, _imc_x64);
 }
 
 /*
@@ -59,9 +58,6 @@ static void export_imc_mode_and_cmd(struct device_node *node,
 
imc_debugfs_parent = debugfs_create_dir("imc", powerpc_debugfs_root);
 
-   if (!imc_debugfs_parent)
-   return;
-
if (of_property_read_u32(node, "cb_offset", _offset))
cb_offset = IMC_CNTL_BLK_OFFSET;
 
@@ -69,21 +65,15 @@ static void export_imc_mode_and_cmd(struct device_node 
*node,
loc = (u64)(ptr->vbase) + cb_offset;
imc_mode_addr = (u64 *)(loc + IMC_CNTL_BLK_MODE_OFFSET);
sprintf(mode, "imc_mode_%d", (u32)(ptr->id));
-   if (!imc_debugfs_create_x64(mode, 0600, imc_debugfs_parent,
-   imc_mode_addr))
-   goto err;
+   imc_debugfs_create_x64(mode, 0600, imc_debugfs_parent,
+  imc_mode_addr);
 
imc_cmd_addr = (u64 *)(loc + IMC_CNTL_BLK_CMD_OFFSET);
sprintf(cmd, "imc_cmd_%d", (u32)(ptr->id));
-   if (!imc_debugfs_create_x64(cmd, 0600, imc_debugfs_parent,
-   imc_cmd_addr))
-   goto err;
+   imc_debugfs_create_x64(cmd, 0600, imc_debugfs_parent,
+  imc_cmd_addr);
ptr++;
}
-   return;
-
-err:
-   debugfs_remove_recursive(imc_debugfs_parent);
 }
 
 /*
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 22c22cd7bd82..57d3a6af1d52 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3174,11 +3174,6 @@ static void pnv_pci_ioda_create_dbgfs(void)
 
sprintf(name, "PCI%04x", hose->global_number);
phb->dbgfs = debugfs_create_dir(name, powerpc_debugfs_root);
-   if (!phb->dbgfs) {
-   pr_warn("%s: Error on creating debugfs on PHB#%x\n",
-   __func__, hose->global_number);
-   continue;
-   }
 
debugfs_create_file_unsafe("dump_diag_regs", 0200, phb->dbgfs,
   phb, _pci_diag_data_fops);
diff --git a/arch/powerpc/platforms/powernv/vas-debug.c 

[PATCH 5/6] powerpc: cell: axon_msi: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Arnd Bergmann 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/platforms/cell/axon_msi.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/cell/axon_msi.c 
b/arch/powerpc/platforms/cell/axon_msi.c
index 57c4e0e86c88..ca2555b8a0c2 100644
--- a/arch/powerpc/platforms/cell/axon_msi.c
+++ b/arch/powerpc/platforms/cell/axon_msi.c
@@ -480,10 +480,6 @@ void axon_msi_debug_setup(struct device_node *dn, struct 
axon_msic *msic)
 
snprintf(name, sizeof(name), "msic_%d", of_node_to_nid(dn));
 
-   if (!debugfs_create_file(name, 0600, powerpc_debugfs_root,
-msic, _msic)) {
-   pr_devel("axon_msi: debugfs_create_file failed!\n");
-   return;
-   }
+   debugfs_create_file(name, 0600, powerpc_debugfs_root, msic, _msic);
 }
 #endif /* DEBUG */
-- 
2.25.0



[PATCH 4/6] powerpc: mm: ptdump: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Christophe Leroy 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/mm/ptdump/bats.c  | 8 +++-
 arch/powerpc/mm/ptdump/hashpagetable.c | 7 ++-
 arch/powerpc/mm/ptdump/ptdump.c| 8 +++-
 arch/powerpc/mm/ptdump/segment_regs.c  | 8 +++-
 4 files changed, 11 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/mm/ptdump/bats.c b/arch/powerpc/mm/ptdump/bats.c
index 4154feac1da3..d3a5d6b318d1 100644
--- a/arch/powerpc/mm/ptdump/bats.c
+++ b/arch/powerpc/mm/ptdump/bats.c
@@ -164,10 +164,8 @@ static const struct file_operations bats_fops = {
 
 static int __init bats_init(void)
 {
-   struct dentry *debugfs_file;
-
-   debugfs_file = debugfs_create_file("block_address_translation", 0400,
-  powerpc_debugfs_root, NULL, 
_fops);
-   return debugfs_file ? 0 : -ENOMEM;
+   debugfs_create_file("block_address_translation", 0400,
+   powerpc_debugfs_root, NULL, _fops);
+   return 0;
 }
 device_initcall(bats_init);
diff --git a/arch/powerpc/mm/ptdump/hashpagetable.c 
b/arch/powerpc/mm/ptdump/hashpagetable.c
index a07278027c6f..b6ed9578382f 100644
--- a/arch/powerpc/mm/ptdump/hashpagetable.c
+++ b/arch/powerpc/mm/ptdump/hashpagetable.c
@@ -527,13 +527,10 @@ static const struct file_operations ptdump_fops = {
 
 static int ptdump_init(void)
 {
-   struct dentry *debugfs_file;
-
if (!radix_enabled()) {
populate_markers();
-   debugfs_file = debugfs_create_file("kernel_hash_pagetable",
-   0400, NULL, NULL, _fops);
-   return debugfs_file ? 0 : -ENOMEM;
+   debugfs_create_file("kernel_hash_pagetable", 0400, NULL, NULL,
+   _fops);
}
return 0;
 }
diff --git a/arch/powerpc/mm/ptdump/ptdump.c b/arch/powerpc/mm/ptdump/ptdump.c
index 206156255247..d92bb8ea229c 100644
--- a/arch/powerpc/mm/ptdump/ptdump.c
+++ b/arch/powerpc/mm/ptdump/ptdump.c
@@ -417,12 +417,10 @@ void ptdump_check_wx(void)
 
 static int ptdump_init(void)
 {
-   struct dentry *debugfs_file;
-
populate_markers();
build_pgtable_complete_mask();
-   debugfs_file = debugfs_create_file("kernel_page_tables", 0400, NULL,
-   NULL, _fops);
-   return debugfs_file ? 0 : -ENOMEM;
+   debugfs_create_file("kernel_page_tables", 0400, NULL, NULL,
+   _fops);
+   return 0;
 }
 device_initcall(ptdump_init);
diff --git a/arch/powerpc/mm/ptdump/segment_regs.c 
b/arch/powerpc/mm/ptdump/segment_regs.c
index 501843664bb9..dde2fe8de4b2 100644
--- a/arch/powerpc/mm/ptdump/segment_regs.c
+++ b/arch/powerpc/mm/ptdump/segment_regs.c
@@ -55,10 +55,8 @@ static const struct file_operations sr_fops = {
 
 static int __init sr_init(void)
 {
-   struct dentry *debugfs_file;
-
-   debugfs_file = debugfs_create_file("segment_registers", 0400,
-  powerpc_debugfs_root, NULL, 
_fops);
-   return debugfs_file ? 0 : -ENOMEM;
+   debugfs_create_file("segment_registers", 0400, powerpc_debugfs_root,
+   NULL, _fops);
+   return 0;
 }
 device_initcall(sr_init);
-- 
2.25.0



[PATCH 3/6] powerpc: mm: book3s64: hash_utils: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: "Aneesh Kumar K.V" 
Cc: Christophe Leroy 
Cc: Nicholas Piggin 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/mm/book3s64/hash_utils.c | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index 523d4d39d11e..7e5714a69a58 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -2018,11 +2018,8 @@ DEFINE_DEBUGFS_ATTRIBUTE(fops_hpt_order, hpt_order_get, 
hpt_order_set, "%llu\n")
 
 static int __init hash64_debugfs(void)
 {
-   if (!debugfs_create_file_unsafe("hpt_order", 0600, powerpc_debugfs_root,
-   NULL, _hpt_order)) {
-   pr_err("lpar: unable to create hpt_order debugsfs file\n");
-   }
-
+   debugfs_create_file("hpt_order", 0600, powerpc_debugfs_root, NULL,
+   _hpt_order);
return 0;
 }
 machine_device_initcall(pseries, hash64_debugfs);
-- 
2.25.0



[PATCH 2/6] powerpc: kvm: no need to check return value of debugfs_create functions

2020-02-09 Thread Greg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the
return value.  The function can work or not, but the code logic should
never do something different based on this.

Because of this cleanup, we get to remove a few fields in struct
kvm_arch that are now unused.

Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Michael Ellerman 
Cc: kvm-...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Greg Kroah-Hartman 
---
 arch/powerpc/include/asm/kvm_host.h|  3 ---
 arch/powerpc/kvm/book3s_64_mmu_hv.c|  5 ++---
 arch/powerpc/kvm/book3s_64_mmu_radix.c |  5 ++---
 arch/powerpc/kvm/book3s_hv.c   |  9 ++---
 arch/powerpc/kvm/timing.c  | 13 +++--
 5 files changed, 9 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index 6e8b8ffd06ad..877f8aa2bc1e 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -308,8 +308,6 @@ struct kvm_arch {
pgd_t *pgtable;
u64 process_table;
struct dentry *debugfs_dir;
-   struct dentry *htab_dentry;
-   struct dentry *radix_dentry;
struct kvm_resize_hpt *resize_hpt; /* protected by kvm->lock */
 #endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */
 #ifdef CONFIG_KVM_BOOK3S_PR_POSSIBLE
@@ -830,7 +828,6 @@ struct kvm_vcpu_arch {
struct kvmhv_tb_accumulator cede_time;  /* time napping inside guest */
 
struct dentry *debugfs_dir;
-   struct dentry *debugfs_timings;
 #endif /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */
 };
 
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 6c372f5c61b6..8b4eac0c9dcd 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -2138,9 +2138,8 @@ static const struct file_operations debugfs_htab_fops = {
 
 void kvmppc_mmu_debugfs_init(struct kvm *kvm)
 {
-   kvm->arch.htab_dentry = debugfs_create_file("htab", 0400,
-   kvm->arch.debugfs_dir, kvm,
-   _htab_fops);
+   debugfs_create_file("htab", 0400, kvm->arch.debugfs_dir, kvm,
+   _htab_fops);
 }
 
 void kvmppc_mmu_book3s_hv_init(struct kvm_vcpu *vcpu)
diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 803940d79b73..1d75ed684b53 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -1376,9 +1376,8 @@ static const struct file_operations debugfs_radix_fops = {
 
 void kvmhv_radix_debugfs_init(struct kvm *kvm)
 {
-   kvm->arch.radix_dentry = debugfs_create_file("radix", 0400,
-kvm->arch.debugfs_dir, kvm,
-_radix_fops);
+   debugfs_create_file("radix", 0400, kvm->arch.debugfs_dir, kvm,
+   _radix_fops);
 }
 
 int kvmppc_radix_init(void)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 2cefd071b848..33be4d93248a 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -2258,14 +2258,9 @@ static void debugfs_vcpu_init(struct kvm_vcpu *vcpu, 
unsigned int id)
struct kvm *kvm = vcpu->kvm;
 
snprintf(buf, sizeof(buf), "vcpu%u", id);
-   if (IS_ERR_OR_NULL(kvm->arch.debugfs_dir))
-   return;
vcpu->arch.debugfs_dir = debugfs_create_dir(buf, kvm->arch.debugfs_dir);
-   if (IS_ERR_OR_NULL(vcpu->arch.debugfs_dir))
-   return;
-   vcpu->arch.debugfs_timings =
-   debugfs_create_file("timings", 0444, vcpu->arch.debugfs_dir,
-   vcpu, _timings_ops);
+   debugfs_create_file("timings", 0444, vcpu->arch.debugfs_dir, vcpu,
+   _timings_ops);
 }
 
 #else /* CONFIG_KVM_BOOK3S_HV_EXIT_TIMING */
diff --git a/arch/powerpc/kvm/timing.c b/arch/powerpc/kvm/timing.c
index bfe4f106cffc..8e4791c6f2af 100644
--- a/arch/powerpc/kvm/timing.c
+++ b/arch/powerpc/kvm/timing.c
@@ -207,19 +207,12 @@ static const struct file_operations 
kvmppc_exit_timing_fops = {
 void kvmppc_create_vcpu_debugfs(struct kvm_vcpu *vcpu, unsigned int id)
 {
static char dbg_fname[50];
-   struct dentry *debugfs_file;
 
snprintf(dbg_fname, sizeof(dbg_fname), "vm%u_vcpu%u_timing",
 current->pid, id);
-   debugfs_file = debugfs_create_file(dbg_fname, 0666,
-   kvm_debugfs_dir, vcpu,
-   _exit_timing_fops);
-
-   if (!debugfs_file) {
-   printk(KERN_ERR"%s: error creating debugfs file %s\n",
-   __func__, dbg_fname);
-   return;
-   }
+   debugfs_create_file(dbg_fname, 0666, kvm_debugfs_dir, vcpu,
+   _exit_timing_fops);
+
 

Re: [PATCH] powerpc/hugetlb: Fix 8M hugepages on 8xx

2020-02-09 Thread Christophe Leroy




Le 06/02/2020 à 14:50, Christophe Leroy a écrit :

Commit 55c8fc3f4930 ("powerpc/8xx: reintroduce 16K pages with HW
assistance") redefined pte_t as a struct of 4 pte_basic_t, because
in 16K pages mode there are four identical entries in the page table.
But hugepd entries for 8k pages require only one entrie of size
pte_basic_t. So there is no point in creating a cache for 4 entries
page tables.

Also, with HW assistance the entries must be 4k aligned, the 8xx
drops the last 12 bits. Redefine HUGEPD_SHIFT_MASK to mask them out.

Calculate PTE_T_ORDER using the size of pte_basic_t instead of pte_t.

In 16k mode, define a specific set_huge_pte_at() function which writes
the pte in a single entry instead of using set_pte_at() which writes
4 identical entries. Define set_pte_filter() inline otherwise GCC
doesn't inline it anymore because it is now used twice, and that gives
a pretty suboptimal code because of pte_t being a struct of 4 entries.
This function is also used for 512k pages which only require one entry
as well allthough replicating it four times is harmless as 512k pages
entries are spread every 128 bytes in the table.


That's not enough. We also need to change huge_ptep_set_access_flags(), 
huge_pte_clear(), huge_ptep_set_wrprotect() and huge_ptep_get_and_clear()


Will leave it as is at the time being, in parallele I'm working on 
getting rid of CONFIG_PPC_MM_SLICES for the 8xx and that fix that.


Christophe



Fixes: 22569b881d37 ("powerpc/8xx: Enable 8M hugepage support with HW 
assistance")
Cc: sta...@vger.kernel.org
Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/hugetlb.h |  5 +
  arch/powerpc/include/asm/page.h|  5 +
  arch/powerpc/mm/hugetlbpage.c  |  3 ++-
  arch/powerpc/mm/pgtable.c  | 19 ++-
  4 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/hugetlb.h 
b/arch/powerpc/include/asm/hugetlb.h
index bd6504c28c2f..f43cfbcf014f 100644
--- a/arch/powerpc/include/asm/hugetlb.h
+++ b/arch/powerpc/include/asm/hugetlb.h
@@ -64,6 +64,11 @@ static inline void arch_clear_hugepage_flags(struct page 
*page)
  {
  }
  
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)

+#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, 
pte_t pte);
+#endif
+
  #include 
  
  #else /* ! CONFIG_HUGETLB_PAGE */

diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 86332080399a..080a0bf8e54b 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -295,8 +295,13 @@ static inline bool pfn_valid(unsigned long pfn)
  /*
   * Some number of bits at the level of the page table that points to
   * a hugepte are used to encode the size.  This masks those bits.
+ * On 8xx, HW assistance requires 4k alignment for the hugepte.
   */
+#ifdef CONFIG_PPC_8xx
+#define HUGEPD_SHIFT_MASK 0xfff
+#else
  #define HUGEPD_SHIFT_MASK 0x3f
+#endif
  
  #ifndef __ASSEMBLY__
  
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c

index 73d4873fc7f8..c61032580185 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -30,7 +30,8 @@ bool hugetlb_disabled = false;
  
  #define hugepd_none(hpd)	(hpd_val(hpd) == 0)
  
-#define PTE_T_ORDER	(__builtin_ffs(sizeof(pte_t)) - __builtin_ffs(sizeof(void *)))

+#define PTE_T_ORDER(__builtin_ffs(sizeof(pte_basic_t)) - \
+__builtin_ffs(sizeof(void *)))
  
  pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz)

  {
diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
index e3759b69f81b..7a38eaa6ca72 100644
--- a/arch/powerpc/mm/pgtable.c
+++ b/arch/powerpc/mm/pgtable.c
@@ -100,7 +100,7 @@ static pte_t set_pte_filter_hash(pte_t pte) { return pte; }
   * as we don't have two bits to spare for _PAGE_EXEC and _PAGE_HWEXEC so
   * instead we "filter out" the exec permission for non clean pages.
   */
-static pte_t set_pte_filter(pte_t pte)
+static inline pte_t set_pte_filter(pte_t pte)
  {
struct page *pg;
  
@@ -259,6 +259,23 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma,

return changed;
  #endif
  }
+
+#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
+void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, 
pte_t pte)
+{
+   /*
+* Make sure hardware valid bit is not set. We don't do
+* tlb flush for this update.
+*/
+   VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep));
+
+   pte = pte_mkpte(pte);
+
+   pte = set_pte_filter(pte);
+
+   ptep->pte = pte_val(pte);
+}
+#endif
  #endif /* CONFIG_HUGETLB_PAGE */
  
  #ifdef CONFIG_DEBUG_VM