Re: [PATCH] powerpc/vdso: Separate vvar vma from vdso

2021-03-27 Thread Dmitry Safonov
Hi Christophe,

On 3/27/21 5:19 PM, Christophe Leroy wrote:
[..]
>> I opportunistically Cc stable on it: I understand that usually such
>> stuff isn't a stable material, but that will allow us in CRIU have
>> one workaround less that is needed just for one release (v5.11) on
>> one platform (ppc64), which we otherwise have to maintain.
> 
> Why is that a workaround, and why for one release only ? I think the
> solution proposed by Laurentto use the aux vector AT_SYSINFO_EHDR should
> work with any past and future release.

Yeah, I guess.
Previously, (before v5.11/power) all kernels had ELF start at "[vdso]"
VMA start, now we'll have to carry the offset in the VMA. Probably, not
the worst thing, but as it will be only for v5.11 release it can break,
so needs separate testing.
Kinda life was a bit easier without this additional code.

>> I wouldn't go as far as to say that the commit 511157ab641e is ABI
>> regression as no other userspace got broken, but I'd really appreciate
>> if it gets backported to v5.11 after v5.12 is released, so as not
>> to complicate already non-simple CRIU-vdso code. Thanks!
>>
>> Cc: Andrei Vagin 
>> Cc: Andy Lutomirski 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Christophe Leroy 
>> Cc: Laurent Dufour 
>> Cc: Michael Ellerman 
>> Cc: Paul Mackerras 
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: sta...@vger.kernel.org # v5.11
>> [1]: https://github.com/checkpoint-restore/criu/issues/1417
>> Signed-off-by: Dmitry Safonov 
>> Tested-by: Christophe Leroy 
> 
> I tested it with sifreturn_vdso selftest and it worked, because that
> selftest doesn't involve VDSO data.

Thanks again on helping with testing it, I appreciate it!

> But if I do a mremap() on the VDSO text vma without remapping VVAR to
> keep the same distance between the two vmas, gettimeofday() crashes. The
> reason is that the code obtains the address of the data by calculating a
> fix difference from its own address with the below macro, the delta
> being resolved at link time:
> 
> .macro get_datapage ptr
> bcl    20, 31, .+4
> 999:
> mflr    \ptr
> #if CONFIG_PPC_PAGE_SHIFT > 14
> addis    \ptr, \ptr, (_vdso_datapage - 999b)@ha
> #endif
> addi    \ptr, \ptr, (_vdso_datapage - 999b)@l
> .endm
> 
> So the datapage needs to remain at the same distance from the code at
> all time.
> 
> Wondering how the other architectures do to have two independent VMAs
> and be able to move one independently of the other.

It's alright as far as I know. If userspace remaps vdso/vvar it should
be aware of this (CRIU keeps this in mind, also old vdso image is dumped
to compare on restore with the one that the host has).

Thanks,
  Dmitry


[PATCH] powerpc: Fix HAVE_HARDLOCKUP_DETECTOR_ARCH build configuration

2021-03-27 Thread Chen Huang
When compiling the powerpc with the SMP disabled, it shows the issue:

arch/powerpc/kernel/watchdog.c: In function ‘watchdog_smp_panic’:
arch/powerpc/kernel/watchdog.c:177:4: error: implicit declaration of function 
‘smp_send_nmi_ipi’; did you mean ‘smp_send_stop’? 
[-Werror=implicit-function-declaration]
  177 |smp_send_nmi_ipi(c, wd_lockup_ipi, 100);
  |^~~~
  |smp_send_stop
cc1: all warnings being treated as errors
make[2]: *** [scripts/Makefile.build:273: arch/powerpc/kernel/watchdog.o] Error 
1
make[1]: *** [scripts/Makefile.build:534: arch/powerpc/kernel] Error 2
make: *** [Makefile:1980: arch/powerpc] Error 2
make: *** Waiting for unfinished jobs

We found that powerpc used ipi to implement hardlockup watchdog, so the
HAVE_HARDLOCKUP_DETECTOR_ARCH should depend on the SMP.

Fixes: 2104180a5369 ("powerpc/64s: implement arch-specific hardlockup watchdog")
Reported-by: Hulk Robot 
Signed-off-by: Chen Huang 
---
 arch/powerpc/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 764df010baee..2d4f37b117ce 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -225,7 +225,7 @@ config PPC
select HAVE_LIVEPATCH   if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MOD_ARCH_SPECIFIC
select HAVE_NMI if PERF_EVENTS || (PPC64 && 
PPC_BOOK3S)
-   select HAVE_HARDLOCKUP_DETECTOR_ARCHif (PPC64 && PPC_BOOK3S)
+   select HAVE_HARDLOCKUP_DETECTOR_ARCHif PPC64 && PPC_BOOK3S && SMP
select HAVE_OPTPROBES   if PPC64
select HAVE_PERF_EVENTS
select HAVE_PERF_EVENTS_NMI if PPC64
-- 
2.17.1



Re: [PATCH 2/4] exec: remove compat_do_execve

2021-03-27 Thread Sergei Shtylyov
On 3/26/21 5:38 PM, Christoph Hellwig wrote:

> Just call compat_do_execve instead.

   compat_do_execveat(), maybe?

> Signed-off-by: Christoph Hellwig 
> ---
>  fs/exec.c | 17 +
>  1 file changed, 1 insertion(+), 16 deletions(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index b63fb020909075..06e07278b456fa 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
[...]
> @@ -2072,7 +2057,7 @@ COMPAT_SYSCALL_DEFINE3(execve, const char __user *, 
> filename,
>   const compat_uptr_t __user *, argv,
>   const compat_uptr_t __user *, envp)
>  {
> - return compat_do_execve(getname(filename), argv, envp);
> + return compat_do_execveat(AT_FDCWD, getname(filename), argv, envp, 0);
>  }
>  
>  COMPAT_SYSCALL_DEFINE5(execveat, int, fd,

MBR, Sergei


Re: [PATCH v3 37/41] powerpc/32s: Move KUEP locking/unlocking in C

2021-03-27 Thread Christophe Leroy




Le 12/03/2021 à 13:50, Christophe Leroy a écrit :

This can be done in C, do it.

Unrolling the loop gains approx. 15% performance.

 From now on, prepare_transfer_to_handler() is only for
interrupts from kernel.

Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/book3s/32/kup.h | 31 ---
  arch/powerpc/include/asm/interrupt.h |  6 +++-
  arch/powerpc/include/asm/kup.h   |  8 +
  arch/powerpc/kernel/entry_32.S   | 16 --
  arch/powerpc/kernel/head_32.h|  3 ++
  arch/powerpc/kernel/head_booke.h |  3 ++
  arch/powerpc/kernel/interrupt.c  |  4 +++
  arch/powerpc/mm/book3s32/Makefile|  1 +
  arch/powerpc/mm/book3s32/kuep.c  | 38 
  9 files changed, 62 insertions(+), 48 deletions(-)
  create mode 100644 arch/powerpc/mm/book3s32/kuep.c




diff --git a/arch/powerpc/mm/book3s32/kuep.c b/arch/powerpc/mm/book3s32/kuep.c
new file mode 100644
index ..c70532568a28
--- /dev/null
+++ b/arch/powerpc/mm/book3s32/kuep.c
@@ -0,0 +1,38 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+#include 
+#include 
+#include 


Need to add  to make Robot happy (missing prototypes of kuep_lock() 
and kuep_unlock())


+
+#define KUEP_UPDATE_TWO_USER_SEGMENTS(n) do {  \
+   if (TASK_SIZE > ((n) << 28))   \
+   mtsr(val1, (n) << 28);\
+   if (TASK_SIZE > (((n) + 1) << 28)) \
+   mtsr(val2, ((n) + 1) << 28);  \
+   val1 = (val1 + 0x222) & 0xf0ff; \
+   val2 = (val2 + 0x222) & 0xf0ff; \
+} while (0)
+
+static __always_inline void kuep_update(u32 val)
+{
+   int val1 = val;
+   int val2 = (val + 0x111) & 0xf0ff;
+
+   KUEP_UPDATE_TWO_USER_SEGMENTS(0);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(2);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(4);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(6);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(8);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(10);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(12);
+   KUEP_UPDATE_TWO_USER_SEGMENTS(14);
+}
+
+void kuep_lock(void)
+{
+   kuep_update(mfsr(0) | SR_NX);
+}
+
+void kuep_unlock(void)
+{
+   kuep_update(mfsr(0) & ~SR_NX);
+}



Re: [PATCH] powerpc/vdso: Separate vvar vma from vdso

2021-03-27 Thread Christophe Leroy




Le 26/03/2021 à 20:17, Dmitry Safonov a écrit :

Since commit 511157ab641e ("powerpc/vdso: Move vdso datapage up front")
VVAR page is in front of the VDSO area. In result it breaks CRIU
(Checkpoint Restore In Userspace) [1], where CRIU expects that "[vdso]"
from /proc/../maps points at ELF/vdso image, rather than at VVAR data page.
Laurent made a patch to keep CRIU working (by reading aux vector).
But I think it still makes sence to separate two mappings into different
VMAs. It will also make ppc64 less "special" for userspace and as
a side-bonus will make VVAR page un-writable by debugger (which previously
would COW page and can be unexpected).

I opportunistically Cc stable on it: I understand that usually such
stuff isn't a stable material, but that will allow us in CRIU have
one workaround less that is needed just for one release (v5.11) on
one platform (ppc64), which we otherwise have to maintain.


Why is that a workaround, and why for one release only ? I think the solution proposed by Laurentto 
use the aux vector AT_SYSINFO_EHDR should work with any past and future release.



I wouldn't go as far as to say that the commit 511157ab641e is ABI
regression as no other userspace got broken, but I'd really appreciate
if it gets backported to v5.11 after v5.12 is released, so as not
to complicate already non-simple CRIU-vdso code. Thanks!

Cc: Andrei Vagin 
Cc: Andy Lutomirski 
Cc: Benjamin Herrenschmidt 
Cc: Christophe Leroy 
Cc: Laurent Dufour 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sta...@vger.kernel.org # v5.11
[1]: https://github.com/checkpoint-restore/criu/issues/1417
Signed-off-by: Dmitry Safonov 
Tested-by: Christophe Leroy 


I tested it with sifreturn_vdso selftest and it worked, because that selftest 
doesn't involve VDSO data.

But if I do a mremap() on the VDSO text vma without remapping VVAR to keep the same distance between 
the two vmas, gettimeofday() crashes. The reason is that the code obtains the address of the data by 
calculating a fix difference from its own address with the below macro, the delta being resolved at 
link time:


.macro get_datapage ptr
bcl 20, 31, .+4
999:
mflr\ptr
#if CONFIG_PPC_PAGE_SHIFT > 14
addis   \ptr, \ptr, (_vdso_datapage - 999b)@ha
#endif
addi\ptr, \ptr, (_vdso_datapage - 999b)@l
.endm

So the datapage needs to remain at the same distance from the code at all time.

Wondering how the other architectures do to have two independant VMAs and be able to move one 
independantly of the other.


Christophe


Re: [PATCH V2 1/5] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT

2021-03-27 Thread Michael Ellerman
Arnaldo  writes:
> On March 25, 2021 11:38:01 AM GMT-03:00, Peter Zijlstra 
>  wrote:
>>On Thu, Mar 25, 2021 at 10:01:35AM -0300, Arnaldo Carvalho de Melo
>>wrote:.
>>> > > Also for CPU_FTR_ARCH_31, capture the two cycle counter
>>information in
>>> > > two 16 bit fields of perf_sample_weight structure.
>>> > 
>>> > Changes looks fine to me.
>>> > 
>>> > Reviewed-by: Madhavan Srinivasan 
>>> 
>>> So who will process the kernel bits? I'm merging the tooling parts,
>>
>>I was sorta expecting these to go through the powerpc tree. Let me know
>>if you want them in tip/perf/core instead.
>
> Shouldn't matter by which tree it gets upstream, as long as it gets picked :-)

I plan to take them, just haven't got around to it yet :}

cheers