Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions

2020-06-10 Thread Nicholas Piggin
Excerpts from Matheus Castanho's message of May 14, 2020 6:55 am:
> Hi Nicholas,
> 
> Small comment below:
> 
> On 4/30/20 1:02 AM, Nicholas Piggin wrote:
>> Add support for the scv instruction on POWER9 and later CPUs.
>> 
>> For now this implements the zeroth scv vector 'scv 0', as identical
>> to 'sc' system calls, with the exception that lr is not preserved, and
>> it is 64-bit only. There may yet be changes made to this ABI, so it's
>> for testing only.
>> 
>> rfscv is implemented to return from scv type system calls. It can not
>> be used to return from sc system calls because those are defined to
>> preserve lr.
>> 
>> In a comparison of getpid syscall, the test program had scv taking
>> about 3 more cycles in user mode (92 vs 89 for sc), due to lr handling.
>> getpid syscall throughput on POWER9 is improved by 33%, mostly due to
>> reducing mtmsr and mtspr.
>> 
>> Signed-off-by: Nicholas Piggin 
>> ---
>>  Documentation/powerpc/syscall64-abi.rst   |  42 --
> 
> [...]
> 
>> +Return value
>> +
>> +- For the sc instruction, both a return value and a return error code are
>> +  returned. cr0.SO is the return error code, and r3 is the return value or
>> +  error code. When cr0.SO is clear, the syscall succeeded and r3 is the 
>> return
>> +  value. When cr0.SO is set, the syscall failed and r3 is the error code 
>> that
>> +  generally corresponds to errno.
>> +
>> +- For the scv 0 instruction, there is a return value indicates failure if it
>> +  is >= -MAX_ERRNO (-4095) as an unsigned comparison, in which case it is 
>> the
>> +  negated return error code. Otherwise it is the successful return value.
> 
> I believe this last paragraph is a bit confusing (didn't quite get the
> unsigned comparison with negative values). So instead of cr0.SO to
> indicate failure, scv returns the negated error code, and positive in
> case of success?

Yes, it will be like other major architectures and return values from
-4095..-1 indicate an error with error value equal to -return value.

I will try to make it a bit clearer.

Thanks,
Nick


Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions

2020-05-13 Thread Matheus Castanho
Hi Nicholas,

Small comment below:

On 4/30/20 1:02 AM, Nicholas Piggin wrote:
> Add support for the scv instruction on POWER9 and later CPUs.
> 
> For now this implements the zeroth scv vector 'scv 0', as identical
> to 'sc' system calls, with the exception that lr is not preserved, and
> it is 64-bit only. There may yet be changes made to this ABI, so it's
> for testing only.
> 
> rfscv is implemented to return from scv type system calls. It can not
> be used to return from sc system calls because those are defined to
> preserve lr.
> 
> In a comparison of getpid syscall, the test program had scv taking
> about 3 more cycles in user mode (92 vs 89 for sc), due to lr handling.
> getpid syscall throughput on POWER9 is improved by 33%, mostly due to
> reducing mtmsr and mtspr.
> 
> Signed-off-by: Nicholas Piggin 
> ---
>  Documentation/powerpc/syscall64-abi.rst   |  42 --

[...]

> +Return value
> +
> +- For the sc instruction, both a return value and a return error code are
> +  returned. cr0.SO is the return error code, and r3 is the return value or
> +  error code. When cr0.SO is clear, the syscall succeeded and r3 is the 
> return
> +  value. When cr0.SO is set, the syscall failed and r3 is the error code that
> +  generally corresponds to errno.
> +
> +- For the scv 0 instruction, there is a return value indicates failure if it
> +  is >= -MAX_ERRNO (-4095) as an unsigned comparison, in which case it is the
> +  negated return error code. Otherwise it is the successful return value.

I believe this last paragraph is a bit confusing (didn't quite get the
unsigned comparison with negative values). So instead of cr0.SO to
indicate failure, scv returns the negated error code, and positive in
case of success?

Thanks,
Matheus Castanho


Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions

2020-05-05 Thread Nicholas Piggin
Excerpts from Segher Boessenkool's message of May 6, 2020 8:11 am:
> Hi!
> 
> On Thu, Apr 30, 2020 at 02:02:02PM +1000, Nicholas Piggin wrote:
>> Add support for the scv instruction on POWER9 and later CPUs.
> 
> Looks good to me in general :-)

Thanks for taking a look.

>> For now this implements the zeroth scv vector 'scv 0', as identical
>> to 'sc' system calls, with the exception that lr is not preserved, and
>> it is 64-bit only. There may yet be changes made to this ABI, so it's
>> for testing only.
> 
> What does it do with SF=0?  I don't see how it is obviously not a
> security hole currently (but I didn't look too closely).

Oh that's an outdated comment, I since decided better to keep all the code 
common and handle 32-bit compat the same way as existing sc syscall.

Thanks,
Nick


Re: [RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions

2020-05-05 Thread Segher Boessenkool
Hi!

On Thu, Apr 30, 2020 at 02:02:02PM +1000, Nicholas Piggin wrote:
> Add support for the scv instruction on POWER9 and later CPUs.

Looks good to me in general :-)

> For now this implements the zeroth scv vector 'scv 0', as identical
> to 'sc' system calls, with the exception that lr is not preserved, and
> it is 64-bit only. There may yet be changes made to this ABI, so it's
> for testing only.

What does it do with SF=0?  I don't see how it is obviously not a
security hole currently (but I didn't look too closely).


Segher


[RFC PATCH 2/2] powerpc/64s: system call support for scv/rfscv instructions

2020-04-29 Thread Nicholas Piggin
Add support for the scv instruction on POWER9 and later CPUs.

For now this implements the zeroth scv vector 'scv 0', as identical
to 'sc' system calls, with the exception that lr is not preserved, and
it is 64-bit only. There may yet be changes made to this ABI, so it's
for testing only.

rfscv is implemented to return from scv type system calls. It can not
be used to return from sc system calls because those are defined to
preserve lr.

In a comparison of getpid syscall, the test program had scv taking
about 3 more cycles in user mode (92 vs 89 for sc), due to lr handling.
getpid syscall throughput on POWER9 is improved by 33%, mostly due to
reducing mtmsr and mtspr.

Signed-off-by: Nicholas Piggin 
---
 Documentation/powerpc/syscall64-abi.rst   |  42 --
 arch/powerpc/include/asm/asm-prototypes.h |   2 +-
 arch/powerpc/include/asm/exception-64s.h  |   6 +
 arch/powerpc/include/asm/head-64.h|   2 +-
 arch/powerpc/include/asm/ppc-opcode.h |   2 +
 arch/powerpc/include/asm/ppc_asm.h|   2 +
 arch/powerpc/include/asm/processor.h  |   2 +-
 arch/powerpc/include/asm/ptrace.h |   8 +-
 arch/powerpc/include/asm/setup.h  |   4 +-
 arch/powerpc/include/asm/sstep.h  |   1 +
 arch/powerpc/include/asm/vdso.h   |   1 +
 arch/powerpc/kernel/cpu_setup_power.S |   2 +-
 arch/powerpc/kernel/cputable.c|   3 +-
 arch/powerpc/kernel/dt_cpu_ftrs.c |   1 +
 arch/powerpc/kernel/entry_64.S| 158 +-
 arch/powerpc/kernel/exceptions-64s.S  | 123 -
 arch/powerpc/kernel/process.c |  10 +-
 arch/powerpc/kernel/setup_64.c|   5 +-
 arch/powerpc/kernel/signal.c  |  19 ++-
 arch/powerpc/kernel/signal_64.c   |  28 +++-
 arch/powerpc/kernel/syscall_64.c  |  32 +++--
 arch/powerpc/kernel/vdso.c|   2 +
 arch/powerpc/kernel/vdso64/sigtramp.S |  34 -
 arch/powerpc/kernel/vdso64/vdso64.lds.S   |   1 +
 arch/powerpc/lib/sstep.c  |  14 ++
 arch/powerpc/perf/callchain_64.c  |   9 +-
 arch/powerpc/platforms/pseries/setup.c|   8 +-
 arch/powerpc/xmon/xmon.c  |   1 +
 28 files changed, 468 insertions(+), 54 deletions(-)

diff --git a/Documentation/powerpc/syscall64-abi.rst 
b/Documentation/powerpc/syscall64-abi.rst
index e49f69f941b9..6f311ad37211 100644
--- a/Documentation/powerpc/syscall64-abi.rst
+++ b/Documentation/powerpc/syscall64-abi.rst
@@ -5,6 +5,15 @@ Power Architecture 64-bit Linux system call ABI
 syscall
 ===
 
+Invocation
+--
+The syscall is made with the sc instruction, and returns with execution
+continuing at the instruction following the sc instruction.
+
+If PPC_FEATURE2_SCV appears in the AT_HWCAP2 ELF auxiliary vector, the
+scv 0 instruction is an alternative that may provide better performance,
+with some differences to calling sequence.
+
 syscall calling sequence\ [1]_ matches the Power Architecture 64-bit ELF ABI
 specification C function calling sequence, including register preservation
 rules, with the following differences.
@@ -12,16 +21,23 @@ rules, with the following differences.
 .. [1] Some syscalls (typically low-level management functions) may have
different calling sequences (e.g., rt_sigreturn).
 
-Parameters and return value

+Parameters
+--
 The system call number is specified in r0.
 
 There is a maximum of 6 integer parameters to a syscall, passed in r3-r8.
 
-Both a return value and a return error code are returned. cr0.SO is the return
-error code, and r3 is the return value or error code. When cr0.SO is clear,
-the syscall succeeded and r3 is the return value. When cr0.SO is set, the
-syscall failed and r3 is the error code that generally corresponds to errno.
+Return value
+
+- For the sc instruction, both a return value and a return error code are
+  returned. cr0.SO is the return error code, and r3 is the return value or
+  error code. When cr0.SO is clear, the syscall succeeded and r3 is the return
+  value. When cr0.SO is set, the syscall failed and r3 is the error code that
+  generally corresponds to errno.
+
+- For the scv 0 instruction, there is a return value indicates failure if it
+  is >= -MAX_ERRNO (-4095) as an unsigned comparison, in which case it is the
+  negated return error code. Otherwise it is the successful return value.
 
 Stack
 -
@@ -34,22 +50,23 @@ Register preservation rules match the ELF ABI calling 
sequence with the
 following differences:
 
 === = 
+--- For the sc instruction ---
 r0  Volatile  (System call number.)
 r3  Volatile  (Parameter 1, and return value.)
 r4-r8   Volatile  (Parameters 2-6.)
-cr0 Volatile  (cr0.SO is the return error condition)
+cr0 Volatile  (cr0.SO is the return error condition.)
 cr1, cr5-7  Nonvolatile
 lr