On 4/19/19 8:40 AM, Mathieu Desnoyers wrote:
----- On Apr 19, 2019, at 10:17 AM, shuah [email protected] wrote:

On 4/19/19 7:48 AM, Mathieu Desnoyers wrote:
----- On Apr 19, 2019, at 9:42 AM, Mathieu Desnoyers
[email protected] wrote:

----- On Apr 19, 2019, at 8:55 AM, Mathieu Desnoyers
[email protected] wrote:

----- On Apr 19, 2019, at 8:41 AM, Mathieu Desnoyers
[email protected] wrote:

----- On Apr 19, 2019, at 6:38 AM, Ingo Molnar [email protected] wrote:

* Mathieu Desnoyers <[email protected]> wrote:

On smaller systems, running a test with 200 threads can take a long
time on machines with smaller number of CPUs.

Detect the number of online cpus at test runtime, and multiply that
by 6 to have 6 rseq threads per cpu preempting each other.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dave Watson <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: [email protected]
Cc: "H . Peter Anvin" <[email protected]>
Cc: Chris Lameter <[email protected]>
Cc: Russell King <[email protected]>
Cc: Michael Kerrisk <[email protected]>
Cc: "Paul E . McKenney" <[email protected]>
Cc: Paul Turner <[email protected]>
Cc: Boqun Feng <[email protected]>
Cc: Josh Triplett <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ben Maurer <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
---
   tools/testing/selftests/rseq/run_param_test.sh | 7 +++++--
   1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/rseq/run_param_test.sh
b/tools/testing/selftests/rseq/run_param_test.sh
index 3acd6d75ff9f..e426304fd4a0 100755
--- a/tools/testing/selftests/rseq/run_param_test.sh
+++ b/tools/testing/selftests/rseq/run_param_test.sh
@@ -1,6 +1,8 @@
   #!/bin/bash
   # SPDX-License-Identifier: GPL-2.0+ or MIT
+NR_CPUS=`grep '^processor' /proc/cpuinfo | wc -l`
+
   EXTRA_ARGS=${@}
OLDIFS="$IFS"
@@ -28,15 +30,16 @@ IFS="$OLDIFS"
REPS=1000
   SLOW_REPS=100
+NR_THREADS=$((6*${NR_CPUS}))
function do_tests()
   {
        local i=0
        while [ "$i" -lt "${#TEST_LIST[@]}" ]; do
                echo "Running test ${TEST_NAME[$i]}"
-               ./param_test ${TEST_LIST[$i]} -r ${REPS} ${@} ${EXTRA_ARGS} || 
exit 1
+               ./param_test ${TEST_LIST[$i]} -r ${REPS} -t ${NR_THREADS} ${@} 
${EXTRA_ARGS}
|| exit 1
                echo "Running compare-twice test ${TEST_NAME[$i]}"
-               ./param_test_compare_twice ${TEST_LIST[$i]} -r ${REPS} ${@} 
${EXTRA_ARGS} ||
exit 1
+               ./param_test_compare_twice ${TEST_LIST[$i]} -r ${REPS} -t 
${NR_THREADS} ${@}
${EXTRA_ARGS} || exit 1
                let "i++"
        done
   }

BTW., when trying to build the rseq self-tests I get this build failure:

   dagon:~/tip/tools/testing/selftests/rseq> make
   gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ -shared
   -fPIC rseq.c -lpthread -o
   /home/mingo/tip/tools/testing/selftests/rseq/librseq.so
   gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./ 
basic_test.c
   -lpthread -lrseq -o /home/mingo/tip/tools/testing/selftests/rseq/basic_test
   gcc -O2 -Wall -g -I./ -I../../../../usr/include/ -L./ -Wl,-rpath=./
   basic_percpu_ops_test.c -lpthread -lrseq -o
   /home/mingo/tip/tools/testing/selftests/rseq/basic_percpu_ops_test
   /usr/bin/ld: /tmp/ccuHTWnZ.o: in function `rseq_cmpeqv_storev':
   /home/mingo/tip/tools/testing/selftests/rseq/./rseq-x86.h:84: undefined
   reference to `.L8'
   /usr/bin/ld: /home/mingo/tip/tools/testing/selftests/rseq/./rseq-x86.h:84:
   undefined reference to `.L49'
   /usr/bin/ld: /tmp/ccuHTWnZ.o: in function `rseq_cmpnev_storeoffp_load':
   /home/mingo/tip/tools/testing/selftests/rseq/./rseq-x86.h:141: undefined
   reference to `.L57'
   /usr/bin/ld: /tmp/ccuHTWnZ.o:(__rseq_failure+0x8): undefined reference to 
`.L8'
   /usr/bin/ld: /tmp/ccuHTWnZ.o:(__rseq_failure+0x14): undefined reference to
   `.L49'
   /usr/bin/ld: /tmp/ccuHTWnZ.o:(__rseq_failure+0x20): undefined reference to
   `.L55'
   collect2: error: ld returned 1 exit status
   make: *** [Makefile:22:
   /home/mingo/tip/tools/testing/selftests/rseq/basic_percpu_ops_test] Error 1

Is this a known problem, or do I miss something from my build environment
perhaps? Vanilla 64-bit Ubuntu 18.10 (Cosmic).

It works fine with gcc-7 (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3))
but indeed I get the same failure with gcc-8 (gcc version 8.0.1 20180414
(experimental) [trunk revision 259383] (Ubuntu 8-20180414-1ubuntu2)).

Thanks for reporting! I will investigate.

It looks like gcc-8 optimize away the target of asm goto labels when
there are more than one of them on x86-64. I'll try to come up with
a simpler reproducer.

It appears to be related to gcc-8 mishandling combination of
asm goto and thread-local storage input operands on x86-64.
Here is a simple reproducer:

__thread int var;
static int fct(void)
{
         asm goto (      "jmp %l[testlabel]\n\t"
                         : : [var] "m" (var) : : testlabel);
         return 0;
testlabel:

FWIW, if I add an empty

     asm volatile ("");

here after the label, gcc-8 -O2 builds "something" which is
a bogus assembler (an endless loop) :

main:
.LFB24:
          .cfi_startproc
.L2:
          subq    $8, %rsp
          .cfi_def_cfa_offset 16
#APP
# 6 "test-asm-goto.c" 1
          jmp .L2

# 0 "" 2
#NO_APP
          movl    %fs:var@tpoff, %edx
          leaq    .LC0(%rip), %rsi
          movl    $1, %edi
          xorl    %eax, %eax
          call    __printf_chk@PLT
          xorl    %eax, %eax
          addq    $8, %rsp
          .cfi_def_cfa_offset 8
          ret
          .cfi_endproc

Thoughts ?


Didn't see problems when I tested it before applying it to
linux-kselftest next.

I have gcc version 7.3.0 (Ubuntu 7.3.0-27ubuntu1~18.04)

It really appears to be an optimization bug in gcc-8. Considering that
bogus compilers are released in the wild, we can hardly justify using
the compiler feature that triggers the bogus behavior, even if it gets
fixed in the future.

I've prepared a patch that changes the way the __rseq_abi fields are
passed to the inline asm. I pass the address of the __rseq_abi TLS
as a register input operand rather than each individual field as "m"
operand.

I will submit it in a separate thread.

By the way, it affects both x86-32 (building with gcc-8 -m32) and x86-64.


Should I drop this patch that is currently in linux-kseltest next? Just
confirming if your new patch is supposed to be applied on top of this
one or not?

thanks,
-- Shuah

Reply via email to