[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #30 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:4506b349cf527834239554a03e43ae45237b315c

commit r11-10880-g4506b349cf527834239554a03e43ae45237b315c
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #31 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:91955e374e07dc8ee9111eeb49c137c5582ed674

commit r11-10881-g91955e374e07dc8ee9111eeb49c137c5582ed674
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #29 from CVS Commits  ---
The releases/gcc-11 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:e1bd4f5434d7989d723188e9f2b524ce234bc44d

commit r11-10879-ge1bd4f5434d7989d723188e9f2b524ce234bc44d
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #28 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:b4561b782427cdfe0fac1a869e79a49187817ffe

commit r12-9739-gb4561b782427cdfe0fac1a869e79a49187817ffe
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #27 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:5c6515076f2ba55a31149085d3826e975c114fe5

commit r12-9738-g5c6515076f2ba55a31149085d3826e975c114fe5
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #26 from CVS Commits  ---
The releases/gcc-12 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:66df913899d32e7726f986afb61c5c5615eb2a36

commit r12-9737-g66df913899d32e7726f986afb61c5c5615eb2a36
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #25 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:09124b7ed7709721e86556b4083ef40925d7489b

commit r13-7495-g09124b7ed7709721e86556b4083ef40925d7489b
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #24 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:3840d5ccf750b6a059258be7faa4a3fce85a6fa6

commit r13-7494-g3840d5ccf750b6a059258be7faa4a3fce85a6fa6
Author: Thomas Schwinge 
Date:   Tue Apr 25 23:53:12 2023 +0200

Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

$ uname -srvi
Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC
2016 x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 32 model name  : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

$ \time make check-target-libgomp
RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata
505044maxresident)k
6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata
505172maxresident)k

This is what people have been complaining about, rightly so, in
 "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata
505188maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata
505360maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata
505112maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata
505360maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata
505128maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata
505100maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata
505200maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata
505160maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata
505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata
994264maxresident)k

Case (b), parallelized:

-j12 GCC_TEST_PARALLEL_SLOTS=10
7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata
994344maxresident)k
-j15 GCC_TEST_PARALLEL_SLOTS=15
8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata
994228maxresident)k
-j17 GCC_TEST_PARALLEL_SLOTS=17
8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata
994176maxresident)k
-j18 GCC_TEST_PARALLEL_SLOTS=18
8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata
994248maxresident)k
-j19 GCC_TEST_PARALLEL_SLOTS=19
9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata
994260maxresident)k
-j20 GCC_TEST_PARALLEL_SLOTS=20
9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata
994284maxresident)k
-j23 GCC_TEST_PARALLEL_SLOTS=23
10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata
994208maxresident)k
-j26 GCC_TEST_PARALLEL_SLOTS=26
11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata
994256maxresident)k
-j32 GCC_TEST_PARALLEL_SLOTS=32
11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata
994240maxresident)k

On my Dell Precision 7530 laptop:

$ uname -srvi
Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
$ grep '^model name' < /proc/cpuinfo | uniq -c
 12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
$ nvidia-smi -L
GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

$ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-28 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #23 from CVS Commits  ---
The releases/gcc-13 branch has been updated by Thomas Schwinge
:

https://gcc.gnu.org/g:2aa6135efb2d5fce93578592d91f8ce19a1b983b

commit r13-7493-g2aa6135efb2d5fce93578592d91f8ce19a1b983b
Author: Rainer Orth 
Date:   Thu May 7 13:26:57 2015 +0200

Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge 
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-23 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #22 from Jakub Jelinek  ---
Ok, but please do it sooner than later, so there is enough time before 13.2rc
to note potential issues on the branch.

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-23 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #21 from Thomas Schwinge  ---
Jakub, given that 'libgomp is usually the “long tail” of [...] testing' (GCC
IRC, 2023-06-05), Iain has asked that I backport to release branches the
changes implemented here:

  - commit r14-854-ge797db5c744f7b4e110f23a495fca8e6b8aebe83 "Support parallel
testing in libgomp, part I [PR66005]"
  - commit r14-855-g6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba "Support parallel
testing in libgomp, part II [PR66005]"
  - commit r14-1490-g04abe1944d30eb18a2060cfcd9695d085f7b4752 "Support parallel
testing in libgomp: fallback Perl 'flock' [PR66005]"

(I haven't looked yet in detail, but there shouldn't be any non-trivial
prerequisite commits, if any at all.)

I've not had any reports about breakage or regressions, so that does seem safe
to me -- any objections?

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-05 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #20 from Thomas Schwinge  ---
(In reply to Iain Sandoe from comment #19)
> (In reply to Thomas Schwinge from comment #18)
> > r14-1490-g04abe1944d30eb18a2060cfcd9695d085f7b4752 "Support parallel 
> > testing in libgomp: fallback Perl 'flock' [PR66005]"
> 
> [...] Perl is available so let's see.

Thanks for your confirmation in GCC IRC, 2023-06-02:

 tschwinge: 94mins => 15 (wallclock), so the Perl addition is also
working

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-02 Thread iains at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

Iain Sandoe  changed:

   What|Removed |Added

 CC||iains at gcc dot gnu.org

--- Comment #19 from Iain Sandoe  ---
(In reply to Thomas Schwinge from comment #18)
> (In reply to Iain Sandoe from
> )
> > I am also somewhat puzzled by what conditions I need to take advantage of
> > the parallel running?
> > Darwin has /usr/bin/getconf and AFAICT the number of cpus is reported OK
> > both at runtime and during config
> 
> (That's not actually relevant for libgomp parallel testing.)
> 
> > but it seems to be determined to run a single process.
> 
> That's the fail-safe default if there's no 'flock' executable available --
> which I suspect is the case on your Darwin systems?  My recent commit
> r14-1490-g04abe1944d30eb18a2060cfcd9695d085f7b4752 "Support parallel testing
> in libgomp: fallback Perl 'flock' [PR66005]" should've addressed that case
> (if you have Perl).

thanks. yes flock used to exist on Darwin but was removed some time ago (like
10+ years) so a replacement is needed - and Perl is available so let's see.

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-02 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #18 from Thomas Schwinge  ---
(In reply to Iain Sandoe from
)
> I am also somewhat puzzled by what conditions I need to take advantage of
> the parallel running?
> Darwin has /usr/bin/getconf and AFAICT the number of cpus is reported OK
> both at runtime and during config

(That's not actually relevant for libgomp parallel testing.)

> but it seems to be determined to run a single process.

That's the fail-safe default if there's no 'flock' executable available --
which I suspect is the case on your Darwin systems?  My recent commit
r14-1490-g04abe1944d30eb18a2060cfcd9695d085f7b4752 "Support parallel testing in
libgomp: fallback Perl 'flock' [PR66005]" should've addressed that case (if you
have Perl).

[Bug testsuite/66005] libgomp make check time is excessive

2023-06-02 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #17 from CVS Commits  ---
The master branch has been updated by Thomas Schwinge :

https://gcc.gnu.org/g:04abe1944d30eb18a2060cfcd9695d085f7b4752

commit r14-1490-g04abe1944d30eb18a2060cfcd9695d085f7b4752
Author: Thomas Schwinge 
Date:   Mon May 15 20:00:07 2023 +0200

Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution
testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
> $ uname -srvi
> Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023
x86_64
> $ grep '^model name' < /proc/cpuinfo | uniq -c
>  12 model name  : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
> $ nvidia-smi -L
> GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

> $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
> 1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata
505148maxresident)k
> 1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata
505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
> -j12 GCC_TEST_PARALLEL_SLOTS=12
> 2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata
505216maxresident)k
> 2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata
505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata
505216maxresident)k
2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata
505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-16 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #16 from Jakub Jelinek  ---
Another possibility would be pick up one runtest (e.g. the first one using
O_EXCL which creates some file) and let it perform all executions from that
point on instead of doing the compilations, where the other runtest would feed
what needs to be executed and later deleted say through a pipe.  The reading
through pipe would ensure that it is able to wait if there is no immediate work
for it.  Of course we have dg-set-env-var which complicates things a little
bit, probably those would need to be transfered to the execution job together
with what program should run, what options should be passed to it, what
LD_LIBRARY_PATH should be used etc.  One issue is make sure all the executable
names are unique even at all optimization levels, we can't have target1.exe
created more than once.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-16 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #15 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #14 from Thomas Schwinge  ---
> (In reply to Eric Gallager from comment #12)
>> Note that there's a gnulib module for flock:
>> https://www.gnu.org/software/gnulib/manual/html_node/flock.html
>
> I'd see that one -- but it also says: "the replacement function does not 
> really
> work", so I don't think that's useful?

Besides, this only provides a replacement for the system call; we'd
still have to implement flock(1) ourselves and I'd rather not see us go
there.

> (In reply to Jakub Jelinek from comment #13)
>> And fcntl in tclx.
>
> Seen that, too -- but is TclX something that people actually have
> available/installed?  (Rainer?)

It's not available in packaged form on any of the targets I mentioned
(Solaris, macOS, AIX).  Besides, adding something like this feels quite
heavy-handed to me.

>> Anyway, I think choosing between flock(1) and some
>> python file locking would be better than using perl which is only needed in
>> maintainer mode and not otherwise.
>
> Rainer, would a 'python3' variant work for you?

Not really: python3 isn't available on older macOS systems, and again:
adding a python requirement (even for python2 in such a limited case)
seems to go overboard to me.

While I personally don't have a problem with requiring perl (it's needed
to support shared library versioning on Solaris), the same argument
applies.

My strong preference would be to use Tcl core means only, thus adding no
additional requirement.  I found a couple of suggestions on how to do
this:

https://wiki.tcl-lang.org/page/How+do+I+manage+lock+files+in+a+cross+platform+manner+in+Tcl
https://wiki.tcl-lang.org/page/Serializing+things+via+file+locks

effectively matching Jakub's suggestion.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #14 from Thomas Schwinge  ---
(In reply to Eric Gallager from comment #12)
> Note that there's a gnulib module for flock:
> https://www.gnu.org/software/gnulib/manual/html_node/flock.html

I'd see that one -- but it also says: "the replacement function does not really
work", so I don't think that's useful?

(In reply to Jakub Jelinek from comment #13)
> And fcntl in tclx.

Seen that, too -- but is TclX something that people actually have
available/installed?  (Rainer?)

> Anyway, I think choosing between flock(1) and some
> python file locking would be better than using perl which is only needed in
> maintainer mode and not otherwise.

Rainer, would a 'python3' variant work for you?

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #13 from Jakub Jelinek  ---
And fcntl in tclx.  Anyway, I think choosing between flock(1) and some python
file locking would be better than using perl which is only needed in maintainer
mode and not otherwise.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread egallager at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #12 from Eric Gallager  ---
Note that there's a gnulib module for flock:
https://www.gnu.org/software/gnulib/manual/html_node/flock.html

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #11 from Thomas Schwinge  ---
(In reply to myself from comment #10)
> Could we easily build a portable 'flock'-like using 'fcntl' locking
> primitives?

(, for
example.)

> (I've not yet looked.)


But simpler, is it OK to require Perl (Ick!) for parallelized
'check-target-libgomp'?  There's ,
and I've got that implemented as a fallback 'flock'.  (It's certainly not,
after two decades or so, my desire to write something in Perl, but I suppose
it's available "almost everywhere" and the fallback 'flock' is simple to
implement.)

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #10 from Thomas Schwinge  ---
(In reply to r...@cebitec.uni-bielefeld.de from comment #8)
> > --- Comment #7 from Thomas Schwinge  ---
> > Resolved for GCC 14.  Not planning on backporting to release branches (but
> > could, if desired).
> 
> Unfortunately not: flock is completely unportable.  It's not in
> POSIX.1/XPG7, and none of Solaris, macOS, or AIX have it.

OK, indeed my approach depends on 'flock'.  Otherwise, we still serialize
'check-target-libgomp', as before.

(In reply to Jakub Jelinek from comment #9)
> r5-3553 uses if {![catch {open $path {RDWR CREAT EXCL} 0600} fd]} { to
> determine which make check invocation should be given a particular batch of
> tests (in an initially empty directory), could you use that instead?

We'd like something that blocks until the lock is available, and something that
works on file descriptors and unlocks implicitly upon 'close'/process exit (to
avoid stale locks).

Using something like Jakub posted with spinning probably does waste too many
parallel slots here?  I'll try to experiment with that, though: at the long
tail at the end of overall parallel testing (that is, when all parallel slots
are otherwise unused), it's still better than no parallelism at all?

Could we easily build a portable 'flock'-like using 'fcntl' locking primitives?
 (I've not yet looked.)

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #9 from Jakub Jelinek  ---
r5-3553 uses if {![catch {open $path {RDWR CREAT EXCL} 0600} fd]} { to
determine which make check invocation should be given a particular batch of
tests (in an initially empty directory), could you use that instead?

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread ro at CeBiTec dot Uni-Bielefeld.DE via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

--- Comment #8 from ro at CeBiTec dot Uni-Bielefeld.DE  ---
> --- Comment #7 from Thomas Schwinge  ---
> Resolved for GCC 14.  Not planning on backporting to release branches (but
> could, if desired).

Unfortunately not: flock is completely unportable.  It's not in
POSIX.1/XPG7, and none of Solaris, macOS, or AIX have it.

[Bug testsuite/66005] libgomp make check time is excessive

2023-05-15 Thread tschwinge at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66005

Thomas Schwinge  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED
  Component|libgomp |testsuite
   Keywords||openacc, openmp

--- Comment #7 from Thomas Schwinge  ---
Resolved for GCC 14.  Not planning on backporting to release branches (but
could, if desired).