[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2018-12-20 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 Richard Earnshaw changed: What|Removed |Added Status|WAITING |UNCONFIRMED Ever confirmed|1

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-02 Thread cbz at baozis dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #10 from Chen Baozi --- I have attached the testcase I used to benchmark synchronization of OpenMP on AArch64, which is extracted from EPCC OpenMP micro-benchmark suite. The operating system I use is ubuntu 16.04 with 4.4.0 kernel.

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #9 from Jakub Jelinek --- Or of course aarch64 could replace not just futex.h, but wait.h if it has something smarter. This needs to be done by somebody who knows the ISA though (i.e. not me).

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #8 from Andrew Pinski --- Wfe kinda works. But yield might be better. Though one most implementions yield is just a nop. Even on the (4thread/core) hyperthread CN99xx, it is a nop.

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread cbz at baozis dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #7 from Chen Baozi --- Created attachment 40867 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40867=edit synchronization test case

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #6 from Jakub Jelinek --- BTW, aarch64 doesn't override cpu_relax, does the HW have any instruction similar to __builtin_ia32_pause () that could be used for spinning?

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #5 from Jakub Jelinek --- (In reply to Andrew Pinski from comment #4) > Or the case the futex syscall is returning right away ... The spinning is completely configurable, see the documented OMP_WAIT_POLICY (standard) and

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #4 from Andrew Pinski --- Or the case the futex syscall is returning right away ...

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #3 from Andrew Pinski --- I wonder if the case we are spinning too much in user space before hitting the futex syscall.

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 --- Comment #2 from Andrew Pinski --- I should note I have seen some scalability issues with GOMP compared to LLVM's openmp implementation on ThunderX 2 CN99xx (and on ThunderX 1 CN88xx). I don't know if this is related to this or not because I

[Bug libgomp/79784] Synchronization overhead is thrashing on Aarch64

2017-03-01 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79784 Richard Earnshaw changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed|