https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125309
Bug ID: 125309
Summary: When OpenMP barrier used in `noinline` function loses
cancellable context
Product: gcc
Version: 16.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: matmal01 at gcc dot gnu.org
Target Milestone: ---
Created attachment 64458
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64458&action=edit
Reproducer for bug
Running the program generated from the attached code hangs.
It hangs because the barrier in the `orphaned_barrier` function gets lowered to
`GOMP_barrier` instead of `GOMP_barrier_cancel`.
I'm a little doubtful that this is valid input. What is the enclosing parallel
region in a standalone function? However I see a testcase in the testsuite
that looks similar to this (libgomp.c/omp-loop03.c) so figure it's at least
worth asking the question.
Since the compiler lowers `orphaned_barrier` to a call to `GOMP_barrier` all
threads which wait on that barrier ignore any `BAR_CANCELLED` flag that gets
set.
The thread which cancels the region sets this `BAR_CANCELLED` flag, and then
skips the call to `orphaned_barrier` because it knows the region is cancelled.
Hence all threads waiting in the barrier wait indefinitely for the last thread
to arrive.
Compiled & ran with the following:
```
vshcmd: > ~/repos/gcc-dir/gcc-install/bin/gcc -fopenmp temp-repro.c -o
temp-repro
vshcmd: > LD_LIBRARY_PATH=/home/mmalcomson/repos/gcc-dir/gcc-install/lib64 \
vshcmd: > OMP_CANCELLATION=true \
vshcmd: > ./temp-repro
```
Inline testcase to avoid the need for download:
```
#include <omp.h>
#include <stdlib.h>
/* This intentionally uses an orphaned helper barrier. Current lowering emits
plain GOMP_barrier for that helper even when it is called from inside a
cancellable parallel region. tid == 1 just chooses one stable canceller.
*/
__attribute__((noinline))
static void
orphaned_barrier (void)
{
/* This barrier gets lowered to a call to `GOMP_barrier`.
Since it's used inside a cancellable region the generation can have
`BAR_CANCELLED` set on it. The thread that cancels the region will not
bother entering the barrier and hence all threads waiting on this
barrier will never progress. */
#pragma omp barrier
}
int
main (void)
{
omp_set_dynamic (0);
omp_set_num_threads (4);
#pragma omp parallel
{
int tid = omp_get_thread_num ();
if (tid == 1)
{
/* This thread cancels the cancellable region.
It knows that the region is cancelled, so it does not bother
calling `orphaned_barrier`.
Since that `orphaned_barrier` got compiled to use `GOMP_barrier`
all other threads don't notice this region is cancelled and
continue waiting for the final thread to arrive. */
#pragma omp cancel parallel if (1)
}
orphaned_barrier ();
}
return 0;
}
```