https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125309

            Bug ID: 125309
           Summary: When OpenMP barrier used in `noinline` function loses
                    cancellable context
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: matmal01 at gcc dot gnu.org
  Target Milestone: ---

Created attachment 64458
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=64458&action=edit
Reproducer for bug

Running the program generated from the attached code hangs.

It hangs because the barrier in the `orphaned_barrier` function gets lowered to
`GOMP_barrier` instead of `GOMP_barrier_cancel`.

I'm a little doubtful that this is valid input.  What is the enclosing parallel
region in a standalone function?  However I see a testcase in the testsuite
that looks similar to this (libgomp.c/omp-loop03.c) so figure it's at least
worth asking the question.


Since the compiler lowers `orphaned_barrier` to a call to `GOMP_barrier` all
threads which wait on that barrier ignore any `BAR_CANCELLED` flag that gets
set.

The thread which cancels the region sets this `BAR_CANCELLED` flag, and then
skips the call to `orphaned_barrier` because it knows the region is cancelled.

Hence all threads waiting in the barrier wait indefinitely for the last thread
to arrive.

Compiled & ran with the following:
```
vshcmd: > ~/repos/gcc-dir/gcc-install/bin/gcc -fopenmp temp-repro.c -o
temp-repro
vshcmd: > LD_LIBRARY_PATH=/home/mmalcomson/repos/gcc-dir/gcc-install/lib64 \
vshcmd: > OMP_CANCELLATION=true \
vshcmd: >   ./temp-repro
```

Inline testcase to avoid the need for download:
```
#include <omp.h>
#include <stdlib.h>

/* This intentionally uses an orphaned helper barrier.  Current lowering emits
   plain GOMP_barrier for that helper even when it is called from inside a
   cancellable parallel region.  tid == 1 just chooses one stable canceller.
   */

__attribute__((noinline))
static void
orphaned_barrier (void)
{
    /* This barrier gets lowered to a call to `GOMP_barrier`.
       Since it's used inside a cancellable region the generation can have
       `BAR_CANCELLED` set on it.  The thread that cancels the region will not
       bother entering the barrier and hence all threads waiting on this
       barrier will never progress.  */
    #pragma omp barrier
}

int
main (void)
{
    omp_set_dynamic (0);
    omp_set_num_threads (4);

    #pragma omp parallel
    {
        int tid = omp_get_thread_num ();
        if (tid == 1)
        {
            /* This thread cancels the cancellable region.
               It knows that the region is cancelled, so it does not bother
               calling `orphaned_barrier`.
               Since that `orphaned_barrier` got compiled to use `GOMP_barrier`
               all other threads don't notice this region is cancelled and
               continue waiting for the final thread to arrive.  */
            #pragma omp cancel parallel if (1)
        }
        orphaned_barrier ();
    }

    return 0;
}
```

Reply via email to