[Bug target/105873] [amdgcn][OpenMP] task reductions fail with "team master not responding; slave thread aborting"

jakub at gcc dot gnu.org via Gcc-bugs Tue, 07 Jun 2022 10:32:03 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105873


--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Though, the first #c0 message is different, that is from
static void
gomp_thread_start (struct gomp_thread_pool *pool)
{
  struct gomp_thread *thr = gomp_thread ();

  gomp_sem_init (&thr->release, 0);
  thr->thread_pool = pool;

  /* The loop exits only when "fn" is assigned "gomp_free_pool_helper",
     which contains "s_endpgm", or an infinite no-op loop is
     suspected (this happens when the thread master crashes).  */
  int nul_limit = 99;
  do
    {
      gomp_simple_barrier_wait (&pool->threads_dock);
      if (!thr->fn)
        {
          if (nul_limit-- > 0)
            continue;
          else
            {
              const char msg[] = ("team master not responding;"
                                  " slave thread aborting");
              write (2, msg, sizeof (msg)-1);
              abort();
            }
        }
      thr->fn (thr->data);
      thr->fn = NULL;

      struct gomp_task *task = thr->task;
      gomp_team_barrier_wait_final (&thr->ts.team->barrier);
      gomp_finish_task (task);
    }
  while (1);
}

I guess it can happen for very similar reasons if one uses parallel with
num_threads smaller than the actual number of threads in the hw.
The threads that participate in the work will have thr->fn non-NULL (the
outlined body of the parallel), but the other (idle) threads will have thr->fn
NULL.  I expect even those idle threads will s_barrier which will wait also for
the other threads that actually do something useful, and can do this more than
100 times for various reasons, like the case of spawning 100 short lived tasks
from busy single thread and with some idle threads, or say
#pragma omp target
for (int i = 0; i < 200; ++i)
#pragma omp parallel num_threads(2)
;
etc. - in the above case whenever the number of created hw threads is > 2,
there
will be 200 s_barrier encounters in each thread, for the first 2 of them when
actually doing work (in the second with thr->fn != NULL, in the first it is run
from GOMP_parallel), but the idle threads will have thr->fn == NULL 200 times.

[Bug target/105873] [amdgcn][OpenMP] task reductions fail with "team master not responding; slave thread aborting"

Reply via email to