https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124362

            Bug ID: 124362
           Summary: [16 Regression][OpenMP with GCN offload] Wrong-code
                    since r16-7825-gf23a339a686ed6 (Backpropagate more
                    equivalences in DOM)
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: openmp, wrong-code
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: burnus at gcc dot gnu.org
                CC: ams at gcc dot gnu.org, law at gcc dot gnu.org
  Target Milestone: ---
            Target: gcn

This is about an OpenMP with offloading to an AMD GCN GPU (here: gfx90a/MI210)
and using the test progream omptests' t-dpf,
https://github.com/doru1004/omptests/blob/main/t-dpf/test.cpp

Before commit r16-7825-gf23a339a686ed6, the testcase is successful.
After the commit, running the program fails with:

$ .../g++ -fopenmp -O3 t-dpf/test.cpp -foffload=amdgcn-amdhsa
$ LD_LIBRARY_PATH=... ./a.out

no schedule clauses
GCN Kernel Aborted


The change between successful (17× "Succeeded") → 'GCN Kernel Aborted' seems to
happen since


commit f23a339a686ed6cc6a4838459bc220e48ba901cb
Author: Jeff Law
Date:   Sat Feb 28 08:54:23 2026 -0700

    [PR tree-optimization/90036] Backpropagate more equivalences in DOM
...
            PR tree-optimization/90036
    gcc/
            * tree-ssa-dom.cc (back_propagate_equivalences): Accept new
            argument for available expression stack.  Lookup equivalences
            in the expression hash table too.  If an expression hash a
            known constant value, record it and recurse.
            (record_temporary_equivalences): Back propagate for a subset of
            edge equivalences.

* * *

Disclaimer: The problem seems to be some kind of race or uninitialized variable
or ... because when it fails varies.

- In the output above, it failed right away for the first subtest.
- Running it again, it was successful for the first six tests – only failed for
the 7th test, namely:

no schedule clauses
Succeeded
schedule static no chunk
Succeeded
schedule static chunk
Succeeded
schedule dynamic no chunk
Succeeded
schedule dynamic chunk
Succeeded
dist_schedule static no chunk
Succeeded
dist_schedule static chunk
GCN Kernel Aborted
team master not responding; slave thread abortingteam master not responding;
slave thread abortingteam master not responding; slave thread abortingteam
master not responding; slave thread abortingteam master not responding; slave
thread abortingteam master not responding; slave thread abortingteam master not
responding; slave thread abortingteam master not responding; slave thread
abortingteam master not responding; slave thread abortingteam master not
responding; slave thread abortingteam master not responding; slave thread
abortingteam master not responding; slave thread abortingteam master not
responding; slave thread abortingteam master not responding; slave thread
abortingGCN Kernel Aborted
GCN Kernel Aborted
GCN Kernel Aborted
GCN Kernel Aborted


- Trying it again, it was successful for 16 subtests and only failed for the
17th.
- And running it again: 3× successful and then the 'GCN Kernel Aborted'.

* * *

This nature of the problem makes it harder to reduce the testcase. I tried
commenting out some chunks in the code and running it in the loop – but that
often made the whole program pass. :-/

Reply via email to