On 9/1/20 1:41 PM, Tobias Burnus wrote:
> Hi Tom, hello all,
>
> it turned out that the testcase fails on PowerPC (but not x86_64)
> as the nvptx lto complains: unresolved symbol
> __sync_val_compare_and_swap_16
>
> The testcase uses int128 – and that's the culprit, but I have no idea
> why it only fails with PowerPC and not with x86-64.
>
Well, I'm guessing the explanation is here in omp-expand.c:
...
/* Expand an GIMPLE_OMP_ATOMIC statement. We try to expand
using expand_omp_atomic_fetch_op. If it failed, we try to
call expand_omp_atomic_pipeline, and if it fails too, the
ultimate fallback is wrapping the operation in a mutex
(expand_omp_atomic_mutex). REGION is the atomic region built
by build_omp_regions_1(). */
static void
expand_omp_atomic (struct omp_region *region)
...
In the x86_64 case, when doing:
...
$ gcc-11 src/libgomp/testsuite/libgomp.c-c++-common/reduction-16.c
-fdump-tree-all-details -fopenmp
...
we get:
...
<bb 33> :
D.3382 = .omp_data_i->res;
GOMP_atomic_start ();
D.3383 = MEM[(__int128 * {ref-all})D.3382];
<bb 34> :
D.3384 = (_Bool) D.3383;
if (D.3384 != 0)
goto <bb 35>; [INV]
else
goto <bb 37>; [INV]
<bb 38> :
MEM[(__int128 * {ref-all})D.3382] = iftmp.80;
GOMP_atomic_end ();
...
which means we're triggering the "expand_omp_atomic_mutex" case for x86_64.
Apparently we're triggering the "expand_omp_atomic_pipeline" for powerpc.
> Unless someone sees a good way to implement __sync_val_compare_and_swap_16,
Hmm, one could implement it in the compiler using calls to
GOMP_atomic_start/GOMP_atomic_end, but it feels somewhat hacky.
Thanks,
- Tom