https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104783

--- Comment #2 from Tom de Vries <vries at gcc dot gnu.org> ---
Hmm, the atom insn sets a register that is not used anywhere.  So the shuffle
communicating the result doesn't make much sense.

We can fix that by doing:
...
diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index c6cec0c27c2..60d02c02452 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -3265,7 +3265,9 @@ static bool
 nvptx_unisimt_handle_set (rtx set, rtx_insn *insn, rtx master)
 {
   rtx reg;
-  if (GET_CODE (set) == SET && REG_P (reg = SET_DEST (set)))
+  if (GET_CODE (set) == SET
+      && REG_P (reg = SET_DEST (set))
+      && find_reg_note (insn, REG_UNUSED, reg) == NULL_RTX)
     {
       emit_insn_after (nvptx_gen_shuffle (reg, reg, master, SHUFFLE_IDX),
                       insn);
...

But that gives us a warp sync instead of a shuffle:
...
$L2:
ld.u64 %r29,[%r27];
@ %r33 atom.add.u32 %r30,[%r29],1;
bar.warp.sync 0xffffffff;
...
so the problem of the hang persists.

But, if we roll back the recent change of commit 8e5c34ab45f ("[nvptx] Use
nvptx_warpsync / nvptx_uniform_warp_check for -muniform-simt", the test-case
passes.

Reply via email to