https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95654

Tom de Vries <vries at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org

--- Comment #11 from Tom de Vries <vries at gcc dot gnu.org> ---
So, at this point we know that duplicating the BB containing VOTE_ANY causes
problems in executing.  But AFAIU, we do not know why.

Is VOTE_ANY not supposed to be duplicated by design? If so, is there any
documentation of that design, that explains that?

At the nvptx level, VOTE_ANY translates to vote.ballot.b32, which does
cross-lane communication, but has defined behaviour in divergent mode AFAICT.
>From that perspective at least, there's no problem with duplicating VOTE_ANY.

My guess at this point, is that duplicating the block with VOTE_ANY has the
effect that the JIT compiler doesn't recognize control flow divergence before
XCHG_IDX, and fails to insert the proper barrier.

And XCHG_IDX translates to shfl.idx.b32, which has undefined behaviour in
divergent mode.

Reply via email to