https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95654
Tom de Vries <vries at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenth at gcc dot gnu.org --- Comment #11 from Tom de Vries <vries at gcc dot gnu.org> --- So, at this point we know that duplicating the BB containing VOTE_ANY causes problems in executing. But AFAIU, we do not know why. Is VOTE_ANY not supposed to be duplicated by design? If so, is there any documentation of that design, that explains that? At the nvptx level, VOTE_ANY translates to vote.ballot.b32, which does cross-lane communication, but has defined behaviour in divergent mode AFAICT. >From that perspective at least, there's no problem with duplicating VOTE_ANY. My guess at this point, is that duplicating the block with VOTE_ANY has the effect that the JIT compiler doesn't recognize control flow divergence before XCHG_IDX, and fails to insert the proper barrier. And XCHG_IDX translates to shfl.idx.b32, which has undefined behaviour in divergent mode.