https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107096

--- Comment #4 from Andrew Stubbs <ams at gcc dot gnu.org> ---
I don't understand rgroups, but I can say that GCN masks are very simply
one-bit-one-lane. There are always 64-lanes, regardless of the type, so V64QI
mode has fewer bytes and bits than V64DImode (when written to memory).

This is different to most other architectures where the bit-size remains the
same and number of lanes varies with the inner type, and has caused us some
issues with invalid assumptions in GCC (e.g. "there's no need for sign-extends
in vector registers" is not true for GCN).

However, I think it's the same as you're describing for AVX512, at least in
this respect.

Incidentally I'm on the cusp of adding multiple "virtual" vector sizes in the
GCN backend (in lieu of implementing full mask support everywhere in the
middle-end and fixing all the cost assumptions), so these VIEW_CONVERT_EXPR
issues are getting worse. I have a bunch of vec_extract patterns that fix up
some of it. Within the backed, the V32, V16, V8, V4 and V2 vectors are all
really just 64-lane vectors with the mask preset, so the mask has to remain
DImode or register allocation becomes tricky.

Reply via email to