https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104112

--- Comment #8 from Richard Biener <rguenth at gcc dot gnu.org> ---
Oh, and we're also not verifying

      /* The target has to make sure we support lowpart/highpart
         extraction, either via direct vector extract or through
         an integer mode punning.  */

and an alternative would be to do the reduction in the wider mode
and only do a final lowpart extraction, but that would require
the support of some intermediate const permutes so we get

  { 0, 1, 2, 3, 4, 5, 6, 7 }
+ { 4, 5, 6, 7, /* dont-care */ }
+ { 2+6, 3+7, /* dont-care */ }

basically whole-vector shifts by half and a quater of the vector
for one missing intermediate mode and then the appropriate lowpart
of the final vector mode.

code-gen wise with the proposed patch we get no accumulator re-use
while with the above scheme we might be able to re-use it (not sure
if SVE is capable of that or whether that would be profitable).

Without -msve-vector-bits=512 we simply get variable length vector code.
There doesn't seem to be -msve-vector-bits=512,256 or so to enable
both lengths (the compiler could set a static mask to "emulate" 256
with fixed 512 vectors?)

Reply via email to