https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63185

--- Comment #6 from Marc Glisse <glisse at gcc dot gnu.org> ---
In addition to the issues already described, it seems that we generate better
code if I replace the VLAs with calls to alloca. Indeed, we assume that alloca
returns 16-aligned memory, while with __builtin_alloca_with_align(..., 64), we
don't seem to have code to turn it into __builtin_alloca_with_align(..., 128)
so we could avoid all the loop adjustment code.

Reply via email to