https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115711
anlauf at gcc dot gnu.org changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |anlauf at gcc dot gnu.org
--- Comment #1 from anlauf at gcc dot gnu.org ---
Interesting.
At -Ofast and on Skylake, the optimized tree has for the second variant:
A.20_11 = __builtin_alloca_with_align (32, 64);
and I do not see a call to malloc in the assembler, but at -O3 -mavx2:
_9 = __builtin_malloc (32);
which is visible in the assembler.
Similar findings apply to
subroutine foo3(a,b,n)
integer, intent(in) :: n
complex(kind(1d0)) ::a(n)
real(kind(1d0)) ::b(2*n)
b=transfer(a,b)
end
So the questions are: which option of -Ofast does enable the use of alloca,
and what can be done to merge the memcpy? Is it potential aliasing/overlap?