https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122219
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target|aarch64 |
Version|16.0 |unknown
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Actually it is because in this case it is a partial write and not a full write
of the v256_t .
So my patch does not fix it.
Are you sure you reduced this too far? because the code as reduced has
undefined code in it.
Though initializing r to 0 still does not cause the optimization.
The issue is dealing with the clobber in LIM. -fstack-reuse=none "fixes" it.