https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97043
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |wrong-code Ever confirmed|0 |1 Blocks| |96522 Status|UNCONFIRMED |NEW Last reconfirmed| |2020-09-14 --- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> --- This blocks backporting the fix for PR96522, causing the gcc.dg/vect/pr81410.c testcase to FAIL execution with an unaligned access using an aligned load. The trunk rev. that fixed this is gbc484e250990393e887f7239157cc85ce6fadcce A pragmatic fix might be diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index f6331eeea86..3fdf56f9335 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -2309,9 +2309,8 @@ vect_analyze_slp_instance (vec_info *vinfo, /* The load requires permutation when unrolling exposes a gap either because the group is larger than the SLP group-size or because there is a gap between the groups. */ - && (known_eq (unrolling_factor, 1U) - || (group_size == DR_GROUP_SIZE (first_stmt_info) - && DR_GROUP_GAP (first_stmt_info) == 0))) + && group_size == DR_GROUP_SIZE (first_stmt_info) + && DR_GROUP_GAP (first_stmt_info) == 0) { SLP_TREE_LOAD_PERMUTATION (load_node).release (); continue; with biggest effects eventually on load-lane targets (arm/aarch64) where we then eventually prefer more of those. For the testcase in question we then generate the following, matching trunk movdqa (%rdx), %xmm2 movdqa 16(%rdx), %xmm0 shufpd $1, 32(%rdx), %xmm0 instead of movdqa (%rdx), %xmm1 addq $48, %rdx movdqu -24(%rdx), %xmm2 (or with the backport of PR96522 a wrong movdqa in place of the movdqu). Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96522 [Bug 96522] [9/10 Regression] Incorrect with with -O -fno-tree-pta