https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125816
Bug ID: 125816
Summary: word-mode vectorization uses inefficient vector
contruction
Product: gcc
Version: 17.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Blocks: 88670
Target Milestone: ---
void foo (int *p, int a, int b)
{
p[0] = a | 2;
p[1] = b | 3;
}
is vectorized (with -mno-sse -fno-vect-cost-model) to
_9 = {a_3(D), b_7(D)};
_10 = VIEW_CONVERT_EXPR<unsigned long>(_9);
_12 = _10 | 12884901890;
_13 = VIEW_CONVERT_EXPR<vector(2) int>(_12);
MEM <vector(2) int> [(int *)p_5(D)] = _13;
which ends up allocating the vector on the stack:
foo:
.LFB0:
.cfi_startproc
movq $0, -8(%rsp)
movl %esi, -8(%rsp)
movq -8(%rsp), %rax
movq %rax, -16(%rsp)
movl %edx, -12(%rsp)
movabsq $12884901890, %rax
orq -16(%rsp), %rax
movq %rax, (%rdi)
ret
instead of using a << 32 | b for vector construction.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
[Bug 88670] [meta-bug] generic vector extension issues