https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125816

            Bug ID: 125816
           Summary: word-mode vectorization uses inefficient vector
                    contruction
           Product: gcc
           Version: 17.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
            Blocks: 88670
  Target Milestone: ---

void foo (int *p, int a, int b)
{
  p[0] = a | 2;
  p[1] = b | 3;
}

is vectorized (with -mno-sse -fno-vect-cost-model) to

  _9 = {a_3(D), b_7(D)};
  _10 = VIEW_CONVERT_EXPR<unsigned long>(_9);
  _12 = _10 | 12884901890;
  _13 = VIEW_CONVERT_EXPR<vector(2) int>(_12);
  MEM <vector(2) int> [(int *)p_5(D)] = _13;

which ends up allocating the vector on the stack:

foo:
.LFB0:
        .cfi_startproc
        movq    $0, -8(%rsp)
        movl    %esi, -8(%rsp)
        movq    -8(%rsp), %rax
        movq    %rax, -16(%rsp)
        movl    %edx, -12(%rsp)
        movabsq $12884901890, %rax
        orq     -16(%rsp), %rax
        movq    %rax, (%rdi)
        ret

instead of using a << 32 | b for vector construction.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88670
[Bug 88670] [meta-bug] generic vector extension issues

Reply via email to