[Bug tree-optimization/53090] suboptimal ivopt

amker at gcc dot gnu.org Tue, 08 Aug 2017 09:22:21 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53090


amker at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |RESOLVED
         Resolution|---                         |FIXED

--- Comment #10 from amker at gcc dot gnu.org ---
Hmm, It's not mentioned at which optimization level the original bug was
reported.  I suspect O2 because vect_perm instruction is needed after
vectorization.  So current status is:
After ivopt rewriting, we generate below 8 instructions loop at O2
.L14:
        movl    (%r14,%rax,4), %ecx
        movl    (%r14,%rdx,4), %esi
        movl    %esi, (%r14,%rax,4)
        movl    %ecx, (%r14,%rdx,4)
        addq    $1, %rax
        subq    $1, %rdx
        cmpl    %eax, %edx
        jg      .L14

It's better than what was reported.

at O3:
.L14:
        movdqu  (%rsi,%rdx), %xmm2
        movdqa  (%r12,%rax), %xmm0
        pshufd  $27, %xmm2, %xmm1
        pshufd  $27, %xmm0, %xmm0
        movaps  %xmm1, (%r12,%rax)
        addq    $16, %rax
        movups  %xmm0, (%rsi,%rdx)
        subq    $16, %rdx
        cmpq    %rax, %rdi
        jne     .L14

Consider this fixed.

[Bug tree-optimization/53090] suboptimal ivopt

Reply via email to