Hi Richard,

Thanks for the review!

The 07/23/2018 18:46, Richard Biener wrote:
> On July 23, 2018 7:01:23 PM GMT+02:00, Tamar Christina 
> <tamar.christ...@arm.com> wrote:
> >Hi All,
> >
> >This allows copy_blkmode_to_reg to perform larger copies when it is
> >safe to do so by calculating
> >the bitsize per iteration doing the maximum copy allowed that does not
> >read more
> >than the amount of bits left to copy.
> >
> >Strictly speaking, this copying is only done if:
> >
> >  1. the target supports fast unaligned access
> >  2. no padding is being used.
> >
> >This should avoid the issues of the first patch (PR85123) but still
> >work for targets that are safe
> >to do so.
> >
> >Original patch https://gcc.gnu.org/ml/gcc-patches/2017-11/msg01088.html
> >Previous respin
> >https://gcc.gnu.org/ml/gcc-patches/2018-04/msg00239.html
> >
> >
> >This produces for the copying of a 3 byte structure:
> >
> >fun3:
> >     adrp    x1, .LANCHOR0
> >     add     x1, x1, :lo12:.LANCHOR0
> >     mov     x0, 0
> >     sub     sp, sp, #16
> >     ldrh    w2, [x1, 16]
> >     ldrb    w1, [x1, 18]
> >     add     sp, sp, 16
> >     bfi     x0, x2, 0, 16
> >     bfi     x0, x1, 16, 8
> >     ret
> >
> >whereas before it was producing
> >
> >fun3:
> >     adrp    x0, .LANCHOR0
> >     add     x2, x0, :lo12:.LANCHOR0
> >     sub     sp, sp, #16
> >     ldrh    w1, [x0, #:lo12:.LANCHOR0]
> >     ldrb    w0, [x2, 2]
> >     strh    w1, [sp, 8]
> >     strb    w0, [sp, 10]
> >     ldr     w0, [sp, 8]
> >     add     sp, sp, 16
> >     ret
> >
> >Cross compiled and regtested on
> >  aarch64_be-none-elf
> >  armeb-none-eabi
> >and no issues
> >
> >Boostrapped and regtested
> > aarch64-none-linux-gnu
> > x86_64-pc-linux-gnu
> > powerpc64-unknown-linux-gnu
> > arm-none-linux-gnueabihf
> >
> >and found no issues.
> >
> >OK for trunk?
> 
> How does this affect store-to-load forwarding when the source is initialized 
> piecewise? IMHO we should avoid larger loads but generate larger stores when 
> possible. 
> 
> How do non-x86 architectures behave with respect to STLF? 
>

I should have made it more explicit in my cover letter, but this only covers 
reg to reg copies.
So the store-t-load forwarding shouldn't really come to play here, unless I'm 
missing something

The example in my patch shows that the loads from mem are mostly unaffected.

For x86 the change is also quite significant, e.g for a 5 byte struct load it 
used to generate

fun5:
        movl    foo5(%rip), %eax
        movl    %eax, %edi
        movzbl  %al, %edx
        movzbl  %ah, %eax
        movb    %al, %dh
        movzbl  foo5+2(%rip), %eax
        shrl    $24, %edi
        salq    $16, %rax
        movq    %rax, %rsi
        movzbl  %dil, %eax
        salq    $24, %rax
        movq    %rax, %rcx
        movq    %rdx, %rax
        movzbl  foo5+4(%rip), %edx
        orq     %rsi, %rax
        salq    $32, %rdx
        orq     %rcx, %rax
        orq     %rdx, %rax
        ret

instead of

fun5:
        movzbl  foo5+4(%rip), %eax
        salq    $32, %rax
        movq    %rax, %rdx
        movl    foo5(%rip), %eax
        orq     %rdx, %rax
        ret

so the loads themselves are unaffected.

Thanks,
Tamar
 
> Richard. 
> 
> >Thanks,
> >Tamar
> >
> >gcc/
> >2018-07-23  Tamar Christina  <tamar.christ...@arm.com>
> >
> >     * expr.c (copy_blkmode_to_reg): Perform larger copies when safe.
> 

-- 

Reply via email to