On Tue, May 15, 2018 at 08:00:49AM -0500, Wilco Dijkstra wrote:
> 
> ping

This seems like a fairly horrible hack around the register allocator
behaviour.

BUt, OK.

James

> This patch improves register allocation of fma by preferring to update the
> accumulator register.  This is done by adding fma insns with operand 1 as the
> accumulator.  The register allocator considers copy preferences only in 
> operand
> order, so if the first operand is dead, it has the highest chance of being
> reused as the destination.  As a result code using fma often has a better
> register allocation.  Performance of SPECFP2017 improves by over 0.5% on some
> implementations, while it had no effect on other implementations.  Fma is more
> readable too, in a simple example we now generate:
> 
>         fmadd   s16, s2, s1, s16
>         fmadd   s7, s17, s16, s7
>         fmadd   s6, s16, s7, s6
>         fmadd   s5, s7, s6, s5
> 
> instead of:
> 
>         fmadd   s16, s16, s2, s1
>         fmadd   s7, s7, s16, s6
>         fmadd   s6, s6, s7, s5
>         fmadd   s5, s5, s6, s4
> 
> Bootstrap OK. OK for commit?
> 
> ChangeLog:
> 2018-01-04  Wilco Dijkstra  <wdijk...@arm.com>
> 
>     gcc/
>         * config/aarch64/aarch64.md (fma<mode>4): Change into expand pattern.
>         (fnma<mode>4): Likewise.
>         (fms<mode>4): Likewise.
>         (fnms<mode>4): Likewise.
>         (aarch64_fma<mode>4): Rename insn, reorder accumulator operand.
>         (aarch64_fnma<mode>4): Likewise.
>         (aarch64_fms<mode>4): Likewise.
>         (aarch64_fnms<mode>4): Likewise.
>         (aarch64_fnmadd<mode>4): Likewise.
      

Reply via email to