ppc: rework vmrg{l, h}{b, h, w} instructions to use Vsr* macros

David Gibson Mon, 28 Jan 2019 20:46:50 -0800

On Sun, Jan 27, 2019 at 10:07:12AM -0800, Richard Henderson wrote:
> On 1/27/19 9:45 AM, Mark Cave-Ayland wrote:
> >> I would expect the i < n/2 loop to be faster, because the assignments are
> >> unconditional.  FWIW.
> > 
> > Do you have any idea as to how much faster? Is it something that would show
> > up as significant within the context of QEMU?
> 
> I don't have any numbers on that, no.
> 
> > As well as eliminating the HI_IDX/LO_IDX constants I do find the updated
> > version much easier to read, so I would prefer to keep it if possible.
> > What about unrolling the loop into 2 separate ones...
> 
> I doubt that would be helpful.
> 
> I would think that
> 
> #define VMRG_DO(name, access, ofs)
> ...
>     int i, half = ARRAY_SIZE(r->access(0)) / 2;
> ...
>     for (i = 0; i < half; i++) {
>         result.access(2 * i + 0) = a->access(i + ofs);
>         result.access(2 * i + 1) = b->access(i + ofs);
>     }
> 
> where OFS = 0 for HI and half for LO is best.  I find it quite readable, and 
> it
> avoids duplicating code between LO and HI as you're currently doing.


Marc, Richard, where are we at with this?

Should I wait on a revised version of this patch before applying the
series?

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

signature.asc
Description: PGP signature

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 2/8] target/ppc: rework vmrg{l, h}{b, h, w} instructions to use Vsr* macros

Reply via email to