On 4/11/19 12:08 AM, David Hildenbrand wrote:
> +}
> +static DisasJumpType op_vavg(DisasContext *s, DisasOps *o)
> +{

Watch your spacing.


> +    static const GVecGen3 g[4] = {
> +        { .fno = gen_helper_gvec_vavg8, },
> +        { .fno = gen_helper_gvec_vavg16, },
> +        { .fni4 = gen_avg_i32, },
> +        { .fni8 = gen_avg_i64, },
> +    };

Pondering possible vector expansions.  I think one possibility is

  t1 = (a >> 1) + (b >> 1);

We still have the two "0.5 bits" to add back in, plus we round up by adding
another 0.5.  This means if either lsb is set, then we have carry in to the 1's
bit.  So:

  t1 = t1 + ((a | b) & 1);

Which leads to

  tcg_gen_sari_vec(vece, t0, a, 1);
  tcg_gen_sari_vec(vece, t1, b, 1);
  tcg_gen_or_vec(vece, t2, a, b);
  tcg_gen_add_vec(vece, t0, t0, t1);
  tcg_gen_dupi_vec(vece, t1, 1);
  tcg_gen_and_vec(vece, t2, t2, t1);
  tcg_gen_add_vec(vece, t0, t0, t2);

  { .fnv = gen_avg_vec,
    .fno = gen_helper_gvec_vavg8,
    .opc = INDEX_op_sari_vec },

But what you have here is correct and the above is mere optimization so,
Reviewed-by: Richard Henderson <richard.hender...@linaro.org>


r~

Reply via email to