On 4/11/19 12:08 AM, David Hildenbrand wrote: > +} > +static DisasJumpType op_vavg(DisasContext *s, DisasOps *o) > +{
Watch your spacing. > + static const GVecGen3 g[4] = { > + { .fno = gen_helper_gvec_vavg8, }, > + { .fno = gen_helper_gvec_vavg16, }, > + { .fni4 = gen_avg_i32, }, > + { .fni8 = gen_avg_i64, }, > + }; Pondering possible vector expansions. I think one possibility is t1 = (a >> 1) + (b >> 1); We still have the two "0.5 bits" to add back in, plus we round up by adding another 0.5. This means if either lsb is set, then we have carry in to the 1's bit. So: t1 = t1 + ((a | b) & 1); Which leads to tcg_gen_sari_vec(vece, t0, a, 1); tcg_gen_sari_vec(vece, t1, b, 1); tcg_gen_or_vec(vece, t2, a, b); tcg_gen_add_vec(vece, t0, t0, t1); tcg_gen_dupi_vec(vece, t1, 1); tcg_gen_and_vec(vece, t2, t2, t1); tcg_gen_add_vec(vece, t0, t0, t2); { .fnv = gen_avg_vec, .fno = gen_helper_gvec_vavg8, .opc = INDEX_op_sari_vec }, But what you have here is correct and the above is mere optimization so, Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~