On Fri, 8 May 2020 at 16:22, Richard Henderson <richard.hender...@linaro.org> wrote: > > Create vectorized versions of handle_shri_with_rndacc > for shift+round and shift+round+accumulate. Add out-of-line > helpers in preparation for longer vector lengths from SVE. > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
> + /* tszimm encoding produces immediates in the range [1..esize] */ > + tcg_debug_assert(shift > 0); > + tcg_debug_assert(shift <= (8 << vece)); > + > + if (shift == (8 << vece)) { > + /* > + * Shifts larger than the element size are architecturally valid. > + * Signed results in all sign bits. With rounding, this produces > + * (-1 + 1) >> 1 == 0, or (0 + 1) >> 1 == 0. > + * I.e. always zero. > + */ > + tcg_gen_gvec_dup_imm(vece, rd_ofs, opr_sz, max_sz, 0); Knew I'd forgotten a review comment -- should this "dup_imm to clear to zeroes" be using a fixed element size rather than 'vece' to avoid the "dup_imm doesn't handle 128 bits" issue? (Similarly elsewhere.) > + } else { > + tcg_gen_gvec_2i(rd_ofs, rm_ofs, opr_sz, max_sz, shift, &ops[vece]); > + } > +} thanks -- PMM