On Tue, 8 Jun 2021 at 23:10, Richard Henderson <richard.hender...@linaro.org> wrote: > > On 6/7/21 9:57 AM, Peter Maydell wrote: > > Implement the MVE VCLZ insn (and the necessary machinery > > for MVE 1-input vector ops). > > > > Note that for non-load instructions predication is always performed > > at a byte level granularity regardless of element size (R_ZLSJ), > > and so the masking logic here differs from that used in the VLDR > > and VSTR helpers. > > > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
> > + > > +/* > > + * Take the bottom bits of mask (which is 1 bit per lane) and > > + * convert to a mask which has 1s in each byte which is predicated. > > + */ > > +static uint8_t mask_to_bytemask1(uint16_t mask) > > +{ > > + return (mask & 1) ? 0xff : 0; > > +} > > + > > +static uint16_t mask_to_bytemask2(uint16_t mask) > > +{ > > + static const uint16_t masks[] = { 0x0000, 0x00ff, 0xff00, 0xffff }; > > + return masks[mask & 3]; > > +} > > + > > +static uint32_t mask_to_bytemask4(uint16_t mask) > > +{ > > + static const uint32_t masks[] = { > > + 0x00000000, 0x000000ff, 0x0000ff00, 0x0000ffff, > > + 0x00ff0000, 0x00ff00ff, 0x00ffff00, 0x00ffffff, > > + 0xff000000, 0xff0000ff, 0xff00ff00, 0xff00ffff, > > + 0xffff0000, 0xffff00ff, 0xffffff00, 0xffffffff, > > + }; > > I'll note that > > (1) the values for the mask_to_bytemask2 array overlap the first 4 values of > the mask_to_bytemask4 array, and > > (2) both of these overlap with the larger > > static inline uint64_t expand_pred_b(uint8_t byte) > > from SVE. It'd be nice to share the storage, whatever the actual functional > interface into the array. Yeah, I guess so. I didn't really feel like trying to abstract that out... > > +#define DO_1OP(OP, ESIZE, TYPE, H, FN) \ > > + void HELPER(mve_##OP)(CPUARMState *env, void *vd, void *vm) \ > > + { \ > > + TYPE *d = vd, *m = vm; \ > > + uint16_t mask = mve_element_mask(env); \ > > + unsigned e; \ > > + for (e = 0; e < 16 / ESIZE; e++, mask >>= ESIZE) { \ > > + TYPE r = FN(m[H(e)]); \ > > + uint64_t bytemask = mask_to_bytemask##ESIZE(mask); \ > > Why uint64_t and not TYPE? Or uint32_t? A later patch adds the mask_to_bytemask8(), so I wanted a type that was definitely unsigned (so TYPE isn't any good) and which was definitely big enough for 64 bits. > > + if (!mve_eci_check(s)) { > > + return true; > > + } > > + > > + if (!vfp_access_check(s)) { > > + return true; > > + } > > Not the first instance, but is it worth saving 4 lines per and combining these > into one IF? Yes, I think so. -- PMM