On Fri, 30 Apr 2021 at 21:34, Richard Henderson <richard.hender...@linaro.org> wrote: > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > v2: Shift values are always signed (laurent desnogues). > --- > target/arm/helper-sve.h | 54 ++++++++++++++++++++++++++ > target/arm/sve.decode | 17 +++++++++ > target/arm/sve_helper.c | 78 ++++++++++++++++++++++++++++++++++++++ > target/arm/translate-sve.c | 18 +++++++++ > 4 files changed, 167 insertions(+) > /* Note that vector data is stored in host-endian 64-bit chunks, > @@ -561,6 +562,83 @@ DO_ZPZZ(sve2_uadalp_zpzz_h, uint16_t, H1_2, do_uadalp_h) > DO_ZPZZ(sve2_uadalp_zpzz_s, uint32_t, H1_4, do_uadalp_s) > DO_ZPZZ_D(sve2_uadalp_zpzz_d, uint64_t, do_uadalp_d) > > +#define do_srshl_b(n, m) do_sqrshl_bhs(n, m, 8, true, NULL) > +#define do_srshl_h(n, m) do_sqrshl_bhs(n, m, 16, true, NULL) > +#define do_srshl_s(n, m) do_sqrshl_bhs(n, m, 32, true, NULL) > +#define do_srshl_d(n, m) do_sqrshl_d(n, m, true, NULL) > + > +DO_ZPZZ(sve2_srshl_zpzz_b, int8_t, H1_2, do_srshl_b) > +DO_ZPZZ(sve2_srshl_zpzz_h, int16_t, H1_2, do_srshl_h) > +DO_ZPZZ(sve2_srshl_zpzz_s, int32_t, H1_4, do_srshl_s) > +DO_ZPZZ_D(sve2_srshl_zpzz_d, int64_t, do_srshl_d)
Should the _b version really be using H1_2 ? Elsewhere the b/h/s/d usage is H1/H1_2/H1_4/"". Running whatever tests you have on a bigendian host would probably be a good idea. Similarly below. > + > +#define do_urshl_b(n, m) do_uqrshl_bhs(n, (int8_t)m, 8, true, NULL) > +#define do_urshl_h(n, m) do_uqrshl_bhs(n, (int16_t)m, 16, true, NULL) > +#define do_urshl_s(n, m) do_uqrshl_bhs(n, m, 32, true, NULL) > +#define do_urshl_d(n, m) do_uqrshl_d(n, m, true, NULL) > + > +DO_ZPZZ(sve2_urshl_zpzz_b, uint8_t, H1_2, do_urshl_b) > +DO_ZPZZ(sve2_urshl_zpzz_h, uint16_t, H1_2, do_urshl_h) > +DO_ZPZZ(sve2_urshl_zpzz_s, uint32_t, H1_4, do_urshl_s) > +DO_ZPZZ_D(sve2_urshl_zpzz_d, uint64_t, do_urshl_d) > + > +/* Unlike the NEON and AdvSIMD versions, there is no QC bit to set. */ > +#define do_sqshl_b(n, m) \ > + ({ uint32_t discard; do_sqrshl_bhs(n, m, 8, false, &discard); }) > +#define do_sqshl_h(n, m) \ > + ({ uint32_t discard; do_sqrshl_bhs(n, m, 16, false, &discard); }) > +#define do_sqshl_s(n, m) \ > + ({ uint32_t discard; do_sqrshl_bhs(n, m, 32, false, &discard); }) > +#define do_sqshl_d(n, m) \ > + ({ uint32_t discard; do_sqrshl_d(n, m, false, &discard); }) Why pass in &discard rather than just NULL ? (Similarly below.) thanks -- PMM