On Mon, 20 Jun 2022 at 19:20, Richard Henderson <richard.hender...@linaro.org> wrote: > > We can reuse the SVE functions for implementing moves to/from > horizontal tile slices, but we need new ones for moves to/from > vertical tile slices. > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > target/arm/helper-sme.h | 11 ++++ > target/arm/helper-sve.h | 2 + > target/arm/translate-a64.h | 9 +++ > target/arm/translate.h | 5 ++ > target/arm/sme.decode | 15 +++++ > target/arm/sme_helper.c | 110 ++++++++++++++++++++++++++++++++++++- > target/arm/sve_helper.c | 12 ++++ > target/arm/translate-a64.c | 19 +++++++ > target/arm/translate-sme.c | 105 +++++++++++++++++++++++++++++++++++ > 9 files changed, 287 insertions(+), 1 deletion(-) > > diff --git a/target/arm/helper-sme.h b/target/arm/helper-sme.h > index c4ee1f09e4..600346e08c 100644 > --- a/target/arm/helper-sme.h > +++ b/target/arm/helper-sme.h > @@ -21,3 +21,14 @@ DEF_HELPER_FLAGS_2(set_pstate_sm, TCG_CALL_NO_RWG, void, > env, i32) > DEF_HELPER_FLAGS_2(set_pstate_za, TCG_CALL_NO_RWG, void, env, i32) > > DEF_HELPER_FLAGS_3(sme_zero, TCG_CALL_NO_RWG, void, env, i32, i32) > + > +DEF_HELPER_FLAGS_4(sme_mova_avz_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32) > +DEF_HELPER_FLAGS_4(sme_mova_zav_b, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
What do the 'avz' and 'zav' stand for here? I thought that 'zav' might mean "from the ZA storage to a Vector", but then what is 'avz' ? > +static TCGv_ptr get_tile_rowcol(DisasContext *s, int esz, int rs, > + int tile_index, bool vertical) > +{ > + int tile = tile_index >> (4 - esz); > + int index = esz == MO_128 ? 0 : extract32(tile_index, 0, 4 - esz); > + int pos, len, offset; > + TCGv_i32 t_index; > + TCGv_ptr addr; > + > + /* Resolve tile.size[index] to an untyped ZA slice index. */ > + t_index = tcg_temp_new_i32(); > + tcg_gen_trunc_tl_i32(t_index, cpu_reg(s, rs)); > + tcg_gen_addi_i32(t_index, t_index, index); > + > + len = ctz32(s->svl) - esz; > + pos = esz; > + offset = tile; > + > + /* > + * Horizontal slice. Index row N, column 0. > + * The helper will iterate by the element size. > + */ > + if (!vertical) { > + pos += ctz32(sizeof(ARMVectorReg)); > + offset *= sizeof(ARMVectorReg); > + } > + offset += offsetof(CPUARMState, zarray); > + > + tcg_gen_deposit_z_i32(t_index, t_index, pos, len); > + tcg_gen_addi_i32(t_index, t_index, offset); > + > + /* > + * Vertical tile slice. Index row 0, column N. > + * The helper will iterate by the row spacing in the array. > + * Need to adjust addressing for elements smaller than uint64_t for BE. > + */ > + if (HOST_BIG_ENDIAN && vertical && esz < MO_64) { > + tcg_gen_xori_i32(t_index, t_index, 8 - (1 << esz)); > + } > + > + addr = tcg_temp_new_ptr(); > + tcg_gen_ext_i32_ptr(addr, t_index); > + tcg_temp_free_i32(t_index); > + tcg_gen_add_ptr(addr, addr, cpu_env); > + > + return addr; > +} This is too confusing -- I spent half an hour looking at it and couldn't figure out if it was correct or not. I can see roughly what it's supposed to be doing but I don't really want to try to reverse engineer the details from the sequence of operations. Eg the way we sometimes just add in the tile number and sometimes add in the tile number * the size of a vector reg looks very strange; I figured out that the deposit op is doing the equivalent of the pseudocode's "MOD dim" on the slice index but it doesn't say so and the calculation of len and pos is kind of obscure to me. Perhaps (a) more commentary and (b) separating out the horizontal and vertical cases would help ? thanks -- PMM