On 11/10/2018 22:52, Richard Henderson wrote: > For a sequence of loads or stores from a single register, > little-endian operations can be promoted to an 8-byte op. > This can reduce the number of operations by a factor of 8. > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <phi...@redhat.com> > --- > target/arm/translate.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/target/arm/translate.c b/target/arm/translate.c > index 12a744b3c3..09f2d648b7 100644 > --- a/target/arm/translate.c > +++ b/target/arm/translate.c > @@ -5011,6 +5011,16 @@ static int disas_neon_ls_insn(DisasContext *s, > uint32_t insn) > if (size == 3 && (interleave | spacing) != 1) { > return 1; > } > + /* For our purposes, bytes are always little-endian. */ > + if (size == 0) { > + endian = MO_LE; > + } > + /* Consecutive little-endian elements from a single register > + * can be promoted to a larger little-endian operation. > + */ > + if (interleave == 1 && endian == MO_LE) { > + size = 3; > + } > tmp64 = tcg_temp_new_i64(); > addr = tcg_temp_new_i32(); > tmp2 = tcg_const_i32(1 << size); >