On Sat, Aug 16, 2025 at 03:55:40AM +0800, Chao Liu wrote: > This patch fixes a critical bug in the RISC-V vector instruction > translation that caused incorrect data handling in strided load > operations (e.g., vlsseg8e32). > > Problem Description: > > The `get_log2` function in `trans_rvv.c.inc` returned a value 1 higher > than the actual log2 value. For example, get_log2(4) incorrectly > returned 3 instead of 2. > > This led to erroneous vector register offset calculations, resulting in > data overlap where bytes 32-47 were incorrectly copied to positions > 16-31 in ChaCha20 encryption code. > > rvv_test_func: > vsetivli zero, 1, e32, m1, ta, ma > li t0, 64 > > vlsseg8e32.v v0, (a0), t0 > addi a0, a0, 32 > vlsseg8e32.v v8, (a0), t0 > > vssseg8e32.v v0, (a1), t0 > addi a1, a1, 32 > vssseg8e32.v v8, (a1), t0 > ret > > Analysis: > > The original implementation counted the number of right shifts until > zero, including the final shift that reduced the value to zero: > > static inline uint32_t get_log2(uint32_t a) > { > uint32_t i = 0; > for (; a > 0;) { > a >>= 1; > i++; > } > return i; // Returns 3 for a=4 (0b100 → 0b10 → 0b1 → 0b0) > } > > Fix: > > The corrected function stops shifting when only the highest bit remains > and handles the special case of a=0: > > static inline uint32_t get_log2(uint32_t a) > { > uint32_t i = 0; > if (a == 0) { > return i; // Handle edge case > } > for (; a > 1; a >>= 1) { > i++; > } > return i; // Now returns 2 for a=4 > } > > Fixes: 28c12c1f2f ("Generate strided vector loads/stores with tcg nodes.") > > Signed-off-by: Chao Liu <chao....@yeah.net> > --- > target/riscv/insn_trans/trans_rvv.c.inc | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) >
Tested-by: Eric Biggers <ebigg...@kernel.org> But, to get this to apply I had to re-apply the fixed commit (which was reverted), then resolve a conflict. You'll need to send out a new series which applies to the latest master branch. - Eric