On 6.6.19. 18:46, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
+    tcg_gen_addi_i64(result, sh, 7);
+    for (i = 7; i >= 1; i--) {
+        tcg_gen_shli_i64(tmp, sh, i * 8);
+        tcg_gen_or_i64(result, result, tmp);
+        tcg_gen_addi_i64(sh, sh, 1);
+    }
Better to replicate sh into the 8 positions and then use one add.

     tcg_gen_muli_i64(sh, sh, 0x0101010101010101ull);
     tcg_gen_addi_i64(hi_result, sh, 0x0001020304050607ull);
     tcg_gen_addi_i64(lo_result, sh, 0x08090a0b0c0d0e0full);

and

     tcg_gen_subfi_i64(hi_result, 0x1011121314151617ull, sh);
     tcg_gen_subfi_i64(lo_result, 0x18191a1b1c1d1e1full, sh);

for lvsr.

I think you are right, this is definitely better way of implementing it. I will adopt your approach in v2.

Kind Regards,

Stefan

r~

Reply via email to