Richard Henderson <r...@twiddle.net> writes: > On 10/12/2016 07:21 PM, David Gibson wrote: >>> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl, >>> + TCGv_i64 inh, TCGv_i64 inl) >>> +{ >>> + TCGv_i64 hi = tcg_temp_new_i64(); >>> + TCGv_i64 lo = tcg_temp_new_i64(); >>> + >>> + tcg_gen_bswap64_i64(hi, inh); >>> + tcg_gen_bswap64_i64(lo, inl); >>> + tcg_gen_shri_i64(outh, hi, 32); >>> + tcg_gen_deposit_i64(outh, outh, hi, 32, 32); >>> + tcg_gen_shri_i64(outl, lo, 32); >>> + tcg_gen_deposit_i64(outl, outl, lo, 32, 32); >>> + >>> + tcg_temp_free_i64(hi); >>> + tcg_temp_free_i64(lo); >>> +} >> >> Is there actually any advantage to having this 128-bit operation >> working on two 64-bit "register"s, as opposed to having a bswap32x2 >> that operates on a single 64-bit register amd calling it twice? > > For this one, no particular advantage. > >>> + gen_bswap16x8(xth, xtl, xbh, xbl); >> >> Likewise for the 16x8 version, I guess, although that would mean >> changing the existing users. > > For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff. On > some > hosts that's up to 6 insns. Being about to reuse that for both swaps is > useful. > >>> + tcg_gen_bswap64_i64(t0, xbl); >>> + tcg_gen_bswap64_i64(xtl, xbh); >>> + tcg_gen_bswap64_i64(xth, t0); >> >> This looks wrong. You swap xbl as you move it to t0, then swap it >> again as you put it back into xth. So it looks like you'll translate >> 0011223344556677 8899AABBCCDDEEFF >> to >> 8899AABBCCDDEEFF 7766554433221100 >> whereas it should become >> FFEEDDCCBBAA9977 7766554433221100 > > Indeed, the third line should be a move, not a swap.
Correct, will send updated patch. Regards Nikunj