Richard Henderson <r...@twiddle.net> writes:
> On 10/12/2016 07:21 PM, David Gibson wrote:
>>> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
>>> + TCGv_i64 inh, TCGv_i64 inl)
>>> + TCGv_i64 hi = tcg_temp_new_i64();
>>> + TCGv_i64 lo = tcg_temp_new_i64();
>>> + tcg_gen_bswap64_i64(hi, inh);
>>> + tcg_gen_bswap64_i64(lo, inl);
>>> + tcg_gen_shri_i64(outh, hi, 32);
>>> + tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
>>> + tcg_gen_shri_i64(outl, lo, 32);
>>> + tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
>>> + tcg_temp_free_i64(hi);
>>> + tcg_temp_free_i64(lo);
>> Is there actually any advantage to having this 128-bit operation
>> working on two 64-bit "register"s, as opposed to having a bswap32x2
>> that operates on a single 64-bit register amd calling it twice?
> For this one, no particular advantage.
>>> + gen_bswap16x8(xth, xtl, xbh, xbl);
>> Likewise for the 16x8 version, I guess, although that would mean
>> changing the existing users.
> For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff. On
> hosts that's up to 6 insns. Being about to reuse that for both swaps is
>>> + tcg_gen_bswap64_i64(t0, xbl);
>>> + tcg_gen_bswap64_i64(xtl, xbh);
>>> + tcg_gen_bswap64_i64(xth, t0);
>> This looks wrong. You swap xbl as you move it to t0, then swap it
>> again as you put it back into xth. So it looks like you'll translate
>> 0011223344556677 8899AABBCCDDEEFF
>> 8899AABBCCDDEEFF 7766554433221100
>> whereas it should become
>> FFEEDDCCBBAA9977 7766554433221100
> Indeed, the third line should be a move, not a swap.
Correct, will send updated patch.