Richard Henderson <r...@twiddle.net> writes:

> On 10/12/2016 07:21 PM, David Gibson wrote:
>>> +static void gen_bswap32x4(TCGv_i64 outh, TCGv_i64 outl,
>>> +                          TCGv_i64 inh, TCGv_i64 inl)
>>> +{
>>> +    TCGv_i64 hi = tcg_temp_new_i64();
>>> +    TCGv_i64 lo = tcg_temp_new_i64();
>>> +
>>> +    tcg_gen_bswap64_i64(hi, inh);
>>> +    tcg_gen_bswap64_i64(lo, inl);
>>> +    tcg_gen_shri_i64(outh, hi, 32);
>>> +    tcg_gen_deposit_i64(outh, outh, hi, 32, 32);
>>> +    tcg_gen_shri_i64(outl, lo, 32);
>>> +    tcg_gen_deposit_i64(outl, outl, lo, 32, 32);
>>> +
>>> +    tcg_temp_free_i64(hi);
>>> +    tcg_temp_free_i64(lo);
>>> +}
>>
>> Is there actually any advantage to having this 128-bit operation
>> working on two 64-bit "register"s, as opposed to having a bswap32x2
>> that operates on a single 64-bit register amd calling it twice?
>
> For this one, no particular advantage.
>
>>> +    gen_bswap16x8(xth, xtl, xbh, xbl);
>>
>> Likewise for the 16x8 version, I guess, although that would mean
>> changing the existing users.
>
> For this one, we have to build a 64-bit constant, 0x00ff00ff00ff00ff.  On 
> some 
> hosts that's up to 6 insns.  Being about to reuse that for both swaps is 
> useful.
>
>>> +    tcg_gen_bswap64_i64(t0, xbl);
>>> +    tcg_gen_bswap64_i64(xtl, xbh);
>>> +    tcg_gen_bswap64_i64(xth, t0);
>>
>> This looks wrong.  You swap xbl as you move it to t0, then swap it
>> again as you put it back into xth.  So it looks like you'll translate
>>            0011223344556677 8899AABBCCDDEEFF
>> to
>>            8899AABBCCDDEEFF 7766554433221100
>> whereas it should become
>>         FFEEDDCCBBAA9977 7766554433221100
>
> Indeed, the third line should be a move, not a swap.

Correct, will send updated patch.

Regards
Nikunj


Reply via email to