On 11/24/20 9:28 AM, Richard Henderson wrote:
>> +        tcg_gen_extract_i64(t0, cpu_gpr[a->rt], 0, 16);
>> +        tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rd], t0, 0, 16);
>> +        tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rd], t0, 16, 16);
>> +        tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rd], t0, 32, 16);
>> +        tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rd], t0, 48, 16);
> 
> Actually, this would be better as
> 
>    tcg_gen_ext16u_i64(t0, cpu_gpr[rt]);
>    tcg_gen_muli_i64(cpu_gpr[a->rd], t0, dup_const(1, MO_16));

Hmm, while that's fine for 64-bit hosts (and ideal for x86_64), it's not ideal
for the 32-bit hosts we have left.

This can also be done with

  // replicate lower 16 bits, garbage in upper 32.
  tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rt],
                      cpu_gpr[a->rt], 16, 48);
  // replicate lower 32 bits
  tcg_gen_deposit_i64(cpu_gpr[a->rd], cpu_gpr[a->rd],
                      cpu_gpr[a->rd], 32, 32);


r~

Reply via email to