Re: [Qemu-devel] Re: [PATCH] target-ppc: ext32u instead of andi with constant

2011-03-22 Thread Aurelien Jarno
On Tue, Mar 22, 2011 at 08:36:26AM +0100, Alexander Graf wrote:
 
 On 22.03.2011, at 07:41, Aurelien Jarno wrote:
 
  Cc: Alexander Graf ag...@suse.de
  Signed-off-by: Aurelien Jarno aurel...@aurel32.net
  ---
  target-ppc/translate.c |8 
  1 files changed, 4 insertions(+), 4 deletions(-)
  
  diff --git a/target-ppc/translate.c b/target-ppc/translate.c
  index 3d265e3..49eab28 100644
  --- a/target-ppc/translate.c
  +++ b/target-ppc/translate.c
  @@ -6975,7 +6975,7 @@ static inline void gen_evmergelo(DisasContext *ctx)
  #if defined(TARGET_PPC64)
  TCGv t0 = tcg_temp_new();
  TCGv t1 = tcg_temp_new();
  -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx-opcode)], 0xLL);
  +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx-opcode)]);
  tcg_gen_shli_tl(t1, cpu_gpr[rA(ctx-opcode)], 32);
  tcg_gen_or_tl(cpu_gpr[rD(ctx-opcode)], t0, t1);
  tcg_temp_free(t0);
  @@ -6994,7 +6994,7 @@ static inline void gen_evmergehilo(DisasContext *ctx)
  #if defined(TARGET_PPC64)
  TCGv t0 = tcg_temp_new();
  TCGv t1 = tcg_temp_new();
  -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx-opcode)], 0xLL);
  +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx-opcode)]);
  tcg_gen_andi_tl(t1, cpu_gpr[rA(ctx-opcode)], 0x000ULL);
 
 Wouldn't deposit make sense here? But that can be a later optimization.

Indeed it makes sense here, the thing is that I don't really know how 
deposit is going. We have merged it, but we don't have the expected 
performance (ie no improvement).

  tcg_gen_or_tl(cpu_gpr[rD(ctx-opcode)], t0, t1);
  tcg_temp_free(t0);
  @@ -7083,14 +7083,14 @@ static inline void gen_evsel(DisasContext *ctx)
  tcg_gen_andi_i32(t0, cpu_crf[ctx-opcode  0x07], 1  2);
  tcg_gen_brcondi_i32(TCG_COND_EQ, t0, 0, l3);
  #if defined(TARGET_PPC64)
  -tcg_gen_andi_tl(t2, cpu_gpr[rA(ctx-opcode)], 0xULL);
  +tcg_gen_ext32u_tl(t2, cpu_gpr[rA(ctx-opcode)]);
  #else
  tcg_gen_mov_tl(cpu_gpr[rD(ctx-opcode)], cpu_gpr[rA(ctx-opcode)]);
  #endif
  tcg_gen_br(l4);
  gen_set_label(l3);
  #if defined(TARGET_PPC64)
  -tcg_gen_andi_tl(t2, cpu_gpr[rB(ctx-opcode)], 0xULL);
  +tcg_gen_ext32u_tl(t2, cpu_gpr[rB(ctx-opcode)]);
  #else
  tcg_gen_mov_tl(cpu_gpr[rD(ctx-opcode)], cpu_gpr[rB(ctx-opcode)]);
  #endif
 
 Acked-by: Alexander Graf ag...@suse.de
 
 
 Alex
 
 
 

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



Re: [Qemu-devel] Re: [PATCH] target-ppc: ext32u instead of andi with constant

2011-03-22 Thread Alexander Graf

Am 22.03.2011 um 10:09 schrieb Aurelien Jarno aurel...@aurel32.net:

 On Tue, Mar 22, 2011 at 08:36:26AM +0100, Alexander Graf wrote:
 
 On 22.03.2011, at 07:41, Aurelien Jarno wrote:
 
 Cc: Alexander Graf ag...@suse.de
 Signed-off-by: Aurelien Jarno aurel...@aurel32.net
 ---
 target-ppc/translate.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)
 
 diff --git a/target-ppc/translate.c b/target-ppc/translate.c
 index 3d265e3..49eab28 100644
 --- a/target-ppc/translate.c
 +++ b/target-ppc/translate.c
 @@ -6975,7 +6975,7 @@ static inline void gen_evmergelo(DisasContext *ctx)
 #if defined(TARGET_PPC64)
TCGv t0 = tcg_temp_new();
TCGv t1 = tcg_temp_new();
 -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx-opcode)], 0xLL);
 +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx-opcode)]);
tcg_gen_shli_tl(t1, cpu_gpr[rA(ctx-opcode)], 32);
tcg_gen_or_tl(cpu_gpr[rD(ctx-opcode)], t0, t1);
tcg_temp_free(t0);
 @@ -6994,7 +6994,7 @@ static inline void gen_evmergehilo(DisasContext *ctx)
 #if defined(TARGET_PPC64)
TCGv t0 = tcg_temp_new();
TCGv t1 = tcg_temp_new();
 -tcg_gen_andi_tl(t0, cpu_gpr[rB(ctx-opcode)], 0xLL);
 +tcg_gen_ext32u_tl(t0, cpu_gpr[rB(ctx-opcode)]);
tcg_gen_andi_tl(t1, cpu_gpr[rA(ctx-opcode)], 0x000ULL);
 
 Wouldn't deposit make sense here? But that can be a later optimization.
 
 Indeed it makes sense here, the thing is that I don't really know how 
 deposit is going. We have merged it, but we don't have the expected 
 performance (ie no improvement).

Yup, as on x86 we have to special-case a lot. The current x86 deposit 
implementation doesn't implement too many of those :).

On ppc for example, the world looks different. And even without speedup, I like 
the cleanup it provides.

Btw, I do use deposit on the s390 target implementation that is finally able to 
run Debian guests as well. All I need to do to finish it up is to properly 
split it up into smaller readable patches. So please decide on deposit soon, 
otherwise we'll have a user merged :).


Alex