Re: [PATCH 06/12] target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

2022-09-24 Thread Richard Henderson

On 9/23/22 21:47, Lucas Mateus Castro(alqotel) wrote:

+TCGv_vec one = tcg_temp_new_vec_matching(t);
+tcg_gen_dupi_vec(vece, one, 1);
+tcg_gen_or_vec(vece, tmp, a, b);
+tcg_gen_and_vec(vece, tmp, tmp, one);


tcg_constant_vec_matching.  With that,
Reviewed-by: Richard Henderson 


r~



[PATCH 06/12] target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

2022-09-23 Thread Lucas Mateus Castro(alqotel)
From: "Lucas Mateus Castro (alqotel)" 

Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its LSB as to replicate the "+ 1"
before the shift described by the ISA.

vavgub:
reptloopmaster patch
8   12500   0,02616600 0,00754200 (-71.2%)
25  40000,0253 0,00637700 (-74.8%)
100 10000,02604600 0,00790100 (-69.7%)
500 200 0,03189300 0,01838400 (-42.4%)
250040  0,06006900 0,06851000 (+14.1%)
800012  0,13941000 0,20548500 (+47.4%)

vavguh:
reptloopmaster patch
8   12500   0,01818200 0,00780600 (-57.1%)
25  40000,01789300 0,00641600 (-64.1%)
100 10000,01899100 0,00787200 (-58.5%)
500 200 0,02527200 0,01828400 (-27.7%)
250040  0,05361800 0,06773000 (+26.3%)
800012  0,12886600 0,20291400 (+57.5%)

vavguw:
reptloopmaster patch
8   12500   0,01423100 0,00776600 (-45.4%)
25  40000,01780800 0,00638600 (-64.1%)
100 10000,02085500 0,00787000 (-62.3%)
500 200 0,02737100 0,01828800 (-33.2%)
250040  0,05572600 0,06774200 (+21.6%)
800012  0,13101700 0,20311600 (+55.0%)

vavgsb:
reptloopmaster patch
8   12500   0,03006000 0,00788600 (-73.8%)
25  40000,02882200 0,00637800 (-77.9%)
100 10000,02958000 0,00791400 (-73.2%)
500 200 0,03548800 0,01860400 (-47.6%)
250040  0,0636 0,06850800 (+7.7%)
800012  0,13816500 0,20550300 (+48.7%)

vavgsh:
reptloopmaster patch
8   12500   0,01965900 0,00776600 (-60.5%)
25  40000,01875400 0,00638700 (-65.9%)
100 10000,01952200 0,00786900 (-59.7%)
500 200 0,02562000 0,01760300 (-31.3%)
250040  0,05384300 0,06742800 (+25.2%)
800012  0,13240800 0,2033 (+53.5%)

vavgsw:
reptloopmaster patch
8   12500   0,01407700 0,00775600 (-44.9%)
25  40000,01762300 0,0064 (-63.7%)
100 10000,02046500 0,00788500 (-61.5%)
500 200 0,02745600 0,01843000 (-32.9%)
250040  0,05375500 0,06820500 (+26.9%)
800012  0,13068300 0,20304900 (+55.4%)

These results to me seems to indicate that with gvec the results have a
slower translation but faster execution.

Signed-off-by: Lucas Mateus Castro (alqotel) 
---
 target/ppc/helper.h |  12 +--
 target/ppc/insn32.decode|   9 +++
 target/ppc/int_helper.c |  32 
 target/ppc/translate/vmx-impl.c.inc | 109 +---
 target/ppc/translate/vmx-ops.c.inc  |   9 +--
 5 files changed, 130 insertions(+), 41 deletions(-)

diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 6a43e32ad3..f88d9d3996 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -143,15 +143,15 @@ DEF_HELPER_FLAGS_1(ftsqrt, TCG_CALL_NO_RWG_SE, i32, i64)
 #define dh_ctype_acc ppc_acc_t *
 #define dh_typecode_acc dh_typecode_ptr
 
-DEF_HELPER_FLAGS_3(vavgub, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavguh, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavguw, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VAVGUB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGUH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGUW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_FLAGS_3(vabsdub, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vabsduh, TCG_CALL_NO_RWG, void, avr, avr, avr)
 DEF_HELPER_FLAGS_3(vabsduw, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsb, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsh, TCG_CALL_NO_RWG, void, avr, avr, avr)
-DEF_HELPER_FLAGS_3(vavgsw, TCG_CALL_NO_RWG, void, avr, avr, avr)
+DEF_HELPER_FLAGS_4(VAVGSB, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGSH, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
+DEF_HELPER_FLAGS_4(VAVGSW, TCG_CALL_NO_RWG, void, avr, avr, avr, i32)
 DEF_HELPER_4(vcmpeqfp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgefp, void, env, avr, avr, avr)
 DEF_HELPER_4(vcmpgtfp, void, env, avr, avr, avr)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index aa4968e6b9..38458c01de 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -519,6 +519,15 @@ VCMPNEZW000100 . . . . 011111   @VC
 VCMPSQ  000100 ... -- . . 0010101   @VX_bf
 VCMPUQ  000100 ... -- . . 0010001   @VX_bf
 
+## Vector Integer Average