This patch adds disassembler for Loongson 2F architecture.
v2:
Fixed coding style problems.
Added comments related to licence and author.
Stefan Brankovic (1):
disas: mips: Add Loongson 2F disassembler
MAINTAINERS |1 +
configure |1 +
disas/Makefile.objs
On 3.7.20. 12:09, Thomas Huth wrote:
On 03/07/2020 11.49, Stefan Brankovic wrote:
On 3.7.20. 09:59, Thomas Huth wrote:
On 02/07/2020 21.42, Stefan Brankovic wrote:
Add disassembler for Loongson 2F instruction set.
Testing is done by comparing qemu disassembly output, obtained by
using -d
On 3.7.20. 09:59, Thomas Huth wrote:
On 02/07/2020 21.42, Stefan Brankovic wrote:
Add disassembler for Loongson 2F instruction set.
Testing is done by comparing qemu disassembly output, obtained by
using -d in_asm command line option, with appropriate objdump output.
Signed-off-by: Stefan
This patch adds disassembler for Loongson 2F instruction set.
Stefan Brankovic (1):
disas: mips: Add Loongson 2F disassembler
MAINTAINERS |1 +
configure |1 +
disas/Makefile.objs |1 +
disas/loongson2f.cpp| 8134
On 2.6.20. 10:52, Aleksandar Markovic wrote:
Stefan Brankovic wants to use his new email address for his future
work in QEMU.
CC: Stefan Brankovic
Signed-off-by: Aleksandar Markovic
Reviewed-by: Stefan Brankovic
---
.mailmap | 1 +
1 file changed, 1 insertion(+)
diff --git a/.mailmap
for the lower doubleword element of vB.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 132 +++-
3 files changed, 130 insertions(+), 13 deletions
iterations.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 21 -
target/ppc/translate/vmx-impl.inc.c | 93 -
3 files changed, 92 insertions(+), 23 deletions(-)
diff --git a/target/ppc
.
V3:
Fixed problem during build.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (3):
target/ppc: Optimize emulation of vclzh and vclzb instructions
target/ppc: Optimize emulation of vpkpx
variable 'result', that is later transferred
to the destination register. Inner 'for' loop does unpacking of pixels in
two iterations. Each iteration takes 16 bits from source register and
unpacks them into 32 bits of the destination register.
Signed-off-by: Stefan Brankovic
---
target/ppc
into 32 bits of the destination register.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 20 -
target/ppc/translate/vmx-impl.inc.c | 82 -
3 files changed, 80 insertions(+), 24 deletions
for the lower doubleword element of vB.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 132 +++-
3 files changed, 130 insertions(+), 13 deletions
:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (3):
target/ppc: Optimize emulation of vclzh and vclzb instructions
target/ppc: Optimize emulation of vpkpx instruction
target/ppc: Optimize emulation
iterations.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 21 -
target/ppc/translate/vmx-impl.inc.c | 93 -
3 files changed, 92 insertions(+), 23 deletions(-)
diff --git a/target/ppc
Hello Aleksandar,
Thank you for taking a look at this patch. I will start working on a
version 8 of the patch where I will address all your suggestions.
Kind Regards,
Stefan
On 19.10.19. 22:40, Aleksandar Markovic wrote:
On Thursday, October 17, 2019, Stefan Brankovic
the same way.
It also stores result of every iteration in temporary register, that is later
transferred to destination register. Inner 'for' loop does unpacking of pixels
and forms resulting doubleword 32 by 32 bits.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 136 +++-
3 files changed, 134 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
iterations, 1 for each pixel) and save result in tmp variable.
In the end of outer for loop, the result is merged in variable called
result and saved in appropriate doubleword element of vD if the whole
doubleword is finished(every second iteration). The outer loop has 4
iterations.
Signed-off-by: Stefan
problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (3):
target/ppc: Optimize emulation of vclzh and vclzb instructions
target/ppc: Optimize emulation of vpkpx instruction
target/ppc: Optimize emulation of vupkhpx and vupklpx instructions
target/ppc
On 29.8.19. 17:31, Richard Henderson wrote:
On 8/29/19 6:34 AM, Stefan Brankovic wrote:
Then I run my performance tests and I got following results(test is calling
vpkpx 10 times):
1) Current helper implementation: ~ 157 ms
2) helper implementation you suggested: ~94 ms
3) tcg
improvement compared to old helper implementation.
V1 of this patch was not sent to qemu-devel and I am now sending V2 to
appropriate email adresses.
Stefan Brankovic (1):
target/ppc: Fix for optimized vsl/vsr instructions
target/ppc/translate/vmx-impl.inc.c | 84
Suggested-by: Aleksandar Markovic
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 84 ++---
1 file changed, 40 insertions(+), 44 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
Please take a look at the following patch
https://lists.nongnu.org/archive/html/qemu-ppc/2019-10/msg00133.html and
let me know if problem is solved.
On 2.10.19. 16:08, Stefan Brankovic wrote:
Hi Mark,
Thank you for reporting this bug. I was away from office for couple of
days, so that's why
Hi Mark,
Thank you for reporting this bug. I was away from office for couple of
days, so that's why I am answering you a bit late, sorry about that. I
will start working on a solution and try to fix this problem in next
couple of days.
On 1.10.19. 20:24, Mark Cave-Ayland wrote:
On
On 27.8.19. 20:52, Richard Henderson wrote:
On 8/27/19 2:37 AM, Stefan Brankovic wrote:
+for (i = 0; i < 4; i++) {
+switch (i) {
+case 0:
+/*
+ * Get high doubleword of vA to perfrom 6-5-5 pack of pixels
+ * 1 an
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 136 +++-
3 files changed, 134 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
, and second
one with a helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 66 +
1 file changed, 37 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
iterations, 1 for each pixel) and save result in tmp variable.
In the end of outer for loop, the result is merged in variable called
result and saved in appropriate doubleword element of vD if the whole
doubleword is finished(every second iteration). The outer loop has 4
iterations.
Signed-off-by: Stefan
) in tcg.
Implemented vector vmrgh and vmrgl instructions for i386.
Converted vmrgh and vmrgl instructions to vector operations.
V3:
Fixed problem during build.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 136 +++-
3 files changed, 134 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.
Signed-off-by: Stefan Brankovic
Reviewed
instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c
doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 276
target/ppc/translate/vmx-impl.inc.c | 77 +-
3 files
, and second
one with a helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 66 +
1 file changed, 37 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
iterations, 1 for each pixel) and save result in tmp variable.
In the end of outer for loop, the result is merged in variable called
result and saved in appropriate doubleword element of vD if the whole
doubleword is finished(every second iteration). The outer loop has 4
iterations.
Signed-off-by: Stefan
obtained is placed in lower doubleword element
of vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 18 --
target/ppc/translate/vmx-impl.inc.c | 121 ++--
3
higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 2 -
target/ppc
instructions to vector operations.
V3:
Fixed problem during build.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (8):
target/ppc: Optimize emulation of lvsl and lvsr instructions
target/ppc
Signed-off-by: Stefan Brankovic
---
tcg/i386/tcg-target.h | 2 +-
tcg/i386/tcg-target.inc.c | 10 ++
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index e825324..d20d08f 100644
--- a/tcg/i386/tcg-target.h
+++ b/tcg/i386
Signed-off-by: Stefan Brankovic
---
accel/tcg/tcg-runtime-gvec.c | 42 ++
accel/tcg/tcg-runtime.h | 4
tcg/i386/tcg-target.h| 1 +
tcg/tcg-op-gvec.c| 23 +++
tcg/tcg-op-gvec.h| 3 +++
tcg/tcg
, and second
one with a helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 66 +
1 file changed, 37 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
Signed-off-by: Stefan Brankovic
---
accel/tcg/tcg-runtime-gvec.c | 42 ++
accel/tcg/tcg-runtime.h | 4
tcg/i386/tcg-target.h| 1 +
tcg/tcg-op-gvec.c| 24
tcg/tcg-op-gvec.h| 2 ++
tcg/tcg
Signed-off-by: Stefan Brankovic
---
tcg/i386/tcg-target.h | 2 +-
tcg/i386/tcg-target.inc.c | 19 +++
2 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/tcg/i386/tcg-target.h b/tcg/i386/tcg-target.h
index e11b22d..daae35f 100644
--- a/tcg/i386/tcg-target.h
+++ b
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 122 +++-
3 files changed, 120 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 3 ---
target/ppc/int_helper.c | 2 +-
target/ppc/translate/vmx-impl.inc.c | 6 +++---
3 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index ac1a5bd
doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 276
target/ppc/translate/vmx-impl.inc.c | 77 +-
3 files
obtained is placed in lower doubleword element
of vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 18 --
target/ppc/translate/vmx-impl.inc.c | 121 ++--
3
(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.
Signed-off-by: Stefan Brankovic
Reviewed
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 3 ---
target/ppc/int_helper.c | 9 -
target/ppc/translate/vmx-impl.inc.c | 6 +++---
3 files changed, 3 insertions(+), 15 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index
instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c
instructions for i386.
Converted vmrgh and vmrgl instructions to vector operations.
V3:
Fixed problem during build.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (13):
target/ppc: Optimize emulation
higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 2 -
target/ppc
of some Altivec
Date: Monday, June 24, 2019 13:20 CEST
From: Howard Spoelstra
To: Stefan Brankovic
CC: qemu-devel qemu-devel
References: <1561371065-3637-1-git-send-email-stefan.branko...@rt-rk.com>
<43c6-5d10a600-15-34dab4c0@176981179>
On Mon, Jun 24, 2019 at 12:28 PM Stef
Original Message
Subject: [PATCH v3 0/8] target/ppc: Optimize emulation of some Altivec
Date: Monday, June 24, 2019 12:10 CEST
From: Stefan Brankovic
To: stefan.branko...@rt-rk.com
Optimize emulation of ten Altivec instructions: lvsl, lvsr, vsl, vsr, vpkpx,
vgbbd, vclzb, vclzh, vclzw and vclzd
instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 122 +++-
3 files changed, 120 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
, and second
one with a helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 66 +
1 file changed, 37 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 276
target/ppc/translate/vmx-impl.inc.c | 77 +-
3 files changed, 76 insertions(+), 278
is presented in this series. The performance improvements are
significant in all cases.
V3:
Fixed problem during build.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (8):
target/ppc: Optimize
obtained is placed in lower doubleword element
of vD.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 18 -
target/ppc/translate/vmx-impl.inc.c | 129 +++-
3 files changed, 97 insertions(+), 52
iterations, 1 for each pixel) and save result in tmp variable.
In the end of outer for loop, the result is merged in variable called
result and saved in appropriate doubleword element of vD if the whole
doubleword is finished(every second iteration). The outer loop has 4
iterations.
Signed-off-by: Stefan
higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 35
(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.
Signed-off-by: Stefan Brankovic
---
target
(one for each
word elemnt of source register vB). Every iteration consists of loading
appropriate word element from source register, counting leading zeros
with tcg_gen_clzi_i32, and saving the result in appropriate word element
of destination register.
Signed-off-by: Stefan Brankovic
---
target
is presented in this series. The performance improvements are
significant in all cases.
V2:
Addressed Richard's Henderson's suggestions.
Fixed problem during build on patch 2/8.
Rebased series to the latest qemu code.
Stefan Brankovic (8):
target/ppc: Optimize emulation of lvsl and lvsr
-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 9 ---
target/ppc/translate/vmx-impl.inc.c | 122 +++-
3 files changed, 120 insertions(+), 13 deletions(-)
diff --git a/target/ppc/helper.h b/target/ppc
higher doubleword element, shift operation
is performed on lower doubleword element of vA, with replacement of
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 35
doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c | 276
target/ppc/translate/vmx-impl.inc.c | 77 +-
3 files changed, 76 insertions(+), 278
iterations, 1 for each pixel) and save result in tmp variable.
In the end of outer for loop, the result is merged in variable called
result and saved in appropriate doubleword element of vD if the whole
doubleword is finished(every second iteration). The outer loop has 4
iterations.
Signed-off-by: Stefan
, and second
one with a helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 66 +
1 file changed, 37 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index
obtained is placed in lower doubleword element
of vD.
Signed-off-by: Stefan Brankovic
---
target/ppc/helper.h | 2 -
target/ppc/int_helper.c | 18 --
target/ppc/translate/vmx-impl.inc.c | 120 ++--
3 files changed, 88 insertions
instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
Reviewed-by: Richard Henderson
---
target/ppc/helper.h | 1 -
target/ppc/int_helper.c
On 6.6.19. 20:19, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
Optimize altivec instruction vgbbd (Vector Gather Bits by Bytes by Doubleword)
All ith bits (i in range 1 to 8) of each byte of doubleword element in
source register are concatenated and placed into ith byte
On 6.6.19. 20:34, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
+for (i = 0; i < 2; i++) {
+if (i == 0) {
+/* Get high doubleword element of vB in avr. */
+get_avr64(avr, VB, true);
+} else {
+/* Get low doublew
On 6.6.19. 22:43, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
+/*
+ * We use this macro if one instruction is realized with direct
+ * translation, and second one with helper.
+ */
+#define GEN_VXFORM_TRANS_DUAL(name0, flg0, flg2_0, name1, flg1, flg2_1)\
+static void
On 6.6.19. 22:38, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
Optimize Altivec instruction vclzh (Vector Count Leading Zeros Halfword).
This instruction counts the number of leading zeros of each halfword element
in source register and places result in the appropriate
On 6.6.19. 19:03, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
+tcg_gen_subi_i64(tmp, sh, 64);
+tcg_gen_neg_i64(tmp, tmp);
Better as
tcg_gen_subfi_i64(tmp, 64, sh);
I was aware there must be way of doing it in a single tcg invocation,
but couldn't find
On 6.6.19. 19:13, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
Stefan Brankovic (8):
target/ppc: Optimize emulation of lvsl and lvsr instructions
target/ppc: Optimize emulation of vsl and vsr instructions
target/ppc: Optimize emulation of vpkpx instruction
On 6.6.19. 18:46, Richard Henderson wrote:
On 6/6/19 5:15 AM, Stefan Brankovic wrote:
+tcg_gen_addi_i64(result, sh, 7);
+for (i = 7; i >= 1; i--) {
+tcg_gen_shli_i64(tmp, sh, i * 8);
+tcg_gen_or_i64(result, result, tmp);
+tcg_gen_addi_i64(sh, sh
>
>
> Original Message
> Subject: Re: [Qemu-devel] [PATCH 0/8] Optimize emulation of ten Altivec
> instructions: lvsl,
> Date: Thursday, June 6, 2019 19:13 CEST
> From: Richard Henderson
> To: Stefan Brankovic , qemu-devel@nongnu.org
> CC:
. In the end of outer for loop, we merge result in
variable called result and save it in appropriate doubleword element
of vD if whole doubleword is finished(every second iteration). Outer
loop has 4 iterations.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 93
, and second one
with helper.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 62 -
1 file changed, 33 insertions(+), 29 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index 8535a31
and result2 is placed in appropriate
doubleword element of vD. We repeat this 2 times.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 99 -
1 file changed, 98 insertions(+), 1 deletion(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b
.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 57 -
1 file changed, 56 insertions(+), 1 deletion(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
b/target/ppc/translate/vmx-impl.inc.c
index 1c34908..7689739 100644
--- a/target/ppc
instruction two times(once for each
doubleword element of source register vB) and placing result in
appropriate doubleword element of destination register vD.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 28 +++-
1 file changed, 27 insertions(+), 1
to ppc platform,
so relatively complex TCG translation (without direct mapping to host
instruction that is not possible in these cases) seems to be the best option,
and that approach is presented in this series. The performance improvements are
significant in all cases.
Stefan Brankovic (8):
target
it in
appropriate byte of variable result) and save them in higher
doubleword element of vD. We repeat this once again for lower
doubleword element of vD by creating bytes (24-sh):(32-sh) in
a for loop and saving result.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 143
result in appropriate doubleword element of destination
register vD. We repeat this once again for lower doubleword element of
vB.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 122 +++-
1 file changed, 120 insertions(+), 2 deletions(-)
diff
element of vA and replace
highest sh bits(that are now 0) with bits saved in shifted.
Signed-off-by: Stefan Brankovic
---
target/ppc/translate/vmx-impl.inc.c | 101 +++-
1 file changed, 99 insertions(+), 2 deletions(-)
diff --git a/target/ppc/translate/vmx-impl.inc.c
Implement emulation of TILEGX instruction V1CMPLEU and V1CMPLTU
using TCG front end operations.
Signed-off-by: Stefan Brankovic
---
target/tilegx/translate.c | 62 ---
1 file changed, 58 insertions(+), 4 deletions(-)
diff --git a/target/tilegx
Implement emulation of TILE-Gx instructions V1CMPLEU, V1CMPLTU,
V2CMPLEU, and V2CMPLTU.
Stefan Brankovic (2):
target/tilegx: Implement emulation of TILEGX instructions V1CMPLEU and
V1CMPLTU
target/tilegx: Implement emulation of TILEGX instructions V2CMPLEU and
V2CMPLTU
target/tilegx
Implement emulation of TILEGX instruction V2CMPLEU and V2CMPLTU
using TCG front end operations.
Signed-off-by: Stefan Brankovic
---
target/tilegx/translate.c | 62 ---
1 file changed, 58 insertions(+), 4 deletions(-)
diff --git a/target/tilegx
91 matches
Mail list logo