[Qemu-devel] [Bug 1696773] Re: golang calls to exec crash user emulation
You will need to apply the patch from https://bugs.launchpad.net/qemu/+bug/1696353 to run this sample app on current master. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1696773 Title: golang calls to exec crash user emulation Status in QEMU: New Bug description: An example program can be found here: https://github.com/willnewton/qemucrash This code starts a goroutine (thread) and calls exec repeatedly. This works ok natively but when run under ARM user emulation it segfaults (usually, there are occasionally other failures). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1696773/+subscriptions
[Qemu-devel] [Bug 1696773] [NEW] golang calls to exec crash user emulation
Public bug reported: An example program can be found here: https://github.com/willnewton/qemucrash This code starts a goroutine (thread) and calls exec repeatedly. This works ok natively but when run under ARM user emulation it segfaults (usually, there are occasionally other failures). ** Affects: qemu Importance: Undecided Status: New -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1696773 Title: golang calls to exec crash user emulation Status in QEMU: New Bug description: An example program can be found here: https://github.com/willnewton/qemucrash This code starts a goroutine (thread) and calls exec repeatedly. This works ok natively but when run under ARM user emulation it segfaults (usually, there are occasionally other failures). To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1696773/+subscriptions
[Qemu-devel] [Bug 1696353] Re: golang binaries fail to start under linux-user
True, but it used to work albeit with slightly wrong semantics. It now fails hard even though the golang runtime doesn't make any use of Sys V semaphores so the presence of the flag is not noticeable by any normal user. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1696353 Title: golang binaries fail to start under linux-user Status in QEMU: New Bug description: With current master golang binaries fail when run under linux-user, for example: [will@localhost qemu]$ ./arm-linux-user/qemu-arm glide runtime: failed to create new OS thread (have 2 already; errno=22) fatal error: newosproc runtime stack: runtime.throw(0x45f879, 0x9) /usr/lib/golang/src/runtime/panic.go:566 +0x78 runtime.newosproc(0x1092c000, 0x1093bfe0) /usr/lib/golang/src/runtime/os_linux.go:160 +0x1b0 runtime.newm(0x4ae1e8, 0x0) /usr/lib/golang/src/runtime/proc.go:1572 +0x12c runtime.main.func1() /usr/lib/golang/src/runtime/proc.go:126 +0x24 runtime.systemstack(0x5ef900) /usr/lib/golang/src/runtime/asm_arm.s:247 +0x80 runtime.mstart() /usr/lib/golang/src/runtime/proc.go:1079 goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_arm.s:192 +0x4 fp=0x109287ac sp=0x109287a8 runtime.main() /usr/lib/golang/src/runtime/proc.go:127 +0x5c fp=0x109287d4 sp=0x109287ac runtime.goexit() /usr/lib/golang/src/runtime/asm_arm.s:998 +0x4 fp=0x109287d4 sp=0x109287d4 The reason for this is that the golang runtime does not pass the CLONE_SYSVMEM flag to clone so the clone flags checks fail: https://github.com/golang/go/blob/master/src/runtime/os_linux.go#L155 The attached patch allows golang binaries to start under linux-user. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1696353/+subscriptions
[Qemu-devel] [Bug 1696353] [NEW] golang binaries fail to start under linux-user
Public bug reported: With current master golang binaries fail when run under linux-user, for example: [will@localhost qemu]$ ./arm-linux-user/qemu-arm glide runtime: failed to create new OS thread (have 2 already; errno=22) fatal error: newosproc runtime stack: runtime.throw(0x45f879, 0x9) /usr/lib/golang/src/runtime/panic.go:566 +0x78 runtime.newosproc(0x1092c000, 0x1093bfe0) /usr/lib/golang/src/runtime/os_linux.go:160 +0x1b0 runtime.newm(0x4ae1e8, 0x0) /usr/lib/golang/src/runtime/proc.go:1572 +0x12c runtime.main.func1() /usr/lib/golang/src/runtime/proc.go:126 +0x24 runtime.systemstack(0x5ef900) /usr/lib/golang/src/runtime/asm_arm.s:247 +0x80 runtime.mstart() /usr/lib/golang/src/runtime/proc.go:1079 goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_arm.s:192 +0x4 fp=0x109287ac sp=0x109287a8 runtime.main() /usr/lib/golang/src/runtime/proc.go:127 +0x5c fp=0x109287d4 sp=0x109287ac runtime.goexit() /usr/lib/golang/src/runtime/asm_arm.s:998 +0x4 fp=0x109287d4 sp=0x109287d4 The reason for this is that the golang runtime does not pass the CLONE_SYSVMEM flag to clone so the clone flags checks fail: https://github.com/golang/go/blob/master/src/runtime/os_linux.go#L155 The attached patch allows golang binaries to start under linux-user. ** Affects: qemu Importance: Undecided Status: New ** Patch added: "0001-linux-user-syscall.c-Loosen-clone-flag-check.patch" https://bugs.launchpad.net/bugs/1696353/+attachment/4890926/+files/0001-linux-user-syscall.c-Loosen-clone-flag-check.patch -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1696353 Title: golang binaries fail to start under linux-user Status in QEMU: New Bug description: With current master golang binaries fail when run under linux-user, for example: [will@localhost qemu]$ ./arm-linux-user/qemu-arm glide runtime: failed to create new OS thread (have 2 already; errno=22) fatal error: newosproc runtime stack: runtime.throw(0x45f879, 0x9) /usr/lib/golang/src/runtime/panic.go:566 +0x78 runtime.newosproc(0x1092c000, 0x1093bfe0) /usr/lib/golang/src/runtime/os_linux.go:160 +0x1b0 runtime.newm(0x4ae1e8, 0x0) /usr/lib/golang/src/runtime/proc.go:1572 +0x12c runtime.main.func1() /usr/lib/golang/src/runtime/proc.go:126 +0x24 runtime.systemstack(0x5ef900) /usr/lib/golang/src/runtime/asm_arm.s:247 +0x80 runtime.mstart() /usr/lib/golang/src/runtime/proc.go:1079 goroutine 1 [running]: runtime.systemstack_switch() /usr/lib/golang/src/runtime/asm_arm.s:192 +0x4 fp=0x109287ac sp=0x109287a8 runtime.main() /usr/lib/golang/src/runtime/proc.go:127 +0x5c fp=0x109287d4 sp=0x109287ac runtime.goexit() /usr/lib/golang/src/runtime/asm_arm.s:998 +0x4 fp=0x109287d4 sp=0x109287d4 The reason for this is that the golang runtime does not pass the CLONE_SYSVMEM flag to clone so the clone flags checks fail: https://github.com/golang/go/blob/master/src/runtime/os_linux.go#L155 The attached patch allows golang binaries to start under linux-user. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1696353/+subscriptions
[Qemu-devel] [Bug 1312561] Re: libstdc++-6.dll is missing from your computer
Also getting same error when running the following command on Windows 7 64 bit. qemu-system-arm -cpu? I also reinstalled qemu without any luck. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1312561 Title: libstdc++-6.dll is missing from your computer Status in QEMU: New Bug description: qemu-w64-setup-20140418.exe Windows 7 64 bit PC. qemu-system-armw -kernel kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append root=/dev/sda2 panic=1 rootfstype=ext4 rw -hda c:\11\rasimg\test.vhd qemu-system-armw.exe - System Error The program can't start because libstdc++-6.dll is missing from your computer. Try reinstalling the program to fix this problem. I tried reinstalling, but no change. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1312561/+subscriptions
[Qemu-devel] [Bug 1312561] Re: libstdc++-6.dll is missing from your computer
Also getting same error when running the following command on Windows 7 64 bit. qemu-system-arm -cpu? I also reinstalled qemu without any luck. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1312561 Title: libstdc++-6.dll is missing from your computer Status in QEMU: New Bug description: qemu-w64-setup-20140418.exe Windows 7 64 bit PC. qemu-system-armw -kernel kernel-qemu -cpu arm1176 -m 256 -M versatilepb -no-reboot -serial stdio -append root=/dev/sda2 panic=1 rootfstype=ext4 rw -hda c:\11\rasimg\test.vhd qemu-system-armw.exe - System Error The program can't start because libstdc++-6.dll is missing from your computer. Try reinstalling the program to fix this problem. I tried reinstalling, but no change. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1312561/+subscriptions
[Qemu-devel] [PATCH v4 2/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
Add support for AArch32 CRC32 and CRC32C instructions added in ARMv8 and add a CPU feature flag to enable these instructions. The CRC32-C implementation used is the built-in qemu implementation and The CRC-32 implementation is from zlib. This requires adding zlib to LIBS to ensure it is linked for the linux-user binary. Signed-off-by: Will Newton will.new...@linaro.org --- configure | 2 +- target-arm/cpu.c | 1 + target-arm/cpu.h | 1 + target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 56 ++ 6 files changed, 101 insertions(+), 1 deletion(-) Changes in v4: - Add feature flag for CRC instructions - Check for SBZ bits in ARM encoding - Use more accurate flags for the helper definition diff --git a/configure b/configure index 423f435..030da86 100755 --- a/configure +++ b/configure @@ -1657,7 +1657,7 @@ EOF Make sure to have the zlib libs and headers installed. fi fi -libs_softmmu=$libs_softmmu -lz +LIBS=$LIBS -lz ## # libseccomp check diff --git a/target-arm/cpu.c b/target-arm/cpu.c index 6e7ce89..8231e57 100644 --- a/target-arm/cpu.c +++ b/target-arm/cpu.c @@ -922,6 +922,7 @@ static void arm_any_initfn(Object *obj) set_feature(cpu-env, ARM_FEATURE_THUMB2EE); set_feature(cpu-env, ARM_FEATURE_ARM_DIV); set_feature(cpu-env, ARM_FEATURE_V7MP); +set_feature(cpu-env, ARM_FEATURE_CRC); #ifdef TARGET_AARCH64 set_feature(cpu-env, ARM_FEATURE_AARCH64); #endif diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 3c8a2db..c3aeb74 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -615,6 +615,7 @@ enum arm_features { ARM_FEATURE_AARCH64, /* supports 64 bit mode */ ARM_FEATURE_V8_AES, /* implements AES part of v8 Crypto Extensions */ ARM_FEATURE_CBAR, /* has cp15 CBAR */ +ARM_FEATURE_CRC, /* ARMv8 CRC instructions */ }; static inline int arm_feature(CPUARMState *env, int feature) diff --git a/target-arm/helper.c b/target-arm/helper.c index 1b111b6..5e4f0fa 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -5,6 +5,8 @@ #include sysemu/arch_init.h #include sysemu/sysemu.h #include qemu/bitops.h +#include qemu/crc32c.h +#include zlib.h /* For crc32 */ #ifndef CONFIG_USER_ONLY static inline int get_phys_addr(CPUARMState *env, uint32_t address, @@ -4392,3 +4394,40 @@ int arm_rmode_to_sf(int rmode) } return rmode; } + +static void crc_init_buffer(uint8_t *buf, uint32_t val, uint32_t bytes) +{ +memset(buf, 0, 4); + +if (bytes == 1) { +buf[0] = val 0xff; +} else if (bytes == 2) { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +} else { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +buf[2] = (val 16) 0xff; +buf[3] = (val 24) 0xff; +} +} + +uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* zlib crc32 converts the accumulator and output to one's complement. */ +return crc32(acc ^ 0x, buf, bytes) ^ 0x; +} + +uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* Linux crc32c converts the output to one's complement. */ +return crc32c(acc, buf, bytes) ^ 0x; +} diff --git a/target-arm/helper.h b/target-arm/helper.h index 19bd620..9738e49 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -497,6 +497,9 @@ DEF_HELPER_3(neon_qzip32, void, env, i32, i32) DEF_HELPER_4(crypto_aese, void, env, i32, i32, i32) DEF_HELPER_4(crypto_aesmc, void, env, i32, i32, i32) +DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32) +DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32) + #ifdef TARGET_AARCH64 #include helper-a64.h #endif diff --git a/target-arm/translate.c b/target-arm/translate.c index 6ccf0ba..253d2a1 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -7561,6 +7561,36 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) store_reg(s, 14, tmp2); gen_bx(s, tmp); break; +case 0x4: +{ +/* crc32/crc32c */ +uint32_t c = extract32(insn, 8, 4); + +/* Check this CPU supports ARMv8 CRC instructions. + * op1 == 3 is UNPREDICTABLE but handle as UNDEFINED. + * Bits 8, 10 and 11 should be zero. + */ +if (!arm_feature(env, ARM_FEATURE_CRC) || op1 == 0x3 || +(c 0xd) != 0) { +goto illegal_op; +} + +rn = extract32(insn, 16, 4); +rd = extract32(insn, 12, 4); + +tmp = load_reg(s, rn); +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 op1); +if (c
[Qemu-devel] [PATCH v4 1/2] include/qemu/crc32c.h: Rename include guards to match filename
Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- include/qemu/crc32c.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Changes in v4: - None diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h index 56d1c3b..dafb6a1 100644 --- a/include/qemu/crc32c.h +++ b/include/qemu/crc32c.h @@ -25,8 +25,8 @@ * */ -#ifndef QEMU_CRC32_H -#define QEMU_CRC32_H +#ifndef QEMU_CRC32C_H +#define QEMU_CRC32C_H #include qemu-common.h -- 1.8.1.4
[Qemu-devel] [PATCH v4 0/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
This series adds support for the AArch32 CRC32 instructions added in ARMv8. Will Newton (2): include/qemu/crc32c.h: Rename include guards to match filename target-arm: Add support for AArch32 ARMv8 CRC32 instructions configure | 2 +- include/qemu/crc32c.h | 4 ++-- target-arm/cpu.c | 1 + target-arm/cpu.h | 1 + target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 56 ++ 7 files changed, 103 insertions(+), 3 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH v3 1/2] include/qemu/crc32c.h: Rename include guards to match filename
Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- include/qemu/crc32c.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Changes in v3: - None diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h index 56d1c3b..dafb6a1 100644 --- a/include/qemu/crc32c.h +++ b/include/qemu/crc32c.h @@ -25,8 +25,8 @@ * */ -#ifndef QEMU_CRC32_H -#define QEMU_CRC32_H +#ifndef QEMU_CRC32C_H +#define QEMU_CRC32C_H #include qemu-common.h -- 1.8.1.4
[Qemu-devel] [PATCH v3 0/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
This series adds support for the AArch32 CRC32 instructions added in ARMv8. Will Newton (2): include/qemu/crc32c.h: Rename include guards to match filename target-arm: Add support for AArch32 ARMv8 CRC32 instructions configure | 2 +- include/qemu/crc32c.h | 4 ++-- target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 48 5 files changed, 93 insertions(+), 3 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH v3 2/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
Add support for AArch32 CRC32 and CRC32C instructions added in ARMv8. The CRC32-C implementation used is the built-in qemu implementation and The CRC-32 implementation is from zlib. This requires adding zlib to LIBS to ensure it is linked for the linux-user binary. Signed-off-by: Will Newton will.new...@linaro.org --- configure | 2 +- target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 48 4 files changed, 91 insertions(+), 1 deletion(-) Changes in v3: - Use extract32 to get register fields from instruction diff --git a/configure b/configure index 4648117..822842c 100755 --- a/configure +++ b/configure @@ -1550,7 +1550,7 @@ EOF Make sure to have the zlib libs and headers installed. fi fi -libs_softmmu=$libs_softmmu -lz +LIBS=$LIBS -lz ## # libseccomp check diff --git a/target-arm/helper.c b/target-arm/helper.c index 5ae08c9..294cfaf 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -5,6 +5,8 @@ #include sysemu/arch_init.h #include sysemu/sysemu.h #include qemu/bitops.h +#include qemu/crc32c.h +#include zlib.h /* For crc32 */ #ifndef CONFIG_USER_ONLY static inline int get_phys_addr(CPUARMState *env, uint32_t address, @@ -4468,3 +4470,40 @@ int arm_rmode_to_sf(int rmode) } return rmode; } + +static void crc_init_buffer(uint8_t *buf, uint32_t val, uint32_t bytes) +{ +memset(buf, 0, 4); + +if (bytes == 1) { +buf[0] = val 0xff; +} else if (bytes == 2) { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +} else { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +buf[2] = (val 16) 0xff; +buf[3] = (val 24) 0xff; +} +} + +uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* zlib crc32 converts the accumulator and output to one's complement. */ +return crc32(acc ^ 0x, buf, bytes) ^ 0x; +} + +uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* Linux crc32c converts the output to one's complement. */ +return crc32c(acc, buf, bytes) ^ 0x; +} diff --git a/target-arm/helper.h b/target-arm/helper.h index 951e6ad..fb92e53 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -494,6 +494,9 @@ DEF_HELPER_3(neon_qzip32, void, env, i32, i32) DEF_HELPER_4(crypto_aese, void, env, i32, i32, i32) DEF_HELPER_4(crypto_aesmc, void, env, i32, i32, i32) +DEF_HELPER_3(crc32, i32, i32, i32, i32) +DEF_HELPER_3(crc32c, i32, i32, i32, i32) + #ifdef TARGET_AARCH64 #include helper-a64.h #endif diff --git a/target-arm/translate.c b/target-arm/translate.c index 782aab8..8e87869 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -7541,6 +7541,32 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) store_reg(s, 14, tmp2); gen_bx(s, tmp); break; +case 0x4: +{ +/* crc32/crc32c */ +uint32_t c = extract32(insn, 9, 1); + +/* size == 64 is UNPREDICTABLE but handle as UNDEFINED. */ +if (op1 == 0x3) { +goto illegal_op; +} + +rn = extract32(insn, 16, 4); +rd = extract32(insn, 12, 4); + +tmp = load_reg(s, rn); +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 op1); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3); +store_reg(s, rd, tmp); +break; +} case 0x5: /* saturating add/subtract */ ARCH(5TE); rd = (insn 12) 0xf; @@ -9125,6 +9151,28 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw case 0x18: /* clz */ gen_helper_clz(tmp, tmp); break; +case 0x20: +case 0x21: +case 0x22: +case 0x28: +case 0x29: +case 0x2a: +{ +/* crc32/crc32c */ +uint32_t sz = op 0x3; +uint32_t c = op 0x8; + +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 sz); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3
[Qemu-devel] [PATCH v2 1/2] include/qemu/crc32c.h: Rename include guards to match filename
Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- include/qemu/crc32c.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h index 56d1c3b..dafb6a1 100644 --- a/include/qemu/crc32c.h +++ b/include/qemu/crc32c.h @@ -25,8 +25,8 @@ * */ -#ifndef QEMU_CRC32_H -#define QEMU_CRC32_H +#ifndef QEMU_CRC32C_H +#define QEMU_CRC32C_H #include qemu-common.h -- 1.8.1.4
[Qemu-devel] [PATCH v2 0/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
This series adds support for the AArch32 CRC32 instructions added in ARMv8. Will Newton (2): include/qemu/crc32c.h: Rename include guards to match filename target-arm: Add support for AArch32 ARMv8 CRC32 instructions configure | 2 +- include/qemu/crc32c.h | 4 ++-- target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 48 5 files changed, 93 insertions(+), 3 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH v2 2/2] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
Add support for AArch32 CRC32 and CRC32C instructions added in ARMv8. The CRC32-C implementation used is the built-in qemu implementation and The CRC-32 implementation is from zlib. This requires adding zlib to LIBS to ensure it is linked for the linux-user binary. Signed-off-by: Will Newton will.new...@linaro.org --- configure | 2 +- target-arm/helper.c| 39 +++ target-arm/helper.h| 3 +++ target-arm/translate.c | 48 4 files changed, 91 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 4648117..822842c 100755 --- a/configure +++ b/configure @@ -1550,7 +1550,7 @@ EOF Make sure to have the zlib libs and headers installed. fi fi -libs_softmmu=$libs_softmmu -lz +LIBS=$LIBS -lz ## # libseccomp check diff --git a/target-arm/helper.c b/target-arm/helper.c index 5ae08c9..294cfaf 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -5,6 +5,8 @@ #include sysemu/arch_init.h #include sysemu/sysemu.h #include qemu/bitops.h +#include qemu/crc32c.h +#include zlib.h /* For crc32 */ #ifndef CONFIG_USER_ONLY static inline int get_phys_addr(CPUARMState *env, uint32_t address, @@ -4468,3 +4470,40 @@ int arm_rmode_to_sf(int rmode) } return rmode; } + +static void crc_init_buffer(uint8_t *buf, uint32_t val, uint32_t bytes) +{ +memset(buf, 0, 4); + +if (bytes == 1) { +buf[0] = val 0xff; +} else if (bytes == 2) { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +} else { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +buf[2] = (val 16) 0xff; +buf[3] = (val 24) 0xff; +} +} + +uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* zlib crc32 converts the accumulator and output to one's complement. */ +return crc32(acc ^ 0x, buf, bytes) ^ 0x; +} + +uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +/* Linux crc32c converts the output to one's complement. */ +return crc32c(acc, buf, bytes) ^ 0x; +} diff --git a/target-arm/helper.h b/target-arm/helper.h index 951e6ad..fb92e53 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -494,6 +494,9 @@ DEF_HELPER_3(neon_qzip32, void, env, i32, i32) DEF_HELPER_4(crypto_aese, void, env, i32, i32, i32) DEF_HELPER_4(crypto_aesmc, void, env, i32, i32, i32) +DEF_HELPER_3(crc32, i32, i32, i32, i32) +DEF_HELPER_3(crc32c, i32, i32, i32, i32) + #ifdef TARGET_AARCH64 #include helper-a64.h #endif diff --git a/target-arm/translate.c b/target-arm/translate.c index 782aab8..9410b6a 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -7541,6 +7541,32 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) store_reg(s, 14, tmp2); gen_bx(s, tmp); break; +case 0x4: +{ +/* crc32/crc32c */ +uint32_t c = extract32(insn, 9, 1); + +/* size == 64 is UNPREDICTABLE but handle as UNDEFINED. */ +if (op1 == 0x3) { +goto illegal_op; +} + +rn = (insn 16) 0xf; +rd = (insn 12) 0xf; + +tmp = load_reg(s, rn); +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 op1); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3); +store_reg(s, rd, tmp); +break; +} case 0x5: /* saturating add/subtract */ ARCH(5TE); rd = (insn 12) 0xf; @@ -9125,6 +9151,28 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw case 0x18: /* clz */ gen_helper_clz(tmp, tmp); break; +case 0x20: +case 0x21: +case 0x22: +case 0x28: +case 0x29: +case 0x2a: +{ +/* crc32/crc32c */ +uint32_t sz = op 0x3; +uint32_t c = op 0x8; + +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 sz); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3); +break; +} default
[Qemu-devel] [PATCH 1/3] include/qemu/crc32c.h: Rename include guards to match filename
Signed-off-by: Will Newton will.new...@linaro.org --- include/qemu/crc32c.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h index 56d1c3b..dafb6a1 100644 --- a/include/qemu/crc32c.h +++ b/include/qemu/crc32c.h @@ -25,8 +25,8 @@ * */ -#ifndef QEMU_CRC32_H -#define QEMU_CRC32_H +#ifndef QEMU_CRC32C_H +#define QEMU_CRC32C_H #include qemu-common.h -- 1.8.1.4
[Qemu-devel] [PATCH 3/3] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
Add support for AArch32 CRC32 and CRC32C instructions added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/helper.c| 37 + target-arm/helper.h| 3 +++ target-arm/translate.c | 48 3 files changed, 88 insertions(+) diff --git a/target-arm/helper.c b/target-arm/helper.c index 5ae08c9..d773612 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -5,6 +5,8 @@ #include sysemu/arch_init.h #include sysemu/sysemu.h #include qemu/bitops.h +#include qemu/crc32.h +#include qemu/crc32c.h #ifndef CONFIG_USER_ONLY static inline int get_phys_addr(CPUARMState *env, uint32_t address, @@ -4468,3 +4470,38 @@ int arm_rmode_to_sf(int rmode) } return rmode; } + +static void crc_init_buffer(uint8_t *buf, uint32_t val, uint32_t bytes) +{ +memset(buf, 0, 4); + +if (bytes == 1) { +buf[0] = val 0xff; +} else if (bytes == 2) { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +} else { +buf[0] = val 0xff; +buf[1] = (val 8) 0xff; +buf[2] = (val 16) 0xff; +buf[3] = (val 24) 0xff; +} +} + +uint32_t HELPER(crc32)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +return crc32(acc, buf, bytes) ^ 0x; +} + +uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, uint32_t bytes) +{ +uint8_t buf[4]; + +crc_init_buffer(buf, val, bytes); + +return crc32c(acc, buf, bytes) ^ 0x; +} diff --git a/target-arm/helper.h b/target-arm/helper.h index 951e6ad..fb92e53 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -494,6 +494,9 @@ DEF_HELPER_3(neon_qzip32, void, env, i32, i32) DEF_HELPER_4(crypto_aese, void, env, i32, i32, i32) DEF_HELPER_4(crypto_aesmc, void, env, i32, i32, i32) +DEF_HELPER_3(crc32, i32, i32, i32, i32) +DEF_HELPER_3(crc32c, i32, i32, i32, i32) + #ifdef TARGET_AARCH64 #include helper-a64.h #endif diff --git a/target-arm/translate.c b/target-arm/translate.c index 782aab8..9410b6a 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -7541,6 +7541,32 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) store_reg(s, 14, tmp2); gen_bx(s, tmp); break; +case 0x4: +{ +/* crc32/crc32c */ +uint32_t c = extract32(insn, 9, 1); + +/* size == 64 is UNPREDICTABLE but handle as UNDEFINED. */ +if (op1 == 0x3) { +goto illegal_op; +} + +rn = (insn 16) 0xf; +rd = (insn 12) 0xf; + +tmp = load_reg(s, rn); +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 op1); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3); +store_reg(s, rd, tmp); +break; +} case 0x5: /* saturating add/subtract */ ARCH(5TE); rd = (insn 12) 0xf; @@ -9125,6 +9151,28 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw case 0x18: /* clz */ gen_helper_clz(tmp, tmp); break; +case 0x20: +case 0x21: +case 0x22: +case 0x28: +case 0x29: +case 0x2a: +{ +/* crc32/crc32c */ +uint32_t sz = op 0x3; +uint32_t c = op 0x8; + +tmp2 = load_reg(s, rm); +tmp3 = tcg_const_i32(1 sz); +if (c) { +gen_helper_crc32c(tmp, tmp, tmp2, tmp3); +} else { +gen_helper_crc32(tmp, tmp, tmp2, tmp3); +} +tcg_temp_free_i32(tmp2); +tcg_temp_free_i32(tmp3); +break; +} default: goto illegal_op; } -- 1.8.1.4
[Qemu-devel] [PATCH 0/3] target-arm: Add support for AArch32 ARMv8 CRC32 instructions
This series adds support for the AArch32 CRC32 instructions added in ARMv8. The CRC-32 algorithm is added alongside the existing CRC-32C implementation which requires a small fix to the crc32c.h header file. Will Newton (3): include/qemu/crc32c.h: Rename include guards to match filename util/crc32.c: Add CRC-32 implementation target-arm: Add support for AArch32 ARMv8 CRC32 instructions include/qemu/crc32.h | 15 include/qemu/crc32c.h | 4 +-- target-arm/helper.c| 37 +++ target-arm/helper.h| 3 ++ target-arm/translate.c | 48 + util/Makefile.objs | 1 + util/crc32.c | 98 ++ 7 files changed, 204 insertions(+), 2 deletions(-) create mode 100644 include/qemu/crc32.h create mode 100644 util/crc32.c -- 1.8.1.4
[Qemu-devel] [PATCH 2/3] util/crc32.c: Add CRC-32 implementation
Add a table-driven CRC-32 implementation similar in style to the existing CRC-32C implementation. Signed-off-by: Will Newton will.new...@linaro.org --- include/qemu/crc32.h | 15 util/Makefile.objs | 1 + util/crc32.c | 98 3 files changed, 114 insertions(+) create mode 100644 include/qemu/crc32.h create mode 100644 util/crc32.c diff --git a/include/qemu/crc32.h b/include/qemu/crc32.h new file mode 100644 index 000..c79daf5 --- /dev/null +++ b/include/qemu/crc32.h @@ -0,0 +1,15 @@ +/* + * CRC32C Checksum Algorithm + * + * Polynomial: 0x04C11DB7 + * + */ + +#ifndef QEMU_CRC32_H +#define QEMU_CRC32_H + +#include qemu-common.h + +uint32_t crc32(uint32_t crc, const uint8_t *data, unsigned int length); + +#endif diff --git a/util/Makefile.objs b/util/Makefile.objs index 937376b..7ce7432 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -14,3 +14,4 @@ util-obj-y += crc32c.o util-obj-y += throttle.o util-obj-y += getauxval.o util-obj-y += readline.o +util-obj-y += crc32.o diff --git a/util/crc32.c b/util/crc32.c new file mode 100644 index 000..d66243a --- /dev/null +++ b/util/crc32.c @@ -0,0 +1,98 @@ +/* + * CRC32 Checksum Algorithm + * + * Polynomial: 0x04C11DB7 + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + */ + +#include qemu-common.h +#include qemu/crc32.h + +/* + * This is the CRC-32 table + * Generated with: + * width = 32 bits + * poly = 0x04C11DB7 + * reflect input bytes = true + * reflect output bytes = true + */ + +static const uint32_t crc32_table[256] = { +0x, 0x77073096, 0xee0e612c, 0x990951ba, +0x076dc419, 0x706af48f, 0xe963a535, 0x9e6495a3, +0x0edb8832, 0x79dcb8a4, 0xe0d5e91e, 0x97d2d988, +0x09b64c2b, 0x7eb17cbd, 0xe7b82d07, 0x90bf1d91, +0x1db71064, 0x6ab020f2, 0xf3b97148, 0x84be41de, +0x1adad47d, 0x6ddde4eb, 0xf4d4b551, 0x83d385c7, +0x136c9856, 0x646ba8c0, 0xfd62f97a, 0x8a65c9ec, +0x14015c4f, 0x63066cd9, 0xfa0f3d63, 0x8d080df5, +0x3b6e20c8, 0x4c69105e, 0xd56041e4, 0xa2677172, +0x3c03e4d1, 0x4b04d447, 0xd20d85fd, 0xa50ab56b, +0x35b5a8fa, 0x42b2986c, 0xdbbbc9d6, 0xacbcf940, +0x32d86ce3, 0x45df5c75, 0xdcd60dcf, 0xabd13d59, +0x26d930ac, 0x51de003a, 0xc8d75180, 0xbfd06116, +0x21b4f4b5, 0x56b3c423, 0xcfba9599, 0xb8bda50f, +0x2802b89e, 0x5f058808, 0xc60cd9b2, 0xb10be924, +0x2f6f7c87, 0x58684c11, 0xc1611dab, 0xb6662d3d, +0x76dc4190, 0x01db7106, 0x98d220bc, 0xefd5102a, +0x71b18589, 0x06b6b51f, 0x9fbfe4a5, 0xe8b8d433, +0x7807c9a2, 0x0f00f934, 0x9609a88e, 0xe10e9818, +0x7f6a0dbb, 0x086d3d2d, 0x91646c97, 0xe6635c01, +0x6b6b51f4, 0x1c6c6162, 0x856530d8, 0xf262004e, +0x6c0695ed, 0x1b01a57b, 0x8208f4c1, 0xf50fc457, +0x65b0d9c6, 0x12b7e950, 0x8bbeb8ea, 0xfcb9887c, +0x62dd1ddf, 0x15da2d49, 0x8cd37cf3, 0xfbd44c65, +0x4db26158, 0x3ab551ce, 0xa3bc0074, 0xd4bb30e2, +0x4adfa541, 0x3dd895d7, 0xa4d1c46d, 0xd3d6f4fb, +0x4369e96a, 0x346ed9fc, 0xad678846, 0xda60b8d0, +0x44042d73, 0x33031de5, 0xaa0a4c5f, 0xdd0d7cc9, +0x5005713c, 0x270241aa, 0xbe0b1010, 0xc90c2086, +0x5768b525, 0x206f85b3, 0xb966d409, 0xce61e49f, +0x5edef90e, 0x29d9c998, 0xb0d09822, 0xc7d7a8b4, +0x59b33d17, 0x2eb40d81, 0xb7bd5c3b, 0xc0ba6cad, +0xedb88320, 0x9abfb3b6, 0x03b6e20c, 0x74b1d29a, +0xead54739, 0x9dd277af, 0x04db2615, 0x73dc1683, +0xe3630b12, 0x94643b84, 0x0d6d6a3e, 0x7a6a5aa8, +0xe40ecf0b, 0x9309ff9d, 0x0a00ae27, 0x7d079eb1, +0xf00f9344, 0x8708a3d2, 0x1e01f268, 0x6906c2fe, +0xf762575d, 0x806567cb, 0x196c3671, 0x6e6b06e7, +0xfed41b76, 0x89d32be0, 0x10da7a5a, 0x67dd4acc, +0xf9b9df6f, 0x8ebeeff9, 0x17b7be43, 0x60b08ed5, +0xd6d6a3e8, 0xa1d1937e, 0x38d8c2c4, 0x4fdff252, +0xd1bb67f1, 0xa6bc5767, 0x3fb506dd, 0x48b2364b, +0xd80d2bda, 0xaf0a1b4c, 0x36034af6, 0x41047a60, +0xdf60efc3, 0xa867df55, 0x316e8eef, 0x4669be79, +0xcb61b38c, 0xbc66831a, 0x256fd2a0, 0x5268e236, +0xcc0c7795, 0xbb0b4703, 0x220216b9, 0x5505262f, +0xc5ba3bbe, 0xb2bd0b28, 0x2bb45a92, 0x5cb36a04, +0xc2d7ffa7, 0xb5d0cf31, 0x2cd99e8b, 0x5bdeae1d, +0x9b64c2b0, 0xec63f226, 0x756aa39c, 0x026d930a, +0x9c0906a9, 0xeb0e363f, 0x72076785, 0x05005713, +0x95bf4a82, 0xe2b87a14, 0x7bb12bae, 0x0cb61b38, +0x92d28e9b, 0xe5d5be0d, 0x7cdcefb7, 0x0bdbdf21, +0x86d3d2d4, 0xf1d4e242, 0x68ddb3f8, 0x1fda836e, +0x81be16cd, 0xf6b9265b, 0x6fb077e1, 0x18b74777, +0x88085ae6, 0xff0f6a70, 0x66063bca, 0x11010b5c, +0x8f659eff, 0xf862ae69, 0x616bffd3, 0x166ccf45, +0xa00ae278, 0xd70dd2ee, 0x4e048354, 0x3903b3c2, +0xa7672661, 0xd06016f7, 0x4969474d, 0x3e6e77db, +0xaed16a4a, 0xd9d65adc, 0x40df0b66, 0x37d83bf0, +0xa9bcae53, 0xdebb9ec5, 0x47b2cf7f, 0x30b5ffe9
[Qemu-devel] [PATCH v3] target-arm: Add support for AArch32 64bit VCVTB and VCVTT
Add support for the AArch32 floating-point half-precision to double- precision conversion VCVTB and VCVTT instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 74 ++ 1 file changed, 56 insertions(+), 18 deletions(-) Changes in v3: - Fix comments as per review - Tidy up single precision source logic diff --git a/target-arm/translate.c b/target-arm/translate.c index e701c0f..9ca357c 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3142,16 +3142,19 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) VFP_DREG_N(rn, insn); } -if (op == 15 (rn == 15 || ((rn 0x1c) == 0x18))) { -/* Integer or single precision destination. */ +if (op == 15 (rn == 15 || ((rn 0x1c) == 0x18) || + ((rn 0x1e) == 0x6))) { +/* Integer or single/half precision destination. */ rd = VFP_SREG_D(insn); } else { VFP_DREG_D(rd, insn); } if (op == 15 -(((rn 0x1c) == 0x10) || ((rn 0x14) == 0x14))) { -/* VCVT from int is always from S reg regardless of dp bit. - * VCVT with immediate frac_bits has same format as SREG_M +(((rn 0x1c) == 0x10) || ((rn 0x14) == 0x14) || + ((rn 0x1e) == 0x4))) { +/* VCVT from int or half precision is always from S reg + * regardless of dp bit. VCVT with immediate frac_bits + * has same format as SREG_M. */ rm = VFP_SREG_M(insn); } else { @@ -3241,12 +3244,19 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) case 5: case 6: case 7: -/* VCVTB, VCVTT: only present with the halfprec extension, - * UNPREDICTABLE if bit 8 is set (we choose to UNDEF) +/* VCVTB, VCVTT: only present with the halfprec extension + * UNPREDICTABLE if bit 8 is set prior to ARMv8 + * (we choose to UNDEF) */ -if (dp || !arm_feature(env, ARM_FEATURE_VFP_FP16)) { +if ((dp !arm_feature(env, ARM_FEATURE_V8)) || +!arm_feature(env, ARM_FEATURE_VFP_FP16)) { return 1; } +if (!extract32(rn, 1, 1)) { +/* Half precision source. */ +gen_mov_F0_vreg(0, rm); +break; +} /* Otherwise fall through */ default: /* One source operand. */ @@ -3394,21 +3404,39 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) case 3: /* sqrt */ gen_vfp_sqrt(dp); break; -case 4: /* vcvtb.f32.f16 */ +case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */ tmp = gen_vfp_mrs(); tcg_gen_ext16u_i32(tmp, tmp); -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, cpu_env); +if (dp) { +gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp, + cpu_env); +} else { +gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, + cpu_env); +} tcg_temp_free_i32(tmp); break; -case 5: /* vcvtt.f32.f16 */ +case 5: /* vcvtt.f32.f16, vcvtt.f64.f16 */ tmp = gen_vfp_mrs(); tcg_gen_shri_i32(tmp, tmp, 16); -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, cpu_env); +if (dp) { +gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp, + cpu_env); +} else { +gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, + cpu_env); +} tcg_temp_free_i32(tmp); break; -case 6: /* vcvtb.f16.f32 */ +case 6: /* vcvtb.f16.f32, vcvtb.f16.f64 */ tmp = tcg_temp_new_i32(); -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env
[Qemu-devel] [PATCH v2 02/11] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM
Add support for AArch32 ARMv8 FP VRINTA, VRINTN, VRINTP and VRINTM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 54 ++ 1 file changed, 54 insertions(+) Changes in v2: - Add comment to fp_decode_rm lookup table diff --git a/target-arm/translate.c b/target-arm/translate.c index 8d240e1..2db6812 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2759,6 +2759,56 @@ static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn, return 0; } +static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp, +int rounding) +{ +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode; + +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding)); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); + +if (dp) { +TCGv_i64 tcg_op; +TCGv_i64 tcg_res; +tcg_op = tcg_temp_new_i64(); +tcg_res = tcg_temp_new_i64(); +tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm)); +gen_helper_rintd(tcg_res, tcg_op, fpst); +tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(tcg_op); +tcg_temp_free_i64(tcg_res); +} else { +TCGv_i32 tcg_op; +TCGv_i32 tcg_res; +tcg_op = tcg_temp_new_i32(); +tcg_res = tcg_temp_new_i32(); +tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm)); +gen_helper_rints(tcg_res, tcg_op, fpst); +tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(tcg_op); +tcg_temp_free_i32(tcg_res); +} + +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); + +tcg_temp_free_ptr(fpst); +return 0; +} + + +/* Table for converting the most common AArch32 encoding of + * rounding mode to arm_fprounding order (which matches the + * common AArch64 order); see ARM ARM pseudocode FPDecodeRM(). + */ +static const uint8_t fp_decode_rm[] = { +FPROUNDING_TIEAWAY, +FPROUNDING_TIEEVEN, +FPROUNDING_POSINF, +FPROUNDING_NEGINF, +}; + static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) { uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); @@ -2781,6 +2831,10 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) return handle_vsel(insn, rd, rn, rm, dp); } else if ((insn 0x0fb00e10) == 0x0e800a00) { return handle_vminmaxnm(insn, rd, rn, rm, dp); +} else if ((insn 0x0fbc0ed0) == 0x0eb80a40) { +/* VRINTA, VRINTN, VRINTP, VRINTM */ +int rounding = fp_decode_rm[extract32(insn, 16, 2)]; +return handle_vrint(insn, rd, rm, dp, rounding); } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH v2 01/11] target-arm: Move arm_rmode_to_sf to a shared location.
This function will be needed for AArch32 ARMv8 support, so move it to helper.c where it can be used by both targets. Also moves the code out of line, but as it is quite a large function I don't believe this should be a significant performance impact. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/cpu.h | 2 ++ target-arm/helper.c| 28 target-arm/translate-a64.c | 28 3 files changed, 30 insertions(+), 28 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 198b6b8..383c582 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -496,6 +496,8 @@ enum arm_fprounding { FPROUNDING_ODD }; +int arm_rmode_to_sf(int rmode); + enum arm_cpu_mode { ARM_CPU_MODE_USR = 0x10, ARM_CPU_MODE_FIQ = 0x11, diff --git a/target-arm/helper.c b/target-arm/helper.c index c708f15..b1541b9 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4418,3 +4418,31 @@ float64 HELPER(rintd)(float64 x, void *fp_status) return ret; } + +/* Convert ARM rounding mode to softfloat */ +int arm_rmode_to_sf(int rmode) +{ +switch (rmode) { +case FPROUNDING_TIEAWAY: +rmode = float_round_ties_away; +break; +case FPROUNDING_ODD: +/* FIXME: add support for TIEAWAY and ODD */ +qemu_log_mask(LOG_UNIMP, arm: unimplemented rounding mode: %d\n, + rmode); +case FPROUNDING_TIEEVEN: +default: +rmode = float_round_nearest_even; +break; +case FPROUNDING_POSINF: +rmode = float_round_up; +break; +case FPROUNDING_NEGINF: +rmode = float_round_down; +break; +case FPROUNDING_ZERO: +rmode = float_round_to_zero; +break; +} +return rmode; +} diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index cf80c46..8effbe2 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -3186,34 +3186,6 @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn) } } -/* Convert ARM rounding mode to softfloat */ -static inline int arm_rmode_to_sf(int rmode) -{ -switch (rmode) { -case FPROUNDING_TIEAWAY: -rmode = float_round_ties_away; -break; -case FPROUNDING_ODD: -/* FIXME: add support for TIEAWAY and ODD */ -qemu_log_mask(LOG_UNIMP, arm: unimplemented rounding mode: %d\n, - rmode); -case FPROUNDING_TIEEVEN: -default: -rmode = float_round_nearest_even; -break; -case FPROUNDING_POSINF: -rmode = float_round_up; -break; -case FPROUNDING_NEGINF: -rmode = float_round_down; -break; -case FPROUNDING_ZERO: -rmode = float_round_to_zero; -break; -} -return rmode; -} - static void handle_fp_compare(DisasContext *s, bool is_double, unsigned int rn, unsigned int rm, bool cmp_with_zero, bool signal_all_nans) -- 1.8.1.4
[Qemu-devel] [PATCH v2 03/11] target-arm: Add support for AArch32 FP VRINTR
Add support for the AArch32 floating-point VRINTR instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 11 +++ 1 file changed, 11 insertions(+) Changes in v2: - Move code outside the arms of the if diff --git a/target-arm/translate.c b/target-arm/translate.c index 2db6812..2b3157c 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3379,6 +3379,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) gen_vfp_F1_ld0(dp); gen_vfp_cmpe(dp); break; +case 12: /* vrintr */ +{ +TCGv_ptr fpst = get_fpstatus_ptr(0); +if (dp) { +gen_helper_rintd(cpu_F0d, cpu_F0d, fpst); +} else { +gen_helper_rints(cpu_F0s, cpu_F0s, fpst); +} +tcg_temp_free_ptr(fpst); +break; +} case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH v2 09/11] target-arm: Add AArch32 FP VCVTA, VCVTN, VCVTP and VCVTM
Add support for the AArch32 floating-point VCVTA, VCVTN, VCVTP and VCVTM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 61 ++ 1 file changed, 61 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index 5a8ca24..0fcc159 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2797,6 +2797,63 @@ static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp, return 0; } +static int handle_vcvt(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp, + int rounding) +{ +bool is_signed = extract32(insn, 7, 1); +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode, tcg_shift; + +tcg_shift = tcg_const_i32(0); + +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding)); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); + +if (dp) { +TCGv_i64 tcg_double, tcg_res; +TCGv_i32 tcg_tmp; +/* Rd is encoded as a single precision register even when the source + * is double precision. + */ +rd = ((rd 1) 0x1e) | ((rd 4) 0x1); +tcg_double = tcg_temp_new_i64(); +tcg_res = tcg_temp_new_i64(); +tcg_tmp = tcg_temp_new_i32(); +tcg_gen_ld_f64(tcg_double, cpu_env, vfp_reg_offset(1, rm)); +if (is_signed) { +gen_helper_vfp_tosld(tcg_res, tcg_double, tcg_shift, fpst); +} else { +gen_helper_vfp_tould(tcg_res, tcg_double, tcg_shift, fpst); +} +tcg_gen_trunc_i64_i32(tcg_tmp, tcg_res); +tcg_gen_st_f32(tcg_tmp, cpu_env, vfp_reg_offset(0, rd)); +tcg_temp_free_i32(tcg_tmp); +tcg_temp_free_i64(tcg_res); +tcg_temp_free_i64(tcg_double); +} else { +TCGv_i32 tcg_single, tcg_res; +tcg_single = tcg_temp_new_i32(); +tcg_res = tcg_temp_new_i32(); +tcg_gen_ld_f32(tcg_single, cpu_env, vfp_reg_offset(0, rm)); +if (is_signed) { +gen_helper_vfp_tosls(tcg_res, tcg_single, tcg_shift, fpst); +} else { +gen_helper_vfp_touls(tcg_res, tcg_single, tcg_shift, fpst); +} +tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(0, rd)); +tcg_temp_free_i32(tcg_res); +tcg_temp_free_i32(tcg_single); +} + +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); + +tcg_temp_free_i32(tcg_shift); + +tcg_temp_free_ptr(fpst); + +return 0; +} /* Table for converting the most common AArch32 encoding of * rounding mode to arm_fprounding order (which matches the @@ -2835,6 +2892,10 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) /* VRINTA, VRINTN, VRINTP, VRINTM */ int rounding = fp_decode_rm[extract32(insn, 16, 2)]; return handle_vrint(insn, rd, rm, dp, rounding); +} else if ((insn 0x0fbc0e50) == 0x0ebc0a40) { +/* VCVTA, VCVTN, VCVTP, VCVTM */ +int rounding = fp_decode_rm[extract32(insn, 16, 2)]; +return handle_vcvt(insn, rd, rm, dp, rounding); } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH v2 04/11] target-arm: Add support for AArch32 FP VRINTZ
Add support for the AArch32 floating-point VRINTZ instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 16 1 file changed, 16 insertions(+) Changes in v2: - Move code outside the arms of the if diff --git a/target-arm/translate.c b/target-arm/translate.c index 2b3157c..9afb19f 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3390,6 +3390,22 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) tcg_temp_free_ptr(fpst); break; } +case 13: /* vrintz */ +{ +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode; +tcg_rmode = tcg_const_i32(float_round_to_zero); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +if (dp) { +gen_helper_rintd(cpu_F0d, cpu_F0d, fpst); +} else { +gen_helper_rints(cpu_F0s, cpu_F0s, fpst); +} +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); +tcg_temp_free_ptr(fpst); +break; +} case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH v2 08/11] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP, VRINTM, VRINTZ
Add support for the AArch32 Advanced SIMD VRINTA, VRINTN, VRINTP VRINTM and VRINTZ instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 40 +++- 1 file changed, 39 insertions(+), 1 deletion(-) Changes in v2: - Merge VRINTZ handling into the same block diff --git a/target-arm/translate.c b/target-arm/translate.c index c179817..5a8ca24 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4709,9 +4709,14 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */ #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */ #define NEON_2RM_VSHLL 38 +#define NEON_2RM_VRINTN 40 #define NEON_2RM_VRINTX 41 +#define NEON_2RM_VRINTA 42 +#define NEON_2RM_VRINTZ 43 #define NEON_2RM_VCVT_F16_F32 44 +#define NEON_2RM_VRINTM 45 #define NEON_2RM_VCVT_F32_F16 46 +#define NEON_2RM_VRINTP 47 #define NEON_2RM_VRECPE 56 #define NEON_2RM_VRSQRTE 57 #define NEON_2RM_VRECPE_F 58 @@ -4725,7 +4730,9 @@ static int neon_2rm_is_float_op(int op) { /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || -op == NEON_2RM_VRINTX || op = NEON_2RM_VRECPE_F); +(op = NEON_2RM_VRINTN op = NEON_2RM_VRINTZ) || +op == NEON_2RM_VRINTM || op == NEON_2RM_VRINTP || +op = NEON_2RM_VRECPE_F); } /* Each entry in this array has bit n set if the insn allows @@ -4769,9 +4776,14 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VMOVN] = 0x7, [NEON_2RM_VQMOVN] = 0x7, [NEON_2RM_VSHLL] = 0x7, +[NEON_2RM_VRINTN] = 0x4, [NEON_2RM_VRINTX] = 0x4, +[NEON_2RM_VRINTA] = 0x4, +[NEON_2RM_VRINTZ] = 0x4, [NEON_2RM_VCVT_F16_F32] = 0x2, +[NEON_2RM_VRINTM] = 0x4, [NEON_2RM_VCVT_F32_F16] = 0x2, +[NEON_2RM_VRINTP] = 0x4, [NEON_2RM_VRECPE] = 0x4, [NEON_2RM_VRSQRTE] = 0x4, [NEON_2RM_VRECPE_F] = 0x4, @@ -6482,6 +6494,32 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins } neon_store_reg(rm, pass, tmp2); break; +case NEON_2RM_VRINTN: +case NEON_2RM_VRINTA: +case NEON_2RM_VRINTM: +case NEON_2RM_VRINTP: +case NEON_2RM_VRINTZ: +{ +TCGv_i32 tcg_rmode; +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +int rmode; + +if (op == NEON_2RM_VRINTZ) { +rmode = FPROUNDING_ZERO; +} else { +rmode = fp_decode_rm[((op 0x6) 1) ^ 1]; +} + +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode)); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +gen_helper_rints(cpu_F0s, cpu_F0s, fpstatus); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +tcg_temp_free_ptr(fpstatus); +tcg_temp_free_i32(tcg_rmode); +break; +} case NEON_2RM_VRINTX: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); -- 1.8.1.4
[Qemu-devel] [PATCH v2 05/11] target-arm: Add support for AArch32 FP VRINTX
Add support for the AArch32 floating-point VRINTX instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 11 +++ 1 file changed, 11 insertions(+) Changes in v2: - Move code outside the arms of the if diff --git a/target-arm/translate.c b/target-arm/translate.c index 9afb19f..9eb5b92 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3406,6 +3406,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) tcg_temp_free_ptr(fpst); break; } +case 14: /* vrintx */ +{ +TCGv_ptr fpst = get_fpstatus_ptr(0); +if (dp) { +gen_helper_rintd_exact(cpu_F0d, cpu_F0d, fpst); +} else { +gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpst); +} +tcg_temp_free_ptr(fpst); +break; +} case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH v2 10/11] target-arm: Add AArch32 SIMD VCVTA, VCVTN, VCVTP and VCVTM
Add support for the AArch32 Advanced SIMD VCVTA, VCVTN, VCVTP and VCVTM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 53 +- 1 file changed, 52 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 0fcc159..e701c0f 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4778,6 +4778,14 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VRINTM 45 #define NEON_2RM_VCVT_F32_F16 46 #define NEON_2RM_VRINTP 47 +#define NEON_2RM_VCVTAU 48 +#define NEON_2RM_VCVTAS 49 +#define NEON_2RM_VCVTNU 50 +#define NEON_2RM_VCVTNS 51 +#define NEON_2RM_VCVTPU 52 +#define NEON_2RM_VCVTPS 53 +#define NEON_2RM_VCVTMU 54 +#define NEON_2RM_VCVTMS 55 #define NEON_2RM_VRECPE 56 #define NEON_2RM_VRSQRTE 57 #define NEON_2RM_VRECPE_F 58 @@ -4792,7 +4800,8 @@ static int neon_2rm_is_float_op(int op) /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || (op = NEON_2RM_VRINTN op = NEON_2RM_VRINTZ) || -op == NEON_2RM_VRINTM || op == NEON_2RM_VRINTP || +op == NEON_2RM_VRINTM || +(op = NEON_2RM_VRINTP op = NEON_2RM_VCVTMS) || op = NEON_2RM_VRECPE_F); } @@ -4845,6 +4854,14 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VRINTM] = 0x4, [NEON_2RM_VCVT_F32_F16] = 0x2, [NEON_2RM_VRINTP] = 0x4, +[NEON_2RM_VCVTAU] = 0x4, +[NEON_2RM_VCVTAS] = 0x4, +[NEON_2RM_VCVTNU] = 0x4, +[NEON_2RM_VCVTNS] = 0x4, +[NEON_2RM_VCVTPU] = 0x4, +[NEON_2RM_VCVTPS] = 0x4, +[NEON_2RM_VCVTMU] = 0x4, +[NEON_2RM_VCVTMS] = 0x4, [NEON_2RM_VRECPE] = 0x4, [NEON_2RM_VRSQRTE] = 0x4, [NEON_2RM_VRECPE_F] = 0x4, @@ -6588,6 +6605,40 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins tcg_temp_free_ptr(fpstatus); break; } +case NEON_2RM_VCVTAU: +case NEON_2RM_VCVTAS: +case NEON_2RM_VCVTNU: +case NEON_2RM_VCVTNS: +case NEON_2RM_VCVTPU: +case NEON_2RM_VCVTPS: +case NEON_2RM_VCVTMU: +case NEON_2RM_VCVTMS: +{ +bool is_signed = !extract32(insn, 7, 1); +TCGv_ptr fpst = get_fpstatus_ptr(1); +TCGv_i32 tcg_rmode, tcg_shift; +int rmode = fp_decode_rm[extract32(insn, 8, 2)]; + +tcg_shift = tcg_const_i32(0); +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode)); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); + +if (is_signed) { +gen_helper_vfp_tosls(cpu_F0s, cpu_F0s, + tcg_shift, fpst); +} else { +gen_helper_vfp_touls(cpu_F0s, cpu_F0s, + tcg_shift, fpst); +} + +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +tcg_temp_free_i32(tcg_rmode); +tcg_temp_free_i32(tcg_shift); +tcg_temp_free_ptr(fpst); +break; +} case NEON_2RM_VRECPE: gen_helper_recpe_u32(tmp, tmp, cpu_env); break; -- 1.8.1.4
[Qemu-devel] [PATCH v2 07/11] target-arm: Add set_neon_rmode helper
This helper sets the rounding mode in the standard_fp_status word to allow NEON instructions to modify the rounding mode whilst using the standard FPSCR values for everything else. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helper.c | 17 + target-arm/helper.h | 1 + 2 files changed, 18 insertions(+) diff --git a/target-arm/helper.c b/target-arm/helper.c index b1541b9..ca5b000 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4048,6 +4048,23 @@ uint32_t HELPER(set_rmode)(uint32_t rmode, CPUARMState *env) return prev_rmode; } +/* Set the current fp rounding mode in the standard fp status and return + * the old one. This is for NEON instructions that need to change the + * rounding mode but wish to use the standard FPSCR values for everything + * else. Always set the rounding mode back to the correct value after + * modifying it. + * The argument is a softfloat float_round_ value. + */ +uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env) +{ +float_status *fp_status = env-vfp.standard_fp_status; + +uint32_t prev_rmode = get_float_rounding_mode(fp_status); +set_float_rounding_mode(rmode, fp_status); + +return prev_rmode; +} + /* Half precision conversions. */ static float32 do_fcvt_f16_to_f32(uint32_t a, CPUARMState *env, float_status *s) { diff --git a/target-arm/helper.h b/target-arm/helper.h index 70872df..71b8411 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -149,6 +149,7 @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr) DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr) DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, env) +DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env) DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env) DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env) -- 1.8.1.4
[Qemu-devel] [PATCH v2 11/11] target-arm: Add support for AArch32 64bit VCVTB and VCVTT
Add support for the AArch32 floating-point half-precision to double- precision conversion VCVTB and VCVTT instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 62 ++ 1 file changed, 48 insertions(+), 14 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index e701c0f..dfda2c4 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3142,14 +3142,16 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) VFP_DREG_N(rn, insn); } -if (op == 15 (rn == 15 || ((rn 0x1c) == 0x18))) { +if (op == 15 (rn == 15 || ((rn 0x1c) == 0x18) || + ((rn 0x1e) == 0x6))) { /* Integer or single precision destination. */ rd = VFP_SREG_D(insn); } else { VFP_DREG_D(rd, insn); } if (op == 15 -(((rn 0x1c) == 0x10) || ((rn 0x14) == 0x14))) { +(((rn 0x1c) == 0x10) || ((rn 0x14) == 0x14) || + ((rn 0x1e) == 0x4))) { /* VCVT from int is always from S reg regardless of dp bit. * VCVT with immediate frac_bits has same format as SREG_M */ @@ -3241,12 +3243,19 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) case 5: case 6: case 7: -/* VCVTB, VCVTT: only present with the halfprec extension, - * UNPREDICTABLE if bit 8 is set (we choose to UNDEF) +/* VCVTB, VCVTT: only present with the halfprec extension + * UNPREDICTABLE if bit 8 is set prior to ARMv8 + * (we choose to UNDEF) */ -if (dp || !arm_feature(env, ARM_FEATURE_VFP_FP16)) { +if ((dp !arm_feature(env, ARM_FEATURE_V8)) || +!arm_feature(env, ARM_FEATURE_VFP_FP16)) { return 1; } +if ((rn 0x1e) == 0x4) { +/* Single precision source */ +gen_mov_F0_vreg(0, rm); +break; +} /* Otherwise fall through */ default: /* One source operand. */ @@ -3394,21 +3403,39 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) case 3: /* sqrt */ gen_vfp_sqrt(dp); break; -case 4: /* vcvtb.f32.f16 */ +case 4: /* vcvtb.f32.f16, vcvtb.f64.f16 */ tmp = gen_vfp_mrs(); tcg_gen_ext16u_i32(tmp, tmp); -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, cpu_env); +if (dp) { +gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp, + cpu_env); +} else { +gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, + cpu_env); +} tcg_temp_free_i32(tmp); break; -case 5: /* vcvtt.f32.f16 */ +case 5: /* vcvtt.f32.f16, vcvtt.f64.f16 */ tmp = gen_vfp_mrs(); tcg_gen_shri_i32(tmp, tmp, 16); -gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, cpu_env); +if (dp) { +gen_helper_vfp_fcvt_f16_to_f64(cpu_F0d, tmp, + cpu_env); +} else { +gen_helper_vfp_fcvt_f16_to_f32(cpu_F0s, tmp, + cpu_env); +} tcg_temp_free_i32(tmp); break; -case 6: /* vcvtb.f16.f32 */ +case 6: /* vcvtb.f16.f32, vcvtb.f16.f64 */ tmp = tcg_temp_new_i32(); -gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, cpu_env); +if (dp) { +gen_helper_vfp_fcvt_f64_to_f16(tmp, cpu_F0d, + cpu_env); +} else { +gen_helper_vfp_fcvt_f32_to_f16(tmp, cpu_F0s, + cpu_env); +} gen_mov_F0_vreg(0, rd
[Qemu-devel] [PATCH v2 06/11] target-arm: Add support for AArch32 SIMD VRINTX
Add support for the AArch32 Advanced SIMD VRINTX instruction. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/translate.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 9eb5b92..c179817 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4709,6 +4709,7 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */ #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */ #define NEON_2RM_VSHLL 38 +#define NEON_2RM_VRINTX 41 #define NEON_2RM_VCVT_F16_F32 44 #define NEON_2RM_VCVT_F32_F16 46 #define NEON_2RM_VRECPE 56 @@ -4724,7 +4725,7 @@ static int neon_2rm_is_float_op(int op) { /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || -op = NEON_2RM_VRECPE_F); +op == NEON_2RM_VRINTX || op = NEON_2RM_VRECPE_F); } /* Each entry in this array has bit n set if the insn allows @@ -4768,6 +4769,7 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VMOVN] = 0x7, [NEON_2RM_VQMOVN] = 0x7, [NEON_2RM_VSHLL] = 0x7, +[NEON_2RM_VRINTX] = 0x4, [NEON_2RM_VCVT_F16_F32] = 0x2, [NEON_2RM_VCVT_F32_F16] = 0x2, [NEON_2RM_VRECPE] = 0x4, @@ -6480,6 +6482,13 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins } neon_store_reg(rm, pass, tmp2); break; +case NEON_2RM_VRINTX: +{ +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpstatus); +tcg_temp_free_ptr(fpstatus); +break; +} case NEON_2RM_VRECPE: gen_helper_recpe_u32(tmp, tmp, cpu_env); break; -- 1.8.1.4
[Qemu-devel] [PATCH 4/9] target-arm: Add support for AArch32 FP VRINTZ
Add support for the AArch32 floating-point VRINTZ instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 21 + 1 file changed, 21 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index 73e0e8d..153d0e6 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3385,6 +3385,27 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) tcg_temp_free_ptr(fpst); } break; +case 13: /* vrintz */ +if (dp) { +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode; +tcg_rmode = tcg_const_i32(float_round_to_zero); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +gen_helper_rintd(cpu_F0d, cpu_F0d, fpst); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); +tcg_temp_free_ptr(fpst); +} else { +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode; +tcg_rmode = tcg_const_i32(float_round_to_zero); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +gen_helper_rints(cpu_F0s, cpu_F0s, fpst); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); +tcg_temp_free_ptr(fpst); +} +break; case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH 2/9] target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM
Add support for AArch32 ARMv8 FP VRINTA, VRINTN, VRINTP and VRINTM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 49 + 1 file changed, 49 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index 8d240e1..f688f6d 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2759,6 +2759,51 @@ static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn, return 0; } +static int handle_vrint(uint32_t insn, uint32_t rd, uint32_t rm, uint32_t dp, +int rounding) +{ +TCGv_ptr fpst = get_fpstatus_ptr(0); +TCGv_i32 tcg_rmode; + +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rounding)); +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); + +if (dp) { +TCGv_i64 tcg_op; +TCGv_i64 tcg_res; +tcg_op = tcg_temp_new_i64(); +tcg_res = tcg_temp_new_i64(); +tcg_gen_ld_f64(tcg_op, cpu_env, vfp_reg_offset(dp, rm)); +gen_helper_rintd(tcg_res, tcg_op, fpst); +tcg_gen_st_f64(tcg_res, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(tcg_op); +tcg_temp_free_i64(tcg_res); +} else { +TCGv_i32 tcg_op; +TCGv_i32 tcg_res; +tcg_op = tcg_temp_new_i32(); +tcg_res = tcg_temp_new_i32(); +tcg_gen_ld_f32(tcg_op, cpu_env, vfp_reg_offset(dp, rm)); +gen_helper_rints(tcg_res, tcg_op, fpst); +tcg_gen_st_f32(tcg_res, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(tcg_op); +tcg_temp_free_i32(tcg_res); +} + +gen_helper_set_rmode(tcg_rmode, tcg_rmode, cpu_env); +tcg_temp_free_i32(tcg_rmode); + +tcg_temp_free_ptr(fpst); +return 0; +} + +static const uint8_t fp_decode_rm[] = { +FPROUNDING_TIEAWAY, +FPROUNDING_TIEEVEN, +FPROUNDING_POSINF, +FPROUNDING_NEGINF, +}; + static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) { uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); @@ -2781,6 +2826,10 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) return handle_vsel(insn, rd, rn, rm, dp); } else if ((insn 0x0fb00e10) == 0x0e800a00) { return handle_vminmaxnm(insn, rd, rn, rm, dp); +} else if ((insn 0x0fbc0ed0) == 0x0eb80a40) { +/* VRINTA, VRINTN, VRINTP, VRINTM */ +int rounding = fp_decode_rm[extract32(insn, 16, 2)]; +return handle_vrint(insn, rd, rm, dp, rounding); } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH 7/9] target-arm: Add set_neon_rmode helper
This helper sets the rounding mode in the standard_fp_status word to allow NEON instructions to modify the rounding mode whilst using the standard FPSCR values for everything else. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/helper.c | 17 + target-arm/helper.h | 1 + 2 files changed, 18 insertions(+) diff --git a/target-arm/helper.c b/target-arm/helper.c index b1541b9..ca5b000 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4048,6 +4048,23 @@ uint32_t HELPER(set_rmode)(uint32_t rmode, CPUARMState *env) return prev_rmode; } +/* Set the current fp rounding mode in the standard fp status and return + * the old one. This is for NEON instructions that need to change the + * rounding mode but wish to use the standard FPSCR values for everything + * else. Always set the rounding mode back to the correct value after + * modifying it. + * The argument is a softfloat float_round_ value. + */ +uint32_t HELPER(set_neon_rmode)(uint32_t rmode, CPUARMState *env) +{ +float_status *fp_status = env-vfp.standard_fp_status; + +uint32_t prev_rmode = get_float_rounding_mode(fp_status); +set_float_rounding_mode(rmode, fp_status); + +return prev_rmode; +} + /* Half precision conversions. */ static float32 do_fcvt_f16_to_f32(uint32_t a, CPUARMState *env, float_status *s) { diff --git a/target-arm/helper.h b/target-arm/helper.h index 70872df..71b8411 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -149,6 +149,7 @@ DEF_HELPER_3(vfp_ultod, f64, i64, i32, ptr) DEF_HELPER_3(vfp_uqtod, f64, i64, i32, ptr) DEF_HELPER_FLAGS_2(set_rmode, TCG_CALL_NO_RWG, i32, i32, env) +DEF_HELPER_FLAGS_2(set_neon_rmode, TCG_CALL_NO_RWG, i32, i32, env) DEF_HELPER_2(vfp_fcvt_f16_to_f32, f32, i32, env) DEF_HELPER_2(vfp_fcvt_f32_to_f16, i32, f32, env) -- 1.8.1.4
[Qemu-devel] [PATCH 9/9] target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP and VRINTM
Add support for the AArch32 Advanced SIMD VRINTA, VRINTN, VRINTP and VRINTM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 29 - 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 16242d3..5564cb9 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4709,10 +4709,14 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */ #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */ #define NEON_2RM_VSHLL 38 +#define NEON_2RM_VRINTN 40 #define NEON_2RM_VRINTX 41 +#define NEON_2RM_VRINTA 42 #define NEON_2RM_VRINTZ 43 #define NEON_2RM_VCVT_F16_F32 44 +#define NEON_2RM_VRINTM 45 #define NEON_2RM_VCVT_F32_F16 46 +#define NEON_2RM_VRINTP 47 #define NEON_2RM_VRECPE 56 #define NEON_2RM_VRSQRTE 57 #define NEON_2RM_VRECPE_F 58 @@ -4726,7 +4730,8 @@ static int neon_2rm_is_float_op(int op) { /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || -op == NEON_2RM_VRINTX || op == NEON_2RM_VRINTZ || +(op = NEON_2RM_VRINTN op = NEON_2RM_VRINTZ) || +op == NEON_2RM_VRINTM || op == NEON_2RM_VRINTP || op = NEON_2RM_VRECPE_F); } @@ -4771,10 +4776,14 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VMOVN] = 0x7, [NEON_2RM_VQMOVN] = 0x7, [NEON_2RM_VSHLL] = 0x7, +[NEON_2RM_VRINTN] = 0x4, [NEON_2RM_VRINTX] = 0x4, +[NEON_2RM_VRINTA] = 0x4, [NEON_2RM_VRINTZ] = 0x4, [NEON_2RM_VCVT_F16_F32] = 0x2, +[NEON_2RM_VRINTM] = 0x4, [NEON_2RM_VCVT_F32_F16] = 0x2, +[NEON_2RM_VRINTP] = 0x4, [NEON_2RM_VRECPE] = 0x4, [NEON_2RM_VRSQRTE] = 0x4, [NEON_2RM_VRECPE_F] = 0x4, @@ -6485,6 +6494,24 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins } neon_store_reg(rm, pass, tmp2); break; +case NEON_2RM_VRINTN: +case NEON_2RM_VRINTA: +case NEON_2RM_VRINTM: +case NEON_2RM_VRINTP: +{ +TCGv_i32 tcg_rmode; +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +int rmode = fp_decode_rm[((op 0x6) 1) ^ 1]; +tcg_rmode = tcg_const_i32(arm_rmode_to_sf(rmode)); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +gen_helper_rints(cpu_F0s, cpu_F0s, fpstatus); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +tcg_temp_free_ptr(fpstatus); +tcg_temp_free_i32(tcg_rmode); +break; +} case NEON_2RM_VRINTX: { TCGv_ptr fpstatus = get_fpstatus_ptr(1); -- 1.8.1.4
[Qemu-devel] [PATCH 8/9] target-arm: Add support for AArch32 SIMD VRINTZ
Add support for the AArch32 Advanced SIMD VRINTZ instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index b6d11db..16242d3 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4710,6 +4710,7 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */ #define NEON_2RM_VSHLL 38 #define NEON_2RM_VRINTX 41 +#define NEON_2RM_VRINTZ 43 #define NEON_2RM_VCVT_F16_F32 44 #define NEON_2RM_VCVT_F32_F16 46 #define NEON_2RM_VRECPE 56 @@ -4725,7 +4726,8 @@ static int neon_2rm_is_float_op(int op) { /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || -op == NEON_2RM_VRINTX || op = NEON_2RM_VRECPE_F); +op == NEON_2RM_VRINTX || op == NEON_2RM_VRINTZ || +op = NEON_2RM_VRECPE_F); } /* Each entry in this array has bit n set if the insn allows @@ -4770,6 +4772,7 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VQMOVN] = 0x7, [NEON_2RM_VSHLL] = 0x7, [NEON_2RM_VRINTX] = 0x4, +[NEON_2RM_VRINTZ] = 0x4, [NEON_2RM_VCVT_F16_F32] = 0x2, [NEON_2RM_VCVT_F32_F16] = 0x2, [NEON_2RM_VRECPE] = 0x4, @@ -6489,6 +6492,20 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins tcg_temp_free_ptr(fpstatus); break; } +case NEON_2RM_VRINTZ: +{ +TCGv_i32 tcg_rmode; +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +tcg_rmode = tcg_const_i32(float_round_to_zero); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +gen_helper_rints(cpu_F0s, cpu_F0s, fpstatus); +gen_helper_set_neon_rmode(tcg_rmode, tcg_rmode, + cpu_env); +tcg_temp_free_ptr(fpstatus); +tcg_temp_free_i32(tcg_rmode); +break; +} case NEON_2RM_VRECPE: gen_helper_recpe_u32(tmp, tmp, cpu_env); break; -- 1.8.1.4
[Qemu-devel] [PATCH 5/9] target-arm: Add support for AArch32 FP VRINTX
Add support for the AArch32 floating-point VRINTX instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index 153d0e6..5108f6b 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3406,6 +3406,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) tcg_temp_free_ptr(fpst); } break; +case 14: /* vrintx */ +if (dp) { +TCGv_ptr fpst = get_fpstatus_ptr(0); +gen_helper_rintd_exact(cpu_F0d, cpu_F0d, fpst); +tcg_temp_free_ptr(fpst); +} else { +TCGv_ptr fpst = get_fpstatus_ptr(0); +gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpst); +tcg_temp_free_ptr(fpst); +} +break; case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH 3/9] target-arm: Add support for AArch32 FP VRINTR
Add support for the AArch32 floating-point VRINTR instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index f688f6d..73e0e8d 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3374,6 +3374,17 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) gen_vfp_F1_ld0(dp); gen_vfp_cmpe(dp); break; +case 12: /* vrintr */ +if (dp) { +TCGv_ptr fpst = get_fpstatus_ptr(0); +gen_helper_rintd(cpu_F0d, cpu_F0d, fpst); +tcg_temp_free_ptr(fpst); +} else { +TCGv_ptr fpst = get_fpstatus_ptr(0); +gen_helper_rints(cpu_F0s, cpu_F0s, fpst); +tcg_temp_free_ptr(fpst); +} +break; case 15: /* single-double conversion */ if (dp) gen_helper_vfp_fcvtsd(cpu_F0s, cpu_F0d, cpu_env); -- 1.8.1.4
[Qemu-devel] [PATCH 6/9] target-arm: Add support for AArch32 SIMD VRINTX
Add support for the AArch32 Advanced SIMD VRINTX instruction. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 5108f6b..b6d11db 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4709,6 +4709,7 @@ static const uint8_t neon_3r_sizes[] = { #define NEON_2RM_VMOVN 36 /* Includes VQMOVN, VQMOVUN */ #define NEON_2RM_VQMOVN 37 /* Includes VQMOVUN */ #define NEON_2RM_VSHLL 38 +#define NEON_2RM_VRINTX 41 #define NEON_2RM_VCVT_F16_F32 44 #define NEON_2RM_VCVT_F32_F16 46 #define NEON_2RM_VRECPE 56 @@ -4724,7 +4725,7 @@ static int neon_2rm_is_float_op(int op) { /* Return true if this neon 2reg-misc op is float-to-float */ return (op == NEON_2RM_VABS_F || op == NEON_2RM_VNEG_F || -op = NEON_2RM_VRECPE_F); +op == NEON_2RM_VRINTX || op = NEON_2RM_VRECPE_F); } /* Each entry in this array has bit n set if the insn allows @@ -4768,6 +4769,7 @@ static const uint8_t neon_2rm_sizes[] = { [NEON_2RM_VMOVN] = 0x7, [NEON_2RM_VQMOVN] = 0x7, [NEON_2RM_VSHLL] = 0x7, +[NEON_2RM_VRINTX] = 0x4, [NEON_2RM_VCVT_F16_F32] = 0x2, [NEON_2RM_VCVT_F32_F16] = 0x2, [NEON_2RM_VRECPE] = 0x4, @@ -6480,6 +6482,13 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins } neon_store_reg(rm, pass, tmp2); break; +case NEON_2RM_VRINTX: +{ +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +gen_helper_rints_exact(cpu_F0s, cpu_F0s, fpstatus); +tcg_temp_free_ptr(fpstatus); +break; +} case NEON_2RM_VRECPE: gen_helper_recpe_u32(tmp, tmp, cpu_env); break; -- 1.8.1.4
[Qemu-devel] [PATCH 1/9] target-arm: Move arm_rmode_to_sf to a shared location.
This function will be needed for AArch32 ARMv8 support, so move it to helper.c where it can be used by both targets. Also moves the code out of line, but as it is quite a large function I don't believe this should be a significant performance impact. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/cpu.h | 2 ++ target-arm/helper.c| 28 target-arm/translate-a64.c | 28 3 files changed, 30 insertions(+), 28 deletions(-) diff --git a/target-arm/cpu.h b/target-arm/cpu.h index 198b6b8..383c582 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -496,6 +496,8 @@ enum arm_fprounding { FPROUNDING_ODD }; +int arm_rmode_to_sf(int rmode); + enum arm_cpu_mode { ARM_CPU_MODE_USR = 0x10, ARM_CPU_MODE_FIQ = 0x11, diff --git a/target-arm/helper.c b/target-arm/helper.c index c708f15..b1541b9 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4418,3 +4418,31 @@ float64 HELPER(rintd)(float64 x, void *fp_status) return ret; } + +/* Convert ARM rounding mode to softfloat */ +int arm_rmode_to_sf(int rmode) +{ +switch (rmode) { +case FPROUNDING_TIEAWAY: +rmode = float_round_ties_away; +break; +case FPROUNDING_ODD: +/* FIXME: add support for TIEAWAY and ODD */ +qemu_log_mask(LOG_UNIMP, arm: unimplemented rounding mode: %d\n, + rmode); +case FPROUNDING_TIEEVEN: +default: +rmode = float_round_nearest_even; +break; +case FPROUNDING_POSINF: +rmode = float_round_up; +break; +case FPROUNDING_NEGINF: +rmode = float_round_down; +break; +case FPROUNDING_ZERO: +rmode = float_round_to_zero; +break; +} +return rmode; +} diff --git a/target-arm/translate-a64.c b/target-arm/translate-a64.c index cf80c46..8effbe2 100644 --- a/target-arm/translate-a64.c +++ b/target-arm/translate-a64.c @@ -3186,34 +3186,6 @@ static void disas_data_proc_reg(DisasContext *s, uint32_t insn) } } -/* Convert ARM rounding mode to softfloat */ -static inline int arm_rmode_to_sf(int rmode) -{ -switch (rmode) { -case FPROUNDING_TIEAWAY: -rmode = float_round_ties_away; -break; -case FPROUNDING_ODD: -/* FIXME: add support for TIEAWAY and ODD */ -qemu_log_mask(LOG_UNIMP, arm: unimplemented rounding mode: %d\n, - rmode); -case FPROUNDING_TIEEVEN: -default: -rmode = float_round_nearest_even; -break; -case FPROUNDING_POSINF: -rmode = float_round_up; -break; -case FPROUNDING_NEGINF: -rmode = float_round_down; -break; -case FPROUNDING_ZERO: -rmode = float_round_to_zero; -break; -} -return rmode; -} - static void handle_fp_compare(DisasContext *s, bool is_double, unsigned int rn, unsigned int rm, bool cmp_with_zero, bool signal_all_nans) -- 1.8.1.4
[Qemu-devel] [PATCH 0/9] target-arm: Add AArch32 ARMv8 VRINT instructions
This series adds support for the floating-point and Advanced SIMD versions of the VRINT family of instructions. Will Newton (9): target-arm: Move arm_rmode_to_sf to a shared location. target-arm: Add AArch32 FP VRINTA, VRINTN, VRINTP and VRINTM target-arm: Add support for AArch32 FP VRINTR target-arm: Add support for AArch32 FP VRINTZ target-arm: Add support for AArch32 FP VRINTX target-arm: Add support for AArch32 SIMD VRINTX target-arm: Add set_neon_rmode helper target-arm: Add support for AArch32 SIMD VRINTZ target-arm: Add AArch32 SIMD VRINTA, VRINTN, VRINTP and VRINTM target-arm/cpu.h | 2 + target-arm/helper.c| 45 ++ target-arm/helper.h| 1 + target-arm/translate-a64.c | 28 - target-arm/translate.c | 145 + 5 files changed, 193 insertions(+), 28 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH] linux-user: Remove regs parameter load_elf_binary and load_flt_binary
The regs parameter is not used anywhere, so remove it. Signed-off-by: Will Newton will.new...@linaro.org --- linux-user/elfload.c | 3 +-- linux-user/flatload.c | 3 +-- linux-user/linuxload.c | 4 ++-- linux-user/qemu.h | 6 ++ 4 files changed, 6 insertions(+), 10 deletions(-) diff --git a/linux-user/elfload.c b/linux-user/elfload.c index 8dd424d..5902f16 100644 --- a/linux-user/elfload.c +++ b/linux-user/elfload.c @@ -1998,8 +1998,7 @@ give_up: free(syms); } -int load_elf_binary(struct linux_binprm * bprm, struct target_pt_regs * regs, -struct image_info * info) +int load_elf_binary(struct linux_binprm *bprm, struct image_info *info) { struct image_info interp_info; struct elfhdr elf_ex; diff --git a/linux-user/flatload.c b/linux-user/flatload.c index ceb89bb..566a7a8 100644 --- a/linux-user/flatload.c +++ b/linux-user/flatload.c @@ -704,8 +704,7 @@ static int load_flat_shared_library(int id, struct lib_info *libs) #endif /* CONFIG_BINFMT_SHARED_FLAT */ -int load_flt_binary(struct linux_binprm * bprm, struct target_pt_regs * regs, -struct image_info * info) +int load_flt_binary(struct linux_binprm *bprm, struct image_info *info) { struct lib_info libinfo[MAX_SHARED_LIBS]; abi_ulong p = bprm-p; diff --git a/linux-user/linuxload.c b/linux-user/linuxload.c index a1fe5ed..f2997c2 100644 --- a/linux-user/linuxload.c +++ b/linux-user/linuxload.c @@ -154,13 +154,13 @@ int loader_exec(int fdexec, const char *filename, char **argv, char **envp, bprm-buf[1] == 'E' bprm-buf[2] == 'L' bprm-buf[3] == 'F') { -retval = load_elf_binary(bprm, regs, infop); +retval = load_elf_binary(bprm, infop); #if defined(TARGET_HAS_BFLT) } else if (bprm-buf[0] == 'b' bprm-buf[1] == 'F' bprm-buf[2] == 'L' bprm-buf[3] == 'T') { -retval = load_flt_binary(bprm,regs,infop); +retval = load_flt_binary(bprm, infop); #endif } else { return -ENOEXEC; diff --git a/linux-user/qemu.h b/linux-user/qemu.h index e2717e0..c2f74f3 100644 --- a/linux-user/qemu.h +++ b/linux-user/qemu.h @@ -178,10 +178,8 @@ int loader_exec(int fdexec, const char *filename, char **argv, char **envp, struct target_pt_regs * regs, struct image_info *infop, struct linux_binprm *); -int load_elf_binary(struct linux_binprm * bprm, struct target_pt_regs * regs, -struct image_info * info); -int load_flt_binary(struct linux_binprm * bprm, struct target_pt_regs * regs, -struct image_info * info); +int load_elf_binary(struct linux_binprm *bprm, struct image_info *info); +int load_flt_binary(struct linux_binprm *bprm, struct image_info *info); abi_long memcpy_to_target(abi_ulong dest, const void *src, unsigned long len); -- 1.8.1.4
Re: [Qemu-devel] arm-softmmu usb (or hci) passthrough possible?
On 10 December 2013 10:53, Jan Petrouš jan.petr...@tieto.com wrote: Hi Peter. On 10 December 2013 11:35, Peter Maydell peter.mayd...@linaro.org wrote: On 10 December 2013 10:20, Jan Petrouš jan.petr...@tieto.com wrote: Hi, sorry for my ignorance but I can not find the way how to set up usb (or hci) passthrough for arm-softmmu target. Is it even possible? You don't say which ARM board model you're trying to use. Many of them don't have an emulated USB controller... Well, for start I would be happy with any arm model working this way :) My intention is to fix that on RasPi model. But for that I would like to be sure it is even possible or if it need more work (what is also not problem for me). I simply would like to be sure to not duplicate someone's effort. So, if you can point me to any arm model which can do usb passthrough I would be very happy. Thanks. Just a word of warning - the Raspberry Pi uses a non-standard (i.e. non-EHCI) USB host controller so it may prove very difficult to get USB working. -- Will Newton Toolchain Working Group, Linaro
[Qemu-devel] [PATCH v8 1/6] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/translate.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 5f003e7..f63e89d 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2636,6 +2636,14 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rn != ARM_VFP_MVFR1 rn != ARM_VFP_MVFR0) return 1; } + +if (extract32(insn, 28, 4) == 0xf) { +/* Encodings with T=1 (Thumb) or unconditional (ARM): + * only used in v8 and above. + */ +return 1; +} + dp = ((insn 0xf00) == 0xb00); switch ((insn 24) 0xf) { case 0xe: @@ -6296,9 +6304,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6753,6 +6758,13 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +return; +} if (((insn 0x0f30f000) == 0x0510f000) || ((insn 0x0f30f010) == 0x0710f000)) { if ((insn (1 22)) == 0) { @@ -8033,9 +8045,15 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) case 0xc: case 0xd: case 0xe: -/* Coprocessor. */ -if (disas_coproc_insn(env, s, insn)) +if (((insn 8) 0xe) == 10) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +} else if (disas_coproc_insn(env, s, insn)) { +/* Coprocessor. */ goto illegal_op; +} break; case 0xf: /* swi */ @@ -8765,6 +8783,10 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw insn = (insn 0xe2ff) | ((insn (1 28)) 4) | (1 28); if (disas_neon_data_insn(env, s, insn)) goto illegal_op; +} else if (((insn 8) 0xe) == 10) { +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} } else { if (insn (1 28)) goto illegal_op; -- 1.8.1.4
[Qemu-devel] [PATCH v8 2/6] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/translate.c | 135 - 1 file changed, 134 insertions(+), 1 deletion(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index f63e89d..0a22ad8 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2614,6 +2614,139 @@ static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) return tmp; } +static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm, + uint32_t dp) +{ +uint32_t cc = extract32(insn, 20, 2); + +if (dp) { +TCGv_i64 frn, frm, dest; +TCGv_i64 tmp, zero, zf, nf, vf; + +zero = tcg_const_i64(0); + +frn = tcg_temp_new_i64(); +frm = tcg_temp_new_i64(); +dest = tcg_temp_new_i64(); + +zf = tcg_temp_new_i64(); +nf = tcg_temp_new_i64(); +vf = tcg_temp_new_i64(); + +tcg_gen_extu_i32_i64(zf, cpu_ZF); +tcg_gen_ext_i32_i64(nf, cpu_NF); +tcg_gen_ext_i32_i64(vf, cpu_VF); + +tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero, +frn, frm); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero, +frn, frm); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, +frn, frm); +tcg_temp_free_i64(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero, +frn, frm); +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, +dest, frm); +tcg_temp_free_i64(tmp); +break; +} +tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(frn); +tcg_temp_free_i64(frm); +tcg_temp_free_i64(dest); + +tcg_temp_free_i64(zf); +tcg_temp_free_i64(nf); +tcg_temp_free_i64(vf); + +tcg_temp_free_i64(zero); +} else { +TCGv_i32 frn, frm, dest; +TCGv_i32 tmp, zero; + +zero = tcg_const_i32(0); + +frn = tcg_temp_new_i32(); +frm = tcg_temp_new_i32(); +dest = tcg_temp_new_i32(); +tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero, +frn, frm); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero, +frn, frm); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, +frn, frm); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero, +frn, frm); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, +dest, frm); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(frn); +tcg_temp_free_i32(frm); +tcg_temp_free_i32(dest); + +tcg_temp_free_i32(zero); +} + +return 0; +} + +static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) +{ +uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); + +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +if (dp) { +VFP_DREG_D(rd, insn); +VFP_DREG_N(rn, insn); +VFP_DREG_M(rm, insn); +} else { +rd = VFP_SREG_D(insn); +rn = VFP_SREG_N(insn); +rm = VFP_SREG_M(insn); +} + +if ((insn 0x0f800e50) == 0x0e000a00) { +return handle_vsel(insn, rd, rn, rm, dp); +} +return 1; +} + /* Disassemble a VFP instruction. Returns nonzero if an error occurred (ie. an undefined instruction
[Qemu-devel] [PATCH v8 6/6] target-arm: Implement ARMv8 SIMD VMAXNM and VMINNM instructions.
This adds support for the ARMv8 Advanced SIMD VMAXNM and VMINNM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 31 ++- 1 file changed, 22 insertions(+), 9 deletions(-) Changes in v8: - Use VFP helper instead of adding a NEON specific one - Drop size check diff --git a/target-arm/translate.c b/target-arm/translate.c index 9a8069e..73ed266 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4553,7 +4553,7 @@ static void gen_neon_narrow_op(int op, int u, int size, #define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */ #define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */ #define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */ -#define NEON_3R_VRECPS_VRSQRTS 31 /* float VRECPS, VRSQRTS */ +#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */ static const uint8_t neon_3r_sizes[] = { [NEON_3R_VHADD] = 0x7, @@ -4586,7 +4586,7 @@ static const uint8_t neon_3r_sizes[] = { [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */ [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */ [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */ -[NEON_3R_VRECPS_VRSQRTS] = 0x5, /* size bit 1 encodes op */ +[NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */ }; /* Symbolic constants for op fields for Neon 2-register miscellaneous. @@ -4847,8 +4847,9 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins return 1; } break; -case NEON_3R_VRECPS_VRSQRTS: -if (u) { +case NEON_3R_FLOAT_MISC: +/* VMAXNM/VMINNM in ARMv8 */ +if (u !arm_feature(env, ARM_FEATURE_V8)) { return 1; } break; @@ -5137,11 +5138,23 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins tcg_temp_free_ptr(fpstatus); break; } -case NEON_3R_VRECPS_VRSQRTS: -if (size == 0) -gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); -else -gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); +case NEON_3R_FLOAT_MISC: +if (u) { +/* VMAXNM/VMINNM */ +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +if (size == 0) { +gen_helper_vfp_maxnms(tmp, tmp, tmp2, fpstatus); +} else { +gen_helper_vfp_minnms(tmp, tmp, tmp2, fpstatus); +} +tcg_temp_free_ptr(fpstatus); +} else { +if (size == 0) { +gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); +} else { +gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); + } +} break; case NEON_3R_VFM: { -- 1.8.1.4
[Qemu-devel] [PATCH v8 0/6] target-arm: Add support for VSEL and VMIN/MAXNM.
This series adds support for three new instructions added in ARMv8 - VSEL, VMINNM and VMAXNM. Will Newton (6): target-arm: Move call to disas_vfp_insn out of disas_coproc_insn. target-arm: Implement ARMv8 VSEL instruction. softfloat: Remove unused argument from MINMAX macro. softfloat: Add minNum() and maxNum() functions to softfloat. target-arm: Implement ARMv8 FP VMAXNM and VMINNM instructions. target-arm: Implement ARMv8 SIMD VMAXNM and VMINNM instructions. fpu/softfloat.c | 38 ++-- include/fpu/softfloat.h | 4 + target-arm/helper.c | 25 + target-arm/helper.h | 5 + target-arm/translate.c | 246 +--- 5 files changed, 298 insertions(+), 20 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH v8 5/6] target-arm: Implement ARMv8 FP VMAXNM and VMINNM instructions.
This adds support for the ARMv8 floating point VMAXNM and VMINNM instructions. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- target-arm/helper.c| 25 + target-arm/helper.h| 5 + target-arm/translate.c | 50 ++ 3 files changed, 80 insertions(+) diff --git a/target-arm/helper.c b/target-arm/helper.c index 3445813..7507240 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4079,3 +4079,28 @@ float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c, void *fpstp) float_status *fpst = fpstp; return float64_muladd(a, b, c, 0, fpst); } + +/* ARMv8 VMAXNM/VMINNM */ +float32 VFP_HELPER(maxnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float32_maxnum(a, b, fpst); +} + +float64 VFP_HELPER(maxnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float64_maxnum(a, b, fpst); +} + +float32 VFP_HELPER(minnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float32_minnum(a, b, fpst); +} + +float64 VFP_HELPER(minnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float64_minnum(a, b, fpst); +} diff --git a/target-arm/helper.h b/target-arm/helper.h index cac9564..d459a39 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -132,6 +132,11 @@ DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env) DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr) DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_maxnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_maxnms, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_minnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_minnms, f32, f32, f32, ptr) + DEF_HELPER_3(recps_f32, f32, f32, f32, env) DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env) DEF_HELPER_2(recpe_f32, f32, f32, env) diff --git a/target-arm/translate.c b/target-arm/translate.c index 0a22ad8..9a8069e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2723,6 +2723,54 @@ static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm, return 0; } +static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn, +uint32_t rm, uint32_t dp) +{ +uint32_t vmin = extract32(insn, 6, 1); +TCGv_ptr fpst = get_fpstatus_ptr(0); + +if (dp) { +TCGv_i64 frn, frm, dest; + +frn = tcg_temp_new_i64(); +frm = tcg_temp_new_i64(); +dest = tcg_temp_new_i64(); + +tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) { +gen_helper_vfp_minnmd(dest, frn, frm, fpst); +} else { +gen_helper_vfp_maxnmd(dest, frn, frm, fpst); +} +tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(frn); +tcg_temp_free_i64(frm); +tcg_temp_free_i64(dest); +} else { +TCGv_i32 frn, frm, dest; + +frn = tcg_temp_new_i32(); +frm = tcg_temp_new_i32(); +dest = tcg_temp_new_i32(); + +tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) { +gen_helper_vfp_minnms(dest, frn, frm, fpst); +} else { +gen_helper_vfp_maxnms(dest, frn, frm, fpst); +} +tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(frn); +tcg_temp_free_i32(frm); +tcg_temp_free_i32(dest); +} + +tcg_temp_free_ptr(fpst); +return 0; +} + static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) { uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); @@ -2743,6 +2791,8 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) if ((insn 0x0f800e50) == 0x0e000a00) { return handle_vsel(insn, rd, rn, rm, dp); +} else if ((insn 0x0fb00e10) == 0x0e800a00) { +return handle_vminmaxnm(insn, rd, rn, rm, dp); } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH v8 3/6] softfloat: Remove unused argument from MINMAX macro.
The nan_exp argument is not used, so remove it. Signed-off-by: Will Newton will.new...@linaro.org Reviewed-by: Peter Maydell peter.mayd...@linaro.org --- fpu/softfloat.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 7ba51b6..97bf627 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -6706,7 +6706,7 @@ int float128_compare_quiet( float128 a, float128 b STATUS_PARAM ) * 'compare and pick one input' because that would mishandle * NaNs and +0 vs -0. */ -#define MINMAX(s, nan_exp) \ +#define MINMAX(s) \ INLINE float ## s float ## s ## _minmax(float ## s a, float ## s b, \ int ismin STATUS_PARAM )\ { \ @@ -6747,8 +6747,8 @@ float ## s float ## s ## _max(float ## s a, float ## s b STATUS_PARAM) \ return float ## s ## _minmax(a, b, 0 STATUS_VAR); \ } -MINMAX(32, 0xff) -MINMAX(64, 0x7ff) +MINMAX(32) +MINMAX(64) /* Multiply A by 2 raised to the power N. */ -- 1.8.1.4
[Qemu-devel] [PATCH v8 4/6] softfloat: Add minNum() and maxNum() functions to softfloat.
Add floatnn_minnum() and floatnn_maxnum() functions which are equivalent to the minNum() and maxNum() functions from IEEE 754-2008. They are similar to min() and max() but differ in the handling of QNaN arguments. Signed-off-by: Will Newton will.new...@linaro.org --- fpu/softfloat.c | 32 +--- include/fpu/softfloat.h | 4 2 files changed, 33 insertions(+), 3 deletions(-) Changes in v8: - Use existing MINMAX macro to implement minnum and maxnum - Improve comment diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 97bf627..dbda61b 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -6705,10 +6705,17 @@ int float128_compare_quiet( float128 a, float128 b STATUS_PARAM ) /* min() and max() functions. These can't be implemented as * 'compare and pick one input' because that would mishandle * NaNs and +0 vs -0. + * + * minnum() and maxnum() functions. These are similar to the min() + * and max() functions but if one of the arguments is a QNaN and + * the other is numerical then the numerical argument is returned. + * minnum() and maxnum correspond to the IEEE 754-2008 minNum() + * and maxNum() operations. min() and max() are the typical min/max + * semantics provided by many CPUs which predate that specification. */ #define MINMAX(s) \ INLINE float ## s float ## s ## _minmax(float ## s a, float ## s b, \ -int ismin STATUS_PARAM )\ +int ismin, int isieee STATUS_PARAM) \ { \ flag aSign, bSign; \ uint ## s ## _t av, bv; \ @@ -6716,6 +6723,15 @@ INLINE float ## s float ## s ## _minmax(float ## s a, float ## s b, \ b = float ## s ## _squash_input_denormal(b STATUS_VAR); \ if (float ## s ## _is_any_nan(a) || \ float ## s ## _is_any_nan(b)) { \ +if (isieee) { \ +if (float ## s ## _is_quiet_nan(a)\ +!float ## s ##_is_any_nan(b)) { \ +return b; \ +} else if (float ## s ## _is_quiet_nan(b) \ + !float ## s ## _is_any_nan(a)) { \ +return a; \ +} \ +} \ return propagateFloat ## s ## NaN(a, b STATUS_VAR); \ } \ aSign = extractFloat ## s ## Sign(a); \ @@ -6739,12 +6755,22 @@ INLINE float ## s float ## s ## _minmax(float ## s a, float ## s b, \ \ float ## s float ## s ## _min(float ## s a, float ## s b STATUS_PARAM) \ { \ -return float ## s ## _minmax(a, b, 1 STATUS_VAR); \ +return float ## s ## _minmax(a, b, 1, 0 STATUS_VAR);\ } \ \ float ## s float ## s ## _max(float ## s a, float ## s b STATUS_PARAM) \ { \ -return float ## s ## _minmax(a, b, 0 STATUS_VAR); \ +return float ## s ## _minmax(a, b, 0, 0 STATUS_VAR);\ +} \ +\ +float ## s float ## s ## _minnum(float ## s a, float ## s b STATUS_PARAM) \ +{ \ +return float ## s ## _minmax(a, b, 1, 1 STATUS_VAR);\ +} \ +\ +float ## s float ## s ## _maxnum(float ## s a, float ## s b STATUS_PARAM) \ +{ \ +return float ## s ## _minmax(a, b, 0, 1 STATUS_VAR);\ } MINMAX(32) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index f3927e2..2365274 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -302,6 +302,8 @@ int float32_compare( float32, float32 STATUS_PARAM ); int float32_compare_quiet( float32
[Qemu-devel] [PATCH v7 0/6] target-arm: Add support for VSEL and VMIN/MAXNM.
This series adds support for three new instructions added in ARMv8 - VSEL, VMINNM and VMAXNM. Will Newton (6): target-arm: Move call to disas_vfp_insn out of disas_coproc_insn. target-arm: Implement ARMv8 VSEL instruction. softfloat: Remove unused argument from MINMAX macro. softfloat: Add minNum() and maxNum() functions to softfloat. target-arm: Implement ARMv8 FP VMAXNM and VMINNM instructions. target-arm: Implement ARMv8 SIMD VMAXNM and VMINNM instructions. fpu/softfloat.c | 60 +++- include/fpu/softfloat.h | 4 + target-arm/helper.c | 25 + target-arm/helper.h | 8 ++ target-arm/neon_helper.c | 16 target-arm/translate.c | 244 --- 6 files changed, 340 insertions(+), 17 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH v7 1/6] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. --- target-arm/translate.c | 32 +++- 1 file changed, 27 insertions(+), 5 deletions(-) Changes in v7: - Fix comment style - Fix brace style diff --git a/target-arm/translate.c b/target-arm/translate.c index 5f003e7..f63e89d 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2636,6 +2636,14 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rn != ARM_VFP_MVFR1 rn != ARM_VFP_MVFR0) return 1; } + +if (extract32(insn, 28, 4) == 0xf) { +/* Encodings with T=1 (Thumb) or unconditional (ARM): + * only used in v8 and above. + */ +return 1; +} + dp = ((insn 0xf00) == 0xb00); switch ((insn 24) 0xf) { case 0xe: @@ -6296,9 +6304,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6753,6 +6758,13 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +return; +} if (((insn 0x0f30f000) == 0x0510f000) || ((insn 0x0f30f010) == 0x0710f000)) { if ((insn (1 22)) == 0) { @@ -8033,9 +8045,15 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) case 0xc: case 0xd: case 0xe: -/* Coprocessor. */ -if (disas_coproc_insn(env, s, insn)) +if (((insn 8) 0xe) == 10) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +} else if (disas_coproc_insn(env, s, insn)) { +/* Coprocessor. */ goto illegal_op; +} break; case 0xf: /* swi */ @@ -8765,6 +8783,10 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw insn = (insn 0xe2ff) | ((insn (1 28)) 4) | (1 28); if (disas_neon_data_insn(env, s, insn)) goto illegal_op; +} else if (((insn 8) 0xe) == 10) { +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} } else { if (insn (1 28)) goto illegal_op; -- 1.8.1.4
[Qemu-devel] [PATCH v7 5/6] target-arm: Implement ARMv8 FP VMAXNM and VMINNM instructions.
This adds support for the ARMv8 floating point VMAXNM and VMINNM instructions. --- target-arm/helper.c| 25 + target-arm/helper.h| 5 + target-arm/translate.c | 50 ++ 3 files changed, 80 insertions(+) Changes in v7: - Break VMINNM/VMAXNM handling out into function - Use new softfloat routines for minnum/maxnum - Use extract32 to decode insn - Fix brace style diff --git a/target-arm/helper.c b/target-arm/helper.c index 3445813..7507240 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4079,3 +4079,28 @@ float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c, void *fpstp) float_status *fpst = fpstp; return float64_muladd(a, b, c, 0, fpst); } + +/* ARMv8 VMAXNM/VMINNM */ +float32 VFP_HELPER(maxnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float32_maxnum(a, b, fpst); +} + +float64 VFP_HELPER(maxnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float64_maxnum(a, b, fpst); +} + +float32 VFP_HELPER(minnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float32_minnum(a, b, fpst); +} + +float64 VFP_HELPER(minnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +return float64_minnum(a, b, fpst); +} diff --git a/target-arm/helper.h b/target-arm/helper.h index cac9564..d459a39 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -132,6 +132,11 @@ DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env) DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr) DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_maxnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_maxnms, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_minnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_minnms, f32, f32, f32, ptr) + DEF_HELPER_3(recps_f32, f32, f32, f32, env) DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env) DEF_HELPER_2(recpe_f32, f32, f32, env) diff --git a/target-arm/translate.c b/target-arm/translate.c index 0a22ad8..9a8069e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2723,6 +2723,54 @@ static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm, return 0; } +static int handle_vminmaxnm(uint32_t insn, uint32_t rd, uint32_t rn, +uint32_t rm, uint32_t dp) +{ +uint32_t vmin = extract32(insn, 6, 1); +TCGv_ptr fpst = get_fpstatus_ptr(0); + +if (dp) { +TCGv_i64 frn, frm, dest; + +frn = tcg_temp_new_i64(); +frm = tcg_temp_new_i64(); +dest = tcg_temp_new_i64(); + +tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) { +gen_helper_vfp_minnmd(dest, frn, frm, fpst); +} else { +gen_helper_vfp_maxnmd(dest, frn, frm, fpst); +} +tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(frn); +tcg_temp_free_i64(frm); +tcg_temp_free_i64(dest); +} else { +TCGv_i32 frn, frm, dest; + +frn = tcg_temp_new_i32(); +frm = tcg_temp_new_i32(); +dest = tcg_temp_new_i32(); + +tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) { +gen_helper_vfp_minnms(dest, frn, frm, fpst); +} else { +gen_helper_vfp_maxnms(dest, frn, frm, fpst); +} +tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(frn); +tcg_temp_free_i32(frm); +tcg_temp_free_i32(dest); +} + +tcg_temp_free_ptr(fpst); +return 0; +} + static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) { uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); @@ -2743,6 +2791,8 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) if ((insn 0x0f800e50) == 0x0e000a00) { return handle_vsel(insn, rd, rn, rm, dp); +} else if ((insn 0x0fb00e10) == 0x0e800a00) { +return handle_vminmaxnm(insn, rd, rn, rm, dp); } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH v7 3/6] softfloat: Remove unused argument from MINMAX macro.
The nan_exp argument is not used, so remove it. --- fpu/softfloat.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) Changes in v7: - New patch diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 7ba51b6..97bf627 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -6706,7 +6706,7 @@ int float128_compare_quiet( float128 a, float128 b STATUS_PARAM ) * 'compare and pick one input' because that would mishandle * NaNs and +0 vs -0. */ -#define MINMAX(s, nan_exp) \ +#define MINMAX(s) \ INLINE float ## s float ## s ## _minmax(float ## s a, float ## s b, \ int ismin STATUS_PARAM )\ { \ @@ -6747,8 +6747,8 @@ float ## s float ## s ## _max(float ## s a, float ## s b STATUS_PARAM) \ return float ## s ## _minmax(a, b, 0 STATUS_VAR); \ } -MINMAX(32, 0xff) -MINMAX(64, 0x7ff) +MINMAX(32) +MINMAX(64) /* Multiply A by 2 raised to the power N. */ -- 1.8.1.4
[Qemu-devel] [PATCH v7 2/6] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. --- target-arm/translate.c | 135 - 1 file changed, 134 insertions(+), 1 deletion(-) Changes in v7: - Break out VSEL handling into a function - Properly sign extend VF and NF to 64bit - Use extract32 to decode insn - Fix brace style diff --git a/target-arm/translate.c b/target-arm/translate.c index f63e89d..0a22ad8 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2614,6 +2614,139 @@ static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) return tmp; } +static int handle_vsel(uint32_t insn, uint32_t rd, uint32_t rn, uint32_t rm, + uint32_t dp) +{ +uint32_t cc = extract32(insn, 20, 2); + +if (dp) { +TCGv_i64 frn, frm, dest; +TCGv_i64 tmp, zero, zf, nf, vf; + +zero = tcg_const_i64(0); + +frn = tcg_temp_new_i64(); +frm = tcg_temp_new_i64(); +dest = tcg_temp_new_i64(); + +zf = tcg_temp_new_i64(); +nf = tcg_temp_new_i64(); +vf = tcg_temp_new_i64(); + +tcg_gen_extu_i32_i64(zf, cpu_ZF); +tcg_gen_ext_i32_i64(nf, cpu_NF); +tcg_gen_ext_i32_i64(vf, cpu_VF); + +tcg_gen_ld_f64(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(frm, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, dest, zf, zero, +frn, frm); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, dest, vf, zero, +frn, frm); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, +frn, frm); +tcg_temp_free_i64(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, dest, zf, zero, +frn, frm); +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, dest, tmp, zero, +dest, frm); +tcg_temp_free_i64(tmp); +break; +} +tcg_gen_st_f64(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(frn); +tcg_temp_free_i64(frm); +tcg_temp_free_i64(dest); + +tcg_temp_free_i64(zf); +tcg_temp_free_i64(nf); +tcg_temp_free_i64(vf); + +tcg_temp_free_i64(zero); +} else { +TCGv_i32 frn, frm, dest; +TCGv_i32 tmp, zero; + +zero = tcg_const_i32(0); + +frn = tcg_temp_new_i32(); +frm = tcg_temp_new_i32(); +dest = tcg_temp_new_i32(); +tcg_gen_ld_f32(frn, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(frm, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, dest, cpu_ZF, zero, +frn, frm); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, dest, cpu_VF, zero, +frn, frm); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, +frn, frm); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, dest, cpu_ZF, zero, +frn, frm); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, dest, tmp, zero, +dest, frm); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(dest, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(frn); +tcg_temp_free_i32(frm); +tcg_temp_free_i32(dest); + +tcg_temp_free_i32(zero); +} + +return 0; +} + +static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) +{ +uint32_t rd, rn, rm, dp = extract32(insn, 8, 1); + +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +if (dp) { +VFP_DREG_D(rd, insn); +VFP_DREG_N(rn, insn); +VFP_DREG_M(rm, insn); +} else { +rd = VFP_SREG_D(insn); +rn = VFP_SREG_N(insn); +rm = VFP_SREG_M(insn); +} + +if ((insn 0x0f800e50) == 0x0e000a00) { +return handle_vsel(insn, rd, rn, rm, dp); +} +return 1; +} + /* Disassemble a VFP instruction. Returns nonzero if an
[Qemu-devel] [PATCH v7 6/6] target-arm: Implement ARMv8 SIMD VMAXNM and VMINNM instructions.
This adds support for the ARMv8 Advanced SIMD VMAXNM and VMINNM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/helper.h | 3 +++ target-arm/neon_helper.c | 16 target-arm/translate.c | 31 ++- 3 files changed, 41 insertions(+), 9 deletions(-) Changes in v7: - Use new softfloat routines for minnum/maxnum - Rename NEON_3R_VRECPS_VRSQRTS to NEON_3R_FLOAT_MISC - Fix brace style diff --git a/target-arm/helper.h b/target-arm/helper.h index d459a39..3ecbbd2 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -355,6 +355,9 @@ DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr) DEF_HELPER_3(neon_acge_f32, i32, i32, i32, ptr) DEF_HELPER_3(neon_acgt_f32, i32, i32, i32, ptr) +DEF_HELPER_3(neon_maxnm_f32, i32, i32, i32, ptr) +DEF_HELPER_3(neon_minnm_f32, i32, i32, i32, ptr) + /* iwmmxt_helper.c */ DEF_HELPER_2(iwmmxt_maddsq, i64, i64, i64) DEF_HELPER_2(iwmmxt_madduq, i64, i64, i64) diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c index b028cc2..06e6894 100644 --- a/target-arm/neon_helper.c +++ b/target-arm/neon_helper.c @@ -2008,3 +2008,19 @@ void HELPER(neon_zip16)(CPUARMState *env, uint32_t rd, uint32_t rm) env-vfp.regs[rm] = make_float64(m0); env-vfp.regs[rd] = make_float64(d0); } + +uint32_t HELPER(neon_maxnm_f32)(uint32_t a, uint32_t b, void *fpstp) +{ +float_status *fpst = fpstp; +float32 af = make_float32(a); +float32 bf = make_float32(b); +return float32_val(float32_maxnum(af, bf, fpst)); +} + +uint32_t HELPER(neon_minnm_f32)(uint32_t a, uint32_t b, void *fpstp) +{ +float_status *fpst = fpstp; +float32 af = make_float32(a); +float32 bf = make_float32(b); +return float32_val(float32_minnum(af, bf, fpst)); +} diff --git a/target-arm/translate.c b/target-arm/translate.c index 9a8069e..7d9213e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4553,7 +4553,7 @@ static void gen_neon_narrow_op(int op, int u, int size, #define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */ #define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */ #define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */ -#define NEON_3R_VRECPS_VRSQRTS 31 /* float VRECPS, VRSQRTS */ +#define NEON_3R_FLOAT_MISC 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */ static const uint8_t neon_3r_sizes[] = { [NEON_3R_VHADD] = 0x7, @@ -4586,7 +4586,7 @@ static const uint8_t neon_3r_sizes[] = { [NEON_3R_FLOAT_CMP] = 0x5, /* size bit 1 encodes op */ [NEON_3R_FLOAT_ACMP] = 0x5, /* size bit 1 encodes op */ [NEON_3R_FLOAT_MINMAX] = 0x5, /* size bit 1 encodes op */ -[NEON_3R_VRECPS_VRSQRTS] = 0x5, /* size bit 1 encodes op */ +[NEON_3R_FLOAT_MISC] = 0x5, /* size bit 1 encodes op */ }; /* Symbolic constants for op fields for Neon 2-register miscellaneous. @@ -4847,8 +4847,9 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins return 1; } break; -case NEON_3R_VRECPS_VRSQRTS: -if (u) { +case NEON_3R_FLOAT_MISC: +/* VMAXNM/VMINNM in ARMv8 */ +if (u (!arm_feature(env, ARM_FEATURE_V8) || (size 1))) { return 1; } break; @@ -5137,11 +5138,23 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins tcg_temp_free_ptr(fpstatus); break; } -case NEON_3R_VRECPS_VRSQRTS: -if (size == 0) -gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); -else -gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); +case NEON_3R_FLOAT_MISC: +if (u) { +/* VMAXNM/VMINNM */ +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +if (size == 0) { +gen_helper_neon_maxnm_f32(tmp, tmp, tmp2, fpstatus); +} else { +gen_helper_neon_minnm_f32(tmp, tmp, tmp2, fpstatus); +} +tcg_temp_free_ptr(fpstatus); +} else { +if (size == 0) { +gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); +} else { +gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); + } +} break; case NEON_3R_VFM: { -- 1.8.1.4
[Qemu-devel] [PATCH v7 4/6] softfloat: Add minNum() and maxNum() functions to softfloat.
Add floatnn_minnum() and floatnn_maxnum() functions which are equivalent to the minNum() and maxNum() functions from IEEE 754-2008. They are similar to min() and max() but differ in the handling of QNaN arguments. --- fpu/softfloat.c | 54 + include/fpu/softfloat.h | 4 2 files changed, 58 insertions(+) Changes in v7: - New patch diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 97bf627..9834927 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -6750,6 +6750,60 @@ float ## s float ## s ## _max(float ## s a, float ## s b STATUS_PARAM) \ MINMAX(32) MINMAX(64) +/* minnum() and maxnum() functions. These are similar to the min() + * and max() functions but if one of the arguments is a QNaN and + * the other is numerical then the numerical argument is returned. + */ +#define MINMAXNUM(s) \ +INLINE float ## s float ## s ## _minmaxnum(float ## s a, float ## s b, \ + int ismin STATUS_PARAM )\ +{ \ +flag aSign, bSign; \ +uint ## s ## _t av, bv;\ +a = float ## s ## _squash_input_denormal(a STATUS_VAR);\ +b = float ## s ## _squash_input_denormal(b STATUS_VAR);\ +if (float ## s ## _is_quiet_nan(a) \ +!float ## s ##_is_quiet_nan(b)) { \ +return b; \ +} else if (float ## s ## _is_quiet_nan(b)\ + !float ## s ## _is_quiet_nan(a)) { \ +return a; \ +} else if (float ## s ## _is_any_nan(a) || \ +float ## s ## _is_any_nan(b)) {\ +return propagateFloat ## s ## NaN(a, b STATUS_VAR);\ +} \ +aSign = extractFloat ## s ## Sign(a); \ +bSign = extractFloat ## s ## Sign(b); \ +av = float ## s ## _val(a);\ +bv = float ## s ## _val(b);\ +if (aSign != bSign) { \ +if (ismin) { \ +return aSign ? a : b; \ +} else { \ +return aSign ? b : a; \ +} \ +} else { \ +if (ismin) { \ +return (aSign ^ (av bv)) ? a : b;\ +} else { \ +return (aSign ^ (av bv)) ? b : a;\ +} \ +} \ +} \ + \ +float ## s float ## s ## _minnum(float ## s a, float ## s b STATUS_PARAM) \ +{ \ +return float ## s ## _minmaxnum(a, b, 1 STATUS_VAR); \ +} \ + \ +float ## s float ## s ## _maxnum(float ## s a, float ## s b STATUS_PARAM) \ +{ \ +return float ## s ## _minmaxnum(a, b, 0 STATUS_VAR); \ +} + +MINMAXNUM(32) +MINMAXNUM(64) + /* Multiply A by 2 raised to the power N. */ float32 float32_scalbn( float32 a, int n STATUS_PARAM ) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index f3927e2..2365274 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -302,6 +302,8 @@ int float32_compare( float32, float32 STATUS_PARAM ); int float32_compare_quiet( float32, float32 STATUS_PARAM ); float32 float32_min(float32, float32 STATUS_PARAM); float32 float32_max(float32, float32 STATUS_PARAM); +float32 float32_minnum(float32, float32 STATUS_PARAM); +float32 float32_maxnum(float32, float32
Re: [Qemu-devel] [PATCH v6 1/4] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
On 29 November 2013 15:26, Peter Maydell peter.mayd...@linaro.org wrote: On 28 November 2013 17:07, Will Newton will.new...@linaro.org wrote: Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Hi; thanks for these patches. A minor process note: for future versions, can you make sure you send patches out properly threaded under a cover letter mail? That will help me since some tools I use to process patches rely on patchseries coming as a properly threaded set of mails. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) Changes in v6: - Add return after disas_vfp_insn call in disas_arm_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 5f003e7..5a6c1ea 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2636,6 +2636,13 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rn != ARM_VFP_MVFR1 rn != ARM_VFP_MVFR0) return 1; } + +if (extract32(insn, 28, 4) == 0xf) { +/* Encodings with T=1 (Thumb) or unconditional (ARM): + only used in v8 and above. */ Since I'm nitpicking: I prefer the multiline comment form /* text text * line 2 */ though I know we're not always consistent in existing code. +return 1; +} + dp = ((insn 0xf00) == 0xb00); switch ((insn 24) 0xf) { case 0xe: @@ -6296,9 +6303,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6753,6 +6757,12 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) +goto illegal_op; Missing braces. (scripts/checkpatch.pl will catch this kind of nit.) I had noticed the checkpatch warnings but as they seemed inconsistent with the rest of the file I assumed they were a bug in checkpatch! -- Will Newton Toolchain Working Group, Linaro
[Qemu-devel] [PATCH v6 3/4] target-arm: Implement ARMv8 FP VMAXNM and VMINNM instructions.
This adds support for the ARMv8 floating point VMAXNM and VMINNM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/helper.c| 41 + target-arm/helper.h| 5 + target-arm/translate.c | 43 +++ 3 files changed, 89 insertions(+) Changes in v6: - New patch diff --git a/target-arm/helper.c b/target-arm/helper.c index 3445813..e5428d1 100644 --- a/target-arm/helper.c +++ b/target-arm/helper.c @@ -4079,3 +4079,44 @@ float64 VFP_HELPER(muladd, d)(float64 a, float64 b, float64 c, void *fpstp) float_status *fpst = fpstp; return float64_muladd(a, b, c, 0, fpst); } + +/* ARMv8 VMAXNM/VMINNM */ +float32 VFP_HELPER(maxnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +if (float32_is_quiet_nan(a) !float32_is_quiet_nan(b)) +return b; +else if (float32_is_quiet_nan(b) !float32_is_quiet_nan(a)) +return a; +return float32_max(a, b, fpst); +} + +float64 VFP_HELPER(maxnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +if (float64_is_quiet_nan(a) !float64_is_quiet_nan(b)) +return b; +else if (float64_is_quiet_nan(b) !float64_is_quiet_nan(a)) +return a; +return float64_max(a, b, fpst); +} + +float32 VFP_HELPER(minnm, s)(float32 a, float32 b, void *fpstp) +{ +float_status *fpst = fpstp; +if (float32_is_quiet_nan(a) !float32_is_quiet_nan(b)) +return b; +else if (float32_is_quiet_nan(b) !float32_is_quiet_nan(a)) +return a; +return float32_min(a, b, fpst); +} + +float64 VFP_HELPER(minnm, d)(float64 a, float64 b, void *fpstp) +{ +float_status *fpst = fpstp; +if (float64_is_quiet_nan(a) !float64_is_quiet_nan(b)) +return b; +else if (float64_is_quiet_nan(b) !float64_is_quiet_nan(a)) +return a; +return float64_min(a, b, fpst); +} diff --git a/target-arm/helper.h b/target-arm/helper.h index cac9564..d459a39 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -132,6 +132,11 @@ DEF_HELPER_2(neon_fcvt_f32_to_f16, i32, f32, env) DEF_HELPER_4(vfp_muladdd, f64, f64, f64, f64, ptr) DEF_HELPER_4(vfp_muladds, f32, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_maxnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_maxnms, f32, f32, f32, ptr) +DEF_HELPER_3(vfp_minnmd, f64, f64, f64, ptr) +DEF_HELPER_3(vfp_minnms, f32, f32, f32, ptr) + DEF_HELPER_3(recps_f32, f32, f32, f32, env) DEF_HELPER_3(rsqrts_f32, f32, f32, f32, env) DEF_HELPER_2(recpe_f32, f32, f32, env) diff --git a/target-arm/translate.c b/target-arm/translate.c index 4e7077e..cac7668 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2738,6 +2738,49 @@ static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) } return 0; +} else if ((insn 0x0fb00e10) == 0x0e800a00) { +/* vmaxnm/vminnm */ +uint32_t vmin = (insn 6) 1; +TCGv_ptr fpst; +fpst = get_fpstatus_ptr(0); +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); + +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) +gen_helper_vfp_minnmd(ftmp3, ftmp1, ftmp2, fpst); +else +gen_helper_vfp_maxnmd(ftmp3, ftmp1, ftmp2, fpst); +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); + +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +if (vmin) +gen_helper_vfp_minnms(ftmp3, ftmp1, ftmp2, fpst); +else +gen_helper_vfp_maxnms(ftmp3, ftmp1, ftmp2, fpst); +tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(ftmp1); +tcg_temp_free_i32(ftmp2); +tcg_temp_free_i32(ftmp3); +} + + tcg_temp_free_ptr(fpst); + return 0; } return 1; } -- 1.8.1.4
[Qemu-devel] [PATCH v6 1/4] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 30 +- 1 file changed, 25 insertions(+), 5 deletions(-) Changes in v6: - Add return after disas_vfp_insn call in disas_arm_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 5f003e7..5a6c1ea 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2636,6 +2636,13 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rn != ARM_VFP_MVFR1 rn != ARM_VFP_MVFR0) return 1; } + +if (extract32(insn, 28, 4) == 0xf) { +/* Encodings with T=1 (Thumb) or unconditional (ARM): + only used in v8 and above. */ +return 1; +} + dp = ((insn 0xf00) == 0xb00); switch ((insn 24) 0xf) { case 0xe: @@ -6296,9 +6303,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6753,6 +6757,12 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) +goto illegal_op; +return; +} if (((insn 0x0f30f000) == 0x0510f000) || ((insn 0x0f30f010) == 0x0710f000)) { if ((insn (1 22)) == 0) { @@ -8033,9 +8043,15 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) case 0xc: case 0xd: case 0xe: -/* Coprocessor. */ -if (disas_coproc_insn(env, s, insn)) +if (((insn 8) 0xe) == 10) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +} else if (disas_coproc_insn(env, s, insn)) { +/* Coprocessor. */ goto illegal_op; +} break; case 0xf: /* swi */ @@ -8765,6 +8781,10 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw insn = (insn 0xe2ff) | ((insn (1 28)) 4) | (1 28); if (disas_neon_data_insn(env, s, insn)) goto illegal_op; +} else if (((insn 8) 0xe) == 10) { +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} } else { if (insn (1 28)) goto illegal_op; -- 1.8.1.4
[Qemu-devel] [PATCH v6 2/4] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 130 - 1 file changed, 129 insertions(+), 1 deletion(-) Changes in v6: - None diff --git a/target-arm/translate.c b/target-arm/translate.c index 5a6c1ea..4e7077e 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2614,6 +2614,134 @@ static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) return tmp; } +static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) +{ +uint32_t rd, rn, rm, dp = (insn 8) 1; + +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +if (dp) { +VFP_DREG_D(rd, insn); +VFP_DREG_N(rn, insn); +VFP_DREG_M(rm, insn); +} else { +rd = VFP_SREG_D(insn); +rn = VFP_SREG_N(insn); +rm = VFP_SREG_M(insn); +} + +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; +TCGv_i64 tmp, zero, zf, nf, vf; + +zero = tcg_const_i64(0); + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); + +zf = tcg_temp_new_i64(); +nf = tcg_temp_new_i64(); +vf = tcg_temp_new_i64(); + +tcg_gen_extu_i32_i64(zf, cpu_ZF); +tcg_gen_extu_i32_i64(nf, cpu_NF); +tcg_gen_extu_i32_i64(vf, cpu_VF); + +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, zf, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, vf, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i64(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, zf, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i64(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); + +tcg_temp_free_i64(zf); +tcg_temp_free_i64(nf); +tcg_temp_free_i64(vf); + +tcg_temp_free_i64(zero); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; +TCGv_i32 tmp, zero; + +zero = tcg_const_i32(0); + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(ftmp1
[Qemu-devel] [PATCH v6 4/4] target-arm: Implement ARMv8 SIMD VMAXNM and VMINNM instructions.
This adds support for the ARMv8 Advanced SIMD VMAXNM and VMINNM instructions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/helper.h | 3 +++ target-arm/neon_helper.c | 24 target-arm/translate.c | 23 +-- 3 files changed, 44 insertions(+), 6 deletions(-) diff --git a/target-arm/helper.h b/target-arm/helper.h index d459a39..3ecbbd2 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -355,6 +355,9 @@ DEF_HELPER_3(neon_cgt_f32, i32, i32, i32, ptr) DEF_HELPER_3(neon_acge_f32, i32, i32, i32, ptr) DEF_HELPER_3(neon_acgt_f32, i32, i32, i32, ptr) +DEF_HELPER_3(neon_maxnm_f32, i32, i32, i32, ptr) +DEF_HELPER_3(neon_minnm_f32, i32, i32, i32, ptr) + /* iwmmxt_helper.c */ DEF_HELPER_2(iwmmxt_maddsq, i64, i64, i64) DEF_HELPER_2(iwmmxt_madduq, i64, i64, i64) diff --git a/target-arm/neon_helper.c b/target-arm/neon_helper.c index b028cc2..cc55e83 100644 --- a/target-arm/neon_helper.c +++ b/target-arm/neon_helper.c @@ -2008,3 +2008,27 @@ void HELPER(neon_zip16)(CPUARMState *env, uint32_t rd, uint32_t rm) env-vfp.regs[rm] = make_float64(m0); env-vfp.regs[rd] = make_float64(d0); } + +uint32_t HELPER(neon_maxnm_f32)(uint32_t a, uint32_t b, void *fpstp) +{ +float_status *fpst = fpstp; +float32 af = make_float32(a); +float32 bf = make_float32(b); +if (float32_is_quiet_nan(af) !float32_is_quiet_nan(bf)) +return b; +else if (float32_is_quiet_nan(bf) !float32_is_quiet_nan(af)) +return a; +return float32_val(float32_max(af, bf, fpst)); +} + +uint32_t HELPER(neon_minnm_f32)(uint32_t a, uint32_t b, void *fpstp) +{ +float_status *fpst = fpstp; +float32 af = make_float32(a); +float32 bf = make_float32(b); +if (float32_is_quiet_nan(af) !float32_is_quiet_nan(bf)) +return b; +else if (float32_is_quiet_nan(bf) !float32_is_quiet_nan(af)) +return a; +return float32_val(float32_min(af, bf, fpst)); +} diff --git a/target-arm/translate.c b/target-arm/translate.c index cac7668..9a4e7f4 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -4540,7 +4540,7 @@ static void gen_neon_narrow_op(int op, int u, int size, #define NEON_3R_FLOAT_CMP 28 /* float VCEQ, VCGE, VCGT */ #define NEON_3R_FLOAT_ACMP 29 /* float VACGE, VACGT, VACLE, VACLT */ #define NEON_3R_FLOAT_MINMAX 30 /* float VMIN, VMAX */ -#define NEON_3R_VRECPS_VRSQRTS 31 /* float VRECPS, VRSQRTS */ +#define NEON_3R_VRECPS_VRSQRTS 31 /* float VRECPS, VRSQRTS, VMAXNM/MINNM */ static const uint8_t neon_3r_sizes[] = { [NEON_3R_VHADD] = 0x7, @@ -4835,7 +4835,8 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins } break; case NEON_3R_VRECPS_VRSQRTS: -if (u) { +/* Encoding shared with VMAXNM/VMINNM in ARMv8 */ +if (u (!arm_feature(env, ARM_FEATURE_V8) || (size 1))) { return 1; } break; @@ -5125,10 +5126,20 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins break; } case NEON_3R_VRECPS_VRSQRTS: -if (size == 0) -gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); -else -gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); +if (u) { +/* VMAXNM/VMINNM */ +TCGv_ptr fpstatus = get_fpstatus_ptr(1); +if (size == 0) +gen_helper_neon_maxnm_f32(tmp, tmp, tmp2, fpstatus); +else +gen_helper_neon_minnm_f32(tmp, tmp, tmp2, fpstatus); +tcg_temp_free_ptr(fpstatus); +} else { +if (size == 0) +gen_helper_recps_f32(tmp, tmp, tmp2, cpu_env); +else +gen_helper_rsqrts_f32(tmp, tmp, tmp2, cpu_env); +} break; case NEON_3R_VFM: { -- 1.8.1.4
Re: [Qemu-devel] [PATCH v5 1/2] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
On 15 October 2013 16:09, Will Newton will.new...@linaro.org wrote: Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 29 - 1 file changed, 24 insertions(+), 5 deletions(-) Changes in v5: - Check for high bits set in disas_vfp_insn Ping? -- Will Newton Toolchain Working Group, Linaro
[Qemu-devel] [PATCH v5 1/2] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 29 - 1 file changed, 24 insertions(+), 5 deletions(-) Changes in v5: - Check for high bits set in disas_vfp_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 5f003e7..c04d2cf 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2636,6 +2636,13 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rn != ARM_VFP_MVFR1 rn != ARM_VFP_MVFR0) return 1; } + +if (extract32(insn, 28, 4) == 0xf) { +/* Encodings with T=1 (Thumb) or unconditional (ARM): + only used in v8 and above. */ +return 1; +} + dp = ((insn 0xf00) == 0xb00); switch ((insn 24) 0xf) { case 0xe: @@ -6296,9 +6303,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6753,6 +6757,11 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) +goto illegal_op; +} if (((insn 0x0f30f000) == 0x0510f000) || ((insn 0x0f30f010) == 0x0710f000)) { if ((insn (1 22)) == 0) { @@ -8033,9 +8042,15 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) case 0xc: case 0xd: case 0xe: -/* Coprocessor. */ -if (disas_coproc_insn(env, s, insn)) +if (((insn 8) 0xe) == 10) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +} else if (disas_coproc_insn(env, s, insn)) { +/* Coprocessor. */ goto illegal_op; +} break; case 0xf: /* swi */ @@ -8765,6 +8780,10 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw insn = (insn 0xe2ff) | ((insn (1 28)) 4) | (1 28); if (disas_neon_data_insn(env, s, insn)) goto illegal_op; +} else if (((insn 8) 0xe) == 10) { +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} } else { if (insn (1 28)) goto illegal_op; -- 1.8.1.4
[Qemu-devel] [PATCH v5 2/2] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 130 - 1 file changed, 129 insertions(+), 1 deletion(-) Changes in v5: - Break out VSEL decode into separate disas_vfp_v8_insn function diff --git a/target-arm/translate.c b/target-arm/translate.c index c04d2cf..2b4020f 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2614,6 +2614,134 @@ static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) return tmp; } +static int disas_vfp_v8_insn(CPUARMState *env, DisasContext *s, uint32_t insn) +{ +uint32_t rd, rn, rm, dp = (insn 8) 1; + +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +if (dp) { +VFP_DREG_D(rd, insn); +VFP_DREG_N(rn, insn); +VFP_DREG_M(rm, insn); +} else { +rd = VFP_SREG_D(insn); +rn = VFP_SREG_N(insn); +rm = VFP_SREG_M(insn); +} + +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; +TCGv_i64 tmp, zero, zf, nf, vf; + +zero = tcg_const_i64(0); + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); + +zf = tcg_temp_new_i64(); +nf = tcg_temp_new_i64(); +vf = tcg_temp_new_i64(); + +tcg_gen_extu_i32_i64(zf, cpu_ZF); +tcg_gen_extu_i32_i64(nf, cpu_NF); +tcg_gen_extu_i32_i64(vf, cpu_VF); + +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, zf, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, vf, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i64(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, zf, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i64(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); + +tcg_temp_free_i64(zf); +tcg_temp_free_i64(nf); +tcg_temp_free_i64(vf); + +tcg_temp_free_i64(zero); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; +TCGv_i32 tmp, zero; + +zero = tcg_const_i32(0); + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd
[Qemu-devel] [PATCH] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. It is based on the previous patch[1] from Mans Rullgard, but attempts to addres the feedback given on that patch. [1] http://lists.nongnu.org/archive/html/qemu-devel/2013-06/msg03117.html Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 121 + 1 file changed, 121 insertions(+) diff --git a/target-arm/translate.c b/target-arm/translate.c index 998bde2..7bfd606 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2617,6 +2617,114 @@ static TCGv_i32 gen_load_and_replicate(DisasContext *s, TCGv_i32 addr, int size) return tmp; } +static int disas_v8vfp_insn(CPUARMState *env, DisasContext *s, uint32_t insn) +{ +uint32_t rd, rn, rm, dp = (insn 8) 1; + +if (!s-vfp_enabled) + return 1; + +if (dp) { +VFP_DREG_D(rd, insn); +VFP_DREG_N(rn, insn); +VFP_DREG_M(rm, insn); +} else { +rd = VFP_SREG_D(insn); +rn = VFP_SREG_N(insn); +rm = VFP_SREG_M(insn); +} + +if (((insn 23) 1) == 0) { +/* vsel */ +uint32_t cc = (insn 20) 3; + TCGv_i32 tmp, zero; + + zero = tcg_const_tl(0); + + if (dp) { + TCGv_i64 ftmp1, ftmp2, ftmp3; + + ftmp1 = tcg_temp_new_i64(); + ftmp2 = tcg_temp_new_i64(); + ftmp3 = tcg_temp_new_i64(); + tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); + tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); + switch (cc) { + case 0: /* eq: Z */ + tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, cpu_ZF, zero, + ftmp1, ftmp2); + break; + case 1: /* vs: V */ + tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, cpu_VF, zero, + ftmp1, ftmp2); + break; + case 2: /* ge: N == V - N ^ V == 0 */ + tmp = tcg_temp_new_i32(); + tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); + tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, + ftmp1, ftmp2); + tcg_temp_free_i32(tmp); + break; + case 3: /* gt: !Z N == V */ + tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, cpu_ZF, zero, + ftmp1, ftmp2); + tmp = tcg_temp_new_i32(); + tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); + tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, + ftmp3, ftmp2); + tcg_temp_free_i32(tmp); + break; + } + tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); + tcg_temp_free_i64(ftmp1); + tcg_temp_free_i64(ftmp2); + tcg_temp_free_i64(ftmp3); + } else { + TCGv_i32 ftmp1, ftmp2, ftmp3; + + ftmp1 = tcg_temp_new_i32(); + ftmp2 = tcg_temp_new_i32(); + ftmp3 = tcg_temp_new_i32(); + tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); + tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); + switch (cc) { + case 0: /* eq: Z */ + tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, + ftmp1, ftmp2); + break; + case 1: /* vs: V */ + tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, + ftmp1, ftmp2); + break; + case 2: /* ge: N == V - N ^ V == 0 */ + tmp = tcg_temp_new_i32(); + tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); + tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, + ftmp1, ftmp2); + tcg_temp_free_i32(tmp); + break; + case 3: /* gt: !Z N == V */ + tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, + ftmp1, ftmp2); + tmp = tcg_temp_new_i32(); + tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); + tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, + ftmp3, ftmp2); + tcg_temp_free_i32(tmp); + break; + } + tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); + tcg_temp_free_i32(ftmp1); + tcg_temp_free_i32(ftmp2); + tcg_temp_free_i32(ftmp3); + } + +return 0; +} + +return 1; +} + /* Disassemble a VFP instruction. Returns nonzero if an error occurred (ie. an undefined instruction). */ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) @@ -6756,6 +6864,13 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return
[Qemu-devel] [PATCH v2] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. It is based on the previous patch[1] from Mans Rullgard, but attempts to address the feedback given on that patch. [1] http://lists.nongnu.org/archive/html/qemu-devel/2013-06/msg03117.html Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 105 + 1 file changed, 105 insertions(+) Changes in v2: - Integrate vsel decoding into disas_vfp_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 998bde2..5e49334 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2880,6 +2880,98 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rm = VFP_SREG_M(insn); } +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; +TCGv_i32 tmp, zero; + +/* ARMv8 VFP. */ +if (!arm_feature(env, ARM_FEATURE_V8)) +return 1; + +zero = tcg_const_tl(0); + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd
Re: [Qemu-devel] [PATCH v2] target-arm: Implement ARMv8 VSEL instruction.
On 3 October 2013 13:59, Peter Maydell peter.mayd...@linaro.org wrote: On 3 October 2013 21:51, Will Newton will.new...@linaro.org wrote: This adds support for the VSEL floating point selection instruction which was added in ARMv8. It is based on the previous patch[1] from Mans Rullgard, but attempts to address the feedback given on that patch. [1] http://lists.nongnu.org/archive/html/qemu-devel/2013-06/msg03117.html This sort of commentary about previous patch versions should go below the '---', not in the commit message. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 105 + 1 file changed, 105 insertions(+) Changes in v2: - Integrate vsel decoding into disas_vfp_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 998bde2..5e49334 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2880,6 +2880,98 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rm = VFP_SREG_M(insn); } +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; +TCGv_i32 tmp, zero; + +/* ARMv8 VFP. */ +if (!arm_feature(env, ARM_FEATURE_V8)) +return 1; scripts/checkpatch.pl will tell you that omitting the braces is a coding style violation. Ok, I'll fix that. + +zero = tcg_const_tl(0); + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero
[Qemu-devel] [PATCH v3] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 115 ++--- 1 file changed, 110 insertions(+), 5 deletions(-) Changes in v3: - Move calls to disas_vfp_insn out of disas_coproc_insn diff --git a/target-arm/translate.c b/target-arm/translate.c index 998bde2..10b4fac 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2880,6 +2880,99 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rm = VFP_SREG_M(insn); } +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; +TCGv_i32 tmp, zero; + +/* ARMv8 VFP. */ +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +zero = tcg_const_tl(0); + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i32(tmp); +break; +} +tcg_gen_st_f32(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i32(ftmp1); +tcg_temp_free_i32(ftmp2); +tcg_temp_free_i32(ftmp3
Re: [Qemu-devel] [PATCH v2] target-arm: Implement ARMv8 VSEL instruction.
On 3 October 2013 15:34, Richard Henderson r...@twiddle.net wrote: On 10/03/2013 05:51 AM, Will Newton wrote: +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; Does this compile when configured with --enable-debug? It shouldn't, since movcond_i64 takes 5 _i64 variables, and your comparison variables are _i32. No, thanks for picking that up. I was wondering if that was valid and the code seemed to work. What's the best way to work around the problem? Just extend everything up to 64bits? -- Will Newton Toolchain Working Group, Linaro
[Qemu-devel] [PATCH 1/2] target-arm: Move call to disas_vfp_insn out of disas_coproc_insn.
Floating point is an extension to the instruction set rather than a coprocessor, so call it directly from the ARM and Thumb decode functions. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/target-arm/translate.c b/target-arm/translate.c index 998bde2..2c1458a 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -6299,9 +6299,6 @@ static int disas_coproc_insn(CPUARMState * env, DisasContext *s, uint32_t insn) return disas_dsp_insn(env, s, insn); } return 1; -case 10: -case 11: - return disas_vfp_insn (env, s, insn); default: break; } @@ -6756,6 +6753,11 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) goto illegal_op; return; } +if ((insn 0x0f000e10) == 0x0e000a00) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) +goto illegal_op; +} if (((insn 0x0f30f000) == 0x0510f000) || ((insn 0x0f30f010) == 0x0710f000)) { if ((insn (1 22)) == 0) { @@ -8036,9 +8038,15 @@ static void disas_arm_insn(CPUARMState * env, DisasContext *s) case 0xc: case 0xd: case 0xe: -/* Coprocessor. */ -if (disas_coproc_insn(env, s, insn)) +if (((insn 8) 0xe) == 10) { +/* VFP. */ +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} +} else if (disas_coproc_insn(env, s, insn)) { +/* Coprocessor. */ goto illegal_op; +} break; case 0xf: /* swi */ @@ -8768,6 +8776,10 @@ static int disas_thumb2_insn(CPUARMState *env, DisasContext *s, uint16_t insn_hw insn = (insn 0xe2ff) | ((insn (1 28)) 4) | (1 28); if (disas_neon_data_insn(env, s, insn)) goto illegal_op; +} else if (((insn 8) 0xe) == 10) { +if (disas_vfp_insn(env, s, insn)) { +goto illegal_op; +} } else { if (insn (1 28)) goto illegal_op; -- 1.8.1.4
[Qemu-devel] [PATCHv4 2/2] target-arm: Implement ARMv8 VSEL instruction.
This adds support for the VSEL floating point selection instruction which was added in ARMv8. Signed-off-by: Will Newton will.new...@linaro.org --- target-arm/translate.c | 113 + 1 file changed, 113 insertions(+) Changes in v4: - Fix leak of temporaries - Extend condition values to 64bit in the DP case diff --git a/target-arm/translate.c b/target-arm/translate.c index 2c1458a..db2d862 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -2880,6 +2880,119 @@ static int disas_vfp_insn(CPUARMState * env, DisasContext *s, uint32_t insn) rm = VFP_SREG_M(insn); } +if ((insn 0x0f800e50) == 0x0e000a00) { +/* vsel */ +uint32_t cc = (insn 20) 3; + +/* ARMv8 VFP. */ +if (!arm_feature(env, ARM_FEATURE_V8)) { +return 1; +} + +if (dp) { +TCGv_i64 ftmp1, ftmp2, ftmp3; +TCGv_i64 tmp, zero, zf, nf, vf; + +zero = tcg_const_i64(0); + +ftmp1 = tcg_temp_new_i64(); +ftmp2 = tcg_temp_new_i64(); +ftmp3 = tcg_temp_new_i64(); + +zf = tcg_temp_new_i64(); +nf = tcg_temp_new_i64(); +vf = tcg_temp_new_i64(); + +tcg_gen_extu_i32_i64(zf, cpu_ZF); +tcg_gen_extu_i32_i64(nf, cpu_NF); +tcg_gen_extu_i32_i64(vf, cpu_VF); + +tcg_gen_ld_f64(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f64(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i64(TCG_COND_EQ, ftmp3, zf, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i64(TCG_COND_LT, ftmp3, vf, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i64(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i64(TCG_COND_NE, ftmp3, zf, zero, +ftmp1, ftmp2); +tmp = tcg_temp_new_i64(); +tcg_gen_xor_i64(tmp, vf, nf); +tcg_gen_movcond_i64(TCG_COND_GE, ftmp3, tmp, zero, +ftmp3, ftmp2); +tcg_temp_free_i64(tmp); +break; +} +tcg_gen_st_f64(ftmp3, cpu_env, vfp_reg_offset(dp, rd)); +tcg_temp_free_i64(ftmp1); +tcg_temp_free_i64(ftmp2); +tcg_temp_free_i64(ftmp3); + +tcg_temp_free_i64(zf); +tcg_temp_free_i64(nf); +tcg_temp_free_i64(vf); + +tcg_temp_free_i64(zero); +} else { +TCGv_i32 ftmp1, ftmp2, ftmp3; +TCGv_i32 tmp, zero; + +zero = tcg_const_i32(0); + +ftmp1 = tcg_temp_new_i32(); +ftmp2 = tcg_temp_new_i32(); +ftmp3 = tcg_temp_new_i32(); +tcg_gen_ld_f32(ftmp1, cpu_env, vfp_reg_offset(dp, rn)); +tcg_gen_ld_f32(ftmp2, cpu_env, vfp_reg_offset(dp, rm)); +switch (cc) { +case 0: /* eq: Z */ +tcg_gen_movcond_i32(TCG_COND_EQ, ftmp3, cpu_ZF, zero, +ftmp1, ftmp2); +break; +case 1: /* vs: V */ +tcg_gen_movcond_i32(TCG_COND_LT, ftmp3, cpu_VF, zero, +ftmp1, ftmp2); +break; +case 2: /* ge: N == V - N ^ V == 0 */ +tmp = tcg_temp_new_i32(); +tcg_gen_xor_i32(tmp, cpu_VF, cpu_NF); +tcg_gen_movcond_i32(TCG_COND_GE, ftmp3, tmp, zero, +ftmp1, ftmp2); +tcg_temp_free_i32(tmp); +break; +case 3: /* gt: !Z N == V */ +tcg_gen_movcond_i32(TCG_COND_NE, ftmp3, cpu_ZF, zero, +ftmp1