Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
On Tue, 23 Jun 2020, Paolo Bonzini wrote: > On 23/06/20 02:01, Joseph Myers wrote: > > The x87 fpatan emulation is currently based around conversion to > > double. This is inherently unsuitable for a good emulation of any > > floatx80 operation. Reimplement using the soft-float operations, as > > for other such instructions. > > > > Signed-off-by: Joseph Myers > > Queued, thanks. > > Just one question: do recent processors still use the same CORDIC > approximations as the 8087, and if so would it be better or simpler to > do that instead of using a good implementation such as this one? I don't know what approximations the processors use, but they're definitely different for at least some instructions between Intel and AMD processors (as shown by glibc test ulps baselines created on one processor sometimes needing increasing to work on other processors; avoiding test problems means the emulation needs to be at least as accurate as hardware). (Whereas the AVX-512 approximation instructions have reference implementations for their exact semantics.) -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
Patchew URL: https://patchew.org/QEMU/alpine.deb.2.21.200623340.24...@digraph.polyomino.org.uk/ Hi, This series failed the asan build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1 === TEST SCRIPT END === GEN docs/interop/qemu-qmp-ref.7 CC qga/commands.o CC qga/guest-agent-command-state.o /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of ` CC qga/main.o CC qga/commands-posix.o CC qga/channel-posix.o __interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) --- AR libvhost-user.a AS pc-bios/optionrom/multiboot.o AS pc-bios/optionrom/linuxboot.o /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) CC pc-bios/optionrom/linuxboot_dma.o GEN docs/interop/qemu-ga-ref.html GEN docs/interop/qemu-ga-ref.txt --- BUILD pc-bios/optionrom/pvh.raw SIGNpc-bios/optionrom/pvh.bin LINKqemu-ga /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-keymap LINKivshmem-client /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKivshmem-server LINKqemu-nbd LINKqemu-storage-daemon /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from LINKqemu-img /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-io /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-edid /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKfsdev/virtfs-proxy-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKscsi/qemu-pr-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKqemu-bridge-helper /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors.cpp.o) LINKvirtiofsd /usr/bin/ld: /usr/lib64/clang/10.0.0/lib/linux/libclang_rt.asan-x86_64.a(asan_interceptors_vfork.S.o): warning: common of `__interception::real_vfork' overridden by definition from
Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
Patchew URL: https://patchew.org/QEMU/alpine.deb.2.21.200623340.24...@digraph.polyomino.org.uk/ Hi, This series failed the docker-quick@centos7 build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-centos7 V=1 NETWORK=1 time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1 === TEST SCRIPT END === CC aarch64-softmmu/target/arm/pauth_helper.o GEN trace/generated-helpers.c /tmp/qemu-test/src/target/i386/fpu_helper.c: In function 'helper_fpatan': /tmp/qemu-test/src/target/i386/fpu_helper.c:1098:17: error: implicit declaration of function 'shift128Right' [-Werror=implicit-function-declaration] shift128Right(remsig0, remsig1, 1, , ); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1098:17: error: nested extern declaration of 'shift128Right' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1101:13: error: implicit declaration of function 'estimateDiv128To64' [-Werror=implicit-function-declaration] xsig0 = estimateDiv128To64(remsig0, remsig1, den_sig); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1101:13: error: nested extern declaration of 'estimateDiv128To64' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1102:13: error: implicit declaration of function 'mul64To128' [-Werror=implicit-function-declaration] mul64To128(den_sig, xsig0, , ); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1102:13: error: nested extern declaration of 'mul64To128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1103:13: error: implicit declaration of function 'sub128' [-Werror=implicit-function-declaration] sub128(remsig0, remsig1, msig0, msig1, , ); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1103:13: error: nested extern declaration of 'sub128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1106:17: error: implicit declaration of function 'add128' [-Werror=implicit-function-declaration] add128(remsig0, remsig1, 0, den_sig, , ); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1106:17: error: nested extern declaration of 'add128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1143:33: error: implicit declaration of function 'shift128Left' [-Werror=implicit-function-declaration] shift128Left(ysig0, ysig1, shift, ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1143:33: error: nested extern declaration of 'shift128Left' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1171:21: error: implicit declaration of function 'shift128RightJamming' [-Werror=implicit-function-declaration] shift128RightJamming(xsig0, xsig1, texp - xexp, ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1171:21: error: nested extern declaration of 'shift128RightJamming' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1200:17: error: implicit declaration of function 'mul128By64To192' [-Werror=implicit-function-declaration] mul128By64To192(xsig0, xsig1, tsig, , , ); ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1200:17: error: nested extern declaration of 'mul128By64To192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1218:17: error: implicit declaration of function 'sub192' [-Werror=implicit-function-declaration] sub192(remsig0, remsig1, remsig2, msig0, msig1, msig2, ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1218:17: error: nested extern declaration of 'sub192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1222:21: error: implicit declaration of function 'add192' [-Werror=implicit-function-declaration] add192(remsig0, remsig1, remsig2, 0, dsig0, dsig1, ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1222:21: error: nested extern declaration of 'add192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1237:17: error: implicit declaration of function 'mul128To256' [-Werror=implicit-function-declaration] mul128To256(zsig0, zsig1, zsig0, zsig1, ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1237:17: error: nested extern declaration of 'mul128To256' [-Werror=nested-externs] cc1: all warnings being treated as errors CC aarch64-softmmu/trace/control-target.o make[1]: *** [target/i386/fpu_helper.o] Error 1 make[1]: *** Waiting for unfinished jobs CC aarch64-softmmu/softmmu/main.o CC aarch64-softmmu/target/arm/translate.o CC aarch64-softmmu/gdbstub-xml.o CC aarch64-softmmu/trace/generated-helpers.o CC aarch64-softmmu/target/arm/translate-sve.o
Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
Patchew URL: https://patchew.org/QEMU/alpine.deb.2.21.200623340.24...@digraph.polyomino.org.uk/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === CC aarch64-softmmu/target/arm/translate-sve.o LINKaarch64-softmmu/qemu-system-aarch64w.exe /tmp/qemu-test/src/target/i386/fpu_helper.c: In function 'helper_fpatan': /tmp/qemu-test/src/target/i386/fpu_helper.c:1098:17: error: implicit declaration of function 'shift128Right' [-Werror=implicit-function-declaration] 1098 | shift128Right(remsig0, remsig1, 1, , ); | ^ /tmp/qemu-test/src/target/i386/fpu_helper.c:1098:17: error: nested extern declaration of 'shift128Right' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1101:21: error: implicit declaration of function 'estimateDiv128To64' [-Werror=implicit-function-declaration] 1101 | xsig0 = estimateDiv128To64(remsig0, remsig1, den_sig); | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1101:21: error: nested extern declaration of 'estimateDiv128To64' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1102:13: error: implicit declaration of function 'mul64To128' [-Werror=implicit-function-declaration] 1102 | mul64To128(den_sig, xsig0, , ); | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1102:13: error: nested extern declaration of 'mul64To128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1103:13: error: implicit declaration of function 'sub128' [-Werror=implicit-function-declaration] 1103 | sub128(remsig0, remsig1, msig0, msig1, , ); | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1103:13: error: nested extern declaration of 'sub128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1106:17: error: implicit declaration of function 'add128' [-Werror=implicit-function-declaration] 1106 | add128(remsig0, remsig1, 0, den_sig, , ); | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1106:17: error: nested extern declaration of 'add128' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1143:33: error: implicit declaration of function 'shift128Left' [-Werror=implicit-function-declaration] 1143 | shift128Left(ysig0, ysig1, shift, | ^~~~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1143:33: error: nested extern declaration of 'shift128Left' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1171:21: error: implicit declaration of function 'shift128RightJamming' [-Werror=implicit-function-declaration] 1171 | shift128RightJamming(xsig0, xsig1, texp - xexp, | ^~~~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1171:21: error: nested extern declaration of 'shift128RightJamming' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1200:17: error: implicit declaration of function 'mul128By64To192' [-Werror=implicit-function-declaration] 1200 | mul128By64To192(xsig0, xsig1, tsig, , , ); | ^~~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1200:17: error: nested extern declaration of 'mul128By64To192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1218:17: error: implicit declaration of function 'sub192' [-Werror=implicit-function-declaration] 1218 | sub192(remsig0, remsig1, remsig2, msig0, msig1, msig2, | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1218:17: error: nested extern declaration of 'sub192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1222:21: error: implicit declaration of function 'add192' [-Werror=implicit-function-declaration] 1222 | add192(remsig0, remsig1, remsig2, 0, dsig0, dsig1, | ^~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1222:21: error: nested extern declaration of 'add192' [-Werror=nested-externs] /tmp/qemu-test/src/target/i386/fpu_helper.c:1237:17: error: implicit declaration of function 'mul128To256' [-Werror=implicit-function-declaration] 1237 | mul128To256(zsig0, zsig1, zsig0, zsig1, | ^~~ /tmp/qemu-test/src/target/i386/fpu_helper.c:1237:17: error: nested extern declaration of 'mul128To256' [-Werror=nested-externs] cc1: all warnings being treated as errors make[1]: *** [/tmp/qemu-test/src/rules.mak:69: target/i386/fpu_helper.o] Error 1
Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
On 23/06/20 02:01, Joseph Myers wrote: > The x87 fpatan emulation is currently based around conversion to > double. This is inherently unsuitable for a good emulation of any > floatx80 operation. Reimplement using the soft-float operations, as > for other such instructions. > > Signed-off-by: Joseph Myers Queued, thanks. Just one question: do recent processors still use the same CORDIC approximations as the 8087, and if so would it be better or simpler to do that instead of using a good implementation such as this one? Thanks, Paolo
Re: [PATCH v2] target/i386: reimplement fpatan using floatx80 operations
On 23/06/20 02:01, Joseph Myers wrote: > The x87 fpatan emulation is currently based around conversion to > double. This is inherently unsuitable for a good emulation of any > floatx80 operation. Reimplement using the soft-float operations, as > for other such instructions. > > Signed-off-by: Joseph Myers > > --- > > Changes in version 2: adjust the "Dividing ST1 by ST0 gives the > correct result." case to ensure correct exceptions, as well as a > correctly rounded result in non-to-nearest modes, when the division is > exact. > --- > target/i386/fpu_helper.c | 487 - > tests/tcg/i386/test-i386-fpatan.c | 1071 + > 2 files changed, 1554 insertions(+), 4 deletions(-) > create mode 100644 tests/tcg/i386/test-i386-fpatan.c > > diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c > index 62820bc735..71cec3962f 100644 > --- a/target/i386/fpu_helper.c > +++ b/target/i386/fpu_helper.c > @@ -1239,14 +1239,493 @@ void helper_fptan(CPUX86State *env) > } > } > > +/* Values of pi/4, pi/2, 3pi/4 and pi, with 128-bit precision. */ > +#define pi_4_exp 0x3ffe > +#define pi_4_sig_high 0xc90fdaa22168c234ULL > +#define pi_4_sig_low 0xc4c6628b80dc1cd1ULL > +#define pi_2_exp 0x3fff > +#define pi_2_sig_high 0xc90fdaa22168c234ULL > +#define pi_2_sig_low 0xc4c6628b80dc1cd1ULL > +#define pi_34_exp 0x4000 > +#define pi_34_sig_high 0x96cbe3f9990e91a7ULL > +#define pi_34_sig_low 0x9394c9e8a0a5159dULL > +#define pi_exp 0x4000 > +#define pi_sig_high 0xc90fdaa22168c234ULL > +#define pi_sig_low 0xc4c6628b80dc1cd1ULL > + > +/* > + * Polynomial coefficients for an approximation to atan(x), with only > + * odd powers of x used, for x in the interval [-1/16, 1/16]. (Unlike > + * for some other approximations, no low part is needed for the first > + * coefficient here to achieve a sufficiently accurate result, because > + * the coefficient in this minimax approximation is very close to > + * exactly 1.) > + */ > +#define fpatan_coeff_0 make_floatx80(0x3fff, 0x8000ULL) > +#define fpatan_coeff_1 make_floatx80(0xbffd, 0xaa43ULL) > +#define fpatan_coeff_2 make_floatx80(0x3ffc, 0xccbfe4f8ULL) > +#define fpatan_coeff_3 make_floatx80(0xbffc, 0x92492491fbab2e66ULL) > +#define fpatan_coeff_4 make_floatx80(0x3ffb, 0xe38e372881ea1e0bULL) > +#define fpatan_coeff_5 make_floatx80(0xbffb, 0xba2c0104bbdd0615ULL) > +#define fpatan_coeff_6 make_floatx80(0x3ffb, 0x9baf7ebf898b42efULL) > + > +struct fpatan_data { > +/* High and low parts of atan(x). */ > +floatx80 atan_high, atan_low; > +}; > + > +static const struct fpatan_data fpatan_table[9] = { > +{ floatx80_zero, > + floatx80_zero }, > +{ make_floatx80(0x3ffb, 0xfeadd4d5617b6e33ULL), > + make_floatx80(0xbfb9, 0xdda19d8305ddc420ULL) }, > +{ make_floatx80(0x3ffc, 0xfadbafc96406eb15ULL), > + make_floatx80(0x3fbb, 0xdb8f3debef442fccULL) }, > +{ make_floatx80(0x3ffd, 0xb7b0ca0f26f78474ULL), > + make_floatx80(0xbfbc, 0xeab9bdba460376faULL) }, > +{ make_floatx80(0x3ffd, 0xed63382b0dda7b45ULL), > + make_floatx80(0x3fbc, 0xdfc88bd978751a06ULL) }, > +{ make_floatx80(0x3ffe, 0x8f005d5ef7f59f9bULL), > + make_floatx80(0x3fbd, 0xb906bc2ccb886e90ULL) }, > +{ make_floatx80(0x3ffe, 0xa4bc7d1934f70924ULL), > + make_floatx80(0x3fbb, 0xcd43f9522bed64f8ULL) }, > +{ make_floatx80(0x3ffe, 0xb8053e2bc2319e74ULL), > + make_floatx80(0xbfbc, 0xd3496ab7bd6eef0cULL) }, > +{ make_floatx80(0x3ffe, 0xc90fdaa22168c235ULL), > + make_floatx80(0xbfbc, 0xece675d1fc8f8cbcULL) }, > +}; > + > void helper_fpatan(CPUX86State *env) > { > -double fptemp, fpsrcop; > +uint8_t old_flags = save_exception_flags(env); > +uint64_t arg0_sig = extractFloatx80Frac(ST0); > +int32_t arg0_exp = extractFloatx80Exp(ST0); > +bool arg0_sign = extractFloatx80Sign(ST0); > +uint64_t arg1_sig = extractFloatx80Frac(ST1); > +int32_t arg1_exp = extractFloatx80Exp(ST1); > +bool arg1_sign = extractFloatx80Sign(ST1); > + > +if (floatx80_is_signaling_nan(ST0, >fp_status)) { > +float_raise(float_flag_invalid, >fp_status); > +ST1 = floatx80_silence_nan(ST0, >fp_status); > +} else if (floatx80_is_signaling_nan(ST1, >fp_status)) { > +float_raise(float_flag_invalid, >fp_status); > +ST1 = floatx80_silence_nan(ST1, >fp_status); > +} else if (floatx80_invalid_encoding(ST0) || > + floatx80_invalid_encoding(ST1)) { > +float_raise(float_flag_invalid, >fp_status); > +ST1 = floatx80_default_nan(>fp_status); > +} else if (floatx80_is_any_nan(ST0)) { > +ST1 = ST0; > +} else if (floatx80_is_any_nan(ST1)) { > +/* Pass this NaN through. */ > +} else if (floatx80_is_zero(ST1) && !arg0_sign) { > +/* Pass this zero through. */ > +} else if (((floatx80_is_infinity(ST0) && !floatx80_is_infinity(ST1)) || > + arg0_exp - arg1_exp >= 80) && >
[PATCH v2] target/i386: reimplement fpatan using floatx80 operations
The x87 fpatan emulation is currently based around conversion to double. This is inherently unsuitable for a good emulation of any floatx80 operation. Reimplement using the soft-float operations, as for other such instructions. Signed-off-by: Joseph Myers --- Changes in version 2: adjust the "Dividing ST1 by ST0 gives the correct result." case to ensure correct exceptions, as well as a correctly rounded result in non-to-nearest modes, when the division is exact. --- target/i386/fpu_helper.c | 487 - tests/tcg/i386/test-i386-fpatan.c | 1071 + 2 files changed, 1554 insertions(+), 4 deletions(-) create mode 100644 tests/tcg/i386/test-i386-fpatan.c diff --git a/target/i386/fpu_helper.c b/target/i386/fpu_helper.c index 62820bc735..71cec3962f 100644 --- a/target/i386/fpu_helper.c +++ b/target/i386/fpu_helper.c @@ -1239,14 +1239,493 @@ void helper_fptan(CPUX86State *env) } } +/* Values of pi/4, pi/2, 3pi/4 and pi, with 128-bit precision. */ +#define pi_4_exp 0x3ffe +#define pi_4_sig_high 0xc90fdaa22168c234ULL +#define pi_4_sig_low 0xc4c6628b80dc1cd1ULL +#define pi_2_exp 0x3fff +#define pi_2_sig_high 0xc90fdaa22168c234ULL +#define pi_2_sig_low 0xc4c6628b80dc1cd1ULL +#define pi_34_exp 0x4000 +#define pi_34_sig_high 0x96cbe3f9990e91a7ULL +#define pi_34_sig_low 0x9394c9e8a0a5159dULL +#define pi_exp 0x4000 +#define pi_sig_high 0xc90fdaa22168c234ULL +#define pi_sig_low 0xc4c6628b80dc1cd1ULL + +/* + * Polynomial coefficients for an approximation to atan(x), with only + * odd powers of x used, for x in the interval [-1/16, 1/16]. (Unlike + * for some other approximations, no low part is needed for the first + * coefficient here to achieve a sufficiently accurate result, because + * the coefficient in this minimax approximation is very close to + * exactly 1.) + */ +#define fpatan_coeff_0 make_floatx80(0x3fff, 0x8000ULL) +#define fpatan_coeff_1 make_floatx80(0xbffd, 0xaa43ULL) +#define fpatan_coeff_2 make_floatx80(0x3ffc, 0xccbfe4f8ULL) +#define fpatan_coeff_3 make_floatx80(0xbffc, 0x92492491fbab2e66ULL) +#define fpatan_coeff_4 make_floatx80(0x3ffb, 0xe38e372881ea1e0bULL) +#define fpatan_coeff_5 make_floatx80(0xbffb, 0xba2c0104bbdd0615ULL) +#define fpatan_coeff_6 make_floatx80(0x3ffb, 0x9baf7ebf898b42efULL) + +struct fpatan_data { +/* High and low parts of atan(x). */ +floatx80 atan_high, atan_low; +}; + +static const struct fpatan_data fpatan_table[9] = { +{ floatx80_zero, + floatx80_zero }, +{ make_floatx80(0x3ffb, 0xfeadd4d5617b6e33ULL), + make_floatx80(0xbfb9, 0xdda19d8305ddc420ULL) }, +{ make_floatx80(0x3ffc, 0xfadbafc96406eb15ULL), + make_floatx80(0x3fbb, 0xdb8f3debef442fccULL) }, +{ make_floatx80(0x3ffd, 0xb7b0ca0f26f78474ULL), + make_floatx80(0xbfbc, 0xeab9bdba460376faULL) }, +{ make_floatx80(0x3ffd, 0xed63382b0dda7b45ULL), + make_floatx80(0x3fbc, 0xdfc88bd978751a06ULL) }, +{ make_floatx80(0x3ffe, 0x8f005d5ef7f59f9bULL), + make_floatx80(0x3fbd, 0xb906bc2ccb886e90ULL) }, +{ make_floatx80(0x3ffe, 0xa4bc7d1934f70924ULL), + make_floatx80(0x3fbb, 0xcd43f9522bed64f8ULL) }, +{ make_floatx80(0x3ffe, 0xb8053e2bc2319e74ULL), + make_floatx80(0xbfbc, 0xd3496ab7bd6eef0cULL) }, +{ make_floatx80(0x3ffe, 0xc90fdaa22168c235ULL), + make_floatx80(0xbfbc, 0xece675d1fc8f8cbcULL) }, +}; + void helper_fpatan(CPUX86State *env) { -double fptemp, fpsrcop; +uint8_t old_flags = save_exception_flags(env); +uint64_t arg0_sig = extractFloatx80Frac(ST0); +int32_t arg0_exp = extractFloatx80Exp(ST0); +bool arg0_sign = extractFloatx80Sign(ST0); +uint64_t arg1_sig = extractFloatx80Frac(ST1); +int32_t arg1_exp = extractFloatx80Exp(ST1); +bool arg1_sign = extractFloatx80Sign(ST1); + +if (floatx80_is_signaling_nan(ST0, >fp_status)) { +float_raise(float_flag_invalid, >fp_status); +ST1 = floatx80_silence_nan(ST0, >fp_status); +} else if (floatx80_is_signaling_nan(ST1, >fp_status)) { +float_raise(float_flag_invalid, >fp_status); +ST1 = floatx80_silence_nan(ST1, >fp_status); +} else if (floatx80_invalid_encoding(ST0) || + floatx80_invalid_encoding(ST1)) { +float_raise(float_flag_invalid, >fp_status); +ST1 = floatx80_default_nan(>fp_status); +} else if (floatx80_is_any_nan(ST0)) { +ST1 = ST0; +} else if (floatx80_is_any_nan(ST1)) { +/* Pass this NaN through. */ +} else if (floatx80_is_zero(ST1) && !arg0_sign) { +/* Pass this zero through. */ +} else if (((floatx80_is_infinity(ST0) && !floatx80_is_infinity(ST1)) || + arg0_exp - arg1_exp >= 80) && + !arg0_sign) { +/* + * Dividing ST1 by ST0 gives the correct result up to + * rounding, and avoids spurious underflow exceptions that + * might result from passing some small values through the + * polynomial