Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
On 5/13/21 8:33 AM, Alex Bennée wrote: Now add and mul columns are going down when the only change is to muladd? Is this just more noise? Running again more times I think it is a real effect: I don't believe it. If source code for a given function is not changing then the generated code should not change (much, especially with FLATTEN), and thus the runtime should not change (much). Are you absolutely sure that you're measuring what you think you are measuring? Is your compiler mis-behaving somehow and not inlining stuff? r~
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > Reorg everything using QEMU_GENERIC and multiple inclusion to > reduce the amount of code duplication between the formats. > > The use of QEMU_GENERIC means that we need to use pointers instead > of structures, which means that even the smaller float formats > need rearranging. > > I've carried it through to completion within fpu/, so that we don't > have (much) of the legacy code remaining. There is some floatx80 > stuff in target/m68k and target/i386 that's still hanging around. I'm going to take a break from reviewing this now I've been through about 2/3rds of the patches. Overall I think the series is in great shape and while the performance modulations are interesting they are not a blocker from my point of view. I'll happily take a small hit to performance for a more unified (and correct!) code base. However the frontend maintainers for those affected may take a different view. -- Alex Bennée
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > On 5/12/21 2:23 PM, Alex Bennée wrote: >> Richard Henderson writes: >> >>> Reorg everything using QEMU_GENERIC and multiple inclusion to >>> reduce the amount of code duplication between the formats. >>> >>> The use of QEMU_GENERIC means that we need to use pointers instead >>> of structures, which means that even the smaller float formats >>> need rearranging. >>> >>> I've carried it through to completion within fpu/, so that we don't >>> have (much) of the legacy code remaining. There is some floatx80 >>> stuff in target/m68k and target/i386 that's still hanging around. >> OK and here are some quad benchmarks. There is actual change above >> the >> noise but I think the biggest hit comes from the parts conversion but we >> do claw some of it back: >> * Run Quad Benchmarks >> #+name: run-quad-float-benchmarks >> #+begin_src sh :results output table append >>commit=$(git describe) >>add=$(./tests/fp/fp-bench add -p quad) >>mul=$(./tests/fp/fp-bench add -p quad) >>muladd=$(./tests/fp/fp-bench add -p quad) >>desc=$(git log --format="format:%s" HEAD^..) >>echo "$commit,$add,$mul,$muladd,$desc" >> #+end_src >> #+RESULTS: run-quad-float-benchmarks >> | pull-target-arm-20210510-1-91-g0fe775d52c | 90.28 MFlops | 90.15 MFlops | >> 90.75 MFlops | >> | >> | pull-target-arm-20210510-1-92-gf7a6dabee2 | 90.80 MFlops | 89.92 MFlops | >> 90.66 MFlops | >> | >> | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.93 MFlops | 89.10 MFlops | >> 87.32 MFlops | >> | >> | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.85 MFlops | 88.83 MFlops | >> 88.53 MFlops | >> | >> | pull-target-arm-20210510-1-94-g900ea1f79d | 87.10 MFlops | 88.02 MFlops | >> 88.22 MFlops | >> | >> | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.11 MFlops | 87.10 MFlops | >> 87.48 MFlops | softfloat: Tidy a * b + inf return >> | >> | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.27 MFlops | 84.86 MFlops | >> 87.99 MFlops | >> | >> | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.56 MFlops | 88.31 MFlops | >> 88.41 MFlops | softfloat: Tidy a * b + inf return >> | >> | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.12 MFlops | 88.88 MFlops | >> 89.09 MFlops | softfloat: Add float_cmask and constants >> | >> | pull-target-arm-20210510-1-97-g2328f560a1 | 91.18 MFlops | 91.84 MFlops | >> 91.30 MFlops | softfloat: Use return_nan in float_to_float >> | >> | pull-target-arm-20210510-1-97-g2328f560a1 | 90.07 MFlops | 91.16 MFlops | >> 91.14 MFlops | softfloat: Use return_nan in float_to_float >> | >> | pull-target-arm-20210510-1-98-g89e2096c6f | 87.54 MFlops | 87.71 MFlops | >> 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode >> | >> | pull-target-arm-20210510-1-98-g89e2096c6f | 87.57 MFlops | 83.80 MFlops | >> 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode >> | >> | pull-target-arm-20210510-1-99-g67ceccacea | 89.29 MFlops | 87.46 MFlops | >> 87.40 MFlops | target/mips: Set set_default_nan_mode with >> set_snan_bit_is_one | >> | pull-target-arm-20210510-1-99-g67ceccacea | 88.08 MFlops | 88.54 MFlops | >> 88.42 MFlops | target/mips: Set set_default_nan_mode with >> set_snan_bit_is_one | >> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | >> 92.37 MFlops | softfloat: Do not produce a default_nan from >> parts_silence_nan | >> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | >> 93.17 MFlops | softfloat: Do not produce a default_nan from >> parts_silence_nan | >> | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | >> 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64 >> | >> | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | >> 92.54 MFlops | >> | >> | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | >> 94.00 MFlops | softfloat: Use pointers with parts_silence_nan >> | >> | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | >> 94.67 MFlops | softfloat: Rearrange FloatParts64 >> | >> | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | >> 95.13 MFlops | >> | >> | pull-target-arm-20210510-1-111-g039cab1333 | 95.55
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
On 5/12/21 2:23 PM, Alex Bennée wrote: Richard Henderson writes: Reorg everything using QEMU_GENERIC and multiple inclusion to reduce the amount of code duplication between the formats. The use of QEMU_GENERIC means that we need to use pointers instead of structures, which means that even the smaller float formats need rearranging. I've carried it through to completion within fpu/, so that we don't have (much) of the legacy code remaining. There is some floatx80 stuff in target/m68k and target/i386 that's still hanging around. OK and here are some quad benchmarks. There is actual change above the noise but I think the biggest hit comes from the parts conversion but we do claw some of it back: * Run Quad Benchmarks #+name: run-quad-float-benchmarks #+begin_src sh :results output table append commit=$(git describe) add=$(./tests/fp/fp-bench add -p quad) mul=$(./tests/fp/fp-bench add -p quad) muladd=$(./tests/fp/fp-bench add -p quad) desc=$(git log --format="format:%s" HEAD^..) echo "$commit,$add,$mul,$muladd,$desc" #+end_src #+RESULTS: run-quad-float-benchmarks | pull-target-arm-20210510-1-91-g0fe775d52c | 90.28 MFlops | 90.15 MFlops | 90.75 MFlops | | | pull-target-arm-20210510-1-92-gf7a6dabee2 | 90.80 MFlops | 89.92 MFlops | 90.66 MFlops | | | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.93 MFlops | 89.10 MFlops | 87.32 MFlops | | | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.85 MFlops | 88.83 MFlops | 88.53 MFlops | | | pull-target-arm-20210510-1-94-g900ea1f79d | 87.10 MFlops | 88.02 MFlops | 88.22 MFlops | | | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.11 MFlops | 87.10 MFlops | 87.48 MFlops | softfloat: Tidy a * b + inf return | | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.27 MFlops | 84.86 MFlops | 87.99 MFlops | | | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.56 MFlops | 88.31 MFlops | 88.41 MFlops | softfloat: Tidy a * b + inf return | | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.12 MFlops | 88.88 MFlops | 89.09 MFlops | softfloat: Add float_cmask and constants| | pull-target-arm-20210510-1-97-g2328f560a1 | 91.18 MFlops | 91.84 MFlops | 91.30 MFlops | softfloat: Use return_nan in float_to_float | | pull-target-arm-20210510-1-97-g2328f560a1 | 90.07 MFlops | 91.16 MFlops | 91.14 MFlops | softfloat: Use return_nan in float_to_float | | pull-target-arm-20210510-1-98-g89e2096c6f | 87.54 MFlops | 87.71 MFlops | 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode | | pull-target-arm-20210510-1-98-g89e2096c6f | 87.57 MFlops | 83.80 MFlops | 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode | | pull-target-arm-20210510-1-99-g67ceccacea | 89.29 MFlops | 87.46 MFlops | 87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one | | pull-target-arm-20210510-1-99-g67ceccacea | 88.08 MFlops | 88.54 MFlops | 88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one | | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan | | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan | | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64| | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 92.54 MFlops | | | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 94.00 MFlops | softfloat: Use pointers with parts_silence_nan | | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 94.67 MFlops | softfloat: Rearrange FloatParts64 | | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 95.13 MFlops | | | pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 95.55 MFlops | | | pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 88.64 MFlops | softfloat: Convert float128_silence_nan to parts| | pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > Reorg everything using QEMU_GENERIC and multiple inclusion to > reduce the amount of code duplication between the formats. > > The use of QEMU_GENERIC means that we need to use pointers instead > of structures, which means that even the smaller float formats > need rearranging. > > I've carried it through to completion within fpu/, so that we don't > have (much) of the legacy code remaining. There is some floatx80 > stuff in target/m68k and target/i386 that's still hanging around. OK and here are some quad benchmarks. There is actual change above the noise but I think the biggest hit comes from the parts conversion but we do claw some of it back: * Run Quad Benchmarks #+name: run-quad-float-benchmarks #+begin_src sh :results output table append commit=$(git describe) add=$(./tests/fp/fp-bench add -p quad) mul=$(./tests/fp/fp-bench add -p quad) muladd=$(./tests/fp/fp-bench add -p quad) desc=$(git log --format="format:%s" HEAD^..) echo "$commit,$add,$mul,$muladd,$desc" #+end_src #+RESULTS: run-quad-float-benchmarks | pull-target-arm-20210510-1-91-g0fe775d52c | 90.28 MFlops | 90.15 MFlops | 90.75 MFlops | | | pull-target-arm-20210510-1-92-gf7a6dabee2 | 90.80 MFlops | 89.92 MFlops | 90.66 MFlops | | | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.93 MFlops | 89.10 MFlops | 87.32 MFlops | | | pull-target-arm-20210510-1-93-gdb71c9fd28 | 88.85 MFlops | 88.83 MFlops | 88.53 MFlops | | | pull-target-arm-20210510-1-94-g900ea1f79d | 87.10 MFlops | 88.02 MFlops | 88.22 MFlops | | | pull-target-arm-20210510-1-95-gdb0bb2966f | 88.11 MFlops | 87.10 MFlops | 87.48 MFlops | softfloat: Tidy a * b + inf return | | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.27 MFlops | 84.86 MFlops | 87.99 MFlops | | | pull-target-arm-20210510-1-95-gdb0bb2966f | 87.56 MFlops | 88.31 MFlops | 88.41 MFlops | softfloat: Tidy a * b + inf return | | pull-target-arm-20210510-1-96-gec2be8ad0c | 88.12 MFlops | 88.88 MFlops | 89.09 MFlops | softfloat: Add float_cmask and constants| | pull-target-arm-20210510-1-97-g2328f560a1 | 91.18 MFlops | 91.84 MFlops | 91.30 MFlops | softfloat: Use return_nan in float_to_float | | pull-target-arm-20210510-1-97-g2328f560a1 | 90.07 MFlops | 91.16 MFlops | 91.14 MFlops | softfloat: Use return_nan in float_to_float | | pull-target-arm-20210510-1-98-g89e2096c6f | 87.54 MFlops | 87.71 MFlops | 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode | | pull-target-arm-20210510-1-98-g89e2096c6f | 87.57 MFlops | 83.80 MFlops | 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode | | pull-target-arm-20210510-1-99-g67ceccacea | 89.29 MFlops | 87.46 MFlops | 87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one | | pull-target-arm-20210510-1-99-g67ceccacea | 88.08 MFlops | 88.54 MFlops | 88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one | | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan | | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan | | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64| | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 92.54 MFlops | | | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 94.00 MFlops | softfloat: Use pointers with parts_silence_nan | | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 94.67 MFlops | softfloat: Rearrange FloatParts64 | | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 95.13 MFlops | | | pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 95.55 MFlops | | | pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 88.64 MFlops | softfloat: Convert float128_silence_nan to parts| | pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops | 88.04 MFlops | softfloat:
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > Reorg everything using QEMU_GENERIC and multiple inclusion to > reduce the amount of code duplication between the formats. > > The use of QEMU_GENERIC means that we need to use pointers instead > of structures, which means that even the smaller float formats > need rearranging. I did a basic some basic benchmarks which show no issues (although I suspect hardfloat is hiding any true cost of the softfloat itself): #+name: run-float-benchmarks #+begin_src shell :results output :async ./fp-bench add -p single ./fp-bench add -p double ./fp-bench mul -p single ./fp-bench mul -p double ./fp-bench muladd -p single ./fp-bench muladd -p double #+end_src #+RESULTS: run-float-benchmarks-after : 374.77 MFlops : 287.58 MFlops : 371.55 MFlops : 281.48 MFlops : 370.76 MFlops : 287.39 MFlops #+RESULTS: run-float-benchmarks-before : 362.40 MFlops : 278.65 MFlops : 360.68 MFlops : 280.92 MFlops : 360.75 MFlops : 280.76 MFlops I guess what would be really telling is if a ext80 benchmark exhibited any slowdown. > > I've carried it through to completion within fpu/, so that we don't > have (much) of the legacy code remaining. There is some floatx80 > stuff in target/m68k and target/i386 that's still hanging around. > > > r~ > > > Alex Bennée (1): > tests/fp: add quad support to the benchmark utility > > Richard Henderson (71): > qemu/host-utils: Use __builtin_bitreverseN > qemu/host-utils: Add wrappers for overflow builtins > qemu/host-utils: Add wrappers for carry builtins > accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c > softfloat: Move the binary point to the msb > softfloat: Inline float_raise > softfloat: Use float_raise in more places > softfloat: Tidy a * b + inf return > softfloat: Add float_cmask and constants > softfloat: Use return_nan in float_to_float > softfloat: fix return_nan vs default_nan_mode > target/mips: Set set_default_nan_mode with set_snan_bit_is_one > softfloat: Do not produce a default_nan from parts_silence_nan > softfloat: Rename FloatParts to FloatParts64 > softfloat: Move type-specific pack/unpack routines > softfloat: Use pointers with parts_default_nan > softfloat: Use pointers with unpack_raw > softfloat: Use pointers with ftype_unpack_raw > softfloat: Use pointers with pack_raw > softfloat: Use pointers with ftype_pack_raw > softfloat: Use pointers with ftype_unpack_canonical > softfloat: Use pointers with ftype_round_pack_canonical > softfloat: Use pointers with parts_silence_nan > softfloat: Rearrange FloatParts64 > softfloat: Convert float128_silence_nan to parts > softfloat: Convert float128_default_nan to parts > softfloat: Move return_nan to softfloat-parts.c.inc > softfloat: Move pick_nan to softfloat-parts.c.inc > softfloat: Move pick_nan_muladd to softfloat-parts.c.inc > softfloat: Move sf_canonicalize to softfloat-parts.c.inc > softfloat: Move round_canonical to softfloat-parts.c.inc > softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h > softfloat: Move addsub_floats to softfloat-parts.c.inc > softfloat: Implement float128_add/sub via parts > softfloat: Move mul_floats to softfloat-parts.c.inc > softfloat: Move muladd_floats to softfloat-parts.c.inc > softfloat: Use mulu64 for mul64To128 > softfloat: Use add192 in mul128To256 > softfloat: Tidy mul128By64To192 > softfloat: Introduce sh[lr]_double primitives > softfloat: Move div_floats to softfloat-parts.c.inc > softfloat: Split float_to_float > softfloat: Convert float-to-float conversions with float128 > softfloat: Move round_to_int to softfloat-parts.c.inc > softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc > softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc > softfloat: Move int_to_float to softfloat-parts.c.inc > softfloat: Move uint_to_float to softfloat-parts.c.inc > softfloat: Move minmax_flags to softfloat-parts.c.inc > softfloat: Move compare_floats to softfloat-parts.c.inc > softfloat: Move scalbn_decomposed to softfloat-parts.c.inc > softfloat: Move sqrt_float to softfloat-parts.c.inc > softfloat: Split out parts_uncanon_normal > softfloat: Reduce FloatFmt > softfloat: Introduce Floatx80RoundPrec > softfloat: Adjust parts_uncanon_normal for floatx80 > tests/fp/fp-test: Reverse order of floatx80 precision tests > softfloat: Convert floatx80_add/sub to FloatParts > softfloat: Convert floatx80_mul to FloatParts > softfloat: Convert floatx80_div to FloatParts > softfloat: Convert floatx80_sqrt to FloatParts > softfloat: Convert floatx80_round to FloatParts > softfloat: Convert floatx80_round_to_int to FloatParts > softfloat: Convert integer to floatx80 to FloatParts > softfloat: Convert floatx80 float conversions to FloatParts > softfloat: Convert floatx80 to integer to FloatParts > softfloat: Convert floatx80_scalbn to FloatParts > softfloat: Convert floatx80 compare to FloatParts >
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
On 5/12/21 6:22 AM, Alex Bennée wrote: I did note we are missing mulAdd tests but they seem to be missing from the underlying testfloat code as well. I guess we don't care that much for the 80bit code? Is it even used by any architectures? It's not used by any architecture, so no point in implementing it. r~
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > On 5/10/21 8:36 AM, Alex Bennée wrote: >> Richard Henderson writes: >> >>> Reorg everything using QEMU_GENERIC and multiple inclusion to >>> reduce the amount of code duplication between the formats. >>> >>> The use of QEMU_GENERIC means that we need to use pointers instead >>> of structures, which means that even the smaller float formats >>> need rearranging. >>> >>> I've carried it through to completion within fpu/, so that we don't >>> have (much) of the legacy code remaining. There is some floatx80 >>> stuff in target/m68k and target/i386 that's still hanging around. >> FWIW I could enable a few more tests... > > Ah, thanks for the reminder that these were disabled. > I'll add this to my patch set for v2. > > >> ...although extF80_lt_quiet still has some failures on equality tests: > > This turns out to be a trivial typo in the tester itself: > > diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc > index cb1bb77e4c..9ff884c140 100644 > --- a/tests/fp/wrap.c.inc > +++ b/tests/fp/wrap.c.inc > @@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet) > WRAP_CMP80(qemu_extF80M_le, floatx80_le) > WRAP_CMP80(qemu_extF80M_lt, floatx80_lt) > WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet) > -WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet) > +WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet) > #undef WRAP_CMP80 > > #define WRAP_CMP128(name, func) \o/ I did note we are missing mulAdd tests but they seem to be missing from the underlying testfloat code as well. I guess we don't care that much for the 80bit code? Is it even used by any architectures? -- Alex Bennée
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
On 5/10/21 8:36 AM, Alex Bennée wrote: Richard Henderson writes: Reorg everything using QEMU_GENERIC and multiple inclusion to reduce the amount of code duplication between the formats. The use of QEMU_GENERIC means that we need to use pointers instead of structures, which means that even the smaller float formats need rearranging. I've carried it through to completion within fpu/, so that we don't have (much) of the legacy code remaining. There is some floatx80 stuff in target/m68k and target/i386 that's still hanging around. FWIW I could enable a few more tests... Ah, thanks for the reminder that these were disabled. I'll add this to my patch set for v2. ...although extF80_lt_quiet still has some failures on equality tests: This turns out to be a trivial typo in the tester itself: diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc index cb1bb77e4c..9ff884c140 100644 --- a/tests/fp/wrap.c.inc +++ b/tests/fp/wrap.c.inc @@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet) WRAP_CMP80(qemu_extF80M_le, floatx80_le) WRAP_CMP80(qemu_extF80M_lt, floatx80_lt) WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet) -WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet) +WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet) #undef WRAP_CMP80 #define WRAP_CMP128(name, func) r~
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Richard Henderson writes: > Reorg everything using QEMU_GENERIC and multiple inclusion to > reduce the amount of code duplication between the formats. > > The use of QEMU_GENERIC means that we need to use pointers instead > of structures, which means that even the smaller float formats > need rearranging. > > I've carried it through to completion within fpu/, so that we don't > have (much) of the legacy code remaining. There is some floatx80 > stuff in target/m68k and target/i386 that's still hanging around. FWIW I could enable a few more tests although extF80_lt_quiet still has some failures on equality tests: ./tests/fp/fp-test -l 1 -r all extF80_lt_quiet >> Testing extF80_lt_quiet 46464 tests total. Errors found in extF80_lt_quiet: +. +. => 1 . expected 0 . +. -. => 1 . expected 0 . +.0001 +.0001 => 1 . expected 0 . +.7FFF +.7FFF => 1 . expected 0 . +.7FFE +.7FFE => 1 . expected 0 . +0001.8000 +0001.8000 => 1 . expected 0 . +0001.8001 +0001.8001 => 1 . expected 0 . +0001. +0001. => 1 . expected 0 . +0001.FFFE +0001.FFFE => 1 . expected 0 . +3FBF.8000 +3FBF.8000 => 1 . expected 0 . +3FBF.8001 +3FBF.8001 => 1 . expected 0 . +3FBF. +3FBF. => 1 . expected 0 . +3FBF.FFFE +3FBF.FFFE => 1 . expected 0 . +3FFD.8000 +3FFD.8000 => 1 . expected 0 . +3FFD.8001 +3FFD.8001 => 1 . expected 0 . +3FFD. +3FFD. => 1 . expected 0 . +3FFD.FFFE +3FFD.FFFE => 1 . expected 0 . +3FFE.8000 +3FFE.8000 => 1 . expected 0 . +3FFE.8001 +3FFE.8001 => 1 . expected 0 . +3FFE. +3FFE. => 1 . expected 0 . 9618 tests performed; 20 errors found. However the rest can be enabled: tests/fp: enable more tests Signed-off-by: Alex Bennée 1 file changed, 6 insertions(+), 6 deletions(-) tests/fp/meson.build | 12 ++-- modified tests/fp/meson.build @@ -556,7 +556,9 @@ softfloat_conv_tests = { 'extF80_to_f64 extF80_to_f128 ' + 'f128_to_f16', 'int-to-float': 'i32_to_f16 i64_to_f16 i32_to_f32 i64_to_f32 ' + -'i32_to_f64 i64_to_f64 i32_to_f128 i64_to_f128', +'i32_to_f64 i64_to_f64 ' + +'i32_to_extF80 i64_to_extF80 ' + +'i32_to_f128 i64_to_f128', 'uint-to-float': 'ui32_to_f16 ui64_to_f16 ui32_to_f32 ui64_to_f32 ' + 'ui32_to_f64 ui64_to_f64 ui64_to_f128 ' + 'ui32_to_extF80 ui64_to_extF80', @@ -581,7 +583,7 @@ softfloat_conv_tests = { 'extF80_to_ui64 extF80_to_ui64_r_minMag ' + 'f128_to_ui64 f128_to_ui64_r_minMag', 'round-to-integer': 'f16_roundToInt f32_roundToInt ' + -'f64_roundToInt f128_roundToInt' +'f64_roundToInt extF80_roundToInt f128_roundToInt' } softfloat_tests = { 'eq_signaling' : 'compare', @@ -602,18 +604,16 @@ fptest_args = ['-s', '-l', '1'] fptest_rounding_args = ['-r', 'all'] # Conversion Routines: -# FIXME: i32_to_extF80 (broken), i64_to_extF80 (broken) -#extF80_roundToInt (broken) foreach k, v : softfloat_conv_tests test('fp-test-' + k, fptest, args: fptest_args + fptest_rounding_args + v.split(), suite: ['softfloat', 'softfloat-conv']) endforeach -# FIXME: extF80_{lt_quiet, rem} (broken), +# FIXME: extF80_{lt_quiet} (broken), #extF80_{mulAdd} (missing) foreach k, v : softfloat_tests - extF80_broken = ['lt_quiet', 'rem'].contains(k) + extF80_broken = ['lt_quiet'].contains(k) test('fp-test-' + k, fptest, args: fptest_args + fptest_rounding_args + ['f16_' + k, 'f32_' + k, 'f64_' + k, 'f128_' + k] + -- Alex Bennée
Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts
Patchew URL: https://patchew.org/QEMU/20210508014802.892561-1-richard.hender...@linaro.org/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20210508014802.892561-1-richard.hender...@linaro.org Subject: [PATCH 00/72] Convert floatx80 and float128 to FloatParts === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20210508014802.892561-1-richard.hender...@linaro.org -> patchew/20210508014802.892561-1-richard.hender...@linaro.org Switched to a new branch 'test' 2dcaa72 softfloat: Convert modrem operations to FloatParts 2956f6c softfloat: Move floatN_log2 to softfloat-parts.c.inc b786038 softfloat: Convert float32_exp2 to FloatParts 01ab7b4 softfloat: Convert floatx80 compare to FloatParts d55b655 softfloat: Convert floatx80_scalbn to FloatParts 0b1869b softfloat: Convert floatx80 to integer to FloatParts 6287e68 softfloat: Convert floatx80 float conversions to FloatParts 5c28030 softfloat: Convert integer to floatx80 to FloatParts e686e7d softfloat: Convert floatx80_round_to_int to FloatParts 9e7a606 softfloat: Convert floatx80_round to FloatParts d19c872 softfloat: Convert floatx80_sqrt to FloatParts 06addff softfloat: Convert floatx80_div to FloatParts 6974166 softfloat: Convert floatx80_mul to FloatParts 0ab65d7 softfloat: Convert floatx80_add/sub to FloatParts 5ae98cd tests/fp/fp-test: Reverse order of floatx80 precision tests dc5eada softfloat: Adjust parts_uncanon_normal for floatx80 0f4d3ce softfloat: Introduce Floatx80RoundPrec aecc38b softfloat: Reduce FloatFmt d8342e2 softfloat: Split out parts_uncanon_normal e8a3234 softfloat: Move sqrt_float to softfloat-parts.c.inc db066f3 softfloat: Move scalbn_decomposed to softfloat-parts.c.inc 870c56c softfloat: Move compare_floats to softfloat-parts.c.inc 4895586 softfloat: Move minmax_flags to softfloat-parts.c.inc c9e01de softfloat: Move uint_to_float to softfloat-parts.c.inc 0b58632 softfloat: Move int_to_float to softfloat-parts.c.inc e408986 softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc 1330f41 softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc 2d631f0 softfloat: Move round_to_int to softfloat-parts.c.inc 45abdca softfloat: Convert float-to-float conversions with float128 0ddbdfd softfloat: Split float_to_float 296ad69 softfloat: Move div_floats to softfloat-parts.c.inc d5ff786 softfloat: Introduce sh[lr]_double primitives d227707 softfloat: Tidy mul128By64To192 5a9c124 softfloat: Use add192 in mul128To256 86d8505 softfloat: Use mulu64 for mul64To128 1186174 softfloat: Move muladd_floats to softfloat-parts.c.inc 5e4de79 softfloat: Move mul_floats to softfloat-parts.c.inc 1450eec softfloat: Implement float128_add/sub via parts ed27dea softfloat: Move addsub_floats to softfloat-parts.c.inc 3fe4a87 softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h 5687e8b softfloat: Move round_canonical to softfloat-parts.c.inc ddf7181 softfloat: Move sf_canonicalize to softfloat-parts.c.inc 49b211c softfloat: Move pick_nan_muladd to softfloat-parts.c.inc d6fae5a softfloat: Move pick_nan to softfloat-parts.c.inc ada8c0f softfloat: Move return_nan to softfloat-parts.c.inc 701d4e6 softfloat: Convert float128_default_nan to parts 735bd6f softfloat: Convert float128_silence_nan to parts be031db softfloat: Rearrange FloatParts64 997cce0 softfloat: Use pointers with parts_silence_nan 759543e softfloat: Use pointers with ftype_round_pack_canonical 59155e0 softfloat: Use pointers with ftype_unpack_canonical 67f866d softfloat: Use pointers with ftype_pack_raw ad2e600 softfloat: Use pointers with pack_raw 6725bec softfloat: Use pointers with ftype_unpack_raw 6fa54f0 softfloat: Use pointers with unpack_raw 36916c3 softfloat: Use pointers with parts_default_nan e255c56 softfloat: Move type-specific pack/unpack routines 17cab05 softfloat: Rename FloatParts to FloatParts64 7973d6d softfloat: Do not produce a default_nan from parts_silence_nan 7267830 target/mips: Set set_default_nan_mode with set_snan_bit_is_one 8acc1a9 softfloat: fix return_nan vs default_nan_mode 5c700cd softfloat: Use return_nan in float_to_float e6a00e2 softfloat: Add float_cmask and constants 689fe83 softfloat: Tidy a * b + inf return 75285a4 softfloat: Use float_raise in more places a8ea262 softfloat: Inline float_raise 3be68a5 softfloat: Move the binary point to the msb e2236e1 tests/fp: add quad support to the benchmark utility be7703a accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c 3731829 qemu/host-utils: Add wrappers for carry builtins a1367bc qemu/host-utils: Add wrappers for overflow builtins 96efc5c qemu/host-utils: Use