Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-13 Thread Richard Henderson

On 5/13/21 8:33 AM, Alex Bennée wrote:

Now add and mul columns are going down when the only change is to
muladd?  Is this just more noise?


Running again more times I think it is a real effect:


I don't believe it.  If source code for a given function is not changing then 
the generated code should not change (much, especially with FLATTEN), and thus 
the runtime should not change (much).


Are you absolutely sure that you're measuring what you think you are measuring?

Is your compiler mis-behaving somehow and not inlining stuff?


r~



Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-13 Thread Alex Bennée


Richard Henderson  writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

I'm going to take a break from reviewing this now I've been through
about 2/3rds of the patches. Overall I think the series is in great
shape and while the performance modulations are interesting they are not
a blocker from my point of view. I'll happily take a small hit to
performance for a more unified (and correct!) code base. However the
frontend maintainers for those affected may take a different view.

-- 
Alex Bennée



Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-13 Thread Alex Bennée


Richard Henderson  writes:

> On 5/12/21 2:23 PM, Alex Bennée wrote:
>> Richard Henderson  writes:
>> 
>>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>>> reduce the amount of code duplication between the formats.
>>>
>>> The use of QEMU_GENERIC means that we need to use pointers instead
>>> of structures, which means that even the smaller float formats
>>> need rearranging.
>>>
>>> I've carried it through to completion within fpu/, so that we don't
>>> have (much) of the legacy code remaining.  There is some floatx80
>>> stuff in target/m68k and target/i386 that's still hanging around.
>> OK and here are some quad benchmarks. There is actual change above
>> the
>> noise but I think the biggest hit comes from the parts conversion but we
>> do claw some of it back:
>> * Run Quad Benchmarks
>> #+name: run-quad-float-benchmarks
>> #+begin_src sh :results output table append
>>commit=$(git describe)
>>add=$(./tests/fp/fp-bench add -p quad)
>>mul=$(./tests/fp/fp-bench add -p quad)
>>muladd=$(./tests/fp/fp-bench add -p quad)
>>desc=$(git log --format="format:%s" HEAD^..)
>>echo "$commit,$add,$mul,$muladd,$desc"
>> #+end_src
>> #+RESULTS: run-quad-float-benchmarks
>> | pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 
>> 90.75 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 
>> 90.66 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 
>> 87.32 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 
>> 88.53 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 
>> 88.22 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 
>> 87.48 MFlops | softfloat: Tidy a * b + inf return
>>   |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 
>> 87.99 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 
>> 88.41 MFlops | softfloat: Tidy a * b + inf return
>>   |
>> | pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 
>> 89.09 MFlops | softfloat: Add float_cmask and constants  
>>   |
>> | pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 
>> 91.30 MFlops | softfloat: Use return_nan in float_to_float   
>>   |
>> | pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 
>> 91.14 MFlops | softfloat: Use return_nan in float_to_float   
>>   |
>> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 
>> 87.90 MFlops | softfloat: fix return_nan vs default_nan_mode 
>>   |
>> | pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 
>> 85.95 MFlops | softfloat: fix return_nan vs default_nan_mode 
>>   |
>> | pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 
>> 87.40 MFlops | target/mips: Set set_default_nan_mode with 
>> set_snan_bit_is_one  |
>> | pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 
>> 88.42 MFlops | target/mips: Set set_default_nan_mode with 
>> set_snan_bit_is_one  |
>> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 
>> 92.37 MFlops | softfloat: Do not produce a default_nan from 
>> parts_silence_nan  |
>> | pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 
>> 93.17 MFlops | softfloat: Do not produce a default_nan from 
>> parts_silence_nan  |
>> | pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 
>> 91.56 MFlops | softfloat: Rename FloatParts to FloatParts64  
>>   |
>> | pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 
>> 92.54 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 
>> 94.00 MFlops | softfloat: Use pointers with parts_silence_nan
>>   |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 
>> 94.67 MFlops | softfloat: Rearrange FloatParts64 
>>   |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 
>> 95.13 MFlops |   
>>   |
>> | pull-target-arm-20210510-1-111-g039cab1333 | 95.55 

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-13 Thread Richard Henderson

On 5/12/21 2:23 PM, Alex Bennée wrote:


Richard Henderson  writes:


Reorg everything using QEMU_GENERIC and multiple inclusion to
reduce the amount of code duplication between the formats.

The use of QEMU_GENERIC means that we need to use pointers instead
of structures, which means that even the smaller float formats
need rearranging.

I've carried it through to completion within fpu/, so that we don't
have (much) of the legacy code remaining.  There is some floatx80
stuff in target/m68k and target/i386 that's still hanging around.


OK and here are some quad benchmarks. There is actual change above the
noise but I think the biggest hit comes from the parts conversion but we
do claw some of it back:

* Run Quad Benchmarks

#+name: run-quad-float-benchmarks
#+begin_src sh :results output table append
   commit=$(git describe)
   add=$(./tests/fp/fp-bench add -p quad)
   mul=$(./tests/fp/fp-bench add -p quad)
   muladd=$(./tests/fp/fp-bench add -p quad)
   desc=$(git log --format="format:%s" HEAD^..)
   echo "$commit,$add,$mul,$muladd,$desc"
#+end_src

#+RESULTS: run-quad-float-benchmarks
| pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 
90.75 MFlops | |
| pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 
90.66 MFlops | |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 
87.32 MFlops | |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 
88.53 MFlops | |
| pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 
88.22 MFlops | |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 
87.48 MFlops | softfloat: Tidy a * b + inf return  |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 
87.99 MFlops | |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 
88.41 MFlops | softfloat: Tidy a * b + inf return  |
| pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 
89.09 MFlops | softfloat: Add float_cmask and constants|
| pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 
91.30 MFlops | softfloat: Use return_nan in float_to_float |
| pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 
91.14 MFlops | softfloat: Use return_nan in float_to_float |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 
87.90 MFlops | softfloat: fix return_nan vs default_nan_mode   |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 
85.95 MFlops | softfloat: fix return_nan vs default_nan_mode   |
| pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 
87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 
88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 
92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 
93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 
91.56 MFlops | softfloat: Rename FloatParts to FloatParts64|
| pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 
92.54 MFlops | |
| pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 
94.00 MFlops | softfloat: Use pointers with parts_silence_nan  |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 
94.67 MFlops | softfloat: Rearrange FloatParts64   |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 
95.13 MFlops | |
| pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 
95.55 MFlops | |
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 
88.64 MFlops | softfloat: Convert float128_silence_nan to parts|
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops 

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-12 Thread Alex Bennée


Richard Henderson  writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

OK and here are some quad benchmarks. There is actual change above the
noise but I think the biggest hit comes from the parts conversion but we
do claw some of it back:

* Run Quad Benchmarks

#+name: run-quad-float-benchmarks
#+begin_src sh :results output table append
  commit=$(git describe)
  add=$(./tests/fp/fp-bench add -p quad)
  mul=$(./tests/fp/fp-bench add -p quad)
  muladd=$(./tests/fp/fp-bench add -p quad)
  desc=$(git log --format="format:%s" HEAD^..)
  echo "$commit,$add,$mul,$muladd,$desc"
#+end_src

#+RESULTS: run-quad-float-benchmarks
| pull-target-arm-20210510-1-91-g0fe775d52c  | 90.28 MFlops | 90.15 MFlops | 
90.75 MFlops | |
| pull-target-arm-20210510-1-92-gf7a6dabee2  | 90.80 MFlops | 89.92 MFlops | 
90.66 MFlops | |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.93 MFlops | 89.10 MFlops | 
87.32 MFlops | |
| pull-target-arm-20210510-1-93-gdb71c9fd28  | 88.85 MFlops | 88.83 MFlops | 
88.53 MFlops | |
| pull-target-arm-20210510-1-94-g900ea1f79d  | 87.10 MFlops | 88.02 MFlops | 
88.22 MFlops | |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 88.11 MFlops | 87.10 MFlops | 
87.48 MFlops | softfloat: Tidy a * b + inf return  |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.27 MFlops | 84.86 MFlops | 
87.99 MFlops | |
| pull-target-arm-20210510-1-95-gdb0bb2966f  | 87.56 MFlops | 88.31 MFlops | 
88.41 MFlops | softfloat: Tidy a * b + inf return  |
| pull-target-arm-20210510-1-96-gec2be8ad0c  | 88.12 MFlops | 88.88 MFlops | 
89.09 MFlops | softfloat: Add float_cmask and constants|
| pull-target-arm-20210510-1-97-g2328f560a1  | 91.18 MFlops | 91.84 MFlops | 
91.30 MFlops | softfloat: Use return_nan in float_to_float |
| pull-target-arm-20210510-1-97-g2328f560a1  | 90.07 MFlops | 91.16 MFlops | 
91.14 MFlops | softfloat: Use return_nan in float_to_float |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.54 MFlops | 87.71 MFlops | 
87.90 MFlops | softfloat: fix return_nan vs default_nan_mode   |
| pull-target-arm-20210510-1-98-g89e2096c6f  | 87.57 MFlops | 83.80 MFlops | 
85.95 MFlops | softfloat: fix return_nan vs default_nan_mode   |
| pull-target-arm-20210510-1-99-g67ceccacea  | 89.29 MFlops | 87.46 MFlops | 
87.40 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-99-g67ceccacea  | 88.08 MFlops | 88.54 MFlops | 
88.42 MFlops | target/mips: Set set_default_nan_mode with set_snan_bit_is_one  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.41 MFlops | 91.85 MFlops | 
92.37 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-100-g8064a6d9d9 | 92.00 MFlops | 92.80 MFlops | 
93.17 MFlops | softfloat: Do not produce a default_nan from parts_silence_nan  |
| pull-target-arm-20210510-1-101-gc303832ddb | 92.27 MFlops | 91.76 MFlops | 
91.56 MFlops | softfloat: Rename FloatParts to FloatParts64|
| pull-target-arm-20210510-1-101-gc303832ddb | 92.64 MFlops | 92.73 MFlops | 
92.54 MFlops | |
| pull-target-arm-20210510-1-110-g8c91cc4bfd | 94.34 MFlops | 93.50 MFlops | 
94.00 MFlops | softfloat: Use pointers with parts_silence_nan  |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.72 MFlops | 95.36 MFlops | 
94.67 MFlops | softfloat: Rearrange FloatParts64   |
| pull-target-arm-20210510-1-111-g039cab1333 | 94.55 MFlops | 94.99 MFlops | 
95.13 MFlops | |
| pull-target-arm-20210510-1-111-g039cab1333 | 95.55 MFlops | 94.72 MFlops | 
95.55 MFlops | |
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.99 MFlops | 87.98 MFlops | 
88.64 MFlops | softfloat: Convert float128_silence_nan to parts|
| pull-target-arm-20210510-1-112-g5de6cec92b | 87.20 MFlops | 88.26 MFlops | 
88.04 MFlops | softfloat: 

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-12 Thread Alex Bennée


Richard Henderson  writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.

I did a basic some basic benchmarks which show no issues (although I
suspect hardfloat is hiding any true cost of the softfloat itself):

#+name: run-float-benchmarks
#+begin_src shell :results output :async
  ./fp-bench add -p single
  ./fp-bench add -p double
  ./fp-bench mul -p single
  ./fp-bench mul -p double
  ./fp-bench muladd -p single
  ./fp-bench muladd -p double
#+end_src

#+RESULTS: run-float-benchmarks-after
: 374.77 MFlops
: 287.58 MFlops
: 371.55 MFlops
: 281.48 MFlops
: 370.76 MFlops
: 287.39 MFlops

#+RESULTS: run-float-benchmarks-before
: 362.40 MFlops
: 278.65 MFlops
: 360.68 MFlops
: 280.92 MFlops
: 360.75 MFlops
: 280.76 MFlops

I guess what would be really telling is if a ext80 benchmark exhibited
any slowdown.

>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.
>
>
> r~
>
>
> Alex Bennée (1):
>   tests/fp: add quad support to the benchmark utility
>
> Richard Henderson (71):
>   qemu/host-utils: Use __builtin_bitreverseN
>   qemu/host-utils: Add wrappers for overflow builtins
>   qemu/host-utils: Add wrappers for carry builtins
>   accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c
>   softfloat: Move the binary point to the msb
>   softfloat: Inline float_raise
>   softfloat: Use float_raise in more places
>   softfloat: Tidy a * b + inf return
>   softfloat: Add float_cmask and constants
>   softfloat: Use return_nan in float_to_float
>   softfloat: fix return_nan vs default_nan_mode
>   target/mips: Set set_default_nan_mode with set_snan_bit_is_one
>   softfloat: Do not produce a default_nan from parts_silence_nan
>   softfloat: Rename FloatParts to FloatParts64
>   softfloat: Move type-specific pack/unpack routines
>   softfloat: Use pointers with parts_default_nan
>   softfloat: Use pointers with unpack_raw
>   softfloat: Use pointers with ftype_unpack_raw
>   softfloat: Use pointers with pack_raw
>   softfloat: Use pointers with ftype_pack_raw
>   softfloat: Use pointers with ftype_unpack_canonical
>   softfloat: Use pointers with ftype_round_pack_canonical
>   softfloat: Use pointers with parts_silence_nan
>   softfloat: Rearrange FloatParts64
>   softfloat: Convert float128_silence_nan to parts
>   softfloat: Convert float128_default_nan to parts
>   softfloat: Move return_nan to softfloat-parts.c.inc
>   softfloat: Move pick_nan to softfloat-parts.c.inc
>   softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
>   softfloat: Move sf_canonicalize to softfloat-parts.c.inc
>   softfloat: Move round_canonical to softfloat-parts.c.inc
>   softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
>   softfloat: Move addsub_floats to softfloat-parts.c.inc
>   softfloat: Implement float128_add/sub via parts
>   softfloat: Move mul_floats to softfloat-parts.c.inc
>   softfloat: Move muladd_floats to softfloat-parts.c.inc
>   softfloat: Use mulu64 for mul64To128
>   softfloat: Use add192 in mul128To256
>   softfloat: Tidy mul128By64To192
>   softfloat: Introduce sh[lr]_double primitives
>   softfloat: Move div_floats to softfloat-parts.c.inc
>   softfloat: Split float_to_float
>   softfloat: Convert float-to-float conversions with float128
>   softfloat: Move round_to_int to softfloat-parts.c.inc
>   softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc
>   softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc
>   softfloat: Move int_to_float to softfloat-parts.c.inc
>   softfloat: Move uint_to_float to softfloat-parts.c.inc
>   softfloat: Move minmax_flags to softfloat-parts.c.inc
>   softfloat: Move compare_floats to softfloat-parts.c.inc
>   softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
>   softfloat: Move sqrt_float to softfloat-parts.c.inc
>   softfloat: Split out parts_uncanon_normal
>   softfloat: Reduce FloatFmt
>   softfloat: Introduce Floatx80RoundPrec
>   softfloat: Adjust parts_uncanon_normal for floatx80
>   tests/fp/fp-test: Reverse order of floatx80 precision tests
>   softfloat: Convert floatx80_add/sub to FloatParts
>   softfloat: Convert floatx80_mul to FloatParts
>   softfloat: Convert floatx80_div to FloatParts
>   softfloat: Convert floatx80_sqrt to FloatParts
>   softfloat: Convert floatx80_round to FloatParts
>   softfloat: Convert floatx80_round_to_int to FloatParts
>   softfloat: Convert integer to floatx80 to FloatParts
>   softfloat: Convert floatx80 float conversions to FloatParts
>   softfloat: Convert floatx80 to integer to FloatParts
>   softfloat: Convert floatx80_scalbn to FloatParts
>   softfloat: Convert floatx80 compare to FloatParts
> 

Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-12 Thread Richard Henderson

On 5/12/21 6:22 AM, Alex Bennée wrote:

I did note we are missing mulAdd tests but they seem to be missing from
the underlying testfloat code as well. I guess we don't care that much
for the 80bit code? Is it even used by any architectures?


It's not used by any architecture, so no point in implementing it.


r~




Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-12 Thread Alex Bennée


Richard Henderson  writes:

> On 5/10/21 8:36 AM, Alex Bennée wrote:
>> Richard Henderson  writes:
>> 
>>> Reorg everything using QEMU_GENERIC and multiple inclusion to
>>> reduce the amount of code duplication between the formats.
>>>
>>> The use of QEMU_GENERIC means that we need to use pointers instead
>>> of structures, which means that even the smaller float formats
>>> need rearranging.
>>>
>>> I've carried it through to completion within fpu/, so that we don't
>>> have (much) of the legacy code remaining.  There is some floatx80
>>> stuff in target/m68k and target/i386 that's still hanging around.
>> FWIW I could enable a few more tests...
>
> Ah, thanks for the reminder that these were disabled.
> I'll add this to my patch set for v2.
>
>
>> ...although extF80_lt_quiet still has some failures on equality tests:
>
> This turns out to be a trivial typo in the tester itself:
>
> diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc
> index cb1bb77e4c..9ff884c140 100644
> --- a/tests/fp/wrap.c.inc
> +++ b/tests/fp/wrap.c.inc
> @@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet)
>  WRAP_CMP80(qemu_extF80M_le, floatx80_le)
>  WRAP_CMP80(qemu_extF80M_lt, floatx80_lt)
>  WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet)
> -WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet)
> +WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet)
>  #undef WRAP_CMP80
>
>  #define WRAP_CMP128(name, func)

\o/

I did note we are missing mulAdd tests but they seem to be missing from
the underlying testfloat code as well. I guess we don't care that much
for the 80bit code? Is it even used by any architectures?

-- 
Alex Bennée



Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-11 Thread Richard Henderson

On 5/10/21 8:36 AM, Alex Bennée wrote:


Richard Henderson  writes:


Reorg everything using QEMU_GENERIC and multiple inclusion to
reduce the amount of code duplication between the formats.

The use of QEMU_GENERIC means that we need to use pointers instead
of structures, which means that even the smaller float formats
need rearranging.

I've carried it through to completion within fpu/, so that we don't
have (much) of the legacy code remaining.  There is some floatx80
stuff in target/m68k and target/i386 that's still hanging around.


FWIW I could enable a few more tests...


Ah, thanks for the reminder that these were disabled.
I'll add this to my patch set for v2.



...although extF80_lt_quiet still has some failures on equality tests:


This turns out to be a trivial typo in the tester itself:

diff --git a/tests/fp/wrap.c.inc b/tests/fp/wrap.c.inc
index cb1bb77e4c..9ff884c140 100644
--- a/tests/fp/wrap.c.inc
+++ b/tests/fp/wrap.c.inc
@@ -643,7 +643,7 @@ WRAP_CMP80(qemu_extF80M_eq, floatx80_eq_quiet)
 WRAP_CMP80(qemu_extF80M_le, floatx80_le)
 WRAP_CMP80(qemu_extF80M_lt, floatx80_lt)
 WRAP_CMP80(qemu_extF80M_le_quiet, floatx80_le_quiet)
-WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_le_quiet)
+WRAP_CMP80(qemu_extF80M_lt_quiet, floatx80_lt_quiet)
 #undef WRAP_CMP80

 #define WRAP_CMP128(name, func)


r~



Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-10 Thread Alex Bennée


Richard Henderson  writes:

> Reorg everything using QEMU_GENERIC and multiple inclusion to
> reduce the amount of code duplication between the formats.
>
> The use of QEMU_GENERIC means that we need to use pointers instead
> of structures, which means that even the smaller float formats
> need rearranging.
>
> I've carried it through to completion within fpu/, so that we don't
> have (much) of the legacy code remaining.  There is some floatx80
> stuff in target/m68k and target/i386 that's still hanging around.

FWIW I could enable a few more tests although extF80_lt_quiet still has
some failures on equality tests:

./tests/fp/fp-test -l 1 -r all extF80_lt_quiet
>> Testing extF80_lt_quiet
46464 tests total.
Errors found in extF80_lt_quiet:
+.  +.  => 1 .  expected 0 .
+.  -.  => 1 .  expected 0 .
+.0001  +.0001  => 1 .  expected 0 .
+.7FFF  +.7FFF  => 1 .  expected 0 .
+.7FFE  +.7FFE  => 1 .  expected 0 .
+0001.8000  +0001.8000  => 1 .  expected 0 .
+0001.8001  +0001.8001  => 1 .  expected 0 .
+0001.  +0001.  => 1 .  expected 0 .
+0001.FFFE  +0001.FFFE  => 1 .  expected 0 .
+3FBF.8000  +3FBF.8000  => 1 .  expected 0 .
+3FBF.8001  +3FBF.8001  => 1 .  expected 0 .
+3FBF.  +3FBF.  => 1 .  expected 0 .
+3FBF.FFFE  +3FBF.FFFE  => 1 .  expected 0 .
+3FFD.8000  +3FFD.8000  => 1 .  expected 0 .
+3FFD.8001  +3FFD.8001  => 1 .  expected 0 .
+3FFD.  +3FFD.  => 1 .  expected 0 .
+3FFD.FFFE  +3FFD.FFFE  => 1 .  expected 0 .
+3FFE.8000  +3FFE.8000  => 1 .  expected 0 .
+3FFE.8001  +3FFE.8001  => 1 .  expected 0 .
+3FFE.  +3FFE.  => 1 .  expected 0 .
9618 tests performed; 20 errors found.

However the rest can be enabled:

tests/fp: enable more tests

Signed-off-by: Alex Bennée 

1 file changed, 6 insertions(+), 6 deletions(-)
tests/fp/meson.build | 12 ++--

modified   tests/fp/meson.build
@@ -556,7 +556,9 @@ softfloat_conv_tests = {
   'extF80_to_f64 extF80_to_f128 ' +
   'f128_to_f16',
 'int-to-float': 'i32_to_f16 i64_to_f16 i32_to_f32 i64_to_f32 ' +
-'i32_to_f64 i64_to_f64 i32_to_f128 i64_to_f128',
+'i32_to_f64 i64_to_f64 ' +
+'i32_to_extF80 i64_to_extF80 ' +
+'i32_to_f128 i64_to_f128',
 'uint-to-float': 'ui32_to_f16 ui64_to_f16 ui32_to_f32 ui64_to_f32 ' +
  'ui32_to_f64 ui64_to_f64 ui64_to_f128 ' +
  'ui32_to_extF80 ui64_to_extF80',
@@ -581,7 +583,7 @@ softfloat_conv_tests = {
  'extF80_to_ui64 extF80_to_ui64_r_minMag ' +
  'f128_to_ui64 f128_to_ui64_r_minMag',
 'round-to-integer': 'f16_roundToInt f32_roundToInt ' +
-'f64_roundToInt f128_roundToInt'
+'f64_roundToInt extF80_roundToInt f128_roundToInt'
 }
 softfloat_tests = {
 'eq_signaling' : 'compare',
@@ -602,18 +604,16 @@ fptest_args = ['-s', '-l', '1']
 fptest_rounding_args = ['-r', 'all']
 
 # Conversion Routines:
-# FIXME: i32_to_extF80 (broken), i64_to_extF80 (broken)
-#extF80_roundToInt (broken)
 foreach k, v : softfloat_conv_tests
   test('fp-test-' + k, fptest,
args: fptest_args + fptest_rounding_args + v.split(),
suite: ['softfloat', 'softfloat-conv'])
 endforeach
 
-# FIXME: extF80_{lt_quiet, rem} (broken),
+# FIXME: extF80_{lt_quiet} (broken),
 #extF80_{mulAdd} (missing)
 foreach k, v : softfloat_tests
-  extF80_broken = ['lt_quiet', 'rem'].contains(k)
+  extF80_broken = ['lt_quiet'].contains(k)
   test('fp-test-' + k, fptest,
args: fptest_args + fptest_rounding_args +
  ['f16_' + k, 'f32_' + k, 'f64_' + k, 'f128_' + k] +

-- 
Alex Bennée



Re: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

2021-05-07 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20210508014802.892561-1-richard.hender...@linaro.org/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210508014802.892561-1-richard.hender...@linaro.org
Subject: [PATCH 00/72] Convert floatx80 and float128 to FloatParts

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/20210508014802.892561-1-richard.hender...@linaro.org -> 
patchew/20210508014802.892561-1-richard.hender...@linaro.org
Switched to a new branch 'test'
2dcaa72 softfloat: Convert modrem operations to FloatParts
2956f6c softfloat: Move floatN_log2 to softfloat-parts.c.inc
b786038 softfloat: Convert float32_exp2 to FloatParts
01ab7b4 softfloat: Convert floatx80 compare to FloatParts
d55b655 softfloat: Convert floatx80_scalbn to FloatParts
0b1869b softfloat: Convert floatx80 to integer to FloatParts
6287e68 softfloat: Convert floatx80 float conversions to FloatParts
5c28030 softfloat: Convert integer to floatx80 to FloatParts
e686e7d softfloat: Convert floatx80_round_to_int to FloatParts
9e7a606 softfloat: Convert floatx80_round to FloatParts
d19c872 softfloat: Convert floatx80_sqrt to FloatParts
06addff softfloat: Convert floatx80_div to FloatParts
6974166 softfloat: Convert floatx80_mul to FloatParts
0ab65d7 softfloat: Convert floatx80_add/sub to FloatParts
5ae98cd tests/fp/fp-test: Reverse order of floatx80 precision tests
dc5eada softfloat: Adjust parts_uncanon_normal for floatx80
0f4d3ce softfloat: Introduce Floatx80RoundPrec
aecc38b softfloat: Reduce FloatFmt
d8342e2 softfloat: Split out parts_uncanon_normal
e8a3234 softfloat: Move sqrt_float to softfloat-parts.c.inc
db066f3 softfloat: Move scalbn_decomposed to softfloat-parts.c.inc
870c56c softfloat: Move compare_floats to softfloat-parts.c.inc
4895586 softfloat: Move minmax_flags to softfloat-parts.c.inc
c9e01de softfloat: Move uint_to_float to softfloat-parts.c.inc
0b58632 softfloat: Move int_to_float to softfloat-parts.c.inc
e408986 softfloat: Move rount_to_uint_and_pack to softfloat-parts.c.inc
1330f41 softfloat: Move rount_to_int_and_pack to softfloat-parts.c.inc
2d631f0 softfloat: Move round_to_int to softfloat-parts.c.inc
45abdca softfloat: Convert float-to-float conversions with float128
0ddbdfd softfloat: Split float_to_float
296ad69 softfloat: Move div_floats to softfloat-parts.c.inc
d5ff786 softfloat: Introduce sh[lr]_double primitives
d227707 softfloat: Tidy mul128By64To192
5a9c124 softfloat: Use add192 in mul128To256
86d8505 softfloat: Use mulu64 for mul64To128
1186174 softfloat: Move muladd_floats to softfloat-parts.c.inc
5e4de79 softfloat: Move mul_floats to softfloat-parts.c.inc
1450eec softfloat: Implement float128_add/sub via parts
ed27dea softfloat: Move addsub_floats to softfloat-parts.c.inc
3fe4a87 softfloat: Use uadd64_carry, usub64_borrow in softfloat-macros.h
5687e8b softfloat: Move round_canonical to softfloat-parts.c.inc
ddf7181 softfloat: Move sf_canonicalize to softfloat-parts.c.inc
49b211c softfloat: Move pick_nan_muladd to softfloat-parts.c.inc
d6fae5a softfloat: Move pick_nan to softfloat-parts.c.inc
ada8c0f softfloat: Move return_nan to softfloat-parts.c.inc
701d4e6 softfloat: Convert float128_default_nan to parts
735bd6f softfloat: Convert float128_silence_nan to parts
be031db softfloat: Rearrange FloatParts64
997cce0 softfloat: Use pointers with parts_silence_nan
759543e softfloat: Use pointers with ftype_round_pack_canonical
59155e0 softfloat: Use pointers with ftype_unpack_canonical
67f866d softfloat: Use pointers with ftype_pack_raw
ad2e600 softfloat: Use pointers with pack_raw
6725bec softfloat: Use pointers with ftype_unpack_raw
6fa54f0 softfloat: Use pointers with unpack_raw
36916c3 softfloat: Use pointers with parts_default_nan
e255c56 softfloat: Move type-specific pack/unpack routines
17cab05 softfloat: Rename FloatParts to FloatParts64
7973d6d softfloat: Do not produce a default_nan from parts_silence_nan
7267830 target/mips: Set set_default_nan_mode with set_snan_bit_is_one
8acc1a9 softfloat: fix return_nan vs default_nan_mode
5c700cd softfloat: Use return_nan in float_to_float
e6a00e2 softfloat: Add float_cmask and constants
689fe83 softfloat: Tidy a * b + inf return
75285a4 softfloat: Use float_raise in more places
a8ea262 softfloat: Inline float_raise
3be68a5 softfloat: Move the binary point to the msb
e2236e1 tests/fp: add quad support to the benchmark utility
be7703a accel/tcg: Use add/sub overflow routines in tcg-runtime-gvec.c
3731829 qemu/host-utils: Add wrappers for carry builtins
a1367bc qemu/host-utils: Add wrappers for overflow builtins
96efc5c qemu/host-utils: Use