Hi, In my previous run at this I'd simply taken the existing float32 functions and attempted to copy and paste the code changing the relevant constants. Apart from the usual typos and missed bits there were sections where softfloat pulls tricks because it knows the exact bit positions of things. While I'm sure it's marginally faster it does make the code rather impenetrable to someone not familiar with how SoftFloat does things. One thing the last few months have taught me is the world is not awash with experts on the finer implementation details of floating point maths. After reviewing the last series Richard Henderson suggested a different approach which pushed most of the code into common shared functions. The majority of the work on the fractional bits is done in 64 bit resolution which leaves plenty of spare bits for rounding for everything from float16 to float64. This series is a result of that work and a coding sprint we did 2 weeks ago in Cambridge.
We've not touched anything that needs higher precision which at the moment is float80 and 128 bit quad precision operations. They would need similar decomposed routines to operate on the higher precision fractional parts. I suspect we'd need to beef up our Int128 wrapper in the process so it can be done efficiently with 128 bit maths. This work is part of the larger chunk of adding half-precision ops to the ARM front-end. However I've split the series up to make for a less messy review. This tree can be found at: https://github.com/stsquad/qemu/tree/softfloat-refactor-and-fp16-v1 While I have been testing the half-precision stuff in the ARM specific tree this series is all common code. It has however been tested with ARM RISU which exercises the float32/64 code paths quite nicely. Any additional testing appreciated. Series Breakdown ---------------- The first five patches are simple helper functions that are mostly inline and there for the benefit of architecture helper functions. This includes the float16 constants in the final patch. The next two patches fixed a bug in NaN propagation which only showed up when doing ARM "Reduction" operations in float16. Although the minmax code is totally replaced later on I wanted to fix it in place first rather than add the fix when it was re-written. The next two patches start preparing the ground for the new decomposed functions and their public APIs. I've used macro expansion in a few places just to avoid the amount of repeated boiler-plate for these APIs. Most of the work is done in the static decompose_foo functions. As you can see in the diffstat there is an overall code reduction even though we have also added float16 support. For reference the previous attempt added 1258 lines of code to implement a subset of the float16 functions. I think the code is also a lot easier to follow and reason about. Alex Bennée (19): fpu/softfloat: implement float16_squash_input_denormal include/fpu/softfloat: implement float16_abs helper include/fpu/softfloat: implement float16_chs helper include/fpu/softfloat: implement float16_set_sign helper include/fpu/softfloat: add some float16 contants fpu/softfloat: propagate signalling NaNs in MINMAX fpu/softfloat: improve comments on ARM NaN propagation fpu/softfloat: move the extract functions to the top of the file fpu/softfloat: define decompose structures fpu/softfloat: re-factor add/sub fpu/softfloat: re-factor mul fpu/softfloat: re-factor div fpu/softfloat: re-factor muladd fpu/softfloat: re-factor round_to_int fpu/softfloat: re-factor float to int/uint fpu/softfloat: re-factor int/uint to float fpu/softfloat: re-factor scalbn fpu/softfloat: re-factor minmax fpu/softfloat: re-factor compare fpu/softfloat-macros.h | 44 + fpu/softfloat-specialize.h | 115 +- fpu/softfloat.c | 6668 ++++++++++++++++++++------------------------ include/fpu/softfloat.h | 89 +- 4 files changed, 3066 insertions(+), 3850 deletions(-) -- 2.15.1