Re: [Qemu-devel] [PATCH v4 13/22] fpu/softfloat: re-factor mul
Peter Maydell writes: > On 6 February 2018 at 16:48, Alex Bennée wrote: >> We can now add float16_mul and use the common decompose and >> canonicalize functions to have a single implementation for >> float16/32/64 versions. >> >> Signed-off-by: Alex Bennée >> Signed-off-by: Richard Henderson >> >> --- >> v3 > >> +/* >> + * Returns the result of multiplying the floating-point values `a' and >> + * `b'. The operation is performed according to the IEC/IEEE Standard >> + * for Binary Floating-Point Arithmetic. >> + */ >> + >> +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s) >> +{ >> +bool sign = a.sign ^ b.sign; >> + >> +if (a.cls == float_class_normal && b.cls == float_class_normal) { >> +uint64_t hi, lo; >> +int exp = a.exp + b.exp; >> + >> +mul64To128(a.frac, b.frac, &hi, &lo); > > It seems a shame that we previously were able to use a > 32x32->64 multiply for the float32 case, and now we have to > do an expensive 64x64->128 multiply regardless... Actually for mul the hit isn't too bad. When we do a div however you do notice a bit of a gulf: https://i.imgur.com/KMWceo8.png We could start passing &floatN_params to the functions much like the sqrt function and be a bit smarter when we do our multiply and let the compiler figure it out as we go. Another avenue worth exploring is ensuring we use native Int128 support where we can so these wide operations can use wide registers where available. However both of these things for future optimisations given it doesn't show up in dbt-bench timings. > > Regardless > Reviewed-by: Peter Maydell > > thanks > -- PMM -- Alex Bennée
Re: [Qemu-devel] [PATCH v4 13/22] fpu/softfloat: re-factor mul
On 02/13/2018 07:20 AM, Peter Maydell wrote: >> +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s) >> +{ >> +bool sign = a.sign ^ b.sign; >> + >> +if (a.cls == float_class_normal && b.cls == float_class_normal) { >> +uint64_t hi, lo; >> +int exp = a.exp + b.exp; >> + >> +mul64To128(a.frac, b.frac, &hi, &lo); > > It seems a shame that we previously were able to use a > 32x32->64 multiply for the float32 case, and now we have to > do an expensive 64x64->128 multiply regardless... To be fair, I've proposed two different solutions addressing that -- c++ templates and glibc macros -- and you like neither. Is there a third alternative that does not involve code duplication? r~
Re: [Qemu-devel] [PATCH v4 13/22] fpu/softfloat: re-factor mul
On 6 February 2018 at 16:48, Alex Bennée wrote: > We can now add float16_mul and use the common decompose and > canonicalize functions to have a single implementation for > float16/32/64 versions. > > Signed-off-by: Alex Bennée > Signed-off-by: Richard Henderson > > --- > v3 > +/* > + * Returns the result of multiplying the floating-point values `a' and > + * `b'. The operation is performed according to the IEC/IEEE Standard > + * for Binary Floating-Point Arithmetic. > + */ > + > +static FloatParts mul_floats(FloatParts a, FloatParts b, float_status *s) > +{ > +bool sign = a.sign ^ b.sign; > + > +if (a.cls == float_class_normal && b.cls == float_class_normal) { > +uint64_t hi, lo; > +int exp = a.exp + b.exp; > + > +mul64To128(a.frac, b.frac, &hi, &lo); It seems a shame that we previously were able to use a 32x32->64 multiply for the float32 case, and now we have to do an expensive 64x64->128 multiply regardless... Regardless Reviewed-by: Peter Maydell thanks -- PMM