On Fri, 3 Jun 2022, Ozkan Sezer wrote:
Why are we using x87 asm instead of sse2 intrinsics for lrint/lrintf ? E.g.: why not do something like the following?diff --git a/mingw-w64-crt/math/lrintl.c b/mingw-w64-crt/math/lrintl.c index d710fac..9f1be51 100644 --- a/mingw-w64-crt/math/lrintl.c +++ b/mingw-w64-crt/math/lrintl.c @@ -5,10 +5,16 @@ */ #include <math.h> +#if defined(_AMD64_) || defined(__x86_64__) +#include <xmmintrin.h> +#endif + long lrintl (long double x) { long retval = 0l; +#if defined(_AMD64_) || defined(__x86_64__) + retval = _mm_cvtsd_si64(_mm_load_sd(&x)); -#if defined(_AMD64_) || defined(__x86_64__) || defined(_X86_) || defined(__i386__) +#elif defined(_X86_) || defined(__i386__) __asm__ __volatile__ ("fistpl %0" : "=m" (retval) : "t" (x) : "st");
For long double, I would presume that you'd still need to use x87, as SSE can't handle 80 bit long doubles, right? (Or converting down to a plain double first, then using SSE, probably also works - but I guess that also might generate x87 instructions.)
For the other functions, do the SSE intrinsics honor the rounding mode/direction that you've set fesetround()?
// Martin _______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
