Mark Dickinson <dicki...@gmail.com> added the comment: [Raymond] > Is there a way to use SSE when available and x86 when it's not.
I guess it's possible in theory, but I don't know of any way to do this in practice. I suppose one could trap the SIGILL generated by the attempted use of an SSE2 instruction on a non-supported platform---is this how things used to work for 386s without the 387? That would make a sequence of floating-point instructions on non-SSE2 x86 horribly slow, though. Antoine: as Raymond said, the advantage of SSE2 for numeric work is accuracy, predictability, and consistency across platforms. The SSE2 instructions finally put an end to all the problems arising from the mismatch between the precision of the x87 floating-point registers (64- bits) and the precision of a C double (53-bits). Those difficulties include (1) unpredictable rounding of intermediate values from 64-bit precision to 53-bit precision, due to spilling of temporaries from FPU registers to memory, and (2) double-rounding. The arithmetic of Python itself is largely immune to the former, but not the latter. (And of course the register spilling still causes headaches for various bits of CPython). Those difficulties can be *mostly* dealt with by setting the x87 rounding precision to double (instead of extended), though this doesn't fix the exponent range, so one still ends up with double-rounding on underflow. The catch is that one can't mess with the x87 state globally, as various library functions (especially libm functions) might depend on it being in whatever the OS considers to be the default state. There's a very nice paper by David Monniaux that covers all this: definitely recommended reading after you've finished Goldberg's "What Every Computer Scientist...". It can currently be found at: http://hal.archives-ouvertes.fr/hal-00128124/en/ An example: in Python (any version), try this: >>> 1e16 + 2.9999 10000000000000002.0 On OS X, Windows and FreeBSD you'll get the answer above. (OS X gcc uses SSE2 by default; Windows and FreeBSD both make the default x87 rounding-precision 53 bits). On 32-bit Linux/x86 or Solaris/x86 you'll likely get the answer 10000000000000004.0 instead, because Linux doesn't (usually?) change the Intel default rounding precision of 64-bits. Using SSE2 instead of the x87 would have fixed this. </standard x87 rant> ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue1580> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com