On Thursday 01 July 2004 14:41, Tim Goetze wrote: > [Ruben van Royen] > > >please note that SSE2 has support for 64bit floats (doubles) and contains > > an instruction that truncates to int, irregardless of controlwords. A new > > enough gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse > > instead of the old fp unit. This has more advantages, since sse math uses > > normal registers instead of the stack in the old fp unit. > > > >The disadvantage is of course that it does not run on older processors. > > I'm also not sure what level of sse athlon currently supports. The last > > time I looked, it only supported sse. This is also good, but it lacks > > support for double precision floatingpoint. > > afaik, the athlon XP here only has SSE (not ~2), but the instruction > set includes this (quote taken from the NASM documentation, section > B.4): > > CVTTSD2SI reg32,xmm/mem32 ; F3 0F 2C /r [KATMAI,SSE]
Yes, that one is part of sse. SSE2 adds a 64bit variant, so it also works with doubles. > > CVTTSS2SI converts a single-precision FP value in the source operand > to a signed doubleword in the destination operand. If the result is > inexact, it is truncated (rounded toward zero). > > The destination operand is a general purpose register. The source can > be either an XMM register or a 32-bit memory location. If the source > is a register, the input value is in the low doubleword. > > - > > the operand requirements are quite different from "fistpl" so > replacing one with the other requires some additional instructions > to move the data around. If you must first move the data from an FP register to an XMM register, it is not very likely that you will get a performance improvement. The route to go would be to do all calculation in SSE code. > > tim
