Re: [linux-audio-dev] Traps in floating point code

Ruben van Royen Thu, 01 Jul 2004 09:22:24 -0700


On Thursday 01 July 2004 14:41, Tim Goetze wrote:
> [Ruben van Royen]
>
> >please note that SSE2 has support for 64bit floats (doubles) and contains
> > an instruction that truncates to int, irregardless of controlwords. A new
> > enough gcc with (-march=pentium4 or -msse2) and -mfpmath=sse will use sse
> > instead of the old fp unit. This has more advantages, since sse math uses
> > normal registers instead of the stack in the old fp unit.
> >
> >The disadvantage is of course that it does not run on older processors.
> > I'm also not sure what level of sse athlon currently supports. The last
> > time I looked, it only supported sse. This is also good, but it lacks
> > support for double precision floatingpoint.
>
> afaik, the athlon XP here only has SSE (not ~2), but the instruction
> set includes this (quote taken from the NASM documentation, section
> B.4):
>
> CVTTSD2SI reg32,xmm/mem32      ; F3 0F 2C /r    [KATMAI,SSE]


Yes, that one is part of sse. SSE2 adds a 64bit variant, so it also works with 
doubles.
>
> CVTTSS2SI converts a single-precision FP value in the source operand
> to a signed doubleword in the destination operand. If the result is
> inexact, it is truncated (rounded toward zero).
>
> The destination operand is a general purpose register. The source can
> be either an XMM register or a 32-bit memory location. If the source
> is a register, the input value is in the low doubleword.
>
> -
>
> the operand requirements are quite different from "fistpl" so
> replacing one with the other requires some additional instructions
> to move the data around.

If you must first move the data from an FP register to an XMM register, it is 
not very likely that you will get a performance improvement. The route to go 
would be to do all calculation in SSE code. 




>
> tim

Re: [linux-audio-dev] Traps in floating point code

Reply via email to