On 07.31, Gwenole Beauchesne wrote:
> On Thu, 31 Jul 2003, J.A. Magallon wrote:
>
> > a) With pentium3, gcc spits sse instructions for moves, like movaps.
> > Sure it should be doing this ? If I build with -mno-sse, my code
works
> > again. But -msse should only activate/deactivate the use of
> > __builtin_movaps() and friends. I am also developing the use of SSE
> > for my vector math, so I _need_ -msse. How can I just enable the
builtins,
> > but prevent gcc to spit its own sse instructions ?
>
> Indeed, doco states that enables intrinsics. But I'd believe this changed
> over time. Please, can I get a testcase of that one too?
>
> > b) Automatic use of SSE would be ok if gcc is sure arguments are
aligned,
> > or if it uses unaligned moves. But I have seen gcc do things like:
> >
> > movups %xmm0, -120(%ebp)
> >
> > Who knows how is that aligned ? Even if stack is 16-byte aligned,
> > 120 is not a multiple of 16... Changing that to movups and compiling
> > the assembler, things work. So obviously it was not aligned.
>
> Can you please provide me with an appropriate testcase?
>
I got it !! I have killed al the const& and optimized code and gcc still
generates a movaps.
// bug.cc
class Vector
{
public:
float f[4];
Vector()
{
f[0]=f[1]=f[2]=f[3]=0;
};
Vector(float a,float b,float c,float d)
{
f[0]=a; f[1]=b; f[2]=c; f[3]=d;
};
Vector operator-() const
{
return Vector(-f[0],-f[1],-f[2],-f[3]);
};
};
class Ray
{
public:
Vector o,d;
};
class Env
{
public:
Vector e,v,n,ng;
void set(const Ray& r);
};
Vector fn()
{
return Vector(0,0,0,0);
}
void Env::set(const Ray& r)
{
e = r.o;
v = -r.d;
n = fn();
ng = -n;
}
Build with
g++ -O2 -march=pentium3 -S bug.cc
and look at the code in Env::set():
...
.LC1:
.long -2147483648
.long 0
.long 0
.long 0
.text
.align 2
.p2align 4,,15
.globl _ZN3Env3setERK3Ray
.type _ZN3Env3setERK3Ray, @function
...
movss .LC1, %xmm0
...
movaps %xmm0, -56(%ebp)
It looks as, when gcc has to reuse a float -1.0, stores it from data segment
in an xmm register, and then puts that back onto stack with an aligned
sse move.
Now that I know whats going on, I have to find a solution apart from
building with -O1...finding the individual -foptmization that cuses this ?
Hope this helps.
Would you post the bug in gcc ? If so, plz add me to the CCs. If not, I will
do it.
TIA
BTW: is there any #pragma or anything similar to say "this file must be
compiled at -O1" ?
--
J.A. Magallon <[EMAIL PROTECTED]> \ Software is like
sex:
werewolf.able.es \ It's better when it's
free
Mandrake Linux release 9.2 (Cooker) for i586
Linux 2.4.22-pre9-jam1m (gcc 3.3.1 (Mandrake Linux 9.2 3.3.1-0.7mdk))