On Sat, 6 May 2000, Dan Nelson wrote:

> In the last episode (May 05), Jean-Marc Zucconi said:
> > Here is something I don't understand:
> > 
> > $ sh -c  '/usr/bin/time  ./a.out'
> >         2.40 real         2.38 user         0.01 sys
> > $ /usr/bin/time  ./a.out
> >         7.19 real         7.19 user         0.00 sys

> It has to do with your stack.  Calling the program via /bin/sh sets up
> your environment differently, so your program's stack starts at a
> different place.  Try running this:

> Here are some bits from the gcc infopage explaining your options if you
> want consistant speed from programs using doubles:
> 
> `-mpreferred-stack-boundary=NUM'
>      Attempt to keep the stack boundary aligned to a 2 raised to NUM
>      byte boundary.  If `-mpreferred-stack-boundary' is not specified,
>      the default is 4 (16 bytes or 128 bits).
>      The stack is required to be aligned on a 4 byte boundary.  On
>      Pentium and PentiumPro, `double' and `long double' values should be
>      aligned to an 8 byte boundary (see `-malign-double') or suffer
>      significant run time performance penalties.  On Pentium III, the
>      Streaming SIMD Extention (SSE) data type `__m128' suffers similar
>      penalties if it is not 16 byte aligned.

The default of 4 for -mpreferred-stack-boundary perfectly preserves
any initial misaligment of the stack.  Under FreeBSD the stack is
initially misaligned (for doubles) with a probability of 1/2.  There
was some discussion of fixing this when gcc-2.95 was imported, but
nothing was committed.  I use the following local hack:

diff -c2 kern_exec.c~ kern_exec.c
*** kern_exec.c~        Mon May  1 15:56:40 2000
--- kern_exec.c Mon May  1 15:56:42 2000
***************
*** 627,630 ****
--- 647,659 ----
                vectp = (char **)
                        (destp - (imgp->argc + imgp->envc + 2) * sizeof(char*));
+ 
+       /*
+        * Align stack to a multiple of 0x20.
+        * XXX vectp has the wrong type; we usually want a vm_offset_t;
+        * the suword() family takes a void *, but should take a vm_offset_t.
+        * XXX should align stack for signals too.
+        * XXX should do this more machine/compiler-independently.
+        */
+       vectp = (char **)(((vm_offset_t)vectp & ~(vm_offset_t)0x1F) - 4);
  
        /*

Bruce



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message

Reply via email to