On Wed, Feb 27, 2002 at 10:03:22AM -0800, ME wrote: <snip> > $ gcc gcc -funroll-all-loops -S sample.c > > When I inspect the above, I see loops included. > -12(%ebp) (3 32-bit offset from %ebp) is set to 5 and -4(%ebp) is incl > until it is cmpl to be no longer less than -12(%ebp). > > Labels even show loops when you watch it. I count about 3 when I quickly > scan it. > > This would lead me to believe the generated asm, code is not unrolled if I > understand the expectation of the unrolling process. (I would guess <snip>
>From the gcc manual: "-funroll-all-loops ... usually makes the program run slower." gcc doesn't unroll loops unless -O3 is selected. Also, unrolling a loop can make it too big to fit in the cache, which will make the code larger and very much slower if the processor has to deal with page faults. The break- even point is hard to predict, but intuitively a loop that does tons of processing and loops maybe 3 times is not a good candidate, but a loop that executes thousands of times and does diddly would see a speedup. ISTR that gcc unrolls loops intelligently where it can, so a loop that executes n times won't end up with n copies of the code, generally speaking. You'll also want to look at -fmove-all-movables, -frerun-cse-after-loop, -frerun-loop-opt, and -fexpensive-optimizations. They're all documented in the gcc manual. -- A watched process never cores. _______________________________________________ vox-tech mailing list [EMAIL PROTECTED] http://lists.lugod.org/mailman/listinfo/vox-tech