On Fri, 15 Oct 1999, Gabriel Bouvigne wrote:

> > With modern compilers, you don't generally need to bother with replacing
> >
> >   for (i = 0; i < len; ++i)
> >     sum += array[i];
> >
> > with
> >
> >   for (p = array, endp = array + len; p < endp; ++p)
> >     sum += *p;
> >
> > Any compiler worth its salt will do it for you in a microsecond - or might
> > not, if it thinks it will be less efficient; performing the strength
> > reduction yourself restricts the compiler's options.
> 
> The following loop is faster than array, both on hppa processors using g++,
> pentium, celeron and pIII using both egcs and msvc++:
> 
> for (i=len,  p=(array+i); i--; )
>     sum+=*--p;
> 
> Descending loops with pointer use have always shown to be faster than
> arrays. I use them a lot in image processing.

I think this is true (for a single pointer loop, for multiple/too many 
pointers I think Takehiro is right, see earlier postings; pointers
can no longer be alloc'ed to registers, so memory is needed and this
slows down things? however we could/should test again on multiple
architectures??); 
Anyway, I got this idea from a recent book discussing all sorts of optims,
and how (good) recent compilers work, I can look up the ref if you want
it;
I was also looking into this kind of thing to change the lame code all
over the place (i.e. inside the subbandfiltering). 
(Part of) Reason for speed up I believe is you can save/replace the 'i' 
"cmp value + bne/beq" assembler with only "bne/beq" (branch equal or not).
Never checked actual gcc output assembler and speed myself; could/should
be done for (too)lame ? Just another thing I'll add to my optim/sb-filter
list I guess :( ;)

regards,
Patrick.

--
MP3 ENCODER mailing list ( http://geek.rcc.se/mp3encoder/ )

Reply via email to