--- Comment #15 from Don <clugd...@yahoo.com.au> 2012-05-02 15:12:18 PDT ---
(In reply to comment #14)
> (In reply to comment #13)
> > (In reply to comment #12)
> > > (In reply to comment #11)
> > > > Haven't done the special case optimizations for constant loading.
> > >
> > > No problem, I'm using GDC anyway which might detect those in the back end.
> > >
> > > An efficient implementation would certainly use at least an xor for 0
> > > initialisation, and the other tricks will get different mileage depending
> > > on
> > > the length of the pipeline surrounding. Not accessing memory is always
> > > better
> > > if there are pipeline cycles to soak up the latency.
> > The -1 trick is always worth doing, I think. Agner Fog has a nice list in
> > his
> > optimisation manuals, but the only ones _always_ worth doing are the 0 and
> > -1
> > integer cases, and the 0.0 floating point case (also using xor).
> If the compiler knows anything about the pipeline around the code, it should
> able to make the best choice about the others.
My guess is that it's pretty rare that the alternative sequences are favoured
just on the basis of the pipeline, since MOVDQA only uses a load port, and
nothing else. Especially on Sandy Bridge or AMD, where there are two load
So I doubt there's much benefit to be had.
By contrast, if there's _any_ chance of a cache miss, they'd be a huge win, but
unfortunately that's far beyond the compiler's capabilities.
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------