On Sat, 28 Sep 2002, Christoph Egger wrote:
> Ok, that's better than the #ifdef...#endif blocks. But you can't get rid
> of maintaining the same algorithm multiple times in multiple
> implementations.

Generally you only need to optimize the inner part of algorithm
where data is being manipulated at high rates in a tight loop, so the amount 
of optimization code is not huge.  The required code will be fairly 
large for the LibBuf Alpha default-* blending algorithms, but that is because 
it is a fairly large amount of code to begin with because there are
so many blending formulas.

I will be including a 64-bit C SWAR implementation, such that compilers
like ICC can try to optimize the tight loops (plus they will result in
speed gains on 64-bit processors which have not had their own SIMD's
implemented in asm yet.)  However, I don't expect much from so-called
MMX optimizing compilers... even without SIMD I have been told GCC
does some pretty stupid things sometimes.

It is a chore to have to maintain the duplicate parts, but with a 
bit of code structure and macros the amount of duplicate 
preludes/postludes to the optimized sections can be reduced.  Plus,
generally the sections where these optimizations are going to be used
(default renderers) don't usually require much maintainance.  Once
they've been made pixel-perfect there is not much more that needs
to ever be done to them (in fact, noone has even noticed that
they aren't pixel perfect or made any other changes to the actual
crossblit algorithms in years.)

--
Brian

Reply via email to