"José Fonseca" wrote: > > In these last few days I have been working on the Mesa software blending > and the existing MMX bug. I've made some progress. > > I made a small test program which calls the relevant functions directly as > Alex suggested. In the process I added comments to the assembly code > (which had none). The error is due to the fact that the inner loop blends > two pixels at the same time, so if the mask of the first element is zero > then both are skipped. I also spotted some errors in the runin section, > e.g., it ANDs with 4 and compares the result with 8 which is impossible... > I still have to study the x86 architecture optimization a little further > to know how to optimally fix both these situations. > > I also made two optimizations in blend_transparency(s_blend.c) which have > no effect in the result precision but that achieved a global speedup of > 30% in the function. These optimizations are in the C code and benefit all > architectures. > > The first was to avoid the repetition of the input variable in the DIV255. > At least my version of gcc (2.96) wasn't factoring the common code out > yelding to a 17% speedup. > > The second was to factor the equation of blending reducing in half the > number of multiplications. This optimization can be applied in other > places on this file as well.
Good work. I'll review your changes and probably apply it to the Mesa trunk (for version 4.1) later today. > A third optimization that I'll try is the "double blend" trick (make two > 8-bit multiplications at the same time in a 32-bit register) as documented > by Michael Herf (http://www.stereopsis.com/doubleblend.html - a quite > interesting site referred to me by Brian). I was going to do that someday too. Go for it. > I would like to keep improving Mesa software rendering performance. I know > that due to its versatility and power Mesa will never rival with a > dedicated and non-conformant software 3d engine such as unreal one, > nevertheless I think that it's possible to make it usefull for simple > realtime rendering. Regards, Despite the proliferation of 3D hardware, there'll always be applications for software rendering. For example, the 16-bit color channel features is being used by several animation houses. -Brian _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel