On Mon, Feb 22, 2010 at 1:39 PM, Siarhei Siamashka < [email protected]> wrote:
> On Friday 19 February 2010, Luca Barbato wrote: > > On 02/19/2010 12:57 PM, Siarhei Siamashka wrote: > > > Adding small increments to the values at the end of loop iteration > could > > > be the biggest source of precision loss. Replacing this with explicit > > > calculation like 'pdx = pdx0 + cx * n' should improve precision and > maybe > > > allow to use floats freely. And floats work better with SIMD on any > > > platforms. > > > > And all the SIMD we are covering have a multiply-accumulate instruction > > that would be in use, I'm a bit more concerned about the sqrt usage > > though... > > The usage of sqrt is probably not a fatal performance problem. > > ARM11 VFP has a separate DS pipeline which can calculate divides or square > roots simultaneously with the other operations. So it's only a matter of > hiding very high square root calculation latency. > > ARM Cortex-A8 has special SIMD instructions intended to help calculating > reciprocals and reciprocal square roots using Newton-Raphson method. > > SSE has SIMD instructions for calculating square roots. > > Note that SSE functions for square root and reciprocal are unfit for this task as they have very little precision (12 bits). At least one Newton-Raphson iteration must be done to have usable values.
_______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
