On 22-Feb-10, at 10:35 AM, Rodrigo Kumpera wrote:



On Mon, Feb 22, 2010 at 1:39 PM, Siarhei Siamashka <[email protected] > wrote:
On Friday 19 February 2010, Luca Barbato wrote:
> On 02/19/2010 12:57 PM, Siarhei Siamashka wrote:
> > Adding small increments to the values at the end of loop iteration could > > be the biggest source of precision loss. Replacing this with explicit > > calculation like 'pdx = pdx0 + cx * n' should improve precision and maybe > > allow to use floats freely. And floats work better with SIMD on any
> > platforms.
>
> And all the SIMD we are covering have a multiply-accumulate instruction
> that would be in use, I'm a bit more concerned about the sqrt usage
> though...

The usage of sqrt is probably not a fatal performance problem.

ARM11 VFP has a separate DS pipeline which can calculate divides or square roots simultaneously with the other operations. So it's only a matter of
hiding very high square root calculation latency.

ARM Cortex-A8 has special SIMD instructions intended to help calculating
reciprocals and reciprocal square roots using Newton-Raphson method.

SSE has SIMD instructions for calculating square roots.


Note that SSE functions for square root and reciprocal are unfit for this task as they have very little precision (12 bits). At least one Newton-Raphson iteration must be done to have
usable values.

rsqrt is an approximation, however the regular square root instructions are not and are fine replacements for the sqrt() function. In fact, gcc on OS X already does this substitution.

-Jeff


_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman

Reply via email to