On 22-Feb-10, at 10:35 AM, Rodrigo Kumpera wrote:
On Mon, Feb 22, 2010 at 1:39 PM, Siarhei Siamashka <[email protected]
> wrote:
On Friday 19 February 2010, Luca Barbato wrote:
> On 02/19/2010 12:57 PM, Siarhei Siamashka wrote:
> > Adding small increments to the values at the end of loop
iteration could
> > be the biggest source of precision loss. Replacing this with
explicit
> > calculation like 'pdx = pdx0 + cx * n' should improve precision
and maybe
> > allow to use floats freely. And floats work better with SIMD on
any
> > platforms.
>
> And all the SIMD we are covering have a multiply-accumulate
instruction
> that would be in use, I'm a bit more concerned about the sqrt usage
> though...
The usage of sqrt is probably not a fatal performance problem.
ARM11 VFP has a separate DS pipeline which can calculate divides or
square
roots simultaneously with the other operations. So it's only a
matter of
hiding very high square root calculation latency.
ARM Cortex-A8 has special SIMD instructions intended to help
calculating
reciprocals and reciprocal square roots using Newton-Raphson method.
SSE has SIMD instructions for calculating square roots.
Note that SSE functions for square root and reciprocal are unfit for
this task as they have
very little precision (12 bits). At least one Newton-Raphson
iteration must be done to have
usable values.
rsqrt is an approximation, however the regular square root
instructions are not and are fine replacements for the sqrt()
function. In fact, gcc on OS X already does this substitution.
-Jeff
_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman