Andrea Canciani wrote:
- wip/radial-float2 computes the discriminant in double procision
(instead of 128
bit fixed point) and should be both fast (almost two times faster than
wip/radial)
and accurate. Does it look good?
I'm a little worried about the int64->double conversions in this
branch. Are there
architectures where it might be a problem?
- in wip/radial-float I tried to keep all the variables of the inner
loop in floating point
to see how much this affects the performance. I get small speed
improvements, but
I don't think they justify this branch (the accuracy guarantees in
this case are very
loose since errors accumulate quadratically with the number of iterations).
I have found that "A*x+B" is almost as fast as "x+B" on floating point
and avoids any problems with inaccuracy of repeated adding a value. The
FPU has an instruction to do this operation. My understanding is the
addition is the slow part and it can merge part of it into the
multiplication.
I am unsure if this will help for higher-order functions, but it seems
that C*(A*x+B)+D has two of these operations and thus will be have the
same relative speed to numerical integration.
In any case if you are getting twice the performance by switching to
float, I suspect that a numerically stable floating point version will
still be a win.
_______________________________________________
Pixman mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pixman