On Tue, Feb 28, 2012 at 4:32 AM, Christophe Gisquet
<[email protected]> wrote:
> Hi,
>
> in my profiling, ff_sbr_apply accounts for ~12% of the decoding time,
> but it has a lot of inlining. The divisions I modify seem to account
> for ~5% of that, so precomputing them is interesting. Disassembly
> shows a gcc 4.6.2 on x86 64 with -funsafe-math-optimizations does not
> do this, while the attached patch does remove 2 such divisions. Not
> sure what part of these explanations is interesting to put in the log
> message.

A possible improvement to this on x86 would be to use rcpps (fast
reciprocal); if extra precision is needed, an added iteration of
Newton's method will achieve ~22 bits out of 24 bits of expected
precision.

Jason
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to