On Wed, Nov 25, 2015 at 9:03 PM, Timothy Gu <timothyg...@gmail.com> wrote: > On Wed, Nov 25, 2015 at 05:17:29PM -0500, Ganesh Ajjanagadde wrote: >> + double f = value * cbrt_lut[value] * pow(2, (exponent - 400) * >> 0.25 + FRAC_BITS + 5) / IMDCT_SCALAR; > > While at it, you could change pow(2 to exp2(, which has a libm.h shim > and is easily 4 times faster than pow() on my machine (glibc 2.19, Haswell).
Thanks for the suggestion. However, this won't yield too great a gain, since IIRC this second loop is only ~ 10% of the net cost with the proposed patch. Furthermore, I had an even better way of doing this in mind, that I might do in a separate patch (provided all the cross compiling annoyances get resolved) - I wanted that to be cleared up before proceeding on these things. The better way I have in mind is notice that the pow is always 2^n * something from pow2_lut, where n is an integer, positive or negative. Thus, this 2^n can be computed iteratively with likely no accuracy issues since 2^n is exact in double precision, and depending on % 4, accordingly multiplied with a thing from pow2_lut. The shim issue is that we can't use avutil/libm IIUC, it needs some other method, likely ifdefry. The above idea scores better in that respect as well. As it is a little more tricky and not trivial like the current changes which are low hanging, belongs in separate patch IMHO. Feel free to try it out if interested :). > > Timothy > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel