On Fri, Nov 27, 2015 at 5:35 AM, Rostislav Pehlivanov <atomnu...@gmail.com> wrote: > LGTM, but could you leave (just comment it out) the old code in there > so it's a little easier to follow? >> //ff_aac_pow2sf_tab[i] = pow(2, (i - POW_SF2_ZERO) / 4.0); >> //ff_aac_pow34sf_tab[i] = pow(ff_aac_pow2sf_tab[i], 3.0/4.0); > > The accuracy increase is always nice.
Done and pushed. Thanks. BTW, do you or others think that the new performance figures are sufficient to justify getting rid of config_hardcoded_tables and associated ifdefry and C file here? > > On Thu, 2015-11-26 at 16:31 -0500, Ganesh Ajjanagadde wrote: >> This speeds up aac_tablegen to a ludicruous degree (~97%), i.e to the >> point >> where it can be argued that runtime initialization can always be done >> instead of >> hard-coded tables. The only cost is essentially a trivial increase in >> the stack size. >> >> Even if one does not care about this, the patch also improves >> accuracy >> as detailed below. >> >> Performance: >> Benchmark obtained by looping 10^4 times over ff_aac_tableinit. >> >> Sample benchmark (x86-64, Haswell, GNU/Linux): >> old: >> 1295292 decicycles in ff_aac_tableinit, 512 runs, 0 skips >> 1275981 decicycles in ff_aac_tableinit, 1024 runs, 0 skips >> 1272932 decicycles in ff_aac_tableinit, 2048 runs, 0 skips >> 1262164 decicycles in ff_aac_tableinit, 4096 runs, 0 skips >> 1256720 decicycles in ff_aac_tableinit, 8192 runs, 0 skips >> >> new: >> 25691 decicycles in ff_aac_tableinit, 505 runs, 7 skips >> 25130 decicycles in ff_aac_tableinit, 1016 runs, 8 skips >> 25973 decicycles in ff_aac_tableinit, 2036 runs, 12 skips >> 25911 decicycles in ff_aac_tableinit, 4078 runs, 18 skips >> 25816 decicycles in ff_aac_tableinit, 8154 runs, 38 skips >> >> Accuracy: >> The previous code was resulting in needless loss of >> accuracy due to the pow being called in succession. As an >> illustration >> of this: >> ff_aac_pow34sf_tab[3] >> old : 0.000000000007598092294225 >> new : 0.000000000007598091426864 >> real: 0.000000000007598091778545 >> >> truncated to float >> old : 0.000000000007598092294225 >> new : 0.000000000007598091426864 >> real: 0.000000000007598091426864 >> >> showing that the old value was not correctly rounded. This affects a >> large number of elements of the array. >> >> Patch tested with FATE. >> >> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >> --- >> libavcodec/aac_tablegen.h | 38 ++++++++++++++++++++++++++++++++++++- >> - >> 1 file changed, 36 insertions(+), 2 deletions(-) >> >> diff --git a/libavcodec/aac_tablegen.h b/libavcodec/aac_tablegen.h >> index 8b223f9..255723b 100644 >> --- a/libavcodec/aac_tablegen.h >> +++ b/libavcodec/aac_tablegen.h >> @@ -35,9 +35,43 @@ float ff_aac_pow34sf_tab[428]; >> av_cold void ff_aac_tableinit(void) >> { >> int i; >> + >> + /* 2^(i/16) for 0 <= i <= 15 */ >> + const double exp2_lut[] = { >> + 1.00000000000000000000, >> + 1.04427378242741384032, >> + 1.09050773266525765921, >> + 1.13878863475669165370, >> + 1.18920711500272106672, >> + 1.24185781207348404859, >> + 1.29683955465100966593, >> + 1.35425554693689272830, >> + 1.41421356237309504880, >> + 1.47682614593949931139, >> + 1.54221082540794082361, >> + 1.61049033194925430818, >> + 1.68179283050742908606, >> + 1.75625216037329948311, >> + 1.83400808640934246349, >> + 1.91520656139714729387, >> + }; >> + double t1 = 8.8817841970012523233890533447265625e-16; // 2^(-50) >> + double t2 = 3.63797880709171295166015625e-12; // 2^(-38) >> + int t1_inc_cur, t2_inc_cur; >> + int t1_inc_prev = 0; >> + int t2_inc_prev = 8; >> + >> for (i = 0; i < 428; i++) { >> - ff_aac_pow2sf_tab[i] = pow(2, (i - POW_SF2_ZERO) / 4.0); >> - ff_aac_pow34sf_tab[i] = pow(ff_aac_pow2sf_tab[i], 3.0/4.0); >> + t1_inc_cur = 4 * (i % 4); >> + t2_inc_cur = (8 + 3*i) % 16; >> + if (t1_inc_cur < t1_inc_prev) >> + t1 *= 2; >> + if (t2_inc_cur < t2_inc_prev) >> + t2 *= 2; >> + ff_aac_pow2sf_tab[i] = t1 * exp2_lut[t1_inc_cur]; >> + ff_aac_pow34sf_tab[i] = t2 * exp2_lut[t2_inc_cur]; >> + t1_inc_prev = t1_inc_cur; >> + t2_inc_prev = t2_inc_cur; >> } >> } >> #endif /* CONFIG_HARDCODED_TABLES */ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel