On Wed, Nov 25, 2015 at 6:29 AM, Michael Niedermayer <michae...@gmx.at> wrote: > On Tue, Nov 24, 2015 at 10:13:22PM -0500, Ganesh Ajjanagadde wrote: >> This is a trivial rewrite of the loops that results in better >> prefetching and associated cache efficiency. Essentially, the problem is >> that modern prefetching logic is based on finite state Markov memory, a >> reasonable >> assumption that is used elsewhere in CPU's in for instance branch >> predictors. >> >> Surrounding loops all iterate forward through the array, making the >> predictor think of prefetching in the forward direction, but the >> intermediate loop is unnecessarily in the backward direction. >> >> Speedup is nontrivial. Benchmarks obtained by 10^6 iterations within >> solve_lls, with START/STOP_TIMER. File is >> tests/data/fate/flac-16-lpc-cholesky.err. >> Hardware: x86-64, Haswell, GNU/Linux. >> >> new: >> 17291 decicycles in solve_lls, 2096706 runs, 446 skips >> 17255 decicycles in solve_lls, 4193657 runs, 647 skips >> 17231 decicycles in solve_lls, 8384997 runs, 3611 skips >> 17189 decicycles in solve_lls,16771010 runs, 6206 skips >> 17132 decicycles in solve_lls,33544757 runs, 9675 skips >> 17092 decicycles in solve_lls,67092404 runs, 16460 skips >> 17058 decicycles in solve_lls,134188213 runs, 29515 skips >> >> old: >> 18009 decicycles in solve_lls, 2096665 runs, 487 skips >> 17805 decicycles in solve_lls, 4193320 runs, 984 skips >> 17779 decicycles in solve_lls, 8386855 runs, 1753 skips >> 18289 decicycles in solve_lls,16774280 runs, 2936 skips >> 18158 decicycles in solve_lls,33548104 runs, 6328 skips >> 18420 decicycles in solve_lls,67091793 runs, 17071 skips >> 18310 decicycles in solve_lls,134187219 runs, 30509 skips >> >> Reviewed-by: Michael Niedermayer <mich...@niedermayer.cc> >> Signed-off-by: Ganesh Ajjanagadde <gajjanaga...@gmail.com> >> --- >> libavutil/lls.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) > > LGTM > > thx
pushed, thanks > > [...] > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Those who are best at talking, realize last or never when they are wrong. > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel