On Sun, May 06, 2018 at 04:53:54PM +0200, Moritz Barsnick wrote: > On Sun, May 06, 2018 at 13:40:58 +0200, Clément Bœsch wrote: > > Overall speed appears to be 1.1x faster with no noticeable quality impact. > > Probably platform dependant? > > > struct weighted_avg { > > - double total_weight; > > - double sum; > > + float total_weight; > > + float sum; > > }; > > I believe these calculaions in nlmeans_plane() will promote to double > before being cast back to float: > > // Also weight the centered pixel > wa->total_weight += 1.0; > wa->sum += 1.0 * src[y*src_linesize + x]; > > (At least the second one. The first one - just an assignment of a > constant - is covered by the preprocessor, IIUC.) They need to use > "1.0f". >
It doesn't really matter here actually, in "lavfi/nlmeans: move final weighted averaging out of nlmeans_plane" you can see that this code represents 0.24% of the CPU time. I fixed it locally anyway, thanks. > (There are others, but only in init(), which don't matter for > performance.) Yeah, I left these to double on purpose. -- Clément B.
signature.asc
Description: PGP signature
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel