2019-01-20 23:37 GMT+01:00, Michael Niedermayer <mich...@niedermayer.cc>: > On Sun, Jan 20, 2019 at 10:33:26PM +0100, Carl Eugen Hoyos wrote: >> 2019-01-20 22:22 GMT+01:00, Michael Niedermayer <g...@videolan.org>: >> > ffmpeg | branch: master | Michael Niedermayer <mich...@niedermayer.cc> | >> > Thu >> > Jan 17 22:35:10 2019 +0100| [12b1338be376a3e5fb606d9fe41b58dc4a9e62c7] >> > | >> > committer: Michael Niedermayer >> > >> > avutil/mem: Optimize fill32() by unrolling and using 64bit >> > >> > Reviewed-by: Marton Balint <c...@passwd.hu> >> > Signed-off-by: Michael Niedermayer <mich...@niedermayer.cc> >> > >> >> http://git.videolan.org/gitweb.cgi/ffmpeg.git/?a=commit;h=12b1338be376a3e5fb606d9fe41b58dc4a9e62c7 >> > --- >> > >> > libavutil/mem.c | 12 ++++++++++++ >> > 1 file changed, 12 insertions(+) >> > >> > diff --git a/libavutil/mem.c b/libavutil/mem.c >> > index 6149755a6b..88fe09b179 100644 >> > --- a/libavutil/mem.c >> > +++ b/libavutil/mem.c >> > @@ -399,6 +399,18 @@ static void fill32(uint8_t *dst, int len) >> > { >> > uint32_t v = AV_RN32(dst - 4); >> > >> > +#if HAVE_FAST_64BIT >> >> I suspect this should be !X86_32 > >> >> > + uint64_t v2= v + ((uint64_t)v<<32); >> > + while (len >= 32) { >> > + AV_WN64(dst , v2); >> > + AV_WN64(dst+ 8, v2); >> > + AV_WN64(dst+16, v2); >> > + AV_WN64(dst+24, v2); >> > + dst += 32; >> > + len -= 32; >> > + } >> >> How can I test the performance of this function? > > with the testcase from the fuzzer (it should be substantially > faster in this case with teh next commit)
> it should also be possible to test it with some fate tests > as this is used by some. I cannot measure any speed difference for the (lengthened) nuv and cscd fate samples with your patch, so I don't think this questions warrants further investigation. Thank you for the abort() suggestion, Carl Eugen _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel