2017-11-27 17:59 GMT+01:00 Henrik Gramner <hen...@gramner.com>: > On Sat, Nov 25, 2017 at 9:53 PM, Martin Vignali > <martin.vign...@gmail.com> wrote: > > Hello, > > > > In attach patch to convert pb_bswap32 to ymm constant > > and remove the vbroadcasti128 part > > > > Speed seems to be similar to me > > This just wastes cache for no reason. A tiny amount, sure, but minor > things tends to add up eventually. > > 128-bit broadcasts are the same speed as 256-bit loads on Intel CPU:s > and twice as fast as 256-bit loads on AMD CPU:s. > > A better solution if you want to avoid ifdeffery would be to create a > macro that uses vbroadcasti128 when mmsize == 32 and mova otherwise. > _______________________________________________ > > Hello,
Thanks for your comments. Do you have an idea, for the name of this macro ? Relative to previous patch similar to this in discussion : avcodec/x86/exrdsp : use ymm constant for pb_80 instead of vbroadcasti128 Do you think, we need to not use YMM constant (declare in constants.h/c), and convert the constantes to XMM in this file, with a vbroadcasti128 load ? Martin _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel