On 2017-11-27 17:50, Henrik Gramner wrote: > On Sun, Nov 26, 2017 at 11:51 PM, James Darnley <james.darn...@gmail.com> > wrote: >> -pd_0_int_min: times 2 dd 0, -2147483648 >> -pq_int_min: times 2 dq -2147483648 >> -pq_int_max: times 2 dq 2147483647 >> +pd_0_int_min: times 4 dd 0, -2147483648 >> +pq_int_min: times 4 dq -2147483648 >> +pq_int_max: times 4 dq 2147483647 > > Using 128-bit broadcasts is preferable over duplicating the constants > to 256-bit unless there's a good reason for doing so since it wastes > less cache and is faster on AMD CPU:s.
At first I thought it sounded like a possible candidate for x86-64 optimisation; I have run out of registers on x86. Although that is in the inner loop and these constants used in the outer loop or just once so I have some room. Do you want to block the patch set while I change this or could it be left for another time? Thanks for the suggestion anyway.
Description: OpenPGP digital signature
_______________________________________________ ffmpeg-devel mailing list email@example.com http://ffmpeg.org/mailman/listinfo/ffmpeg-devel