On 18/08/14 5:01 AM, Pierre Edouard Lepere wrote: > Hi, > here's the new version of the patch. Sorry for the delay. > James, I have not done 8-bit AVX versions because it requires unpacks that > are done differently in AVX.
Aren't you thinking of AVX2 with 256bits wide registers? With AVX i mean an xmm version of the function that takes advantage of the three-operand instruction format, which should be essentially the same as the SSE2 version except for the many saved movdqa between registers. Anyway, LGTM and this can be committed as is. There's some of room for improvements (AVX version, or making all the functions use 8 or less xmm registers) but that can be done later instead of sending the same patch a fourth time. Not sure if someone else wants to comment (Ronald?). > > Thanks for the feedback ! > > -Pierre-Edouard Lepere > > > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel