Re: [FFmpeg-devel] [PATCH] x86/vf_blend: Add SSE4.1 optimization for divide

2016-02-13 Thread Timothy Gu
I've already answered these on IRC but for the sake of completion I'll include the answers here as well. On Sat, Feb 13, 2016 at 10:26:58PM -0300, James Almer wrote: > On 2/13/2016 9:27 PM, Timothy Gu wrote: > > --- > > > > The reason why this function uses SSE4.1 is the roundps instruction. Woul

Re: [FFmpeg-devel] [PATCH] x86/vf_blend: Add SSE4.1 optimization for divide

2016-02-13 Thread James Almer
On 2/13/2016 9:27 PM, Timothy Gu wrote: > --- > > The reason why this function uses SSE4.1 is the roundps instruction. Would > love to find a way to truncate a float to integer in SSE2. > > --- > libavfilter/x86/vf_blend.asm| 32 > libavfilter/x86/vf_blend_in

[FFmpeg-devel] [PATCH] x86/vf_blend: Add SSE4.1 optimization for divide

2016-02-13 Thread Timothy Gu
--- The reason why this function uses SSE4.1 is the roundps instruction. Would love to find a way to truncate a float to integer in SSE2. --- libavfilter/x86/vf_blend.asm| 32 libavfilter/x86/vf_blend_init.c | 6 ++ 2 files changed, 38 insertions(+) dif