Re: [FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

2017-12-18 Thread Martin Vignali
2017-12-17 19:41 GMT+01:00 Henrik Gramner : > On Thu, Dec 14, 2017 at 11:16 AM, Martin Vignali > wrote: > > 2017-12-13 17:37 GMT+01:00 Henrik Gramner : > >> You could also do vextracti128 + 128-bit packuswb instead of 256-bit > >>

Re: [FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

2017-12-17 Thread Henrik Gramner
On Thu, Dec 14, 2017 at 11:16 AM, Martin Vignali wrote: > 2017-12-13 17:37 GMT+01:00 Henrik Gramner : >> You could also do vextracti128 + 128-bit packuswb instead of 256-bit >> packuswb + vpermq. >> > Sorry don't understand this part > do you mean 128

Re: [FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

2017-12-14 Thread Martin Vignali
2017-12-13 17:37 GMT+01:00 Henrik Gramner : > On Sat, Dec 9, 2017 at 1:11 PM, Martin Vignali > wrote: > > the idea in AVX2 is to load 128bits of data (2x 64 bits) > > then shuffle accross lane, the two 64 bits in the low part of each lane, > to > >

Re: [FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

2017-12-13 Thread Henrik Gramner
On Sat, Dec 9, 2017 at 1:11 PM, Martin Vignali wrote: > the idea in AVX2 is to load 128bits of data (2x 64 bits) > then shuffle accross lane, the two 64 bits in the low part of each lane, to > keep the rest of the process similar > to the sse version What about using

[FFmpeg-devel] avfilter/x86/vf_blend : add avx2 version for 8b func (WIP)

2017-12-09 Thread Martin Vignali
Hello, in attach patch to add AVX2 version for each 8b func (except divide) 001 : avutil : add ABS2 for avx2 002 : avfilter : add AVX2 version for most of the func, the AVX2 is a simple modification VBROADCASTi128, for constant loading when the process stay in 8bits when the process use