Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-24 Thread Martin Vignali
> > the code moving should be in a seperate patch idealy, no big issue here as > this > is not a very complex patch > > otherwise patches should be ok > > also tested on linux x86-32/64, mingw32/64, linux qemu mips > thanks > > > patch split (code move/asm) and pushed Thanks for review and

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-20 Thread Michael Niedermayer
On Tue, Mar 20, 2018 at 09:06:53PM +0100, Martin Vignali wrote: > Hello, > > New patchs in attach without AVX2 version [...] > rgb2rgb.c | 23 --- > rgb2rgb.h |6 +++--- > rgb2rgb_template.c | 21 + > x86/rgb2rgb.c |6

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-20 Thread Paul B Mahol
On 3/20/18, Martin Vignali wrote: > Hello, > > New patchs in attach without AVX2 version > > Martin > probably ok. michealni? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-20 Thread Martin Vignali
Hello, New patchs in attach without AVX2 version Martin 0001-swscale-rgb-add-X86-SIMD-SSSE3-for.patch Description: Binary data 0002-swscale-rgb-add-X86-SIMD-SSSE3-for.patch Description: Binary data 0003-checkasm-swscale-add-test-for-rgb-shuffle_bytes-func.patch Description: Binary data

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Carl Eugen Hoyos
2018-03-18 19:04 GMT+01:00, Paul B Mahol : > On 3/18/18, Carl Eugen Hoyos wrote: >> 2018-03-18 18:20 GMT+01:00, Paul B Mahol : >>> On 3/18/18, Carl Eugen Hoyos wrote: 2018-03-18 17:46 GMT+01:00, Martin Vignali

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Paul B Mahol
On 3/18/18, Carl Eugen Hoyos wrote: > 2018-03-18 18:20 GMT+01:00, Paul B Mahol : >> On 3/18/18, Carl Eugen Hoyos wrote: >>> 2018-03-18 17:46 GMT+01:00, Martin Vignali : 2018-03-18 17:37 GMT+01:00 Paul B

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Carl Eugen Hoyos
2018-03-18 18:20 GMT+01:00, Paul B Mahol : > On 3/18/18, Carl Eugen Hoyos wrote: >> 2018-03-18 17:46 GMT+01:00, Martin Vignali : >>> 2018-03-18 17:37 GMT+01:00 Paul B Mahol : >>> On 3/18/18, Nicolas George

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Nicolas George
Paul B Mahol (2018-03-18): > You are almost always wrong. And you, just now, were rude. Regards, -- Nicolas George signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Paul B Mahol
On 3/18/18, Carl Eugen Hoyos wrote: > 2018-03-18 17:46 GMT+01:00, Martin Vignali : >> 2018-03-18 17:37 GMT+01:00 Paul B Mahol : >> >>> On 3/18/18, Nicolas George wrote: >>> > Martin Vignali (2018-03-18): >>> >> I

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Carl Eugen Hoyos
2018-03-18 17:46 GMT+01:00, Martin Vignali : > 2018-03-18 17:37 GMT+01:00 Paul B Mahol : > >> On 3/18/18, Nicolas George wrote: >> > Martin Vignali (2018-03-18): >> >> I run the test again with a bigger width (512 instead of 128) >> >>

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread James Almer
On 3/18/2018 1:28 PM, Nicolas George wrote: > Martin Vignali (2018-03-18): >> I run the test again with a bigger width (512 instead of 128) >> This is my result : >> shuffle_bytes_0321_c: 128.6 >> shuffle_bytes_0321_ssse3: 41.6 >> shuffle_bytes_0321_avx2: 23.4 > > IIUC, these benchmarks are

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Martin Vignali
2018-03-18 17:37 GMT+01:00 Paul B Mahol : > On 3/18/18, Nicolas George wrote: > > Martin Vignali (2018-03-18): > >> I run the test again with a bigger width (512 instead of 128) > >> This is my result : > >> shuffle_bytes_0321_c: 128.6 > >>

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Paul B Mahol
On 3/18/18, Nicolas George wrote: > Martin Vignali (2018-03-18): >> I run the test again with a bigger width (512 instead of 128) >> This is my result : >> shuffle_bytes_0321_c: 128.6 >> shuffle_bytes_0321_ssse3: 41.6 >> shuffle_bytes_0321_avx2: 23.4 > > IIUC, these benchmarks

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Nicolas George
Martin Vignali (2018-03-18): > I run the test again with a bigger width (512 instead of 128) > This is my result : > shuffle_bytes_0321_c: 128.6 > shuffle_bytes_0321_ssse3: 41.6 > shuffle_bytes_0321_avx2: 23.4 IIUC, these benchmarks are expressed in CPU cycles. But what James says is that it can

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread James Almer
On 3/18/2018 1:23 PM, Martin Vignali wrote: > 2018-03-18 16:49 GMT+01:00 James Almer : > >> On 3/18/2018 12:08 PM, Martin Vignali wrote: >>> 2018-03-03 18:20 GMT+01:00 Martin Vignali : >>> Hello, Patch in attach add SIMD for the 5

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Martin Vignali
2018-03-18 16:49 GMT+01:00 James Almer : > On 3/18/2018 12:08 PM, Martin Vignali wrote: > > 2018-03-03 18:20 GMT+01:00 Martin Vignali : > > > >> Hello, > >> > >> Patch in attach add SIMD for the 5 shuffle_bytes func for rgb2rgb > >> The new SIMD are

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread James Almer
On 3/18/2018 12:08 PM, Martin Vignali wrote: > 2018-03-03 18:20 GMT+01:00 Martin Vignali : > >> Hello, >> >> Patch in attach add SIMD for the 5 shuffle_bytes func for rgb2rgb >> The new SIMD are write using external ASM. >> >> Also add checkasm test for theses func >>

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Carl Eugen Hoyos
2018-03-18 16:08 GMT+01:00, Martin Vignali : > shuffle_bytes_0321_c: 51.4 > shuffle_bytes_0321_ssse3: 18.7 > shuffle_bytes_0321_avx2: 12.7 > shuffle_bytes_1230_c: 126.9 > shuffle_bytes_1230_ssse3: 16.7 > shuffle_bytes_1230_avx2: 12.9 > shuffle_bytes_2103_c: 52.4 >

Re: [FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-18 Thread Martin Vignali
2018-03-03 18:20 GMT+01:00 Martin Vignali : > Hello, > > Patch in attach add SIMD for the 5 shuffle_bytes func for rgb2rgb > The new SIMD are write using external ASM. > > Also add checkasm test for theses func > Restricted to x86_64, because the scalar part doesn't

[FFmpeg-devel] swscale/rgb2rgb : add X86_64 SIMD (SSSE3 and AVX2) for shuffly_bytes func

2018-03-03 Thread Martin Vignali
Hello, Patch in attach add SIMD for the 5 shuffle_bytes func for rgb2rgb The new SIMD are write using external ASM. Also add checkasm test for theses func Restricted to x86_64, because the scalar part doesn't compile on x86_32 I consider for the scalar part that the src_size value is a multiple