Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
On 2017-10-28 17:57, Martin Vignali wrote: >>> For this, i follow Henrik Gramner's code, in the discussion >>> libavcodec/exr : add x86 SIMD for predictor. >> >> That's fine. I will look at that on my own time. I don't want to hold >> up applying the patch. Would you like me to push it for you? >> >> >> Yes please push it if you can. > Thanks ! > > Martin Done, at last, with only a ~22 hour delay. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
> >> > > For this, i follow Henrik Gramner's code, in the discussion > > libavcodec/exr : add x86 SIMD for predictor. > > That's fine. I will look at that on my own time. I don't want to hold > up applying the patch. Would you like me to push it for you? > > > Yes please push it if you can. Thanks ! Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
On 2017-10-28 16:14, Martin Vignali wrote: > 2017-10-28 15:47 GMT+02:00 James Darnley : > >> On 2017-10-22 19:14, Martin Vignali wrote: >>> @@ -86,16 +101,20 @@ SECTION .text >>> >>> ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w); >>> %macro BSWAP32_BUF 0 >>> -%if cpuflag(ssse3) >>> +%if cpuflag(ssse3)||cpuflag(avx2) >>> cglobal bswap32_buf, 3,4,3 >>> mov r3, r1 >>> +%if cpuflag(avx2) >>> +vbroadcasti128 m2, [pb_bswap32] >>> +%else >>> mova m2, [pb_bswap32] >>> +%endif >> >> Why don't you increase the size of the constant? You wouldn't need a >> preprocessor condition. >> >> I do admit I have never considered using a broadact-128 instruction in >> similar circumstances so I have no idea which might be "better". >> >> > For this, i follow Henrik Gramner's code, in the discussion > libavcodec/exr : add x86 SIMD for predictor. That's fine. I will look at that on my own time. I don't want to hold up applying the patch. Would you like me to push it for you? signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
2017-10-28 15:47 GMT+02:00 James Darnley : > On 2017-10-22 19:14, Martin Vignali wrote: > > @@ -86,16 +101,20 @@ SECTION .text > > > > ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w); > > %macro BSWAP32_BUF 0 > > -%if cpuflag(ssse3) > > +%if cpuflag(ssse3)||cpuflag(avx2) > > cglobal bswap32_buf, 3,4,3 > > mov r3, r1 > > +%if cpuflag(avx2) > > +vbroadcasti128 m2, [pb_bswap32] > > +%else > > mova m2, [pb_bswap32] > > +%endif > > Why don't you increase the size of the constant? You wouldn't need a > preprocessor condition. > > I do admit I have never considered using a broadact-128 instruction in > similar circumstances so I have no idea which might be "better". > > > For this, i follow Henrik Gramner's code, in the discussion libavcodec/exr : add x86 SIMD for predictor. Martin ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
On 2017-10-22 19:14, Martin Vignali wrote: > @@ -86,16 +101,20 @@ SECTION .text > > ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w); > %macro BSWAP32_BUF 0 > -%if cpuflag(ssse3) > +%if cpuflag(ssse3)||cpuflag(avx2) > cglobal bswap32_buf, 3,4,3 > mov r3, r1 > +%if cpuflag(avx2) > +vbroadcasti128 m2, [pb_bswap32] > +%else > mova m2, [pb_bswap32] > +%endif Why don't you increase the size of the constant? You wouldn't need a preprocessor condition. I do admit I have never considered using a broadact-128 instruction in similar circumstances so I have no idea which might be "better". signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
2017-10-25 9:39 GMT+02:00 Paul B Mahol : > On 10/22/17, Martin Vignali wrote: > > Hello, > > > > In attach patch in order to add an AVX2 version for bswap_buf > > (swap uint32 in a buffer) > > > > Checkasm result (Kaby Lake 10.12) > > bswap_buf_c: 122.8 > > bswap_buf_sse2: 67.8 > > bswap_buf_ssse3: 34.3 > > bswap_buf_avx2: 21.0 > > > > > > Pass checkasm and fate test for me > > > > > > Martin > > Jokyo Images > > > > probably ok > > ping for apply ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
On 10/22/17, Martin Vignali wrote: > Hello, > > In attach patch in order to add an AVX2 version for bswap_buf > (swap uint32 in a buffer) > > Checkasm result (Kaby Lake 10.12) > bswap_buf_c: 122.8 > bswap_buf_sse2: 67.8 > bswap_buf_ssse3: 34.3 > bswap_buf_avx2: 21.0 > > > Pass checkasm and fate test for me > > > Martin > Jokyo Images > probably ok ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf
Hello, In attach patch in order to add an AVX2 version for bswap_buf (swap uint32 in a buffer) Checkasm result (Kaby Lake 10.12) bswap_buf_c: 122.8 bswap_buf_sse2: 67.8 bswap_buf_ssse3: 34.3 bswap_buf_avx2: 21.0 Pass checkasm and fate test for me Martin Jokyo Images 0001-libavcodec-bswapdsp-add-AVX2-func-for-bswap_buf.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel