Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-29 Thread James Darnley
On 2017-10-28 17:57, Martin Vignali wrote:

>>> For this, i follow Henrik Gramner's code, in the discussion
>>> libavcodec/exr : add x86 SIMD for predictor.
>>
>> That's fine.  I will look at that on my own time.  I don't want to hold
>> up applying the patch.  Would you like me to push it for you?
>>
>>
>> Yes please push it if you can.
> Thanks !
> 
> Martin

Done, at last, with only a ~22 hour delay.




signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-28 Thread Martin Vignali
> >>
> > For this, i follow Henrik Gramner's code, in the discussion
> > libavcodec/exr : add x86 SIMD for predictor.
>
> That's fine.  I will look at that on my own time.  I don't want to hold
> up applying the patch.  Would you like me to push it for you?
>
>
> Yes please push it if you can.
Thanks !

Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-28 Thread James Darnley
On 2017-10-28 16:14, Martin Vignali wrote:
> 2017-10-28 15:47 GMT+02:00 James Darnley :
> 
>> On 2017-10-22 19:14, Martin Vignali wrote:
>>> @@ -86,16 +101,20 @@ SECTION .text
>>>
>>>  ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w);
>>>  %macro BSWAP32_BUF 0
>>> -%if cpuflag(ssse3)
>>> +%if cpuflag(ssse3)||cpuflag(avx2)
>>>  cglobal bswap32_buf, 3,4,3
>>>  mov  r3, r1
>>> +%if cpuflag(avx2)
>>> +vbroadcasti128  m2, [pb_bswap32]
>>> +%else
>>>  mova m2, [pb_bswap32]
>>> +%endif
>>
>> Why don't you increase the size of the constant?  You wouldn't need a
>> preprocessor condition.
>>
>> I do admit I have never considered using a broadact-128 instruction in
>> similar circumstances so I have no idea which might be "better".
>>
>>
> For this, i follow Henrik Gramner's code, in the discussion
> libavcodec/exr : add x86 SIMD for predictor.

That's fine.  I will look at that on my own time.  I don't want to hold
up applying the patch.  Would you like me to push it for you?




signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-28 Thread Martin Vignali
2017-10-28 15:47 GMT+02:00 James Darnley :

> On 2017-10-22 19:14, Martin Vignali wrote:
> > @@ -86,16 +101,20 @@ SECTION .text
> >
> >  ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w);
> >  %macro BSWAP32_BUF 0
> > -%if cpuflag(ssse3)
> > +%if cpuflag(ssse3)||cpuflag(avx2)
> >  cglobal bswap32_buf, 3,4,3
> >  mov  r3, r1
> > +%if cpuflag(avx2)
> > +vbroadcasti128  m2, [pb_bswap32]
> > +%else
> >  mova m2, [pb_bswap32]
> > +%endif
>
> Why don't you increase the size of the constant?  You wouldn't need a
> preprocessor condition.
>
> I do admit I have never considered using a broadact-128 instruction in
> similar circumstances so I have no idea which might be "better".
>
>
> For this, i follow Henrik Gramner's code, in the discussion
libavcodec/exr : add x86 SIMD for predictor.

Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-28 Thread James Darnley
On 2017-10-22 19:14, Martin Vignali wrote:
> @@ -86,16 +101,20 @@ SECTION .text
>  
>  ; void ff_bswap_buf(uint32_t *dst, const uint32_t *src, int w);
>  %macro BSWAP32_BUF 0
> -%if cpuflag(ssse3)
> +%if cpuflag(ssse3)||cpuflag(avx2)
>  cglobal bswap32_buf, 3,4,3
>  mov  r3, r1
> +%if cpuflag(avx2)
> +vbroadcasti128  m2, [pb_bswap32]
> +%else
>  mova m2, [pb_bswap32]
> +%endif

Why don't you increase the size of the constant?  You wouldn't need a
preprocessor condition.

I do admit I have never considered using a broadact-128 instruction in
similar circumstances so I have no idea which might be "better".




signature.asc
Description: OpenPGP digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-28 Thread Martin Vignali
2017-10-25 9:39 GMT+02:00 Paul B Mahol :

> On 10/22/17, Martin Vignali  wrote:
> > Hello,
> >
> > In attach patch in order to add an AVX2 version for bswap_buf
> > (swap uint32 in a buffer)
> >
> > Checkasm result (Kaby Lake 10.12)
> > bswap_buf_c: 122.8
> > bswap_buf_sse2: 67.8
> > bswap_buf_ssse3: 34.3
> > bswap_buf_avx2: 21.0
> >
> >
> > Pass checkasm and fate test for me
> >
> >
> > Martin
> > Jokyo Images
> >
>
> probably ok
>
>
ping for apply
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-25 Thread Paul B Mahol
On 10/22/17, Martin Vignali  wrote:
> Hello,
>
> In attach patch in order to add an AVX2 version for bswap_buf
> (swap uint32 in a buffer)
>
> Checkasm result (Kaby Lake 10.12)
> bswap_buf_c: 122.8
> bswap_buf_sse2: 67.8
> bswap_buf_ssse3: 34.3
> bswap_buf_avx2: 21.0
>
>
> Pass checkasm and fate test for me
>
>
> Martin
> Jokyo Images
>

probably ok
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


[FFmpeg-devel] libavcodec/bswapdsp : add AVX2 for bswap_buf

2017-10-22 Thread Martin Vignali
Hello,

In attach patch in order to add an AVX2 version for bswap_buf
(swap uint32 in a buffer)

Checkasm result (Kaby Lake 10.12)
bswap_buf_c: 122.8
bswap_buf_sse2: 67.8
bswap_buf_ssse3: 34.3
bswap_buf_avx2: 21.0


Pass checkasm and fate test for me


Martin
Jokyo Images


0001-libavcodec-bswapdsp-add-AVX2-func-for-bswap_buf.patch
Description: Binary data
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel