Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread James Darnley
On 2016-12-02 00:31, Carl Eugen Hoyos wrote: > 2016-12-01 17:57 GMT+01:00 James Darnley : >> Yorkfield: >> - mmx2: 2.44x faster (278 vs. 114 cycles) >> - sse2: 3.35x faster (278 vs. 83 cycles) >> >> Skylake: >> - mmx2: 1.69x faster (169 vs. 100 cycles) >> - sse2: 2.34x faster

Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread James Darnley
On 2016-12-01 23:16, Michael Niedermayer wrote: > On Thu, Dec 01, 2016 at 05:57:44PM +0100, James Darnley wrote: >> Yorkfield: >> - mmx2: 2.44x faster (278 vs. 114 cycles) >> - sse2: 3.35x faster (278 vs. 83 cycles) >> >> Skylake: >> - mmx2: 1.69x faster (169 vs. 100 cycles) >> - sse2: 2.34x

Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread Carl Eugen Hoyos
2016-12-01 17:57 GMT+01:00 James Darnley : > Yorkfield: > - mmx2: 2.44x faster (278 vs. 114 cycles) > - sse2: 3.35x faster (278 vs. 83 cycles) > > Skylake: > - mmx2: 1.69x faster (169 vs. 100 cycles) > - sse2: 2.34x faster (169 vs. 72 cycles) Is it expected (or possible)

Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread Michael Niedermayer
On Thu, Dec 01, 2016 at 05:57:44PM +0100, James Darnley wrote: > Yorkfield: > - mmx2: 2.44x faster (278 vs. 114 cycles) > - sse2: 3.35x faster (278 vs. 83 cycles) > > Skylake: > - mmx2: 1.69x faster (169 vs. 100 cycles) > - sse2: 2.34x faster (169 vs. 72 cycles) > - avx: 2.32x faster (169