Re: [FFmpeg-devel] [PATCH] avcodec/h264: enable sse2 chroma deblock/loop filter functions

2017-02-27 Thread James Darnley
On 2017-02-27 12:13, Paul B Mahol wrote: > On 2/27/17, James Darnley <jdarn...@obe.tv> wrote: >> >> Does anyone have any comments on the patch set? For example: should I >> merge this sse2 patch into the others? > > probably not, just commit. Will do. I have

Re: [FFmpeg-devel] [PATCH] avcodec/h264: enable sse2 chroma deblock/loop filter functions

2017-02-27 Thread James Darnley
On 2017-02-22 01:27, James Darnley wrote: > --- > libavcodec/x86/h264_deblock.asm | 1 + > libavcodec/x86/h264dsp_init.c | 10 ++ > 2 files changed, 11 insertions(+) > > diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm > index 32

[FFmpeg-devel] [PATCH] avcodec/h264: enable sse2 chroma deblock/loop filter functions

2017-02-21 Thread James Darnley
--- libavcodec/x86/h264_deblock.asm | 1 + libavcodec/x86/h264dsp_init.c | 10 ++ 2 files changed, 11 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 32aa3d3..6702ae9 100644 --- a/libavcodec/x86/h264_deblock.asm +++

[FFmpeg-devel] enabling sse2

2017-02-21 Thread James Darnley
libavcodec/x86/h264_deblock.asm | 1 + libavcodec/x86/h264dsp_init.c | 10 ++ 2 files changed, 11 insertions(+) Okay, enabling sse2 gets me the results below. It turns out I should allow sse2 despite some previous testing. Should I leave avx? Sometimes it is a few percentage

[FFmpeg-devel] [PATCH 3/6] avcodec/h264: add avx 8-bit 4:2:2 chroma h deblock/loop filter

2017-02-20 Thread James Darnley
~1.21x faster (68 vs. 56 cycles) compared with mmxext function --- libavcodec/x86/h264_deblock.asm | 27 +++ libavcodec/x86/h264dsp_init.c | 2 ++ 2 files changed, 29 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index

[FFmpeg-devel] [PATCH 1/6] avcodec/h264: add avx 8-bit chroma v deblock/loop filter

2017-02-20 Thread James Darnley
~1.24x faster (101 vs. 81 cycles) compared with mmxext function --- libavcodec/x86/h264_deblock.asm | 38 ++ libavcodec/x86/h264dsp_init.c | 2 ++ 2 files changed, 40 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm

[FFmpeg-devel] [PATCH 5/6] avcodec/h264: add avx 8-bit 4:2:0 chroma h intra deblock/loop filter

2017-02-20 Thread James Darnley
~1.10x faster (69 vs. 63 cycles) compared to mmxext function --- libavcodec/x86/h264_deblock.asm | 9 + libavcodec/x86/h264dsp_init.c | 1 + 2 files changed, 10 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 1e6d822..2197608 100644

[FFmpeg-devel] [PATCH 2/6] avcodec/h264: add avx 8-bit 4:2:0 chroma h deblock/loop filter

2017-02-20 Thread James Darnley
~1.14x faster (93 vs. 81 cycles) compared with mmxext function --- libavcodec/x86/h264_deblock.asm | 70 + libavcodec/x86/h264dsp_init.c | 3 ++ 2 files changed, 73 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm

[FFmpeg-devel] [PATCH 6/6] avcodec/h264: add avx 8-bit 4:2:2 chroma h intra deblock/loop filter

2017-02-20 Thread James Darnley
~1.37x faster (147 vs. 108 cycles) compared to mmxext function --- libavcodec/x86/h264_deblock.asm | 18 ++ libavcodec/x86/h264dsp_init.c | 1 + 2 files changed, 19 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index

[FFmpeg-devel] [PATCH 4/6] avcodec/h264: add avx 8-bit chroma v intra deblock/loop filter

2017-02-20 Thread James Darnley
~1.14x faster (90 vs 78 cycles) compared with mmxext --- libavcodec/x86/h264_deblock.asm | 33 + libavcodec/x86/h264dsp_init.c | 1 + 2 files changed, 34 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index

[FFmpeg-devel] [PATCH 0/6] avx functions for h264 chroma deblocking

2017-02-20 Thread James Darnley
6 more functions which eke out a little more speed. James Darnley (6): avcodec/h264: add avx 8-bit chroma v deblock/loop filter avcodec/h264: add avx 8-bit 4:2:0 chroma h deblock/loop filter avcodec/h264: add avx 8-bit 4:2:2 chroma h deblock/loop filter avcodec/h264: add avx 8-bit chroma

Re: [FFmpeg-devel] [PATCH 1/4] avcodec/x86: deduplicate PASS8ROWS macro

2017-02-17 Thread James Darnley
On 2017-02-16 14:11, James Darnley wrote: > Four patches Does anyone else have any more comments about this patch series? Yea or nay from anyone? ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel

[FFmpeg-devel] [PATCH 1/4] avcodec/x86: deduplicate PASS8ROWS macro

2017-02-16 Thread James Darnley
--- libavcodec/x86/h264_deblock.asm | 5 - libavcodec/x86/h264_deblock_10bit.asm | 5 - libavcodec/x86/hevc_deblock.asm | 5 - libavutil/x86/x86util.asm | 5 + 4 files changed, 5 insertions(+), 15 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm

[FFmpeg-devel] [PATCH 2/4] avcodec/h264: add named parameters to x86 function

2017-02-16 Thread James Darnley
--- libavcodec/x86/h264_deblock.asm | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 435c8be..509a0db 100644 --- a/libavcodec/x86/h264_deblock.asm +++

[FFmpeg-devel] [PATCH 4/4] avcodec/h264: sse2, avx h luma mbaff deblock/loop filter

2017-02-16 Thread James Darnley
x86-64 only Yorkfield: - sse2: ~2.17x (434 vs. 200 cycles) Nehalem: - sse2: ~2.94x (409 vs. 139 cycles) Skylake: - sse2: ~3.10x (370 vs. 119 cycles) - avx: ~3.29x (370 vs. 112 cycles) --- libavcodec/x86/h264_deblock.asm | 89 +

Re: [FFmpeg-devel] [PATCH] avfilter: implement halve filter

2017-02-15 Thread James Darnley
On 2017-02-14 22:25, Mark Thompson wrote: > On 14/02/17 19:44, Daniel Oberhoff wrote: >> filter strictly “halves” the image efficiently, which is often exactly what >> is needed >> likely much faster than using scale > > Did you benchmark this? How? > > $ time ./ffmpeg -f lavfi -i allyuv -vf

Re: [FFmpeg-devel] [PATCH 3/4] x86util: import MOVHL macro

2017-02-15 Thread James Darnley
On 2017-02-14 17:21, Henrik Gramner wrote: > On Mon, Feb 13, 2017 at 1:44 PM, James Darnley <jdarn...@obe.tv> wrote: >> Originally committed to x264 in 1637239a by Henrik Gramner who has >> agreed to re-license it as LGPL. Original commit message follows. >> >>

[FFmpeg-devel] [PATCH 3/4] x86util: import MOVHL macro

2017-02-13 Thread James Darnley
Originally committed to x264 in 1637239a by Henrik Gramner who has agreed to re-license it as LGPL. Original commit message follows. x86: Avoid some bypass delays and false dependencies A bypass delay of 1-3 clock cycles may occur on some CPUs when transitioning between int and

[FFmpeg-devel] [PATCH 2/4] avcodec/h264: add named parameters to x86 function

2017-02-13 Thread James Darnley
--- libavcodec/x86/h264_deblock.asm | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 435c8be56f..509a0dbe0c 100644 --- a/libavcodec/x86/h264_deblock.asm +++

[FFmpeg-devel] [PATCH 1/4] avcodec/x86: deduplicate PASS8ROWS macro

2017-02-13 Thread James Darnley
--- libavcodec/x86/h264_deblock.asm | 5 - libavcodec/x86/h264_deblock_10bit.asm | 5 - libavcodec/x86/hevc_deblock.asm | 5 - libavutil/x86/x86util.asm | 5 + 4 files changed, 5 insertions(+), 15 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm

[FFmpeg-devel] [PATCH 4/4] avcodec/h264: sse2, avx h luma mbaff deblock/loop filter

2017-02-13 Thread James Darnley
x86-64 only Yorkfield: - sse2: 2.16x (434 vs. 201 cycles) Skylake: - sse2: 3.04x (378 vs. 124 cycles) - avx: 3.29x (378 vs. 115 cycles) --- libavcodec/x86/h264_deblock.asm | 119 libavcodec/x86/h264dsp_init.c | 10 2 files changed, 129

Re: [FFmpeg-devel] [PATCH 2/2] imdct15: replace the FFT with a faster PFA FFT algorithm

2017-01-04 Thread James Darnley
On 2017-01-04 13:17, Rostislav Pehlivanov wrote: > Forgot to check the return value here, changed locally to: > > if (ff_fft_init(>ptwo_fft, N - 1, 1) < 0); > goto fail; I hope you have not changed it to that, with that semicolon at the end of the line. signature.asc Description:

[FFmpeg-devel] [PATCH] avcodec/h264: resolve assert being triggered when stack is not aligned

2016-12-07 Thread James Darnley
32-bit msvc. --- libavcodec/x86/h264_deblock_10bit.asm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/x86/h264_deblock_10bit.asm b/libavcodec/x86/h264_deblock_10bit.asm index 56cf4d6..c295364 100644 --- a/libavcodec/x86/h264_deblock_10bit.asm +++

Re: [FFmpeg-devel] [PATCH 0/4] More H.264 assembly (the sequel) [version 2]

2016-12-06 Thread James Darnley
On 2016-12-05 19:32, James Darnley wrote: > Fixed the problem Michael highlighted. Dropped the intra functions until it > becomes clear why their performance is unexpected. Updated the benchmarks with > results from a Nehalem and used (slightly) more accurate data. > > Regarding

[FFmpeg-devel] [PATCH 3/4] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-05 Thread James Darnley
Yorkfield: - mmx2: 2.45x (279 vs. 114 cycles) - sse2: 3.36x (279 vs. 83 cycles) Nehalem: - mmx2: 2.10x (192 vs. 92 cycles) - sse2: 2.84x (192 vs. 68 cycles) Skylake: - mmx2: 1.75x (170 vs. 97 cycles) - sse2: 2.47x (170 vs. 69 cycles) - avx: 2.47x (170 vs. 69 cycles) ---

[FFmpeg-devel] [PATCH 4/4] avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop filter

2016-12-05 Thread James Darnley
Yorkfield: - mmx2: 2.53x (504 vs. 199 cycles) - sse2: 3.83x (504 vs. 131 cycles) Nehalem: - mmx2: 2.42x (365 vs. 151 cycles) - sse2: 3.56x (365 vs. 103 cycles) Skylake: - mmx2: 1.81x (308 vs. 170 cycles) - sse2: 2.84x (308 vs. 108 cycles) - avx: 2.93x (308 vs. 105 cycles) ---

[FFmpeg-devel] [PATCH 1/4] avcodec/h264: clean up and expand x86 function definitions

2016-12-05 Thread James Darnley
--- libavcodec/x86/h264dsp_init.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index c6c643a..7cc0655 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -110,6 +110,8 @@

[FFmpeg-devel] [PATCH 2/4] whitespace changes after last commit

2016-12-05 Thread James Darnley
--- libavcodec/x86/h264dsp_init.c | 44 +-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index 7cc0655..7e16dca 100644 --- a/libavcodec/x86/h264dsp_init.c +++

[FFmpeg-devel] [PATCH 0/4] More H.264 assembly (the sequel) [version 2]

2016-12-05 Thread James Darnley
to remove it I will keep the code. However, I will probably not write any more going forward. James Darnley (4): avcodec/h264: clean up and expand x86 function definitions whitespace changes after last commit avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter avcodec/h264: mmx2

Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread James Darnley
On 2016-12-02 00:31, Carl Eugen Hoyos wrote: > 2016-12-01 17:57 GMT+01:00 James Darnley <jdarn...@obe.tv>: >> Yorkfield: >> - mmx2: 2.44x faster (278 vs. 114 cycles) >> - sse2: 3.35x faster (278 vs. 83 cycles) >> >> Skylake: >> - mmx2: 1.69x faster

Re: [FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread James Darnley
On 2016-12-01 23:16, Michael Niedermayer wrote: > On Thu, Dec 01, 2016 at 05:57:44PM +0100, James Darnley wrote: >> Yorkfield: >> - mmx2: 2.44x faster (278 vs. 114 cycles) >> - sse2: 3.35x faster (278 vs. 83 cycles) >> >> Skylake: >> - mmx2: 1.69x faster

[FFmpeg-devel] [PATCH 3/6] whitespace changes after last commit

2016-12-01 Thread James Darnley
--- libavcodec/x86/h264dsp_init.c | 44 +-- 1 file changed, 22 insertions(+), 22 deletions(-) diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index 3d35f59..ab270da 100644 --- a/libavcodec/x86/h264dsp_init.c +++

[FFmpeg-devel] [PATCH 5/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma intra deblock/loop filter

2016-12-01 Thread James Darnley
Yorkfield: - mmx2: 0.99x faster (180 vs. 181 cycles) - sse2: 1.05x faster (180 vs. 170 cycles) Skylake: - mmx2: 1.21x faster (125 vs. 103 cycles) - sse2: 1.54x faster (125 vs. 81 cycles) - avx: 1.29x faster (125 vs. 97 cycles) --- libavcodec/x86/h264_deblock_10bit.asm | 29

[FFmpeg-devel] [PATCH 4/6] avcodec/h264: mmx2, sse2, avx 10-bit 4:2:2 h chroma deblock/loop filter

2016-12-01 Thread James Darnley
Yorkfield: - mmx2: 2.54x faster (500 vs. 197 cycles) - sse2: 3.82x faster (500 vs. 131 cycles) Skylake: - mmx2: 1.80x faster (317 vs. 176 cycles) - sse2: 2.81x faster (317 vs. 113 cycles) - avx: 2.85x faster (317 vs. 111 cycles) --- libavcodec/x86/h264_deblock_10bit.asm | 39

[FFmpeg-devel] [PATCH 2/6] avcodec/h264: clean up and expand x86 function definitions

2016-12-01 Thread James Darnley
--- libavcodec/x86/h264dsp_init.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index c568762..3d35f59 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -110,6 +110,8 @@

[FFmpeg-devel] [PATCH 0/6] More H.264 assembly (the sequel)

2016-12-01 Thread James Darnley
will definitely try benchmarking it on my Nehalem after sending these emails. Suggestions greatly appreciated. James Darnley (6): avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter avcodec/h264: clean up and expand x86 function definitions whitespace changes after last commit

[FFmpeg-devel] [PATCH 1/6] avcodec/h264: mmx2, sse2, avx 10-bit h chroma deblock/loop filter

2016-12-01 Thread James Darnley
Yorkfield: - mmx2: 2.44x faster (278 vs. 114 cycles) - sse2: 3.35x faster (278 vs. 83 cycles) Skylake: - mmx2: 1.69x faster (169 vs. 100 cycles) - sse2: 2.34x faster (169 vs. 72 cycles) - avx: 2.32x faster (169 vs. 73 cycles) --- libavcodec/x86/h264_deblock_10bit.asm | 118

Re: [FFmpeg-devel] [PATCH 3/3] avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functions

2016-11-30 Thread James Darnley
On 2016-11-30 13:57, Ronald S. Bultje wrote: > On Wed, Nov 30, 2016 at 7:10 AM, James Darnley <jdarn...@obe.tv> wrote: >>> Nehalem: >>> - sse2: >>>- complex: 4.13x faster (1514 vs. 367 cycles) >>>- simple: 4.38x fas

Re: [FFmpeg-devel] [PATCH 3/3] avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functions

2016-11-30 Thread James Darnley
On 2016-11-29 21:09, Carl Eugen Hoyos wrote: > 2016-11-29 17:14 GMT+01:00 James Darnley <jdarn...@obe.tv>: >> On 2016-11-29 15:30, Carl Eugen Hoyos wrote: >>> 2016-11-29 12:52 GMT+01:00 James Darnley <jdarn...@obe.tv>: >>>> sse2: >>>> complex:

Re: [FFmpeg-devel] [PATCH 3/3] avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functions

2016-11-29 Thread James Darnley
On 2016-11-29 21:09, Carl Eugen Hoyos wrote: > 2016-11-29 17:14 GMT+01:00 James Darnley <jdarn...@obe.tv>: >> On 2016-11-29 15:30, Carl Eugen Hoyos wrote: >>> 2016-11-29 12:52 GMT+01:00 James Darnley <jdarn...@obe.tv>: >>>> sse2: >>>> complex:

Re: [FFmpeg-devel] [PATCH 3/3] avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functions

2016-11-29 Thread James Darnley
On 2016-11-29 15:30, Carl Eugen Hoyos wrote: > 2016-11-29 12:52 GMT+01:00 James Darnley <jdarn...@obe.tv>: >> sse2: >> complex: 4.13x faster (1514 vs. 367 cycles) >> simple: 4.38x faster (1836 vs. 419 cycles) >> >> avx: >> complex: 1.07x faster (260

[FFmpeg-devel] [PATCH 1/3] avcodec/h264: mmxext 4:2:2 chroma intra deblock/loop filter

2016-11-29 Thread James Darnley
2.1 times faster (401 vs. 194 cycles) --- libavcodec/x86/h264_deblock.asm | 14 ++ libavcodec/x86/h264dsp_init.c | 2 ++ 2 files changed, 16 insertions(+) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 4aabbc0..fe0ab20 100644 ---

[FFmpeg-devel] [PATCH 3/3] avcodec/h264: sse2 and avx 4:2:2 idct add8 10-bit functions

2016-11-29 Thread James Darnley
sse2: complex: 4.13x faster (1514 vs. 367 cycles) simple: 4.38x faster (1836 vs. 419 cycles) avx: complex: 1.07x faster (260 vs. 244 cycles) simple: 1.03x faster (284 vs. 274 cycles) --- libavcodec/x86/h264_idct_10bit.asm | 53 ++

[FFmpeg-devel] [PATCH 2/3] avcodec/h264: mmx 4:2:2 idct add8 function

2016-11-29 Thread James Darnley
2.87 times faster (1830 vs. 638 cycles) --- libavcodec/x86/h264_idct.asm | 32 libavcodec/x86/h264dsp_init.c | 7 ++- 2 files changed, 38 insertions(+), 1 deletion(-) diff --git a/libavcodec/x86/h264_idct.asm b/libavcodec/x86/h264_idct.asm index

[FFmpeg-devel] [PATCH 0/3] Some new H.264 4:2:2 assembly

2016-11-29 Thread James Darnley
As the title says: new assembly for the H.264 decoder. Many thanks to the authors of the 4:2:0 functions. They were fairly easy to adapt after I saw the pattern in the C, I just had to find it in the asm. James Darnley (3): avcodec/h264: mmxext 4:2:2 chroma intra deblock/loop filter avcodec

[FFmpeg-devel] Help with adding new adpcm decoder

2016-11-22 Thread James Darnley
I want to add a decoder for a game's music, specifically Falcom's Xanadu Next. I think the audio could be decompressed by adpcm_ms but the problem comes from the rest of the format. The file starts with a riff wave header that lies about being pcm and other values, but I can force the decoder

Re: [FFmpeg-devel] [Vote] Code of Conduct

2016-05-21 Thread James Darnley
On 2016-05-18 20:40, Michael Niedermayer wrote: > This is the version i had in my pending branch and should be the last > version of the Code of Conduct from march, IIRC there where no further > comments on the last version, so iam calling everyone to vote on this. > Everyone because it should

[FFmpeg-devel] [WIP] vc2 encoder simd assembly

2016-03-01 Thread James Darnley
Hello I've been working on assembly for the vc2 encoder and have reached an impasse. My code results in very visible errors, very obvious vertical streaks in the bottom-right half of the image and some low-frequency effect (I think). I cannot see the problem in my code so I need some fresh eyes

Re: [FFmpeg-devel] [Consulting] Converting an Audio Visualization application to CLI using FFMpeg

2016-02-27 Thread James Darnley
On 2016-02-27 04:09, Ryan Schott wrote: > Hello, > > I am not sure if this is the right page to post this, but your consulting > page recommended I use this list. I recent built an audio visualization app > using html5. I'm currently using that app with xsplit to stream music to an > RTMP

Re: [FFmpeg-devel] [patch] gdigrab-mouse-dpi-awareness

2016-02-11 Thread James Darnley
On 2016-02-11 23:19, Γιώργος Μεταξάκης wrote: > Subject: [PATCH] mouse dpi awareness > > --- > libavdevice/gdigrab.c | 28 +++- > 1 file changed, 15 insertions(+), 13 deletions(-) > > diff --git a/libavdevice/gdigrab.c b/libavdevice/gdigrab.c > index 4428a34..60f184e

Re: [FFmpeg-devel] [PATCH] avcodec/h264: Fix segfault in 4:2:2 chroma deblock with 32-bit msvc

2016-02-05 Thread James Darnley
On 2016-02-05 21:20, Henrik Gramner wrote: > Using rNm and x86inc's stack allocation with a negative value at the same > time isn't supported, and caused the original stack pointer to be clobbered > when using a compiler that doesn't support stack alignment. > --- >

Re: [FFmpeg-devel] [PATCH] cmdutils: realign for some additional filters with very long name

2016-02-05 Thread James Darnley
On 2016-02-05 21:20, Paul B Mahol wrote: > diff --git a/cmdutils.c b/cmdutils.c > index e0d2807..03a4836 100644 > --- a/cmdutils.c > +++ b/cmdutils.c > @@ -1625,7 +1625,7 @@ int show_filters(void *optctx, const char *opt, const > char *arg) >( i &&

Re: [FFmpeg-devel] [RFC][WIP][PATCH] avfilter: add luascript filter

2016-02-05 Thread James Darnley
On 2016-02-04 19:40, Paul B Mahol wrote: > +#define FN_ENTRY(name) {#name, script_ ## name} > +struct fn_entry { > +const char *name; > +int (*fn)(lua_State *L); > +}; > + > +static const struct fn_entry main_fns[] = { > +FN_ENTRY(log), > +FN_ENTRY(frame_count), > +

Re: [FFmpeg-devel] [RFC][WIP][PATCH] avfilter: add luascript filter

2016-02-03 Thread James Darnley
On 2016-02-02 23:25, Paul B Mahol wrote: > Hi, > > patch attached. Nice. I look forward to reading it. Firstly: why limit it to Lua 5.1? I think it should also support LuaJIT. While it is ABI compatible with 5.1 this patch would require its headers to be in "lua-5.1". My suggestion would be

[FFmpeg-devel] [PATCH] avcodec/h264: mmxext 4:2:2 chroma deblock/loop filter

2016-02-01 Thread James Darnley
2.6 times faster (366 vs. 142 cycles) --- Changes since last patch: - name changed to follow 420 version. - use one less reg by using r4 more (James Almer's suggestion) - don't require aligned space in the stack, use a negative value as the cglobal argument. (perhaps unnessecary now that r6

Re: [FFmpeg-devel] [PATCH] docs: explain properly how -fs works

2016-01-31 Thread James Darnley
On 2016-01-31 16:58, Umair Khan wrote: > Hi, > Thanks for reply. I did a lot of searching but couldn't get what is > the proper way to resend the patch with amended commit. > Should I just do git send-email again ? > And should I send it to this thread only ? If yes, how ? Yes. Run send-email

Re: [FFmpeg-devel] [PATCH] ffmpeg: extend -benchmark_all option to show elapsed time

2016-01-27 Thread James Darnley
On 2016-01-27 19:27, Stefano Sabatini wrote: > On date Wednesday 2016-01-27 13:56:38 +0100, James Darnley encoded: >> On 2016-01-11 18:21, Stefano Sabatini wrote: >>> +This option shows the following information for each processing step, >>> +in this order: the user pr

Re: [FFmpeg-devel] [PATCH] ffmpeg: replace "flush Media" with "flush_media" in benchmark_all output

2016-01-27 Thread James Darnley
On 2016-01-27 13:09, Stefano Sabatini wrote: > Simplify parsing and consistency. Fine. (Ha. It looks like I forgot to press send on this before going out.) signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list

Re: [FFmpeg-devel] [PATCH] ffmpeg: extend -benchmark_all option to show elapsed time

2016-01-27 Thread James Darnley
On 2016-01-11 18:21, Stefano Sabatini wrote: > +This option shows the following information for each processing step, > +in this order: the user process time (in microseconds), the elapsed > +relative time (in microseconds), the processing step type, and the > +relative stream. What is "relative

Re: [FFmpeg-devel] [PATCH] doc/filters: add an example to scale

2016-01-26 Thread James Darnley
On 2016-01-22 14:44, Michael Niedermayer wrote: > On Fri, Jan 22, 2016 at 03:53:10AM +0100, James Darnley wrote: >> Someone on IRC asked for a scale that would fit in a given box. This is the >> answer. I couldn't see it in the existing examples so I thought I would add >&g

Re: [FFmpeg-devel] [PATCH] build: fix MSVC under cygwin

2016-01-23 Thread James Darnley
On 2016-01-24 00:24, James Darnley wrote: > I will try and find out how old the option is. --mixed was added in 2002: > https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git;a=commit;h=1050e57c9afee171480510d3277877aca29c0f96 signature.asc Description: OpenPGP digital sig

Re: [FFmpeg-devel] [PATCH] build: fix msvc build

2016-01-23 Thread James Darnley
On 2016-01-23 22:11, charlie.arn...@gmail.com wrote: > +if enabled msvc; then > +dst_path=$(pwd -W) > +else > +dst_path=$(pwd) > +fi > + If using MSVC through Cygwin is supported this would fail. Its pwd command does not have the -W option. Most people probably don't use both.

[FFmpeg-devel] [PATCH] doc/filters: add an example to scale

2016-01-21 Thread James Darnley
Someone on IRC asked for a scale that would fit in a given box. This is the answer. I couldn't see it in the existing examples so I thought I would add it. --- doc/filters.texi | 6 ++ 1 file changed, 6 insertions(+) diff --git a/doc/filters.texi b/doc/filters.texi index dd1f203..56236c6

Re: [FFmpeg-devel] [PATCH 0/7] x86inc: Sync changes from x264

2016-01-17 Thread James Darnley
On 2016-01-17 23:59, Henrik Gramner wrote: > The following patches were recently pushed to x264. > > Geza Lore (1): > x86inc: Add debug symbols indicating sizes of compiled functions > > Henrik Gramner (6): > x86inc: Be more verbose in assertion failures > x86inc: Improve FMA instruction

Re: [FFmpeg-devel] [PATCH 1/4] fate: add 10-bit v210 encoder tests

2016-01-17 Thread James Darnley
On 2016-01-17 03:11, James Darnley wrote: > On 2016-01-15 20:07, James Darnley wrote: >> ... > > If nobody has further comments about the patches I will probably push > these after I wake up. > A little later than planned but now pushed. signature.asc Description: Open

Re: [FFmpeg-devel] [PATCH 1/4] fate: add 10-bit v210 encoder tests

2016-01-16 Thread James Darnley
On 2016-01-15 20:07, James Darnley wrote: > ... If nobody has further comments about the patches I will probably push these after I wake up. signature.asc Description: OpenPGP digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.

[FFmpeg-devel] [PATCH 2/4] avcodec/v210: add avx2 version of the 8-bit line encoder

2016-01-15 Thread James Darnley
Around 35% faster than the avx version. Signed-off-by: Henrik Gramner --- The only changes here are the ones suggested by Henrik and a whitespace change for alignment at the function definition in v210enc_init.c --- libavcodec/v210enc.c | 5 +++--

[FFmpeg-devel] [PATCH 4/4] avcodec/v210: document the requirement for sample_factor

2016-01-15 Thread James Darnley
The sample factor must be the same for both 8- and 10-bit functions chosen otherwise the output will be incorrect. --- Should I squash this one too? --- libavcodec/v210enc.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/v210enc.h b/libavcodec/v210enc.h index

[FFmpeg-devel] [PATCH 3/4] avcodec/v210: add avx2 version of the 10-bit line encoder

2016-01-15 Thread James Darnley
Around 25% faster than the ssse3 version. --- New patch. Should I squash this into the previous patch before committing? --- libavcodec/v210enc.c | 11 +-- libavcodec/x86/constants.c| 3 ++- libavcodec/x86/constants.h| 2 +- libavcodec/x86/v210enc.asm| 20

[FFmpeg-devel] [PATCH 1/4] fate: add 10-bit v210 encoder tests

2016-01-15 Thread James Darnley
--- Is the name I chose for the 10-bit tests (v210-10) okay? --- tests/fate/vcodec.mak| 3 ++- tests/ref/vsynth/vsynth1-v210-10 | 4 tests/ref/vsynth/vsynth2-v210-10 | 4 tests/ref/vsynth/vsynth3-v210-10 | 4 tests/ref/vsynth/vsynth_lena-v210-10 | 4

Re: [FFmpeg-devel] [PATCH] avcodec/h264: mmxext 4:2:2 chroma deblock/loop filter

2016-01-15 Thread James Darnley
On 2016-01-15 04:21, Ronald S. Bultje wrote: > If you don't need r%dm (looks like you don't, but didn't check > exhaustively), you can also use a negative stack size (0 - mmsize - > ARCH_X86_64 * 2 * mmsize), then it will not create a stack pointer. I am already using r[0-3]m for storage. (A

Re: [FFmpeg-devel] [PATCH] avcodec/h264: mmxext 4:2:2 chroma deblock/loop filter

2016-01-15 Thread James Darnley
On 2016-01-15 03:55, James Almer wrote: > On 1/14/2016 11:05 PM, James Darnley wrote: >> diff --git a/libavcodec/x86/h264_deblock.asm >> b/libavcodec/x86/h264_deblock.asm >> index 5151f3c..20f0814 100644 >> --- a/libavcodec/x86/h264_deblock.asm >> +++

Re: [FFmpeg-devel] [PATCH] avcodec/h264: mmxext 4:2:2 chroma deblock/loop filter

2016-01-15 Thread James Darnley
On 2016-01-15 21:55, James Almer wrote: > On 1/15/2016 5:00 PM, James Darnley wrote: >> On 2016-01-15 03:55, James Almer wrote: >>> On 1/14/2016 11:05 PM, James Darnley wrote: >>>> diff --git a/libavcodec/x86/h264_deblock.asm >>>> b/libavcodec/x86/h2

[FFmpeg-devel] [PATCH] avcodec/h264: mmxext 4:2:2 chroma deblock/loop filter

2016-01-14 Thread James Darnley
2.6 times faster --- I have one question now. Should I make the function name match the assembly existing deblock/loop filter functions? I took the current name from the C (as I was originally trying to use a gather instruction but that didn't offer any benefit). ---

Re: [FFmpeg-devel] [PATCH] avcodec/v210: add avx2 version of the line encoder

2016-01-14 Thread James Darnley
On 2016-01-14 21:42, Henrik Gramner wrote: > On Thu, Jan 14, 2016 at 9:27 PM, James Darnley <james.darn...@gmail.com> > wrote: >> On 2016-01-14 20:21, Henrik Gramner wrote: >>> xmN can be used unconditionally which gets rid of the %else. E.g. >>>

Re: [FFmpeg-devel] [PATCH] avcodec/v210: add avx2 version of the line encoder

2016-01-14 Thread James Darnley
On 2016-01-14 20:21, Henrik Gramner wrote: > On Wed, Jan 13, 2016 at 4:55 PM, James Darnley <james.darn...@gmail.com> > wrote: >> diff --git a/libavcodec/x86/v210enc.asm b/libavcodec/x86/v210enc.asm >> index 859e2d9..a8f3d3c 100644 >> --- a/libavcodec/x86/v210e

[FFmpeg-devel] [PATCH] avcodec/v210: add avx2 version of the line encoder

2016-01-13 Thread James Darnley
Around 35% faster than the avx version. --- libavcodec/v210enc.c | 5 ++-- libavcodec/v210enc.h | 1 + libavcodec/x86/v210enc.asm| 53 +++ libavcodec/x86/v210enc_init.c | 7 ++ 4 files changed, 49 insertions(+), 17 deletions(-)

Re: [FFmpeg-devel] [libav-devel] [RFC] Cineform HD questions

2015-12-31 Thread James Darnley
On 2015-12-31 07:02, Kieran Kunhya wrote: >> Apart from that, again from a quick glance, there are a ton of >> mallocs/frees. Can these somehow get consolidated? > > Yes, that's what I don't know how to solve easily. They should of > course be a single allocated buffer that's reused. Forgive me

Re: [FFmpeg-devel] Ideas to replace the options system

2015-12-04 Thread James Darnley
On 2015-12-04 15:33, Nicolas George wrote: > Why do we need a new options system? > > Most importantly: escaping hell OMG yes! I have seen several times the amount of backslashes Windows users are forced to use to provide a path to some of the filters. You raise a lot of good points that

Re: [FFmpeg-devel] overlay filter option/alternative

2015-12-04 Thread James Darnley
On 2015-12-04 06:29, Ryan Williams wrote: > EDIT: Fixed errors in syntax. > > TLDR, Would you consider an 'underlay' filter or perhaps an option on the > 'overlay' filter that reverses the order of the input labels? > > Consider the following shorthand syntax "[input][a] overlay, [b] overlay,

Re: [FFmpeg-devel] [PATCH] all: use M_SQRT1_2, M_SQRT2, M_PI

2015-11-19 Thread James Darnley
On 2015-11-19 13:52, Ganesh Ajjanagadde wrote: > diff --git a/libavfilter/af_dynaudnorm.c b/libavfilter/af_dynaudnorm.c >> index 8f0c2d0..62a2653 100644 >> --- a/libavfilter/af_dynaudnorm.c >> +++ b/libavfilter/af_dynaudnorm.c >> @@ -227,8 +227,6 @@ static int cqueue_pop(cqueue *q) >> return

Re: [FFmpeg-devel] [PATCHv3] avutil/common: add av_rint64_clip

2015-11-13 Thread James Darnley
On 2015-11-13 15:23, Ganesh Ajjanagadde wrote: > diff --git a/libavutil/version.h b/libavutil/version.h > index 909f9a6..ea10ff0 100644 > --- a/libavutil/version.h > +++ b/libavutil/version.h > @@ -56,8 +56,8 @@ > */ > > #define LIBAVUTIL_VERSION_MAJOR 55 > -#define LIBAVUTIL_VERSION_MINOR

Re: [FFmpeg-devel] [PATCH] avcodec: disallow hwaccel with frame threads

2015-10-23 Thread James Darnley
On 2015-10-23 13:54, Hendrik Leppkes wrote: > The only reason the combination of frame threads and HWAccel was > considered useful is to allow a seamless fallback to multi-threaded > software decoding if the HWAccel is not available, however the issues > outlined above far outweight this.

Re: [FFmpeg-devel] forcing ints to be 64 bits, possible new FATE client idea

2015-10-21 Thread James Darnley
On 2015-10-21 12:18, wm4 wrote: > with size_t/ptrdiff_t > being 128 bit, and a new "long long long int" type (I swear, they will > do it, even if that type name looks horrible). Please no! Just require a C99 style uint128_t/int128_t type. signature.asc Description: OpenPGP digital signature

Re: [FFmpeg-devel] forcing ints to be 64 bits, possible new FATE client idea

2015-10-21 Thread James Darnley
On 2015-10-21 14:44, Clément Bœsch wrote: > On Wed, Oct 21, 2015 at 06:00:21AM -0400, Ganesh Ajjanagadde wrote: > [...] >> why don't you spend 5 minutes trying to outline to beginners like me >> what is "actually important" in your view? >> > > According to the first 100 answers of the survey,

Re: [FFmpeg-devel] [PATCH] avutil/mathematics: speed up av_gcd by using Stein's binary GCD algorithm

2015-10-10 Thread James Darnley
On 2015-10-10 23:06, Ganesh Ajjanagadde wrote: > ... Is the greatest common denominator (yes, I had to look that up) actually used anywhere that is slow and needs to be fast? All the uses of 'av_gcd' found by grep appear be dealing with timing. I see framerate, timebase, scale. I do see uses

Re: [FFmpeg-devel] [PATCH] gitignore: ignore object file temporaries

2015-10-10 Thread James Darnley
On 2015-10-10 00:43, Ganesh Ajjanagadde wrote: > During a build, a lot of *.o-hash files are created - had not noticed > this as they are usually dumped in tmpfs on Linux. However, they > sometimes are present during a long build in the project directory, making it > annoying to commit while the

Re: [FFmpeg-devel] [RFC][PATCH] ffmpeg: add option to transform metadata using iconv

2015-10-10 Thread James Darnley
On 2015-10-09 14:46, Nicolas George wrote: > Le quartidi 4 vendémiaire, an CCXXIV, James Darnley a écrit : >> I can. You should find it attached to this email. I cleaned it up and >> put two test cases of data into the file. You will need Lua and the >> Lua-iconv mod

Re: [FFmpeg-devel] [PATCH 0/7] [RFC] x86 assembly constants

2015-10-03 Thread James Darnley
On 2015-10-03 04:08, Ronald S. Bultje wrote: > Hi, > > On Fri, Oct 2, 2015 at 4:58 PM, Hendrik Leppkes <h.lepp...@gmail.com> wrote: > >> On Fri, Oct 2, 2015 at 7:16 PM, Timothy Gu <timothyg...@gmail.com> wrote: >>> On Fri, Oct 2, 2015 at 10:08 AM James Darn

[FFmpeg-devel] [PATCH 4/7] avcodec: use new constants in assembly

2015-10-02 Thread James Darnley
--- libavcodec/x86/ac3dsp.asm | 2 +- libavcodec/x86/bswapdsp.asm | 3 +-- libavcodec/x86/diracdsp_yasm.asm| 6 +- libavcodec/x86/dwt_yasm.asm | 6 +- libavcodec/x86/h263_loopfilter.asm | 2 +- libavcodec/x86/h264_chromamc.asm

[FFmpeg-devel] [PATCH 6/7] swscale: use new constants in assembly

2015-10-02 Thread James Darnley
--- libswscale/x86/Makefile | 1 + libswscale/x86/constants.asm | 1 + libswscale/x86/output.asm| 5 + tests/ref/fate/source| 1 + 4 files changed, 4 insertions(+), 4 deletions(-) create mode 100644 libswscale/x86/constants.asm diff --git a/libswscale/x86/Makefile

[FFmpeg-devel] [PATCH 7/7] fixup! avfilter: use new constants in assembly

2015-10-02 Thread James Darnley
--- tests/ref/fate/source | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/ref/fate/source b/tests/ref/fate/source index 781f4cd..c1383dd 100644 --- a/tests/ref/fate/source +++ b/tests/ref/fate/source @@ -9,6 +9,7 @@ libavcodec/reverse.c libavcodec/x86/constants.asm

Re: [FFmpeg-devel] [PATCH 0/7] [RFC] x86 assembly constants

2015-10-02 Thread James Darnley
On 2015-10-02 19:16, Timothy Gu wrote: > On Fri, Oct 2, 2015 at 10:08 AM James Darnley <james.darn...@gmail.com> > wrote: > >> The third patch uses them in the remaining inline assembly. >> > > That's the crux of the problem: inline asm uses these constants, but

[FFmpeg-devel] [PATCH 1/7] avutil: add shared assembly constants

2015-10-02 Thread James Darnley
--- So here is the test file I was working on with the thoughts I had. --- ; This section is intended to possibly be included in x86inc.asm ; Align all constant to 32 bytes whether they are used in AVX code or not. %assign constant_align 32 ; Value to be used as padding to achieve alignment.

[FFmpeg-devel] [PATCH 3/7] avcodec: use new constants in C inline assembly

2015-10-02 Thread James Darnley
--- libavcodec/x86/cavsdsp.c| 2 +- libavcodec/x86/constants.h | 66 - libavcodec/x86/inline_asm.h | 2 +- libavcodec/x86/vc1dsp_mmx.c | 2 +- 4 files changed, 3 insertions(+), 69 deletions(-) delete mode 100644 libavcodec/x86/constants.h diff

[FFmpeg-devel] [PATCH 5/7] avfilter: use new constants in assembly

2015-10-02 Thread James Darnley
--- libavfilter/x86/Makefile | 2 ++ libavfilter/x86/af_volume.asm | 3 +-- libavfilter/x86/constants.asm | 1 + libavfilter/x86/vf_fspp.asm| 3 +-- libavfilter/x86/vf_removegrain.asm | 3 +-- libavfilter/x86/vf_ssim.asm| 2 +- libavfilter/x86/vf_yadif.asm

[FFmpeg-devel] [PATCH 2/7] avcodec: replace old C file with new assembly constants

2015-10-02 Thread James Darnley
--- libavcodec/x86/Makefile | 2 +- libavcodec/x86/constants.asm | 1 + libavcodec/x86/constants.c | 81 tests/ref/fate/source| 1 + 4 files changed, 3 insertions(+), 82 deletions(-) create mode 100644 libavcodec/x86/constants.asm

[FFmpeg-devel] [PATCH 0/7] [RFC] x86 assembly constants

2015-10-02 Thread James Darnley
that it would eliminate those almost pointless files. -- James Darnley (7): avutil: add shared assembly constants avcodec: replace old C file with new assembly constants avcodec: use new constants in C inline assembly avcodec: use new constants in assembly avfilter: use new constants in assembly

Re: [FFmpeg-devel] [PATCH] avfilter/vf_maskedmerge: add SIMD for maskedmerge with 8 bit depth input

2015-10-01 Thread James Darnley
On 2015-10-01 19:25, Paul B Mahol wrote: > +cglobal maskedmerge8, 10, 11, 3, 0, bsrc, blinesize, osrc, olinesize, msrc, > mlinesize, dst, dlinesize, w, h You need a guard to prevent this being compiled on x86. > +lea bsrcq, [bsrcq+blinesizeq] > +lea osrcq, [osrcq+olinesizeq] > +lea

Re: [FFmpeg-devel] [PATCH 2/2] tests/fate/source-check.sh: Check for common typos

2015-09-29 Thread James Darnley
On 2015-09-29 21:56, Clément Bœsch wrote: > On Tue, Sep 29, 2015 at 09:21:53PM +0200, Hendrik Leppkes wrote: >> I agree, we have patchcheck for typo checking. > > A lot of people do not run patchcheck (I personally never do, and given > that we fix typo on a regular basis I'm probably not the

<    1   2   3   4   5   >