Re: [libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-16 Thread Luca Barbato
On 10/13/16 16:02, Alexandra Hájková wrote: From: Pierre Edouard Lepere Initially written by Pierre Edouard Lepere , extended by James Almer . Signed-off-by: Alexandra Hájková

Re: [libav-devel] [PATCH] swscale: Properly load alpha for planar rgb

2016-10-16 Thread Sean McGovern
Hi, On Oct 14, 2016 17:28, "Luca Barbato" wrote: > > On 14/10/2016 23:25, Vittorio Giovara wrote: > > From: Michael Niedermayer > > > > Signed-off-by: Vittorio Giovara > > --- > > This should fix ppc and sun fate tests. > >

[libav-devel] [PATCHv3] arm: vp9: Add NEON loop filters

2016-10-16 Thread Martin Storsjö
This work is sponsored by, and copyright, Google. The implementation tries to have smart handling of cases where no pixels need the full filtering for the 8/16 width filters, skipping both calculation and writeback of the unmodified pixels in those cases. The actual effect of this is hard to test

[libav-devel] [PATCH 09/12] avcodec/huffyuvdsp: Change w to intptr in add_hfyu_median_pred() and add_hfyu_left_pred()

2016-10-16 Thread Janne Grunau
From: Michael Niedermayer This avoids potential issues with the high 32bits being random in x86-64 asm Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavcodec/huffyuvdsp.h | 4 ++--

[libav-devel] [PATCH 11/12] x86util: add and use RSHIFT/LSHIFT macros

2016-10-16 Thread Janne Grunau
From: Christophe Gisquet Those macros take a byte number as shift argument, as this argument differs between MMX and SSE2 instructions. Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau ---

[libav-devel] [PATCH 01/12] yadif: x86 assembly for 16-bit samples

2016-10-16 Thread Janne Grunau
From: James Darnley This is a fairly dumb copy of the assembly for 8-bit samples but it works and produces identical output to the C version. The options have been tested on an Athlon64 and a Core2Quad. Athlon64: 1810385 decicycles in C,32726 runs, 42 skips 1080744

[libav-devel] [PATCH 02/12] yadif: x86 assembly for 9 to 14-bit samples

2016-10-16 Thread Janne Grunau
From: James Darnley These smaller samples do not need to be unpacked to double words allowing the code to process more pixels every iteration (still 2 in MMX but 6 in SSE2). It also avoids emulating the missing double word instructions on older instruction sets. Like

[libav-devel] [PATCH 05/12] yadif: remove an 'm' from the LOAD macro definition

2016-10-16 Thread Janne Grunau
From: James Darnley Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavfilter/x86/vf_yadif.asm | 28 ++-- libavfilter/x86/yadif-10.asm | 26 +-

[libav-devel] [PATCH 12/12] x86/yadif-10: remove duplicate ABS macro

2016-10-16 Thread Janne Grunau
From: James Almer And use the x86util ones instead, which are optimized for mmxext/sse2. About ~1% increase in performance on pre SSSE3 processors. Signed-off-by: James Almer Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau

[libav-devel] [PATCH 07/12] x86: huffyuvdsp: port add_bytes to yasm

2016-10-16 Thread Janne Grunau
From: Christophe Gisquet C MMX SSE2 Cycles: 2972 587 302 Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavcodec/huffyuvdsp.h | 2 +- libavcodec/huffyuvdsp.c

[libav-devel] [PATCH 04/12] yadif: remove repeated check on width

2016-10-16 Thread Janne Grunau
From: James Darnley The filter already checks that width (and height) are greater than 3. Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavfilter/x86/vf_yadif.asm | 2 -- libavfilter/x86/yadif-10.asm |

[libav-devel] [PATCH 08/12] x86: huffyuvdsp: add SSE2 median prediction

2016-10-16 Thread Janne Grunau
From: Christophe Gisquet >From 5010c to 4566 on lagarith YUY2. Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavcodec/x86/huffyuvdsp_init.c | 4 ++ libavcodec/x86/huffyuvdsp.asm| 98

[libav-devel] [PATCH 10/12] x86: huffyuvdsp: fewer functions for x86_64

2016-10-16 Thread Janne Grunau
From: Christophe Gisquet When there are 2 functions that are <= SSE2, only one is needed for x86_64. Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavcodec/x86/huffyuvdsp_init.c | 10 ++

[libav-devel] Sync yadif high bit depth and huffyuv x86 asm changes from FFmpeg

2016-10-16 Thread Janne Grunau
Hi, This series brings in the yadif high bit depth x86 asm from FFmpeg. Since '[PATCH 11/12] x86util: add and use RSHIFT/LSHIFT macros' depends on huffyuv asm changes and those looked self-contained I brought them over as well. I tried to keep the changes to the original patches as small as

[libav-devel] [PATCH 03/12] fate: yadif: add >8 bit tests

2016-10-16 Thread Janne Grunau
From: Christophe Gisquet Signed-off-by: Janne Grunau --- tests/fate/filter-video.mak | 6 ++ tests/ref/fate/filter-yadif10 | 31 +++ tests/ref/fate/filter-yadif16 | 31 +++ 3

[libav-devel] [PATCH 06/12] Change license of yadif from GPL to LGPL

2016-10-16 Thread Janne Grunau
From: Robert Krüger Signed-off-by: Robert Krüger Signed-off-by: Michael Niedermayer Signed-off-by: Janne Grunau --- libavfilter/x86/yadif-10.asm | 18 +- libavfilter/x86/yadif-16.asm | 18

[libav-devel] [PATCH] vaapi_h265: Include header for slice types

2016-10-16 Thread Mark Thompson
The include was changed correctly in 4abe3b049d987420eb891f74a35af2cebbf52144, but then mistakenly changed back by c359d624d3efc3fd1d83210d78c4152bd329b765 (it's not just the NAL unit types which are used). --- Building is currently broken with libva supporting H.265, this fixes it.

Re: [libav-devel] [PATCH 06/12] Change license of yadif from GPL to LGPL

2016-10-16 Thread Luca Barbato
On 16/10/2016 20:10, Janne Grunau wrote: > From: Robert Krüger > > Signed-off-by: Robert Krüger > Signed-off-by: Michael Niedermayer > Signed-off-by: Janne Grunau > --- > libavfilter/x86/yadif-10.asm | 18

Re: [libav-devel] [PATCHv3] arm: vp9: Add NEON loop filters

2016-10-16 Thread Luca Barbato
On 16/10/2016 23:23, Martin Storsjö wrote: > On Sun, 16 Oct 2016, Luca Barbato wrote: > >> On 16/10/2016 22:18, Martin Storsjö wrote: >>> >>> Now the comparison to libvpx is much more close; we're rarely slower >>> at all, and even much faster in some cases. >> >> Probably you could update the

Re: [libav-devel] [PATCH 05/12] yadif: remove an 'm' from the LOAD macro definition

2016-10-16 Thread Luca Barbato
On 16/10/2016 20:10, Janne Grunau wrote: > From: James Darnley > > Signed-off-by: Michael Niedermayer > Signed-off-by: Janne Grunau > --- > libavfilter/x86/vf_yadif.asm | 28 ++-- >

Re: [libav-devel] [PATCHv3] arm: vp9: Add NEON loop filters

2016-10-16 Thread Luca Barbato
On 16/10/2016 22:18, Martin Storsjö wrote: > > Now the comparison to libvpx is much more close; we're rarely slower > at all, and even much faster in some cases. Probably you could update the statement in the commit and push it then =) lu ___

Re: [libav-devel] [PATCH 04/12] yadif: remove repeated check on width

2016-10-16 Thread Luca Barbato
On 16/10/2016 20:10, Janne Grunau wrote: > From: James Darnley > > The filter already checks that width (and height) are greater than 3. > > Signed-off-by: Michael Niedermayer > Signed-off-by: Janne Grunau > --- >

Re: [libav-devel] [PATCHv3] arm: vp9: Add NEON loop filters

2016-10-16 Thread Martin Storsjö
On Sun, 16 Oct 2016, Luca Barbato wrote: On 16/10/2016 22:18, Martin Storsjö wrote: Now the comparison to libvpx is much more close; we're rarely slower at all, and even much faster in some cases. Probably you could update the statement in the commit and push it then =) I've already

Re: [libav-devel] [PATCH] vaapi_h265: Include header for slice types

2016-10-16 Thread Luca Barbato
On 17/10/2016 01:03, Mark Thompson wrote: > The include was changed correctly in 4abe3b049d987420eb891f74a35af2cebbf52144, > but then mistakenly changed back by c359d624d3efc3fd1d83210d78c4152bd329b765 > (it's not just the NAL unit types which are used). > --- > Building is currently broken with