Re: [libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Diego Biurrun
On Mon, Oct 17, 2016 at 05:57:24PM -0400, Vittorio Giovara wrote:
> Rotation, aspect ratio and pure matrix export.
> 
> Signed-off-by: Vittorio Giovara 
> ---
> Updated according to Diego's review.
> I haven't split the AR tests, I think it makes sense to have it in a single
> one, but if people strongly prefer the split version, I'll change it.
> Vittorio

Please split it.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-17 Thread Luca Barbato
On 13/10/2016 20:04, Diego Biurrun wrote:
> This does not match the conditions in the .asm file.

Something along those lines seems to work on x86_32 according to
checkasm. Folded in my github tree.

diff --git a/libavcodec/x86/hevcdsp_init.c b/libavcodec/x86/hevcdsp_init.c
index 73279c2..d60ae5e 100644
--- a/libavcodec/x86/hevcdsp_init.c
+++ b/libavcodec/x86/hevcdsp_init.c
@@ -337,6 +337,9 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const
int bit_depth)
 c->add_residual[2] = ff_hevc_add_residual_16_8_avx;
 c->add_residual[3] = ff_hevc_add_residual_32_8_avx;
 }
+if (EXTERNAL_AVX2(cpu_flags)) {
+c->add_residual[3] = ff_hevc_add_residual_32_8_avx2;
+}
 } else if (bit_depth == 10) {
 if (EXTERNAL_MMXEXT(cpu_flags)) {
 c->idct_dc[0] = ff_hevc_idct_4x4_dc_10_mmxext;
@@ -370,6 +373,10 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const
int bit_depth)
 c->idct[0] = ff_hevc_idct_4x4_10_avx;
 c->idct[1] = ff_hevc_idct_8x8_10_avx;
 }
+if (EXTERNAL_AVX2(cpu_flags)) {
+c->add_residual[2] = ff_hevc_add_residual_16_10_avx2;
+c->add_residual[3] = ff_hevc_add_residual_32_10_avx2;
+}
 }

 #if ARCH_X86_64
@@ -401,8 +408,6 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const
int bit_depth)
 if (EXTERNAL_AVX2(cpu_flags)) {
 c->idct_dc[2] = ff_hevc_idct_16x16_dc_8_avx2;
 c->idct_dc[3] = ff_hevc_idct_32x32_dc_8_avx2;
-
-c->add_residual[3] = ff_hevc_add_residual_32_8_avx2;
 }
 } else if (bit_depth == 10) {
 if (EXTERNAL_SSE2(cpu_flags)) {
@@ -434,9 +439,6 @@ void ff_hevc_dsp_init_x86(HEVCDSPContext *c, const
int bit_depth)
 if (EXTERNAL_AVX2(cpu_flags)) {
 c->idct_dc[2] = ff_hevc_idct_16x16_dc_10_avx2;
 c->idct_dc[3] = ff_hevc_idct_32x32_dc_10_avx2;
-
-c->add_residual[2] = ff_hevc_add_residual_16_10_avx2;
-c->add_residual[3] = ff_hevc_add_residual_32_10_avx2;
 }
 }
 #endif /* ARCH_X86_64 */
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Diego Biurrun
Looking at this in more detail now that the first round of review is over.

On Mon, Oct 17, 2016 at 11:54:04AM -0400, Vittorio Giovara wrote:
> --- a/tests/fate-run.sh
> +++ b/tests/fate-run.sh
> @@ -76,6 +76,11 @@ probefmt(){
>  
> +probear(){
> +run avprobe -show_stream_entry sample_aspect_ratio -v 0 "$@"
> +run avprobe -show_stream_entry display_aspect_ratio -v 0 "$@"
> +}
> +
> --- a/tests/fate/probe.mak
> +++ b/tests/fate/probe.mak
> @@ -16,3 +16,16 @@ fate-probe-format: $(FATE_PROBE_FORMAT)
>  $(FATE_PROBE_FORMAT): avprobe$(EXESUF)
> +$(FATE_MOV): avprobe$(EXESUF)

You are duplicating the dependency list, don't.

> +FATE_MOV += fate-mov-display-matrix
> +fate-mov-display-matrix: CMD = run avprobe -v 0 -show_stream_entry matrix 
> $(TARGET_SAMPLES)/mov/displaymatrix.mov
> +
> +FATE_MOV += fate-mov-rotation
> +fate-mov-rotation: CMD = run avprobe -v 0 -show_stream_entry rotation 
> $(TARGET_SAMPLES)/mov/displaymatrix.mov
> +
> +FATE_MOV += fate-mov-ar
> +fate-mov-ar: CMD = probear $(TARGET_SAMPLES)/mov/displaymatrix.mov

This is inconsistent.  You added a convenience function for the last
test, but not for the first two.  You could simply run the last test
"manually" and then separate the tests for sar and dar - and separating
tests is always a good thing IMO.  Or you could add a convenience
function that can be used in all tests.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Vittorio Giovara
On Mon, Oct 17, 2016 at 1:30 PM, Diego Biurrun  wrote:
> Looking at this in more detail now that the first round of review is over.
>
> On Mon, Oct 17, 2016 at 11:54:04AM -0400, Vittorio Giovara wrote:
>> --- a/tests/fate-run.sh
>> +++ b/tests/fate-run.sh
>> @@ -76,6 +76,11 @@ probefmt(){
>>
>> +probear(){
>> +run avprobe -show_stream_entry sample_aspect_ratio -v 0 "$@"
>> +run avprobe -show_stream_entry display_aspect_ratio -v 0 "$@"
>> +}
>> +
>> --- a/tests/fate/probe.mak
>> +++ b/tests/fate/probe.mak
>> @@ -16,3 +16,16 @@ fate-probe-format: $(FATE_PROBE_FORMAT)
>>  $(FATE_PROBE_FORMAT): avprobe$(EXESUF)
>> +$(FATE_MOV): avprobe$(EXESUF)
>
> You are duplicating the dependency list, don't.

How do you mean? Is it enough to do
$(FATE_PROBE_FORMAT) += $(FATE _MOV)

>> +FATE_MOV += fate-mov-display-matrix
>> +fate-mov-display-matrix: CMD = run avprobe -v 0 -show_stream_entry matrix 
>> $(TARGET_SAMPLES)/mov/displaymatrix.mov
>> +
>> +FATE_MOV += fate-mov-rotation
>> +fate-mov-rotation: CMD = run avprobe -v 0 -show_stream_entry rotation 
>> $(TARGET_SAMPLES)/mov/displaymatrix.mov
>> +
>> +FATE_MOV += fate-mov-ar
>> +fate-mov-ar: CMD = probear $(TARGET_SAMPLES)/mov/displaymatrix.mov
>
> This is inconsistent.  You added a convenience function for the last
> test, but not for the first two.  You could simply run the last test
> "manually" and then separate the tests for sar and dar - and separating
> tests is always a good thing IMO.  Or you could add a convenience
> function that can be used in all tests.

I know it's inconsistent, but I moved the AR tests in a separate
function because I thought it might usable in other tests, while
rotation and displaymatrix are something that are meaningful only for
mov.
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Diego Biurrun
On Mon, Oct 17, 2016 at 01:47:32PM -0400, Vittorio Giovara wrote:
> On Mon, Oct 17, 2016 at 1:30 PM, Diego Biurrun  wrote:
> > Looking at this in more detail now that the first round of review is over.
> >
> > On Mon, Oct 17, 2016 at 11:54:04AM -0400, Vittorio Giovara wrote:
> >> --- a/tests/fate-run.sh
> >> +++ b/tests/fate-run.sh
> >> @@ -76,6 +76,11 @@ probefmt(){
> >>
> >> +probear(){
> >> +run avprobe -show_stream_entry sample_aspect_ratio -v 0 "$@"
> >> +run avprobe -show_stream_entry display_aspect_ratio -v 0 "$@"
> >> +}
> >> +
> >> --- a/tests/fate/probe.mak
> >> +++ b/tests/fate/probe.mak
> >> @@ -16,3 +16,16 @@ fate-probe-format: $(FATE_PROBE_FORMAT)
> >>  $(FATE_PROBE_FORMAT): avprobe$(EXESUF)
> >> +$(FATE_MOV): avprobe$(EXESUF)
> >
> > You are duplicating the dependency list, don't.
> 
> How do you mean? Is it enough to do
> $(FATE_PROBE_FORMAT) += $(FATE _MOV)

$(FATE_PROBE_FORMAT) $(FATE_MOV): avprobe$(EXESUF)

> >> +FATE_MOV += fate-mov-display-matrix
> >> +fate-mov-display-matrix: CMD = run avprobe -v 0 -show_stream_entry matrix 
> >> $(TARGET_SAMPLES)/mov/displaymatrix.mov
> >> +
> >> +FATE_MOV += fate-mov-rotation
> >> +fate-mov-rotation: CMD = run avprobe -v 0 -show_stream_entry rotation 
> >> $(TARGET_SAMPLES)/mov/displaymatrix.mov
> >> +
> >> +FATE_MOV += fate-mov-ar
> >> +fate-mov-ar: CMD = probear $(TARGET_SAMPLES)/mov/displaymatrix.mov
> >
> > This is inconsistent.  You added a convenience function for the last
> > test, but not for the first two.  You could simply run the last test
> > "manually" and then separate the tests for sar and dar - and separating
> > tests is always a good thing IMO.  Or you could add a convenience
> > function that can be used in all tests.
> 
> I know it's inconsistent, but I moved the AR tests in a separate
> function because I thought it might usable in other tests, while
> rotation and displaymatrix are something that are meaningful only for
> mov.

Just create one helper function for -show_stream_entry.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 2/2] hevc: x86: Add add_residual optimizations

2016-10-17 Thread Luca Barbato
From: Pierre Edouard Lepere 

Initially written by Pierre Edouard Lepere 
,
extended by James Almer .

Signed-off-by: Alexandra Hájková 
Signed-off-by: Luca Barbato 
---
 libavcodec/x86/Makefile |   3 +-
 libavcodec/x86/hevc_add_res.asm | 391 
 libavcodec/x86/hevcdsp_init.c   |  40 
 3 files changed, 433 insertions(+), 1 deletion(-)
 create mode 100644 libavcodec/x86/hevc_add_res.asm

diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
index a38535b..7574085 100644
--- a/libavcodec/x86/Makefile
+++ b/libavcodec/x86/Makefile
@@ -117,7 +117,8 @@ YASM-OBJS-$(CONFIG_DCA_DECODER)+= x86/dcadsp.o
 YASM-OBJS-$(CONFIG_DNXHD_ENCODER)  += x86/dnxhdenc.o
 YASM-OBJS-$(CONFIG_HEVC_DECODER)   += x86/hevc_deblock.o\
   x86/hevc_mc.o \
-  x86/hevc_idct.o
+  x86/hevc_idct.o   \
+  x86/hevc_add_res.o
 YASM-OBJS-$(CONFIG_PNG_DECODER)+= x86/pngdsp.o
 YASM-OBJS-$(CONFIG_PRORES_DECODER) += x86/proresdsp.o
 YASM-OBJS-$(CONFIG_RV40_DECODER)   += x86/rv40dsp.o
diff --git a/libavcodec/x86/hevc_add_res.asm b/libavcodec/x86/hevc_add_res.asm
new file mode 100644
index 000..0e6706b
--- /dev/null
+++ b/libavcodec/x86/hevc_add_res.asm
@@ -0,0 +1,391 @@
+; *
+; * Provide SIMD optimizations for add_residual functions for HEVC decoding
+; * Copyright (c) 2014 Pierre-Edouard LEPERE
+; *
+; * This file is part of Libav.
+; *
+; * Libav is free software; you can redistribute it and/or
+; * modify it under the terms of the GNU Lesser General Public
+; * License as published by the Free Software Foundation; either
+; * version 2.1 of the License, or (at your option) any later version.
+; *
+; * Libav is distributed in the hope that it will be useful,
+; * but WITHOUT ANY WARRANTY; without even the implied warranty of
+; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+; * Lesser General Public License for more details.
+; *
+; * You should have received a copy of the GNU Lesser General Public
+; * License along with Libav; if not, write to the Free Software
+; * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA
+; 
**
+
+%include "libavutil/x86/x86util.asm"
+
+SECTION_RODATA 32
+max_pixels_10:  times 16  dw ((1 << 10)-1)
+
+SECTION .text
+
+; the add_res macros and functions were largely inspired by x264 project's 
code in the h264_idct.asm file
+%macro ADD_RES_MMX_4_8 0
+mova  m2, [r1]
+mova  m4, [r1+8]
+pxor  m3, m3
+psubw m3, m2
+packuswb  m2, m2
+packuswb  m3, m3
+pxor  m5, m5
+psubw m5, m4
+packuswb  m4, m4
+packuswb  m5, m5
+
+movh  m0, [r0 ]
+movh  m1, [r0+r2  ]
+paddusb   m0, m2
+paddusb   m1, m4
+psubusb   m0, m3
+psubusb   m1, m5
+movh   [r0 ], m0
+movh   [r0+r2  ], m1
+%endmacro
+
+
+INIT_MMX mmxext
+; void ff_hevc_add_residual_4_8_mmxext(uint8_t *dst, int16_t *coeffs, 
ptrdiff_t stride)
+cglobal hevc_add_residual_4_8, 3, 4, 6
+ADD_RES_MMX_4_8
+add   r1, 16
+lea   r0, [r0+r2*2]
+ADD_RES_MMX_4_8
+RET
+
+%macro ADD_RES_SSE_8_8 0
+pxor  m3, m3
+mova  m4, [r1]
+mova  m6, [r1+16]
+mova  m0, [r1+32]
+mova  m2, [r1+48]
+psubw m5, m3, m4
+psubw m7, m3, m6
+psubw m1, m3, m0
+packuswb  m4, m0
+packuswb  m5, m1
+psubw m3, m2
+packuswb  m6, m2
+packuswb  m7, m3
+
+movqm0, [r0 ]
+movqm1, [r0+r2  ]
+movhps  m0, [r0+r2*2]
+movhps  m1, [r0+r3  ]
+paddusb m0, m4
+paddusb m1, m6
+psubusb m0, m5
+psubusb m1, m7
+movq [r0 ], m0
+movq [r0+r2  ], m1
+movhps   [r0+2*r2], m0
+movhps   [r0+r3  ], m1
+%endmacro
+
+%macro ADD_RES_SSE_16_32_8 3
+mova xm2, [r1+%1   ]
+mova xm6, [r1+%1+16]
+%if cpuflag(avx2)
+vinserti128   m2, m2, [r1+%1+32], 1
+vinserti128   m6, m6, [r1+%1+48], 1
+%endif
+%if cpuflag(avx)
+psubw m1, m0, m2
+psubw m5, m0, m6
+%else
+mova  m1, m0
+mova

[libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Vittorio Giovara
Rotation, aspect ratio and pure matrix export.

Signed-off-by: Vittorio Giovara 
---
Updated according to Diego's review.
I haven't split the AR tests, I think it makes sense to have it in a single
one, but if people strongly prefer the split version, I'll change it.
Vittorio

 tests/fate-run.sh |  4 
 tests/fate/probe.mak  | 14 +-
 tests/ref/fate/mov-ar |  2 ++
 tests/ref/fate/mov-display-matrix |  9 +
 tests/ref/fate/mov-rotation   |  1 +
 5 files changed, 29 insertions(+), 1 deletion(-)
 create mode 100644 tests/ref/fate/mov-ar
 create mode 100644 tests/ref/fate/mov-display-matrix
 create mode 100644 tests/ref/fate/mov-rotation

diff --git a/tests/fate-run.sh b/tests/fate-run.sh
index d11ca3c..b1b299a 100755
--- a/tests/fate-run.sh
+++ b/tests/fate-run.sh
@@ -76,6 +76,10 @@ probefmt(){
 run avprobe -show_format_entry format_name -v 0 "$@"
 }
 
+probestream(){
+run avprobe -show_stream_entry "$1" -v 0 "$2"
+}
+
 avconv(){
 dec_opts="-hwaccel $hwaccel -threads $threads -thread_type $thread_type"
 avconv_args="-nostats -cpuflags $cpuflags"
diff --git a/tests/fate/probe.mak b/tests/fate/probe.mak
index 376dfdd..8b9b5dd 100644
--- a/tests/fate/probe.mak
+++ b/tests/fate/probe.mak
@@ -13,6 +13,18 @@ fate-probe-format-roundup2015: REF = dv
 FATE_SAMPLES-$(CONFIG_AVPROBE) += $(FATE_PROBE_FORMAT)
 fate-probe-format: $(FATE_PROBE_FORMAT)
 
-$(FATE_PROBE_FORMAT): avprobe$(EXESUF)
+FATE_MOV += fate-mov-display-matrix
+fate-mov-display-matrix: CMD = probestream matrix 
$(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+FATE_MOV += fate-mov-rotation
+fate-mov-rotation: CMD = probestream rotation 
$(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+FATE_MOV += fate-mov-ar
+fate-mov-ar: CMD = probestream sample_aspect_ratio 
$(TARGET_SAMPLES)/mov/displaymatrix.mov && probestream display_aspect_ratio 
$(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+FATE_SAMPLES-$(call ALLYES, AVPROBE MOV_DEMUXER) += $(FATE_MOV)
+fate-mov: $(FATE_MOV)
+
+$(FATE_PROBE_FORMAT) $(FATE_MOV): avprobe$(EXESUF)
 $(FATE_PROBE_FORMAT): CMP = oneline
 fate-probe-format-%: CMD = probefmt 
$(TARGET_SAMPLES)/probe-format/$(@:fate-probe-format-%=%)
diff --git a/tests/ref/fate/mov-ar b/tests/ref/fate/mov-ar
new file mode 100644
index 000..13cbd9c
--- /dev/null
+++ b/tests/ref/fate/mov-ar
@@ -0,0 +1,2 @@
+9:2
+3:1
diff --git a/tests/ref/fate/mov-display-matrix 
b/tests/ref/fate/mov-display-matrix
new file mode 100644
index 000..64c9599
--- /dev/null
+++ b/tests/ref/fate/mov-display-matrix
@@ -0,0 +1,9 @@
+0
+65536
+0
+-65536
+0
+0
+47185920
+0
+1073741824
diff --git a/tests/ref/fate/mov-rotation b/tests/ref/fate/mov-rotation
new file mode 100644
index 000..64ded27
--- /dev/null
+++ b/tests/ref/fate/mov-rotation
@@ -0,0 +1 @@
+-90
-- 
2.10.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] configure: Print warnings after all other output

2016-10-17 Thread Luca Barbato
On 17/10/2016 21:50, Diego Biurrun wrote:
> ---
> 
> Back in the day I would have added "10l to elenril" to the log msg.

Fine even w/out that much cola.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] checkasm: Add a test for HEVC add_residual

2016-10-17 Thread Martin Storsjö

On Mon, 17 Oct 2016, Luca Barbato wrote:


From: Alexandra Hájková 

---
tests/checkasm/Makefile   |  2 +-
tests/checkasm/checkasm.c |  1 +
tests/checkasm/checkasm.h |  1 +
tests/checkasm/hevc_add_res.c | 84 +++
4 files changed, 87 insertions(+), 1 deletion(-)
create mode 100644 tests/checkasm/hevc_add_res.c


So, what changed compared to the last version that Alexandra submitted 
herself? Didn't we all agree that listing that is the least we can do, 
when resubmitting someone (who is still active and present) else's patch 
in a modified form.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] checkasm: Add a test for HEVC add_residual

2016-10-17 Thread Luca Barbato
On 17/10/2016 23:45, Martin Storsjö wrote:
> So, what changed compared to the last version that Alexandra submitted
> herself? Didn't we all agree that listing that is the least we can do,
> when resubmitting someone (who is still active and present) else's patch
> in a modified form.

I changed what I found to be wrong (6 instead of 16) and what looked
strange (the filename) as mentioned before.

lu
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] checkasm: Add a test for HEVC add_residual

2016-10-17 Thread Luca Barbato
From: Alexandra Hájková 

---
 tests/checkasm/Makefile   |  2 +-
 tests/checkasm/checkasm.c |  1 +
 tests/checkasm/checkasm.h |  1 +
 tests/checkasm/hevc_add_res.c | 84 +++
 4 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 tests/checkasm/hevc_add_res.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 9b3df55..ac3e97e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -12,7 +12,7 @@ AVCODECOBJS-$(CONFIG_VP8DSP)+= vp8dsp.o
 
 # decoders/encoders
 AVCODECOBJS-$(CONFIG_DCA_DECODER)   += dcadsp.o synth_filter.o
-AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o
+AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o hevc_add_res.o
 AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
 AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o
 
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 040c4eb..623bbce 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -90,6 +90,7 @@ static const struct {
 { "h264qpel", checkasm_check_h264qpel },
 #endif
 #if CONFIG_HEVC_DECODER
+{ "hevc_add_res", checkasm_check_hevc_add_res },
 { "hevc_mc", checkasm_check_hevc_mc },
 { "hevc_idct", checkasm_check_hevc_idct },
 #endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 5a4c056..bacd6f4 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -39,6 +39,7 @@ void checkasm_check_fmtconvert(void);
 void checkasm_check_h264dsp(void);
 void checkasm_check_h264pred(void);
 void checkasm_check_h264qpel(void);
+void checkasm_check_hevc_add_res(void);
 void checkasm_check_hevc_idct(void);
 void checkasm_check_hevc_mc(void);
 void checkasm_check_huffyuvdsp(void);
diff --git a/tests/checkasm/hevc_add_res.c b/tests/checkasm/hevc_add_res.c
new file mode 100644
index 000..c242c8c
--- /dev/null
+++ b/tests/checkasm/hevc_add_res.c
@@ -0,0 +1,84 @@
+/*
+ * Copyright (c) 2016 Alexandra Hájková
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with Libav; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+
+#include "libavutil/intreadwrite.h"
+
+#include "libavcodec/hevcdsp.h"
+
+#include "checkasm.h"
+
+#define randomize_buffers(buf, size)\
+do {\
+int j;  \
+for (j = 0; j < size; j++) {\
+int16_t r = rnd();  \
+AV_WN16A(buf + j, r >> 3);  \
+}   \
+} while (0)
+
+#define randomize_buffers2(buf, size) \
+do {  \
+int j;\
+for (j = 0; j < size; j++)\
+AV_WN16A(buf + j * 2, rnd() & 0x3FF); \
+} while (0)
+
+static void check_add_res(HEVCDSPContext h, int bit_depth)
+{
+int i;
+LOCAL_ALIGNED(32, int16_t, res0, [32 * 32]);
+LOCAL_ALIGNED(32, int16_t, res1, [32 * 32]);
+LOCAL_ALIGNED(32, uint8_t, dst0, [32 * 32 * 2]);
+LOCAL_ALIGNED(32, uint8_t, dst1, [32 * 32 * 2]);
+
+for (i = 2; i <= 5; i++) {
+int block_size = 1 << i;
+int size = block_size * block_size;
+declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *dst, int16_t *res, 
ptrdiff_t stride);
+
+randomize_buffers(res0, size);
+randomize_buffers2(dst0, size * 2);
+memcpy(res1, res0, sizeof(*res0) * size);
+memcpy(dst1, dst0, size * 2);
+
+if (check_func(h.add_residual[i - 2], "add_res_%dx%d_%d", block_size, 
block_size, bit_depth)) {
+call_ref(dst0, res0, block_size * 2);
+call_new(dst1, res1, block_size * 2);
+if (memcmp(dst0, dst1, size * 2))
+fail();
+bench_new(dst1, res1, block_size);
+}
+}
+}
+
+void checkasm_check_hevc_add_res(void)
+{
+int bit_depth;
+
+for (bit_depth = 8; bit_depth <= 10; bit_depth++) {
+HEVCDSPContext h;
+
+ff_hevc_dsp_init(, bit_depth);
+check_add_res(h, bit_depth);
+}
+report("add_residual");
+}
-- 
2.9.2

___
libav-devel mailing list
libav-devel@libav.org

Re: [libav-devel] [PATCH] vaapi_h265: Include header for slice types

2016-10-17 Thread Diego Biurrun
On Mon, Oct 17, 2016 at 01:51:53AM +0200, Luca Barbato wrote:
> On 17/10/2016 01:03, Mark Thompson wrote:
> > --- a/libavcodec/vaapi_encode_h265.c
> > +++ b/libavcodec/vaapi_encode_h265.c
> > @@ -25,7 +25,7 @@
> > 
> >  #include "avcodec.h"
> > -#include "hevc.h"
> > +#include "hevcdec.h"
> >  #include "internal.h"
> >  #include "put_bits.h"
> >  #include "vaapi_encode.h"
> > 
> 
> Ok... I wonder if I can setup some of the fate oracles to build with vaapi.

Ping me about it on IRC later please.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: videodsp: Add parentheses to expression to work around warning

2016-10-17 Thread Luca Barbato
On 17/10/2016 16:07, Diego Biurrun wrote:
> Also visible only on NASM, not YASM.

And should be reported to the NASM developers since seem strange.

lu
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86: videodsp: Add parentheses to expression to work around warning

2016-10-17 Thread Diego Biurrun
libavcodec/x86/videodsp.asm:128: warning: signed dword value exceeds bounds
---

Also visible only on NASM, not YASM.

 libavcodec/x86/videodsp.asm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/x86/videodsp.asm b/libavcodec/x86/videodsp.asm
index 53b9e82..b22e0fe 100644
--- a/libavcodec/x86/videodsp.asm
+++ b/libavcodec/x86/videodsp.asm
@@ -110,7 +110,7 @@ cglobal emu_edge_hvar, 5, 6, 1, dst, dst_stride, start_x, 
n_words, h, w
 .x_loop:;   do {
 movu[dstq+wq*2], m0 ; write($reg, $mmsize)
 add  wq, mmsize/2   ; w -= $mmsize/2
-cmp  wq, -mmsize/2  ;   } while (w > $mmsize/2)
+cmp  wq, -(mmsize/2);   } while (w > $mmsize/2)
 jl .x_loop
 movu  [dstq-mmsize], m0 ;   write($reg, $mmsize)
 adddstq, dst_strideq;   dst += dst_stride
-- 
2.1.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: Add missing colons after assembly labels

2016-10-17 Thread Diego Biurrun
On Thu, Oct 13, 2016 at 10:16:16PM +0200, Diego Biurrun wrote:
> This fixes many warnings of the sort
> warning: label alone on a line without a colon might be in error
> ---
>  libavcodec/x86/audiodsp.asm|  2 +-
>  libavcodec/x86/dcadsp.asm  |  2 +-
>  libavcodec/x86/h264_qpel_10bit.asm |  2 +-
>  libavcodec/x86/h264_weight.asm |  4 ++--
>  libavcodec/x86/hevc_mc.asm | 12 ++--
>  libavcodec/x86/v210enc.asm |  4 ++--
>  libavfilter/x86/vf_interlace.asm   |  2 +-
>  libavutil/x86/imgutils.asm |  4 ++--
>  8 files changed, 16 insertions(+), 16 deletions(-)

ping

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: Add missing colons after assembly labels

2016-10-17 Thread Luca Barbato
On 17/10/2016 15:59, Diego Biurrun wrote:
> On Thu, Oct 13, 2016 at 10:16:16PM +0200, Diego Biurrun wrote:
>> This fixes many warnings of the sort
>> warning: label alone on a line without a colon might be in error
>> ---
>>  libavcodec/x86/audiodsp.asm|  2 +-
>>  libavcodec/x86/dcadsp.asm  |  2 +-
>>  libavcodec/x86/h264_qpel_10bit.asm |  2 +-
>>  libavcodec/x86/h264_weight.asm |  4 ++--
>>  libavcodec/x86/hevc_mc.asm | 12 ++--
>>  libavcodec/x86/v210enc.asm |  4 ++--
>>  libavfilter/x86/vf_interlace.asm   |  2 +-
>>  libavutil/x86/imgutils.asm |  4 ++--
>>  8 files changed, 16 insertions(+), 16 deletions(-)
> 
> ping
> 

Seems not problematic.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/3] avprobe: Add -show_stream_entry to get a single stream property

2016-10-17 Thread Vittorio Giovara
On Thu, Oct 13, 2016 at 6:23 PM, Vittorio Giovara
 wrote:
> This is needed for improved fate testing and it is modeled after
> -show_format_entry. The main behavioral difference is that when a print
> function is called with an empty key, rather than discarding it, the
> closes key in the hierarchy is used instead.
>
> Signed-off-by: Vittorio Giovara 
> ---
>  avprobe.c | 56 
>  1 file changed, 56 insertions(+)

ping
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/3] mov: Evaluate the movie display matrix

2016-10-17 Thread Vittorio Giovara
On Thu, Oct 13, 2016 at 6:26 PM, Vittorio Giovara
 wrote:
> On Thu, Oct 13, 2016 at 6:23 PM, Vittorio Giovara
>  wrote:
>> This matrix needs to be applied after all others have (currently only
>> display matrix from trak), but cannot be handled in movie box, since
>> streams are not allocated yet.
>>
>> So store it in main context and if not identity, apply it when appropriate,
>> handling the case when trak display matrix is identity and when it is not.
>
> Sent the email with the old commit log, the new one should be
>
> "This matrix needs to be applied after all others have (currently only
> display matrix from trak), but cannot be handled in movie box, since
> streams are not allocated yet. So store it in main context, and apply
> it when appropriate, that is after parsing the tkhd one."

ping
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] fate: Add tests for mov display matrix

2016-10-17 Thread Vittorio Giovara
Rotation, aspect ratio and pure matrix export.

Signed-off-by: Vittorio Giovara 
---
 tests/fate-run.sh |  5 +
 tests/fate/probe.mak  | 13 +
 tests/ref/fate/mov-ar |  2 ++
 tests/ref/fate/mov-display-matrix |  9 +
 tests/ref/fate/mov-rotation   |  1 +
 5 files changed, 30 insertions(+)
 create mode 100644 tests/ref/fate/mov-ar
 create mode 100644 tests/ref/fate/mov-display-matrix
 create mode 100644 tests/ref/fate/mov-rotation

diff --git a/tests/fate-run.sh b/tests/fate-run.sh
index d11ca3c..d1b4aef 100755
--- a/tests/fate-run.sh
+++ b/tests/fate-run.sh
@@ -76,6 +76,11 @@ probefmt(){
 run avprobe -show_format_entry format_name -v 0 "$@"
 }
 
+probear(){
+run avprobe -show_stream_entry sample_aspect_ratio -v 0 "$@"
+run avprobe -show_stream_entry display_aspect_ratio -v 0 "$@"
+}
+
 avconv(){
 dec_opts="-hwaccel $hwaccel -threads $threads -thread_type $thread_type"
 avconv_args="-nostats -cpuflags $cpuflags"
diff --git a/tests/fate/probe.mak b/tests/fate/probe.mak
index 376dfdd..4659a9a 100644
--- a/tests/fate/probe.mak
+++ b/tests/fate/probe.mak
@@ -16,3 +16,16 @@ fate-probe-format: $(FATE_PROBE_FORMAT)
 $(FATE_PROBE_FORMAT): avprobe$(EXESUF)
 $(FATE_PROBE_FORMAT): CMP = oneline
 fate-probe-format-%: CMD = probefmt 
$(TARGET_SAMPLES)/probe-format/$(@:fate-probe-format-%=%)
+
+FATE_MOV += fate-mov-display-matrix
+fate-mov-display-matrix: CMD = run avprobe -v 0 -show_stream_entry matrix 
$(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+FATE_MOV += fate-mov-rotation
+fate-mov-rotation: CMD = run avprobe -v 0 -show_stream_entry rotation 
$(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+FATE_MOV += fate-mov-ar
+fate-mov-ar: CMD = probear $(TARGET_SAMPLES)/mov/displaymatrix.mov
+
+$(FATE_MOV): avprobe$(EXESUF)
+FATE_SAMPLES-$(call ALLYES, AVPROBE MOV_DEMUXER) += $(FATE_MOV)
+fate-mov: $(FATE_MOV)
diff --git a/tests/ref/fate/mov-ar b/tests/ref/fate/mov-ar
new file mode 100644
index 000..13cbd9c
--- /dev/null
+++ b/tests/ref/fate/mov-ar
@@ -0,0 +1,2 @@
+9:2
+3:1
diff --git a/tests/ref/fate/mov-display-matrix 
b/tests/ref/fate/mov-display-matrix
new file mode 100644
index 000..64c9599
--- /dev/null
+++ b/tests/ref/fate/mov-display-matrix
@@ -0,0 +1,9 @@
+0
+65536
+0
+-65536
+0
+0
+47185920
+0
+1073741824
diff --git a/tests/ref/fate/mov-rotation b/tests/ref/fate/mov-rotation
new file mode 100644
index 000..64ded27
--- /dev/null
+++ b/tests/ref/fate/mov-rotation
@@ -0,0 +1 @@
+-90
-- 
2.10.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 12/12] x86/yadif-10: remove duplicate ABS macro

2016-10-17 Thread Vittorio Giovara
On Sun, Oct 16, 2016 at 2:11 PM, Janne Grunau  wrote:
> From: James Almer 
>
> And use the x86util ones instead, which are optimized for mmxext/sse2.
> About ~1% increase in performance on pre SSSE3 processors.

Question unrelated to this patch, is there any other place where it
would make sense to do this in the tree? A quick grep shows there is a
duplicate macro in x86/ac3dsp.asm, butI don't know asm enough to
judge.
Thanks
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] swscale: Properly load alpha for planar rgb

2016-10-17 Thread Vittorio Giovara
On Sun, Oct 16, 2016 at 2:07 PM, Sean McGovern  wrote:
> Hi,
>
> On Oct 14, 2016 17:28, "Luca Barbato"  wrote:
>>
>> On 14/10/2016 23:25, Vittorio Giovara wrote:
>> > From: Michael Niedermayer 
>> >
>> > Signed-off-by: Vittorio Giovara 
>> > ---
>> > This should fix ppc and sun fate tests.
>> > Vittorio
>>
>
> Has this had a spin on oracle to confirm?

It was ran on Luca's ppc box and it seemed to work.
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-17 Thread Luca Barbato

On 10/16/16 14:00, Luca Barbato wrote:

On 10/13/16 16:02, Alexandra Hájková wrote:

From: Pierre Edouard Lepere 




If nobody has a say I'd push it with the mentioned changes.

lu

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-17 Thread Diego Biurrun
On Mon, Oct 17, 2016 at 09:18:19PM +0200, Luca Barbato wrote:
> On 10/16/16 14:00, Luca Barbato wrote:
> >On 10/13/16 16:02, Alexandra Hájková wrote:
> >>From: Pierre Edouard Lepere 
> 
> If nobody has a say I'd push it with the mentioned changes.

This is much too fuzzy IMO. If you have updated patches, send them.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] configure: Print warnings after all other output

2016-10-17 Thread Diego Biurrun
---

Back in the day I would have added "10l to elenril" to the log msg.

 configure | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 4c164da..5c224ca 100755
--- a/configure
+++ b/configure
@@ -5395,8 +5395,6 @@ echo "#endif /* AVUTIL_AVCONFIG_H */" >> $TMPH
 
 cp_if_changed $TMPH libavutil/avconfig.h
 
-test -n "$WARNINGS" && printf "\n$WARNINGS"
-
 # generate the lists of enabled components
 print_enabled_components(){
 file=$1
@@ -5414,6 +5412,8 @@ print_enabled_components(){
 print_enabled_components libavcodec/bsf_list.c AVBitStreamFilter 
bitstream_filters $BSF_LIST
 print_enabled_components libavformat/protocol_list.c URLProtocol url_protocols 
$PROTOCOL_LIST
 
+test -n "$WARNINGS" && printf "\n$WARNINGS"
+
 # build pkg-config files
 
 lib_version(){
-- 
2.1.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] aarch64: vp9: Add NEON optimizations of VP9 MC functions

2016-10-17 Thread Martin Storsjö
This work is sponsored by, and copyright, Google.

These are ported from the ARM version; it is essentially a 1:1
port with no extra added features, but with some hand tuning
(especially for the plain copy/avg functions). The ARM version
isn't very register starved to begin with, so there's not much
to be gained  from having more spare registers here - we only
avoid having to clobber callee-saved registers.

Examples of runtimes vs the 32 bit version, on a Cortex A53:
 ARM   AArch64
vp9_avg4_neon:  31.2  23.7
vp9_avg8_neon:  57.0  53.7
vp9_avg16_neon:169.4 167.4
vp9_avg32_neon:717.3 716.1
vp9_avg64_neon:   2475.52514.2
vp9_avg_8tap_smooth_4h_neon:   131.2 123.0
vp9_avg_8tap_smooth_4hv_neon:  505.3 458.3
vp9_avg_8tap_smooth_4v_neon:   135.0 112.2
vp9_avg_8tap_smooth_8h_neon:   238.2 231.0
vp9_avg_8tap_smooth_8hv_neon:  731.6 679.2
vp9_avg_8tap_smooth_8v_neon:   270.0 246.5
vp9_avg_8tap_smooth_64h_neon:11503.6   11487.3
vp9_avg_8tap_smooth_64hv_neon:   25299.4   25277.8
vp9_avg_8tap_smooth_64v_neon:13614.7   13605.8
vp9_put4_neon:  17.2  16.5
vp9_put8_neon:  39.2  37.8
vp9_put16_neon: 98.6  99.2
vp9_put32_neon:405.1 307.4
vp9_put64_neon:   1627.11107.6
vp9_put_8tap_smooth_4h_neon:   124.2 116.2
vp9_put_8tap_smooth_4hv_neon:  493.2 445.3
vp9_put_8tap_smooth_4v_neon:   122.0  99.5
vp9_put_8tap_smooth_8h_neon:   227.2 219.3
vp9_put_8tap_smooth_8hv_neon:  698.3 650.2
vp9_put_8tap_smooth_8v_neon:   240.0 218.2
vp9_put_8tap_smooth_64h_neon:10885.1   10846.1
vp9_put_8tap_smooth_64hv_neon:   23377.7   23338.6
vp9_put_8tap_smooth_64v_neon:11679.4   11682.5

These are generally about as fast as the corresponding ARM
routines on the same CPU (at least on the A53), in most cases
marginally faster.

The speedup vs C code is pretty much the same as for the 32 bit
case; on the A53 it's around 2-13x - slightly varying since the
C versions generally don't end up exactly as fast as on 32 bit.
---
I'm still open for comments on the ARM version as well; most comments
to that one will be applied to this one in sync, or vice versa.
---
 libavcodec/aarch64/Makefile  |   2 +
 libavcodec/aarch64/vp9dsp_init_aarch64.c | 139 ++
 libavcodec/aarch64/vp9mc_neon.S  | 748 +++
 libavcodec/vp9.h |   1 +
 libavcodec/vp9dsp.c  |   2 +
 5 files changed, 892 insertions(+)
 create mode 100644 libavcodec/aarch64/vp9dsp_init_aarch64.c
 create mode 100644 libavcodec/aarch64/vp9mc_neon.S

diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile
index 764bedc..6f1227a 100644
--- a/libavcodec/aarch64/Makefile
+++ b/libavcodec/aarch64/Makefile
@@ -17,6 +17,7 @@ OBJS-$(CONFIG_DCA_DECODER)  += 
aarch64/dcadsp_init.o
 OBJS-$(CONFIG_RV40_DECODER) += aarch64/rv40dsp_init_aarch64.o
 OBJS-$(CONFIG_VC1_DECODER)  += aarch64/vc1dsp_init_aarch64.o
 OBJS-$(CONFIG_VORBIS_DECODER)   += aarch64/vorbisdsp_init.o
+OBJS-$(CONFIG_VP9_DECODER)  += aarch64/vp9dsp_init_aarch64.o
 
 # ARMv8 optimizations
 
@@ -43,3 +44,4 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= 
aarch64/mpegaudiodsp_neon.o
 NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o   
\
aarch64/synth_filter_neon.o
 NEON-OBJS-$(CONFIG_VORBIS_DECODER)  += aarch64/vorbisdsp_neon.o
+NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9mc_neon.o
diff --git a/libavcodec/aarch64/vp9dsp_init_aarch64.c 
b/libavcodec/aarch64/vp9dsp_init_aarch64.c
new file mode 100644
index 000..2f772ed
--- /dev/null
+++ b/libavcodec/aarch64/vp9dsp_init_aarch64.c
@@ -0,0 +1,139 @@
+/*
+ * Copyright (c) 2016 Google Inc.
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with Libav; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include 
+
+#include "libavutil/attributes.h"
+#include "libavutil/aarch64/cpu.h"
+#include "libavcodec/vp9.h"
+
+#define declare_fpel(type, sz)