Re: [libav-devel] [PATCH 2/2] checkasm: Add a test for HEVC add_residual

2016-10-12 Thread Diego Biurrun
On Wed, Oct 12, 2016 at 06:24:40PM +0200, Alexandra Hájková wrote:
> --- a/tests/checkasm/checkasm.c
> +++ b/tests/checkasm/checkasm.c
> @@ -92,6 +92,7 @@ static const struct {
>  #if CONFIG_HEVC_DECODER
>  { "hevc_mc", checkasm_check_hevc_mc },
>  { "hevc_idct", checkasm_check_hevc_idct },
> +{ "hevc_add_res", checkasm_check_hevc_add_res },

order

> --- /dev/null
> +++ b/tests/checkasm/hevc_add_res.c
> @@ -0,0 +1,84 @@
> +
> +#define randomize_buffers(buf, size)\
> +do {\
> +int j;  \
> +for (j = 0; j < size; j++) {\
> +int16_t r = rnd();  \
> +AV_WN16A(buf + j, r >> 3);  \
> +}   \
> +} while (0)

We should stop duplicating these between checkasm modules some day.
You're welcome to help me refactor.

> +#define randomize_buffers2(buf, size) \
> +do { \
> +int j;   \
> +for (j = 0; j < size; j++)   \
> +AV_WN16A(buf + j * 2, (rnd() & 0xFF));   \

pointless (), align the \

What is the reason for writing 16-bits and throwing the upper half away?

> +void checkasm_check_hevc_add_res(void)
> +{
> +int bit_depth;
> +
> +for (bit_depth = 8; bit_depth <= 10; bit_depth++) {
> +HEVCDSPContext h;
> +
> +ff_hevc_dsp_init(, bit_depth);
> +check_add_res(h, bit_depth);
> +}

I didn't see you add 9-bit versions of the assembly functions, why do
you test 9 bits?

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-12 Thread Diego Biurrun
On Wed, Oct 12, 2016 at 06:24:39PM +0200, Alexandra Hájková wrote:
> --- /dev/null
> +++ b/libavcodec/x86/hevc_res_add.asm
> @@ -0,0 +1,391 @@
> +; /*

Drop the /, this is not C.

> +; * Provide SIMD optimizations for add_residual functions for HEVC decoding

s/Provide//

> +; * Copyright (c) 2014 Pierre-Edouard LEPERE
> +; *
> +; * This file is part of Libav.
> +; *
> +; * FFmpeg is free software; you can redistribute it and/or

This is not FFmpeg.

> +; * modify it under the terms of the GNU Lesser General Public
> +; * License as published by the Free Software Foundation; either
> +; * version 2.1 of the License, or (at your option) any later version.
> +; *
> +; * FFmpeg is distributed in the hope that it will be useful,
> +; * but WITHOUT ANY WARRANTY; without even the implied warranty of
> +; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +; * Lesser General Public License for more details.
> +; *
> +; * You should have received a copy of the GNU Lesser General Public
> +; * License along with FFmpeg; if not, write to the Free Software
> +; * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
> USA
> +; */
> +%include "libavutil/x86/x86util.asm"

Drop the / and add an empty line.

> +;-
> +; void ff_hevc_add_residual__10(pixel *dst, int16_t *block, int stride)
> +;-
> +%macro ADD_RES_SSE_8_10 4

I don't think this function uses an int stride, stray double underscore.

> +;-
> +; void ff_hevc_add_residual__10(pixel *dst, int16_t *block, int stride)
> +;-

same

> +%if HAVE_AVX2_EXTERNAL
> +INIT_YMM avx2
> +
> +cglobal hevc_add_residual_16_10,3,5,6
> +%endif ;HAVE_AVX_EXTERNAL

The %if and the %endif comment do not match.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 7/9] qsv{dec, enc}: always use an internal mfxFrameSurface1

2016-10-12 Thread Diego Biurrun
On Tue, Oct 11, 2016 at 09:34:33PM +0200, Anton Khirnov wrote:
> For encoding, this avoids modifying the input surface, which we are not
> allowed to do.
> This will also be useful in the following commits.
> ---
>  libavcodec/qsv_internal.h |  5 ++---
>  libavcodec/qsvdec.c   | 32 ++--
>  libavcodec/qsvenc.c   | 32 
>  3 files changed, 36 insertions(+), 33 deletions(-)

Just "qsv:" is the more standard way to prefix log messages IMO.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] hevcdec: move parameter set parsing into a separate header

2016-10-12 Thread Diego Biurrun
On Wed, Oct 12, 2016 at 10:26:32AM +0200, Anton Khirnov wrote:
> This code is independent from the decoder, so it makes more sense for it
> to to have its own header.
> ---
>  libavcodec/hevc.h|   5 +
>  libavcodec/hevc_ps.c |  14 +--
>  libavcodec/hevc_ps.h | 320 
> +++
>  libavcodec/hevc_ps_enc.c |   2 +-
>  libavcodec/hevc_refs.c   |   2 +-
>  libavcodec/hevcdec.c |   2 +-
>  libavcodec/hevcdec.h | 296 +--
>  7 files changed, 339 insertions(+), 302 deletions(-)
>  create mode 100644 libavcodec/hevc_ps.h

probably OK

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] hevcdec: split ff_hevc_diag_scan* declarations into a separate header

2016-10-12 Thread Diego Biurrun
On Wed, Oct 12, 2016 at 10:26:31AM +0200, Anton Khirnov wrote:
> This will be useful in the following commits.
> ---
>  libavcodec/dxva2_hevc.c |  1 +
>  libavcodec/hevc_data.h  | 31 +++
>  libavcodec/hevc_ps.c|  1 +
>  libavcodec/hevcdec.c|  1 +
>  libavcodec/hevcdec.h|  5 -
>  libavcodec/vdpau_hevc.c |  1 +
>  6 files changed, 35 insertions(+), 5 deletions(-)
>  create mode 100644 libavcodec/hevc_data.h

hevc_data.c needs to #include this new header, as said before.

OK with that changed.

Diego
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/3] checkasm: add vp9dsp.itxfm_add tests.

2016-10-12 Thread Martin Storsjö
From: "Ronald S. Bultje" 

This includes fixes by Henrik Gramner.
---
 tests/checkasm/vp9dsp.c | 272 
 1 file changed, 272 insertions(+)

diff --git a/tests/checkasm/vp9dsp.c b/tests/checkasm/vp9dsp.c
index f0cc2a7..690e0cf 100644
--- a/tests/checkasm/vp9dsp.c
+++ b/tests/checkasm/vp9dsp.c
@@ -18,13 +18,16 @@
  * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
+#include 
 #include 
 
 #include "libavutil/common.h"
 #include "libavutil/internal.h"
 #include "libavutil/intreadwrite.h"
+#include "libavutil/mathematics.h"
 
 #include "libavcodec/vp9.h"
+#include "libavcodec/vp9data.h"
 
 #include "checkasm.h"
 
@@ -33,6 +36,274 @@ static const uint32_t pixel_mask[3] = { 0x, 
0x03ff03ff, 0x0fff0fff };
 #define BIT_DEPTH 8
 #define SIZEOF_PIXEL ((BIT_DEPTH + 7) / 8)
 
+#define randomize_buffers() \
+do { \
+uint32_t mask = pixel_mask[(BIT_DEPTH - 8) >> 1];  \
+for (y = 0; y < sz; y++) { \
+for (x = 0; x < sz * SIZEOF_PIXEL; x += 4) {   \
+uint32_t r = rnd() & mask; \
+AV_WN32A(dst + y * sz * SIZEOF_PIXEL + x, r);  \
+AV_WN32A(src + y * sz * SIZEOF_PIXEL + x, rnd() & mask);   \
+}  \
+for (x = 0; x < sz; x++) { \
+if (BIT_DEPTH == 8) {  \
+coef[y * sz + x] = src[y * sz + x] - dst[y * sz + x];  \
+} else {   \
+((int32_t *) coef)[y * sz + x] =   \
+((uint16_t *) src)[y * sz + x] -   \
+((uint16_t *) dst)[y * sz + x];\
+}  \
+}  \
+}  \
+} while(0)
+
+// wht function copied from libvpx
+static void fwht_1d(double *out, const double *in, int sz)
+{
+double t0 = in[0] + in[1];
+double t3 = in[3] - in[2];
+double t4 = trunc((t0 - t3) * 0.5);
+double t1 = t4 - in[1];
+double t2 = t4 - in[2];
+
+out[0] = t0 - t2;
+out[1] = t2;
+out[2] = t3 + t1;
+out[3] = t1;
+}
+
+// standard DCT-II
+static void fdct_1d(double *out, const double *in, int sz)
+{
+int k, n;
+
+for (k = 0; k < sz; k++) {
+out[k] = 0.0;
+for (n = 0; n < sz; n++)
+out[k] += in[n] * cos(M_PI * (2 * n + 1) * k / (sz * 2.0));
+}
+out[0] *= M_SQRT1_2;
+}
+
+// see "Towards jointly optimal spatial prediction and adaptive transform in
+// video/image coding", by J. Han, A. Saxena, and K. Rose
+// IEEE Proc. ICASSP, pp. 726-729, Mar. 2010.
+static void fadst4_1d(double *out, const double *in, int sz)
+{
+int k, n;
+
+for (k = 0; k < sz; k++) {
+out[k] = 0.0;
+for (n = 0; n < sz; n++)
+out[k] += in[n] * sin(M_PI * (n + 1) * (2 * k + 1) / (sz * 2.0 + 
1.0));
+}
+}
+
+// see "A Butterfly Structured Design of The Hybrid Transform Coding Scheme",
+// by Jingning Han, Yaowu Xu, and Debargha Mukherjee
+// 
http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/41418.pdf
+static void fadst_1d(double *out, const double *in, int sz)
+{
+int k, n;
+
+for (k = 0; k < sz; k++) {
+out[k] = 0.0;
+for (n = 0; n < sz; n++)
+out[k] += in[n] * sin(M_PI * (2 * n + 1) * (2 * k + 1) / (sz * 
4.0));
+}
+}
+
+typedef void (*ftx1d_fn)(double *out, const double *in, int sz);
+static void ftx_2d(double *out, const double *in, enum TxfmMode tx,
+   enum TxfmType txtp, int sz)
+{
+static const double scaling_factors[5][4] = {
+{ 4.0, 16.0 * M_SQRT1_2 / 3.0, 16.0 * M_SQRT1_2 / 3.0, 32.0 / 9.0 },
+{ 2.0, 2.0, 2.0, 2.0 },
+{ 1.0, 1.0, 1.0, 1.0 },
+{ 0.25 },
+{ 4.0 }
+};
+static const ftx1d_fn ftx1d_tbl[5][4][2] = {
+{
+{ fdct_1d, fdct_1d },
+{ fadst4_1d, fdct_1d },
+{ fdct_1d, fadst4_1d },
+{ fadst4_1d, fadst4_1d },
+}, {
+{ fdct_1d, fdct_1d },
+{ fadst_1d, fdct_1d },
+{ fdct_1d, fadst_1d },
+{ fadst_1d, fadst_1d },
+}, {
+{ fdct_1d, fdct_1d },
+{ fadst_1d, fdct_1d },
+{ fdct_1d, fadst_1d },
+{ fadst_1d, fadst_1d },
+}, {
+{ fdct_1d, fdct_1d },
+}, {
+{ fwht_1d, fwht_1d },
+},
+};
+double temp[1024];
+double scaling_factor = 

[libav-devel] [PATCH 3/3] arm: vp9: Add NEON loop filters

2016-10-12 Thread Martin Storsjö
This work is sponsored by, and copyright, Google.

The implementation tries to have smart handling of cases
where no pixels need the full filtering for the 8/16 width
filters, skipping both calculation and writeback of the
unmodified pixels in those cases. The actual effect of this
is hard to test with checkasm though, since it tests the
full filtering, and the benefit depends on how many filtered
blocks use the shortcut.

Examples of relative speedup compared to the C version, from checkasm:
  Cortex   A7 A8 A9A53
vp9_loop_filter_h_4_8_neon:  2.26   2.36   2.13   2.84
vp9_loop_filter_h_8_8_neon:  2.04   2.16   1.87   2.66
vp9_loop_filter_h_16_8_neon: 2.14   2.18   1.88   2.52
vp9_loop_filter_h_16_16_neon:2.43   2.48   2.16   3.05
vp9_loop_filter_mix2_h_44_16_neon:   2.25   2.39   2.06   2.76
vp9_loop_filter_mix2_h_48_16_neon:   2.53   2.54   2.18   3.12
vp9_loop_filter_mix2_h_84_16_neon:   2.11   2.23   1.86   2.73
vp9_loop_filter_mix2_h_88_16_neon:   2.04   2.24   1.86   2.76
vp9_loop_filter_mix2_v_44_16_neon:   3.34   3.54   3.03   4.35
vp9_loop_filter_mix2_v_48_16_neon:   3.41   3.54   3.08   4.74
vp9_loop_filter_mix2_v_84_16_neon:   3.58   3.64   3.21   4.83
vp9_loop_filter_mix2_v_88_16_neon:   3.22   3.55   3.13   4.89
vp9_loop_filter_v_4_8_neon:  3.42   3.26   3.08   4.57
vp9_loop_filter_v_8_8_neon:  3.45   3.65   3.11   5.10
vp9_loop_filter_v_16_8_neon: 3.91   3.88   3.59   5.47
vp9_loop_filter_v_16_16_neon:4.81   4.57   4.32   6.59

The speedup vs C code is around 2-6x. The numbers are quite
inconclusive though, since the checkasm test runs multiple filterings
on top of each other, so later rounds might end up with different
codepaths (different decisions on which filter to apply, based
on input pixel differences). Disabling the early-exit in the asm
doesn't give a fair comparison either though, since the C code
only does the necessary calcuations for each row.

This is pretty similar in runtime to the corresponding routines
in libvpx. (This is comparing vpx_lpf_vertical_16_neon,
vpx_lpf_horizontal_edge_8_neon and vpx_lpf_horizontal_edge_16_neon
to vp9_loop_filter_h_16_8_neon, vp9_loop_filter_v_16_8_neon
and vp9_loop_filter_v_16_16_neon - note that the naming of horizonal
and vertical is flipped between the libraries.)

In order to have stable, comparable numbers, the early exits in both
asm versions were disabled, forcing the full filtering
codepath.

   Cortex   A7  A8  A9 A53
vp9_loop_filter_h_16_8_neon: 657.0   460.4   497.1   413.5
libvpx vpx_lpf_vertical_16_neon: 626.0   464.5   470.7   445.0
vp9_loop_filter_v_16_8_neon: 600.4   425.7   447.0   332.2
libvpx vpx_lpf_horizontal_edge_8_neon:   586.5   414.5   415.6   383.2
vp9_loop_filter_v_16_16_neon:   1216.4   868.5   893.2   680.0
libvpx vpx_lpf_horizontal_edge_16_neon: 1060.2   751.7   743.5   685.2

The libvpx functions are marginally faster on A7, A8 and A9, while
our is marginally faster on A53 (which is where it has been tuned
during development). Overall they are pretty similar.
---
 libavcodec/arm/Makefile  |   1 +
 libavcodec/arm/vp9dsp_init_arm.c |  68 
 libavcodec/arm/vp9lpf_neon.S | 696 +++
 3 files changed, 765 insertions(+)
 create mode 100644 libavcodec/arm/vp9lpf_neon.S

diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile
index 01630ac..77452b1 100644
--- a/libavcodec/arm/Makefile
+++ b/libavcodec/arm/Makefile
@@ -140,4 +140,5 @@ NEON-OBJS-$(CONFIG_RV40_DECODER)   += 
arm/rv34dsp_neon.o\
 NEON-OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_neon.o
 NEON-OBJS-$(CONFIG_VP6_DECODER)+= arm/vp6dsp_neon.o
 NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9itxfm_neon.o   \
+  arm/vp9lpf_neon.o \
   arm/vp9mc_neon.o
diff --git a/libavcodec/arm/vp9dsp_init_arm.c b/libavcodec/arm/vp9dsp_init_arm.c
index b2d91d9..9653100 100644
--- a/libavcodec/arm/vp9dsp_init_arm.c
+++ b/libavcodec/arm/vp9dsp_init_arm.c
@@ -182,8 +182,76 @@ static av_cold void vp9dsp_itxfm_init_arm(VP9DSPContext 
*dsp)
 }
 }
 
+#define define_loop_filter(dir, wd) \
+void ff_vp9_loop_filter_##dir##_##wd##_8_neon(uint8_t *dst, ptrdiff_t stride, 
int E, int I, int H)
+
+#define define_loop_filters(wd) \
+define_loop_filter(h, wd);  \
+define_loop_filter(v, wd)
+
+define_loop_filters(4);
+define_loop_filters(8);
+define_loop_filters(16);
+
+#define lf_16_fn(dir, stridea)\
+static void loop_filter_##dir##_16_16_neon(uint8_t *dst,  \
+   ptrdiff_t stride,  \
+   int E, int I, int H)   \
+{

[libav-devel] [PATCH 2/3] arm: vp9: Add NEON itxfm routines

2016-10-12 Thread Martin Storsjö
This work is sponsored by, and copyright, Google.

For the transforms up to 8x8, we can fit all the data (including
temporaries) in registers and just do a straightforward transform
of all the data. For 16x16, we do a transform of 4x16 pixels in
4 slices, using a temporary buffer. For 32x32, we transform 4x32
pixels at a time, in two steps of 4x16 pixels each.

Examples of relative speedup compared to the C version, from checkasm:
 Cortex   A7 A8 A9A53
vp9_inv_adst_adst_4x4_add_neon: 3.25   5.38   3.88   3.62
vp9_inv_adst_adst_8x8_add_neon: 3.84   4.79   4.25   4.18
vp9_inv_adst_adst_16x16_add_neon:   3.43   4.49   4.34   4.22
vp9_inv_dct_dct_4x4_add_neon:   3.72   5.76   4.22   4.34
vp9_inv_dct_dct_8x8_add_neon:   4.58   6.16   5.42   5.11
vp9_inv_dct_dct_16x16_add_neon: 3.42   3.70   3.65   3.91
vp9_inv_dct_dct_32x32_add_neon: 4.08   3.59   3.91   4.54
vp9_inv_wht_wht_4x4_add_neon:   3.22   5.08   3.68   3.75

Thus, the speedup vs C code is around 3-5x.

This is mostly marginally faster than the corresponding routines
in libvpx on most cores, tested with their 32x32 idct (compared to
vpx_idct32x32_1024_add_neon). These numbers are slightly in libvpx's
favour since their version doesn't clear the input buffer like ours
do (although the effect of that on the total runtime probably is
negligible.)

   Cortex   A7   A8   A9  A53
vp9_inv_dct_dct_32x32_add_neon:18663.4  17071.3  14221.8  12183.5
libvpx vpx_idct32x32_1024_add_neon 20789.0  13344.3  15049.9  13030.5

Only on the Cortex A8, the libvpx function is faster. On the other cores,
ours is slightly faster even though ours has got source block clearing
integrated.
---
Suggestions very much welcome on names for the macros - no idea if
the current ones make sense or what one commonly would call these
combinations.

I'm a bit reluctant to expanding the macros (to be able to schedule
instructions better), in order to keep things readable. (Although,
I guess this is kinda write-only code, which nobody ever touches
afterwards).
---
 libavcodec/arm/Makefile  |3 +-
 libavcodec/arm/vp9dsp_init_arm.c |   51 +-
 libavcodec/arm/vp9itxfm_neon.S   | 1151 ++
 3 files changed, 1203 insertions(+), 2 deletions(-)
 create mode 100644 libavcodec/arm/vp9itxfm_neon.S

diff --git a/libavcodec/arm/Makefile b/libavcodec/arm/Makefile
index 2638230..01630ac 100644
--- a/libavcodec/arm/Makefile
+++ b/libavcodec/arm/Makefile
@@ -139,4 +139,5 @@ NEON-OBJS-$(CONFIG_RV40_DECODER)   += 
arm/rv34dsp_neon.o\
   arm/rv40dsp_neon.o
 NEON-OBJS-$(CONFIG_VORBIS_DECODER) += arm/vorbisdsp_neon.o
 NEON-OBJS-$(CONFIG_VP6_DECODER)+= arm/vp6dsp_neon.o
-NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9mc_neon.o
+NEON-OBJS-$(CONFIG_VP9_DECODER)+= arm/vp9itxfm_neon.o   \
+  arm/vp9mc_neon.o
diff --git a/libavcodec/arm/vp9dsp_init_arm.c b/libavcodec/arm/vp9dsp_init_arm.c
index db8c683..b2d91d9 100644
--- a/libavcodec/arm/vp9dsp_init_arm.c
+++ b/libavcodec/arm/vp9dsp_init_arm.c
@@ -94,7 +94,7 @@ define_8tap_2d_funcs(8)
 define_8tap_2d_funcs(4)
 
 
-av_cold void ff_vp9dsp_init_arm(VP9DSPContext *dsp)
+static av_cold void vp9dsp_mc_init_arm(VP9DSPContext *dsp)
 {
 int cpu_flags = av_get_cpu_flags();
 
@@ -138,3 +138,52 @@ av_cold void ff_vp9dsp_init_arm(VP9DSPContext *dsp)
 init_mc_funcs_dirs(4, 4);
 }
 }
+
+#define define_itxfm(type_a, type_b, sz)   \
+void ff_vp9_##type_a##_##type_b##_##sz##x##sz##_add_neon(uint8_t *_dst,\
+ ptrdiff_t stride, \
+ int16_t *_block, int 
eob)
+
+#define define_itxfm_funcs(sz)  \
+define_itxfm(idct,  idct,  sz); \
+define_itxfm(iadst, idct,  sz); \
+define_itxfm(idct,  iadst, sz); \
+define_itxfm(iadst, iadst, sz)
+
+define_itxfm_funcs(4);
+define_itxfm_funcs(8);
+define_itxfm_funcs(16);
+define_itxfm(idct, idct, 32);
+define_itxfm(iwht, iwht, 4);
+
+
+static av_cold void vp9dsp_itxfm_init_arm(VP9DSPContext *dsp)
+{
+int cpu_flags = av_get_cpu_flags();
+
+if (have_neon(cpu_flags)) {
+#define init_itxfm(tx, sz) \
+dsp->itxfm_add[tx][DCT_DCT]   = ff_vp9_idct_idct_##sz##_add_neon;  \
+dsp->itxfm_add[tx][DCT_ADST]  = ff_vp9_iadst_idct_##sz##_add_neon; \
+dsp->itxfm_add[tx][ADST_DCT]  = ff_vp9_idct_iadst_##sz##_add_neon; \
+dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_iadst_iadst_##sz##_add_neon
+
+#define init_idct(tx, nm)   \
+dsp->itxfm_add[tx][DCT_DCT]   = \
+dsp->itxfm_add[tx][ADST_DCT]  = \
+dsp->itxfm_add[tx][DCT_ADST]  = \
+dsp->itxfm_add[tx][ADST_ADST] = ff_vp9_##nm##_add_neon
+
+init_itxfm(TX_4X4, 4x4);
+init_itxfm(TX_8X8, 8x8);
+ 

Re: [libav-devel] [PATCH] swscale: Add the GBRAP12 output

2016-10-12 Thread Vittorio Giovara
On Wed, Oct 12, 2016 at 3:32 PM, Luca Barbato  wrote:
> ---

seems ok, maybe just `swscale: Add GBRAP12 output support`
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] swscale: Add the GBRAP12 output

2016-10-12 Thread Luca Barbato
---

Used to test Kieran's patch

 libswscale/output.c | 2 ++
 libswscale/utils.c  | 4 ++--
 tests/ref/fate/filter-pixdesc-gbrap12be | 1 +
 tests/ref/fate/filter-pixdesc-gbrap12le | 1 +
 tests/ref/fate/filter-pixfmts-copy  | 2 ++
 tests/ref/fate/filter-pixfmts-null  | 2 ++
 tests/ref/fate/filter-pixfmts-scale | 2 ++
 tests/ref/fate/filter-pixfmts-vflip | 2 ++
 8 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100644 tests/ref/fate/filter-pixdesc-gbrap12be
 create mode 100644 tests/ref/fate/filter-pixdesc-gbrap12le

diff --git a/libswscale/output.c b/libswscale/output.c
index a8845f5..d0c303c 100644
--- a/libswscale/output.c
+++ b/libswscale/output.c
@@ -1457,6 +1457,8 @@ av_cold void ff_sws_init_output_funcs(SwsContext *c,
 case AV_PIX_FMT_GBRP10LE:
 case AV_PIX_FMT_GBRP12BE:
 case AV_PIX_FMT_GBRP12LE:
+case AV_PIX_FMT_GBRAP12BE:
+case AV_PIX_FMT_GBRAP12LE:
 case AV_PIX_FMT_GBRP16BE:
 case AV_PIX_FMT_GBRP16LE:
 case AV_PIX_FMT_GBRAP:
diff --git a/libswscale/utils.c b/libswscale/utils.c
index 5c7d631..1c3bbb3 100644
--- a/libswscale/utils.c
+++ b/libswscale/utils.c
@@ -189,8 +189,8 @@ static const FormatEntry format_entries[AV_PIX_FMT_NB] = {
 [AV_PIX_FMT_GBRP16LE]= { 1, 0 },
 [AV_PIX_FMT_GBRP16BE]= { 1, 0 },
 [AV_PIX_FMT_GBRAP]   = { 1, 1 },
-[AV_PIX_FMT_GBRAP12LE]   = { 1, 0 },
-[AV_PIX_FMT_GBRAP12BE]   = { 1, 0 },
+[AV_PIX_FMT_GBRAP12LE]   = { 1, 1 },
+[AV_PIX_FMT_GBRAP12BE]   = { 1, 1 },
 [AV_PIX_FMT_GBRAP16LE]   = { 1, 0 },
 [AV_PIX_FMT_GBRAP16BE]   = { 1, 0 },
 [AV_PIX_FMT_XYZ12BE] = { 0, 0, 1 },
diff --git a/tests/ref/fate/filter-pixdesc-gbrap12be 
b/tests/ref/fate/filter-pixdesc-gbrap12be
new file mode 100644
index 000..6e27632
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-gbrap12be
@@ -0,0 +1 @@
+pixdesc-gbrap12be   a8ce4f5a7578f260399f86f92ae2a7be
diff --git a/tests/ref/fate/filter-pixdesc-gbrap12le 
b/tests/ref/fate/filter-pixdesc-gbrap12le
new file mode 100644
index 000..b6537a5
--- /dev/null
+++ b/tests/ref/fate/filter-pixdesc-gbrap12le
@@ -0,0 +1 @@
+pixdesc-gbrap12le   c14ff0058c8b1ccdb880386d7f9804e5
diff --git a/tests/ref/fate/filter-pixfmts-copy 
b/tests/ref/fate/filter-pixfmts-copy
index 97acc09..d9a5166 100644
--- a/tests/ref/fate/filter-pixfmts-copy
+++ b/tests/ref/fate/filter-pixfmts-copy
@@ -13,6 +13,8 @@ bgr565le6a0d182c7165103b2613d1805c822f9f
 bgr836b9ef72c87da36ac547202d85a5805f
 bgra56e6e1bfde40aaa27473e01b46345c82
 gbrap   57cb1a02d6f015a4329fe367f3bdfe49
+gbrap12be   df4b550099df0702f602a8b305702a8c
+gbrap12le   f947c43e494ab87410dfb2547e7e22f2
 gbrpd5f73b5d3ba7f6cadbc9b4ecbc161005
 gbrp10beeb19bda60ab7f893198364dff21342d6
 gbrp10le546146efb36ad2605e9f74ee5e4c2a36
diff --git a/tests/ref/fate/filter-pixfmts-null 
b/tests/ref/fate/filter-pixfmts-null
index 97acc09..d9a5166 100644
--- a/tests/ref/fate/filter-pixfmts-null
+++ b/tests/ref/fate/filter-pixfmts-null
@@ -13,6 +13,8 @@ bgr565le6a0d182c7165103b2613d1805c822f9f
 bgr836b9ef72c87da36ac547202d85a5805f
 bgra56e6e1bfde40aaa27473e01b46345c82
 gbrap   57cb1a02d6f015a4329fe367f3bdfe49
+gbrap12be   df4b550099df0702f602a8b305702a8c
+gbrap12le   f947c43e494ab87410dfb2547e7e22f2
 gbrpd5f73b5d3ba7f6cadbc9b4ecbc161005
 gbrp10beeb19bda60ab7f893198364dff21342d6
 gbrp10le546146efb36ad2605e9f74ee5e4c2a36
diff --git a/tests/ref/fate/filter-pixfmts-scale 
b/tests/ref/fate/filter-pixfmts-scale
index b6f38b4..142b637 100644
--- a/tests/ref/fate/filter-pixfmts-scale
+++ b/tests/ref/fate/filter-pixfmts-scale
@@ -13,6 +13,8 @@ bgr565le34438643c183ff1748cf7d71453f981c
 bgr8e731ba3dbec294e1daa7313e08e88034
 bgra6e1f417ae41636f631de1cfe39ce1778
 gbrap   eefdbfd1426765ce5e9790022533db0d
+gbrap12be   c676f72b634c77b08a00ab12dc21c5dc
+gbrap12le   90ca5271960dc1ebd6ebe14189223e36
 gbrp5d14768d2ab6cbf3879966b5d5c6befb
 gbrp10be4192c246f4a52ec7a37919665190cce9
 gbrp10le170189b2c2dd46f31165d8fa6cadef0a
diff --git a/tests/ref/fate/filter-pixfmts-vflip 
b/tests/ref/fate/filter-pixfmts-vflip
index 6f633c6..93a292d 100644
--- a/tests/ref/fate/filter-pixfmts-vflip
+++ b/tests/ref/fate/filter-pixfmts-vflip
@@ -13,6 +13,8 @@ bgr565le6f98ccb05e608863ef0912b9a6fd960b
 bgr81f916a75563e6be42c056e7d973a7356
 bgradd8eaea69683884ea45bf2fb635ce415
 gbrap   38e04cbd4dc5566586d58ffed0c6b20d
+gbrap12be   c53126e45593f2e49451c9c9f58cffac
+gbrap12le   6d5b3a8f8aae74f3542a63bcd1179a6c
 gbrp37954476d089b5b74b06891e64ad6b9e
 gbrp10be

Re: [libav-devel] [PATCH 2/2] checkasm: Add a test for HEVC add_residual

2016-10-12 Thread Martin Storsjö

On Wed, 12 Oct 2016, Martin Storsjö wrote:


On Wed, 12 Oct 2016, Alexandra Hájková wrote:


---
tests/checkasm/Makefile   |  2 +-
tests/checkasm/checkasm.c |  1 +
tests/checkasm/checkasm.h |  1 +
tests/checkasm/hevc_add_res.c | 84 

+++

4 files changed, 87 insertions(+), 1 deletion(-)
create mode 100644 tests/checkasm/hevc_add_res.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 9b3df55..ac3e97e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -12,7 +12,7 @@ AVCODECOBJS-$(CONFIG_VP8DSP)+= vp8dsp.o

# decoders/encoders
AVCODECOBJS-$(CONFIG_DCA_DECODER)   += dcadsp.o synth_filter.o
-AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o
+AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o 

hevc_add_res.o

AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o


Can you explain why this change is necessary? It feels like it's either 
misplaced (should be in a different commit) or like something else is 
wrong already (should be in a separate commit), or I might be missing 
something (then it should probably be explained in the commit message, or 
at least explained verbally in a reply to this mail).


Nevermind - sorry, I misread - this objection is dropped.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] checkasm: Add a test for HEVC add_residual

2016-10-12 Thread Martin Storsjö

On Wed, 12 Oct 2016, Alexandra Hájková wrote:


---
tests/checkasm/Makefile   |  2 +-
tests/checkasm/checkasm.c |  1 +
tests/checkasm/checkasm.h |  1 +
tests/checkasm/hevc_add_res.c | 84 +++
4 files changed, 87 insertions(+), 1 deletion(-)
create mode 100644 tests/checkasm/hevc_add_res.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 9b3df55..ac3e97e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -12,7 +12,7 @@ AVCODECOBJS-$(CONFIG_VP8DSP)+= vp8dsp.o

# decoders/encoders
AVCODECOBJS-$(CONFIG_DCA_DECODER)   += dcadsp.o synth_filter.o
-AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o
+AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o hevc_add_res.o
AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o


Can you explain why this change is necessary? It feels like it's either 
misplaced (should be in a different commit) or like something else is 
wrong already (should be in a separate commit), or I might be missing 
something (then it should probably be explained in the commit message, or 
at least explained verbally in a reply to this mail).


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] allow to pass custom sidedata through decoders

2016-10-12 Thread Vittorio Giovara
On Wed, Oct 12, 2016 at 11:43 AM, Francois Cartegnie  wrote:
> Le 12/10/2016 à 16:50, Vittorio Giovara a écrit :
>> On Wed, Oct 12, 2016 at 8:50 AM, Francois Cartegnie  wrote:
>>
>> I'm not really sure about this.
>> If it can be handled by libavcodec, support for it should be added, if
>> it cannot it's not hard to record the presentation timestamp and use a
>> queue in the app itself.
>
> How's that different from setting and grabbing any other supported
> sidedata that is passed through decoder when using avcodec only ?

Because the other supported ones are known to work and are mapped to
something that exists.

Plus I'm pretty sure you would be able to use only one custom data at
a time: since side data is appended last to a buffer and retrieved
first, if you use two AV_PKT_DATA_CUSTOM you would be able to retrieve
only the first one added, and things could get even worse when you map
it to AV_FRAME_SIDE_DATA_CUSTOM.

> Plus AV_NOPTS_VALUE is not uncommon.

Sure, but that can be handled again by the app itself, like it is done now.
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] checkasm: Add a test for HEVC add_residual

2016-10-12 Thread Luca Barbato
On 12/10/2016 18:24, Alexandra Hájková wrote:
> ---
>  tests/checkasm/Makefile   |  2 +-
>  tests/checkasm/checkasm.c |  1 +
>  tests/checkasm/checkasm.h |  1 +
>  tests/checkasm/hevc_add_res.c | 84 
> +++
>  4 files changed, 87 insertions(+), 1 deletion(-)
>  create mode 100644 tests/checkasm/hevc_add_res.c
> 

Sure.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 5/9] hwcontext_qsv: transfer data through the child context when VPP fails

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> Uploading/downloading data through VPP may not work for some formats, in
> that case we can still try to call av_hwframe_transfer_data() on the
> child context.
> ---
>  libavutil/hwcontext_qsv.c | 40 
>  1 file changed, 40 insertions(+)

Possibly Ok.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 9/9] qsv{enc, dec}: extend the internal frame allocator

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> Handle the internal frame requests, which is required by the HEVC
> encoding plugin.
> ---
>  libavcodec/qsv.c  | 265 
> --
>  libavcodec/qsv_internal.h |  13 ++-
>  libavcodec/qsvdec.c   |   3 +-
>  libavcodec/qsvenc.c   |   3 +-
>  4 files changed, 249 insertions(+), 35 deletions(-)
> 

You might add a note on resp->NumFrameActual + 1, usage since it is a
little confusing, beside that seems fine.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/2] hevc/x86: Add add_residual

2016-10-12 Thread Alexandra Hájková
From: Pierre Edouard Lepere 

Initially written by Pierre Edouard Lepere 
,
extended by James Almer .

Signed-off-by: Alexandra Hájková 
---
 libavcodec/x86/Makefile |   3 +-
 libavcodec/x86/hevc_res_add.asm | 391 
 libavcodec/x86/hevcdsp_init.c   |  40 
 3 files changed, 433 insertions(+), 1 deletion(-)
 create mode 100644 libavcodec/x86/hevc_res_add.asm

diff --git a/libavcodec/x86/Makefile b/libavcodec/x86/Makefile
index a38535b..aa93e67 100644
--- a/libavcodec/x86/Makefile
+++ b/libavcodec/x86/Makefile
@@ -117,7 +117,8 @@ YASM-OBJS-$(CONFIG_DCA_DECODER)+= x86/dcadsp.o
 YASM-OBJS-$(CONFIG_DNXHD_ENCODER)  += x86/dnxhdenc.o
 YASM-OBJS-$(CONFIG_HEVC_DECODER)   += x86/hevc_deblock.o\
   x86/hevc_mc.o \
-  x86/hevc_idct.o
+  x86/hevc_idct.o   \
+  x86/hevc_res_add.o
 YASM-OBJS-$(CONFIG_PNG_DECODER)+= x86/pngdsp.o
 YASM-OBJS-$(CONFIG_PRORES_DECODER) += x86/proresdsp.o
 YASM-OBJS-$(CONFIG_RV40_DECODER)   += x86/rv40dsp.o
diff --git a/libavcodec/x86/hevc_res_add.asm b/libavcodec/x86/hevc_res_add.asm
new file mode 100644
index 000..1e3bfc2
--- /dev/null
+++ b/libavcodec/x86/hevc_res_add.asm
@@ -0,0 +1,391 @@
+; /*
+; * Provide SIMD optimizations for add_residual functions for HEVC decoding
+; * Copyright (c) 2014 Pierre-Edouard LEPERE
+; *
+; * This file is part of Libav.
+; *
+; * FFmpeg is free software; you can redistribute it and/or
+; * modify it under the terms of the GNU Lesser General Public
+; * License as published by the Free Software Foundation; either
+; * version 2.1 of the License, or (at your option) any later version.
+; *
+; * FFmpeg is distributed in the hope that it will be useful,
+; * but WITHOUT ANY WARRANTY; without even the implied warranty of
+; * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+; * Lesser General Public License for more details.
+; *
+; * You should have received a copy of the GNU Lesser General Public
+; * License along with FFmpeg; if not, write to the Free Software
+; * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 
USA
+; */
+%include "libavutil/x86/x86util.asm"
+
+SECTION_RODATA 32
+max_pixels_10:  times 16  dw ((1 << 10)-1)
+
+SECTION .text
+
+; the add_res macros and functions were largely inspired by x264 project's 
code in the h264_idct.asm file
+%macro ADD_RES_MMX_4_8 0
+mova  m2, [r1]
+mova  m4, [r1+8]
+pxor  m3, m3
+psubw m3, m2
+packuswb  m2, m2
+packuswb  m3, m3
+pxor  m5, m5
+psubw m5, m4
+packuswb  m4, m4
+packuswb  m5, m5
+
+movh  m0, [r0 ]
+movh  m1, [r0+r2  ]
+paddusb   m0, m2
+paddusb   m1, m4
+psubusb   m0, m3
+psubusb   m1, m5
+movh   [r0 ], m0
+movh   [r0+r2  ], m1
+%endmacro
+
+
+INIT_MMX mmxext
+; void ff_hevc_add_residual_4_8_mmxext(uint8_t *dst, int16_t *coeffs, 
ptrdiff_t stride)
+cglobal hevc_add_residual_4_8, 3, 4, 6
+ADD_RES_MMX_4_8
+add   r1, 16
+lea   r0, [r0+r2*2]
+ADD_RES_MMX_4_8
+RET
+
+%macro ADD_RES_SSE_8_8 0
+pxor  m3, m3
+mova  m4, [r1]
+mova  m6, [r1+16]
+mova  m0, [r1+32]
+mova  m2, [r1+48]
+psubw m5, m3, m4
+psubw m7, m3, m6
+psubw m1, m3, m0
+packuswb  m4, m0
+packuswb  m5, m1
+psubw m3, m2
+packuswb  m6, m2
+packuswb  m7, m3
+
+movqm0, [r0 ]
+movqm1, [r0+r2  ]
+movhps  m0, [r0+r2*2]
+movhps  m1, [r0+r3  ]
+paddusb m0, m4
+paddusb m1, m6
+psubusb m0, m5
+psubusb m1, m7
+movq [r0 ], m0
+movq [r0+r2  ], m1
+movhps   [r0+2*r2], m0
+movhps   [r0+r3  ], m1
+%endmacro
+
+%macro ADD_RES_SSE_16_32_8 3
+mova xm2, [r1+%1   ]
+mova xm6, [r1+%1+16]
+%if cpuflag(avx2)
+vinserti128   m2, m2, [r1+%1+32], 1
+vinserti128   m6, m6, [r1+%1+48], 1
+%endif
+%if cpuflag(avx)
+psubw m1, m0, m2
+psubw m5, m0, m6
+%else
+mova  m1, m0
+mova  m5, m0
+psubw m1, m2
+psubw m5, m6
+%endif
+packuswb  m2, m6
+packuswb  m1, m5
+
+mova xm4, [r1+%1+mmsize*2   ]
+mova 

[libav-devel] [PATCH 2/2] checkasm: Add a test for HEVC add_residual

2016-10-12 Thread Alexandra Hájková
---
 tests/checkasm/Makefile   |  2 +-
 tests/checkasm/checkasm.c |  1 +
 tests/checkasm/checkasm.h |  1 +
 tests/checkasm/hevc_add_res.c | 84 +++
 4 files changed, 87 insertions(+), 1 deletion(-)
 create mode 100644 tests/checkasm/hevc_add_res.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 9b3df55..ac3e97e 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -12,7 +12,7 @@ AVCODECOBJS-$(CONFIG_VP8DSP)+= vp8dsp.o
 
 # decoders/encoders
 AVCODECOBJS-$(CONFIG_DCA_DECODER)   += dcadsp.o synth_filter.o
-AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o
+AVCODECOBJS-$(CONFIG_HEVC_DECODER)  += hevc_mc.o hevc_idct.o hevc_add_res.o
 AVCODECOBJS-$(CONFIG_V210_ENCODER)  += v210enc.o
 AVCODECOBJS-$(CONFIG_VP9_DECODER)   += vp9dsp.o
 
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index 040c4eb..d0dc525 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -92,6 +92,7 @@ static const struct {
 #if CONFIG_HEVC_DECODER
 { "hevc_mc", checkasm_check_hevc_mc },
 { "hevc_idct", checkasm_check_hevc_idct },
+{ "hevc_add_res", checkasm_check_hevc_add_res },
 #endif
 #if CONFIG_HUFFYUVDSP
 { "huffyuvdsp", checkasm_check_huffyuvdsp },
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 5a4c056..bacd6f4 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -39,6 +39,7 @@ void checkasm_check_fmtconvert(void);
 void checkasm_check_h264dsp(void);
 void checkasm_check_h264pred(void);
 void checkasm_check_h264qpel(void);
+void checkasm_check_hevc_add_res(void);
 void checkasm_check_hevc_idct(void);
 void checkasm_check_hevc_mc(void);
 void checkasm_check_huffyuvdsp(void);
diff --git a/tests/checkasm/hevc_add_res.c b/tests/checkasm/hevc_add_res.c
new file mode 100644
index 000..fcc47c1
--- /dev/null
+++ b/tests/checkasm/hevc_add_res.c
@@ -0,0 +1,84 @@
+/*
+ * Copyright (c) 2016 Alexandra Hájková
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with Libav; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+
+#include "libavutil/intreadwrite.h"
+
+#include "libavcodec/hevcdsp.h"
+
+#include "checkasm.h"
+
+#define randomize_buffers(buf, size)\
+do {\
+int j;  \
+for (j = 0; j < size; j++) {\
+int16_t r = rnd();  \
+AV_WN16A(buf + j, r >> 3);  \
+}   \
+} while (0)
+
+#define randomize_buffers2(buf, size) \
+do { \
+int j;   \
+for (j = 0; j < size; j++)   \
+AV_WN16A(buf + j * 2, (rnd() & 0xFF));   \
+} while (0)
+
+static void check_add_res(HEVCDSPContext h, int bit_depth)
+{
+int i;
+LOCAL_ALIGNED(32, int16_t, res0, [32 * 32]);
+LOCAL_ALIGNED(32, int16_t, res1, [32 * 32]);
+LOCAL_ALIGNED(32, uint8_t, dst0, [32 * 32 * 2]);
+LOCAL_ALIGNED(32, uint8_t, dst1, [32 * 32 * 2]);
+
+for (i = 2; i <= 5; i++) {
+int block_size = 1 << i;
+int size = block_size * block_size;
+declare_func_emms(AV_CPU_FLAG_MMX, void, uint8_t *dst, int16_t *res, 
ptrdiff_t stride);
+
+randomize_buffers(res0, size);
+randomize_buffers2(dst0, size * 2);
+memcpy(res1, res0, sizeof(*res0) * size);
+memcpy(dst1, dst0, size * 2);
+
+if (check_func(h.add_residual[i - 2], "add_res_%dx%d_%d", block_size, 
block_size, bit_depth)) {
+call_ref(dst0, res0, block_size * 2);
+call_new(dst1, res1, block_size * 2);
+if (memcmp(dst0, dst1, size * 2))
+fail();
+bench_new(dst1, res1, block_size);
+}
+}
+}
+
+void checkasm_check_hevc_add_res(void)
+{
+int bit_depth;
+
+for (bit_depth = 8; bit_depth <= 10; bit_depth++) {
+HEVCDSPContext h;
+
+ff_hevc_dsp_init(, bit_depth);
+check_add_res(h, bit_depth);
+}
+report("add_residual");
+}
-- 
2.1.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 8/9] qsv{dec, enc}: use a struct as a memory id with internal memory allocator

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> This will allow implementing the allocator more fully, which is needed
> by the HEVC encoder plugin with video memory input.
> ---
>  libavcodec/qsv.c  | 26 +++---
>  libavcodec/qsv_internal.h | 10 --
>  libavcodec/qsvdec.c   |  8 
>  libavcodec/qsvenc.c   |  8 
>  4 files changed, 47 insertions(+), 5 deletions(-)
> 

Possibly ok.

lu

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 7/9] qsv{dec, enc}: always use an internal mfxFrameSurface1

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> For encoding, this avoids modifying the input surface, which we are not
> allowed to do.
> This will also be useful in the following commits.
> ---
>  libavcodec/qsv_internal.h |  5 ++---
>  libavcodec/qsvdec.c   | 32 ++--
>  libavcodec/qsvenc.c   | 32 
>  3 files changed, 36 insertions(+), 33 deletions(-)

Probably ok.


___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 6/9] hwcontext_qsv: support frame mapping

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> +AVHWFramesContext *child_frames_ctx = 
> (AVHWFramesContext*)s->child_frames_ref->data;

You deference data from it

> +mfxFrameSurface1 *surf = (mfxFrameSurface1*)src->data[3];
> +
> +AVFrame *dummy;
> +int ret = 0;
> +
> +if (!s->child_frames_ref)
> +return AVERROR(ENOSYS);

You check if it is null here.

I guess the context should be deferenced after the check.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] hwcontext_qsv: do not fail when download/upload VPP session creation fails

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> Certain pixel formats (e.g. P8) might not be supported for
> download/upload through VPP operations, but can still be used otherwise.
> ---
>  libavutil/hwcontext_qsv.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
> 

Ok.

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/9] hwcontext_qsv: add support for the P8 format

2016-10-12 Thread Luca Barbato
On 11/10/2016 21:34, Anton Khirnov wrote:
> This format is used internally by the QSV encoder to store the encoded
> bitstream.
> ---
>  libavutil/hwcontext_qsv.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/libavutil/hwcontext_qsv.c b/libavutil/hwcontext_qsv.c
> index f2c8086..3679dc0 100644
> --- a/libavutil/hwcontext_qsv.c
> +++ b/libavutil/hwcontext_qsv.c
> @@ -91,6 +91,7 @@ static const struct {
>  } supported_pixel_formats[] = {
>  { AV_PIX_FMT_NV12, MFX_FOURCC_NV12 },
>  { AV_PIX_FMT_P010, MFX_FOURCC_P010 },
> +{ AV_PIX_FMT_PAL8, MFX_FOURCC_P8   },
>  };
>  
>  static int qsv_device_init(AVHWDeviceContext *ctx)
> 

Sure.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] allow to pass custom sidedata through decoders

2016-10-12 Thread Luca Barbato
On 12/10/2016 14:50, Francois Cartegnie wrote:
> Because not every side data is handled or extracted
> internally by libav, a libavcodec user might need
> to attach some extradata and retrieve it in
> presentation order after decoding.
> 
> Ex: active format data,
> unhandled closed captions format
> ---
>  libavcodec/avcodec.h  | 7 +++
>  libavcodec/utils.c| 1 +
>  libavfilter/vf_showinfo.c | 3 +++
>  libavformat/dump.c| 3 +++
>  libavutil/frame.h | 8 
>  5 files changed, 22 insertions(+)
> 

Sounds a good idea for a number of usages indeed.

Probably we could extend the format for custom SEI embedded data, but
that could be a separate side data in itself.

lu


___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] allow to pass custom sidedata through decoders

2016-10-12 Thread Francois Cartegnie
Le 12/10/2016 à 16:50, Vittorio Giovara a écrit :
> On Wed, Oct 12, 2016 at 8:50 AM, Francois Cartegnie  wrote:
> 
> I'm not really sure about this.
> If it can be handled by libavcodec, support for it should be added, if
> it cannot it's not hard to record the presentation timestamp and use a
> queue in the app itself.

How's that different from setting and grabbing any other supported
sidedata that is passed through decoder when using avcodec only ?

Plus AV_NOPTS_VALUE is not uncommon.

Francois
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/4] Add GBRP12 pixel format support

2016-10-12 Thread Vittorio Giovara
On Mon, Oct 10, 2016 at 4:35 PM, Luca Barbato  wrote:
> From: Michael Niedermayer 
>
> Signed-off-by: Vittorio Giovara 
> ---
>  doc/APIchanges  |  3 +++
>  libavutil/pixdesc.c | 25 +
>  libavutil/pixfmt.h  |  4 
>  libavutil/version.h |  2 +-
>  4 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/doc/APIchanges b/doc/APIchanges
> index 655783e..6b984f4 100644
> --- a/doc/APIchanges
> +++ b/doc/APIchanges
> @@ -13,6 +13,9 @@ libavutil: 2015-08-28
>
>  API changes, most recent first:
>
> +2016-xx-xx - xxx - lavu 55.24.0 - pixfmt.h
> +  Add AV_PIX_FMT_GBRP12(LE/BE).
> +
>  2016-xx-xx - xxx - lavu 55.23.0 - hwcontext_vaapi.h
>Add AV_VAAPI_DRIVER_QUIRK_ATTRIB_MEMTYPE.
>
> diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
> index 78fbdbc..38c9908 100644
> --- a/libavutil/pixdesc.c
> +++ b/libavutil/pixdesc.c
> @@ -1514,6 +1514,30 @@ static const AVPixFmtDescriptor 
> av_pix_fmt_descriptors[AV_PIX_FMT_NB] = {
>  },
>  .flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_PLANAR | 
> AV_PIX_FMT_FLAG_RGB,
>  },
> +[AV_PIX_FMT_GBRP12LE] = {
> +.name = "gbrp12le",
> +.nb_components = 3,
> +.log2_chroma_w = 0,
> +.log2_chroma_h = 0,
> +.comp = {
> +{ 2, 2, 0, 0, 12, 1, 11, 1 },/* R */
> +{ 0, 2, 0, 0, 12, 1, 11, 1 },/* G */
> +{ 1, 2, 0, 0, 12, 1, 11, 1 },/* B */
> +},
> +.flags = AV_PIX_FMT_FLAG_PLANAR | AV_PIX_FMT_FLAG_RGB,
> +},
> +[AV_PIX_FMT_GBRP12BE] = {
> +.name = "gbrp12be",
> +.nb_components = 3,
> +.log2_chroma_w = 0,
> +.log2_chroma_h = 0,
> +.comp = {
> +{ 2, 2, 0, 0, 12, 1, 11, 1 },/* R */
> +{ 0, 2, 0, 0, 12, 1, 11, 1 },/* G */
> +{ 1, 2, 0, 0, 12, 1, 11, 1 },/* B */
> +},
> +.flags = AV_PIX_FMT_FLAG_BE | AV_PIX_FMT_FLAG_PLANAR | 
> AV_PIX_FMT_FLAG_RGB,
> +},
>  [AV_PIX_FMT_GBRP16LE] = {
>  .name = "gbrp16le",
>  .nb_components = 3,
> @@ -1923,6 +1947,7 @@ enum AVPixelFormat av_pix_fmt_swap_endianness(enum 
> AVPixelFormat pix_fmt)
>
>  PIX_FMT_SWAP_ENDIANNESS(GBRP9);
>  PIX_FMT_SWAP_ENDIANNESS(GBRP10);
> +PIX_FMT_SWAP_ENDIANNESS(GBRP12);
>  PIX_FMT_SWAP_ENDIANNESS(GBRP16);
>  PIX_FMT_SWAP_ENDIANNESS(YUVA420P9);
>  PIX_FMT_SWAP_ENDIANNESS(YUVA422P9);
> diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
> index 3e040fb..473f53c 100644
> --- a/libavutil/pixfmt.h
> +++ b/libavutil/pixfmt.h
> @@ -239,6 +239,9 @@ enum AVPixelFormat {
>  AV_PIX_FMT_YUV444P12BE, ///< planar YUV 4:4:4, 36bpp, (1 Cr & Cb sample 
> per 1x1 Y), big-endian
>  AV_PIX_FMT_YUV444P12LE, ///< planar YUV 4:4:4, 36bpp, (1 Cr & Cb sample 
> per 1x1 Y), little-endian
>
> +AV_PIX_FMT_GBRP12BE,  ///< planar GBR 4:4:4 36bpp, big-endian
> +AV_PIX_FMT_GBRP12LE,  ///< planar GBR 4:4:4 36bpp, little-endian
> +
>  AV_PIX_FMT_NB,///< number of pixel formats, DO NOT USE THIS if 
> you want to link with shared libav* because the number of formats might 
> differ between versions
>  };
>
> @@ -281,6 +284,7 @@ enum AVPixelFormat {
>
>  #define AV_PIX_FMT_GBRP9 AV_PIX_FMT_NE(GBRP9BE ,GBRP9LE)
>  #define AV_PIX_FMT_GBRP10AV_PIX_FMT_NE(GBRP10BE,GBRP10LE)
> +#define AV_PIX_FMT_GBRP12AV_PIX_FMT_NE(GBRP12BE,GBRP12LE)
>  #define AV_PIX_FMT_GBRP16AV_PIX_FMT_NE(GBRP16BE,GBRP16LE)
>
>  #define AV_PIX_FMT_GBRAP16   AV_PIX_FMT_NE(GBRAP16BE,   GBRAP16LE)
> diff --git a/libavutil/version.h b/libavutil/version.h
> index 73de00e..7f7da80 100644
> --- a/libavutil/version.h
> +++ b/libavutil/version.h
> @@ -54,7 +54,7 @@
>   */
>
>  #define LIBAVUTIL_VERSION_MAJOR 55
> -#define LIBAVUTIL_VERSION_MINOR 23
> +#define LIBAVUTIL_VERSION_MINOR 24
>  #define LIBAVUTIL_VERSION_MICRO  0
>
>  #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
> --

the set looks good to me

-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] APIchanges: Expand the name of recently added pixel formats

2016-10-12 Thread Vittorio Giovara
On Thu, Oct 6, 2016 at 6:29 PM, Luca Barbato  wrote:
> On 07/10/16 00:27, Vittorio Giovara wrote:
>> This makes them easier to search for.
>> ---
>>  doc/APIchanges | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/doc/APIchanges b/doc/APIchanges
>> index 655783e..ef25a98 100644
>> --- a/doc/APIchanges
>> +++ b/doc/APIchanges
>> @@ -20,7 +20,7 @@ API changes, most recent first:
>>Add AVIO_SEEKABLE_TIME flag.
>>
>>  2016-xx-xx - xxx - lavu 55.22.0 - pixfmt.h
>> -  Add AV_PIX_FMT_YUV(420,422,444)P12.
>> +  Add AV_PIX_FMT_YUV420P12, AV_PIX_FMT_YUV422P12, and AV_PIX_FMT_YUV444P12.
>>
>>  2016-xx-xx - xxx - lavc 57.27.0 - avcodec.h
>>Add FF_PROFILE_HEVC_REXT, the extended pixel format profile for HEVC.
>> @@ -41,7 +41,7 @@ API changes, most recent first:
>>members AV_VAAPI_DRIVER_QUIRK_* to represent its values.
>>
>>  2016-07-02 - b7c5f88 - lavu 55.18.0 - pixfmt.h
>> -  Add AV_PIX_FMT_P010(LE/BE).
>> +  Add AV_PIX_FMT_P010LE and AV_PIX_FMT_P010BE.

actually this one is fine as is, I'll drop this chunk.
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] lavc: Document BSF in/out codecparam alloc and init process

2016-10-12 Thread Vittorio Giovara
On Mon, Oct 10, 2016 at 1:38 PM, Anton Khirnov  wrote:
> Quoting Vittorio Giovara (2016-10-04 17:59:53)
>> ---
>>  libavcodec/avcodec.h | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
>> index 167525d..02f391c 100644
>> --- a/libavcodec/avcodec.h
>> +++ b/libavcodec/avcodec.h
>> @@ -5017,12 +5017,15 @@ typedef struct AVBSFContext {
>>  void *priv_data;
>>
>>  /**
>> - * Parameters of the input stream. Set by the caller before 
>> av_bsf_init().
>> + * Parameters of the input stream. This field is allocated in
>> + * av_bsf_alloc(),needs to be initialized by the caller before
>
> nit: I would use "filled" instead of initialized, since the fields are
> guaranteed to be initialized to proper "invalid/unset" values, the
> callers just fills those that are known/relevant.

ok, I'll also slightly modify the title to the usual formatting

lavc: bsf: Document input/output codecparam alloc/init process
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/9] hwcontext_dxva2: add support for the P8 format

2016-10-12 Thread Vittorio Giovara
On Wed, Oct 12, 2016 at 3:58 AM, Anton Khirnov  wrote:
> Quoting Hendrik Leppkes (2016-10-11 22:33:43)
>> On Tue, Oct 11, 2016 at 9:34 PM, Anton Khirnov  wrote:
>> > This format is used internally by the QSV encoder to store the encoded
>> > bitstream.
>>
>> What does that even mean?
>> It smells like some evil hackery going into the nice clean hwcontext.
>
> Moderately evil.
> If you want to use gpu surfaces with QSV, you need to supply a frame
> allocator, which will be invoked to pass surface pools to it. For
> encoding, this allocator gets invoked not only for the pool of input
> frames, but also for a separate pool of (I assume) reconstructed frames
> and another pool of MFX_FOURCC_P8, which on Windows needs to return
> D3DFMT_P8 d3d surfaces. I think those are used to store the encoded
> bitstream on the GPU.
>
> In any case, while using P8 surfaces for this purpose is indeed rather
> strange, there's nothing especially wrong about supporting them in the
> DXVA2 hwcontext, it's just another surface format. The only hacky part
> is the actual palette itself -- I didn't find a way to retrieve the
> palette from a surface, so it's "emulated" by a dummy zero-filled
> buffer. The actual contents are irrelevant for my specific use case (and
> I don't expect it will be used for anything else).
>
> --

Can we have this detailed explanation in the commit log please?
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] allow to pass custom sidedata through decoders

2016-10-12 Thread Vittorio Giovara
On Wed, Oct 12, 2016 at 8:50 AM, Francois Cartegnie  wrote:
> Because not every side data is handled or extracted
> internally by libav, a libavcodec user might need
> to attach some extradata and retrieve it in
> presentation order after decoding.
> Ex: active format data,
> unhandled closed captions format
> ---

I'm not really sure about this.
If it can be handled by libavcodec, support for it should be added, if
it cannot it's not hard to record the presentation timestamp and use a
queue in the app itself.
-- 
Vittorio
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] allow to pass custom sidedata through decoders

2016-10-12 Thread Francois Cartegnie
Because not every side data is handled or extracted
internally by libav, a libavcodec user might need
to attach some extradata and retrieve it in
presentation order after decoding.

Ex: active format data,
unhandled closed captions format
---
 libavcodec/avcodec.h  | 7 +++
 libavcodec/utils.c| 1 +
 libavfilter/vf_showinfo.c | 3 +++
 libavformat/dump.c| 3 +++
 libavutil/frame.h | 8 
 5 files changed, 22 insertions(+)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index 167525d..8e5408b 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -1289,6 +1289,13 @@ enum AVPacketSideDataType {
  * This side data corresponds to the AVCPBProperties struct.
  */
 AV_PKT_DATA_CPB_PROPERTIES,
+
+/**
+ * This side data format is free to the user.
+ * It will be accessible through AV_FRAME_DATA_CUSTOM on
+ * a corresponding frame in presentation order after decoding.
+ */
+AV_PKT_DATA_CUSTOM,
 };
 
 typedef struct AVPacketSideData {
diff --git a/libavcodec/utils.c b/libavcodec/utils.c
index 837edbd..fd05d17 100644
--- a/libavcodec/utils.c
+++ b/libavcodec/utils.c
@@ -543,6 +543,7 @@ int ff_decode_frame_props(AVCodecContext *avctx, AVFrame 
*frame)
 { AV_PKT_DATA_DISPLAYMATRIX, AV_FRAME_DATA_DISPLAYMATRIX },
 { AV_PKT_DATA_STEREO3D,  AV_FRAME_DATA_STEREO3D },
 { AV_PKT_DATA_AUDIO_SERVICE_TYPE, AV_FRAME_DATA_AUDIO_SERVICE_TYPE },
+{ AV_PKT_DATA_CUSTOM,AV_FRAME_DATA_CUSTOM },
 };
 
 frame->color_primaries = avctx->color_primaries;
diff --git a/libavfilter/vf_showinfo.c b/libavfilter/vf_showinfo.c
index 204ff7a..4bdb3ca 100644
--- a/libavfilter/vf_showinfo.c
+++ b/libavfilter/vf_showinfo.c
@@ -127,6 +127,9 @@ static int filter_frame(AVFilterLink *inlink, AVFrame 
*frame)
 case AV_FRAME_DATA_AFD:
 av_log(ctx, AV_LOG_INFO, "afd: value of %"PRIu8, sd->data[0]);
 break;
+case AV_FRAME_DATA_CUSTOM:
+av_log(ctx, AV_LOG_INFO, "custom unknown side data");
+break;
 default:
 av_log(ctx, AV_LOG_WARNING, "unknown side data type %d (%d bytes)",
sd->type, sd->size);
diff --git a/libavformat/dump.c b/libavformat/dump.c
index 3b50f5d..9891c4c 100644
--- a/libavformat/dump.c
+++ b/libavformat/dump.c
@@ -354,6 +354,9 @@ static void dump_sidedata(void *ctx, AVStream *st, const 
char *indent)
 av_log(ctx, AV_LOG_INFO, "cpb: ");
 dump_cpb(ctx, );
 break;
+case AV_PKT_DATA_CUSTOM:
+av_log(ctx, AV_LOG_INFO, "custom unknown side data");
+break;
 default:
 av_log(ctx, AV_LOG_WARNING,
"unknown side data type %d (%d bytes)", sd.type, sd.size);
diff --git a/libavutil/frame.h b/libavutil/frame.h
index 12624d7..f6a365e 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -92,6 +92,14 @@ enum AVFrameSideDataType {
  * enum AVAudioServiceType defined in avcodec.h.
  */
 AV_FRAME_DATA_AUDIO_SERVICE_TYPE,
+
+/**
+ * This side data corresponds to the custom/private data passed to the 
decoder
+ * through AV_PKT_DATA_CUSTOM. It allows to stick custom data to a 
compressed
+ * frame and receive it in presentation order after decoding.
+ * It is not used by the library itself.
+ */
+AV_FRAME_DATA_CUSTOM,
 };
 
 enum AVActiveFormatDescription {
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] hevcdec: split ff_hevc_diag_scan* declarations into a separate header

2016-10-12 Thread Luca Barbato
On 12/10/2016 10:26, Anton Khirnov wrote:
> This will be useful in the following commits.
> ---
>  libavcodec/dxva2_hevc.c |  1 +
>  libavcodec/hevc_data.h  | 31 +++
>  libavcodec/hevc_ps.c|  1 +
>  libavcodec/hevcdec.c|  1 +
>  libavcodec/hevcdec.h|  5 -
>  libavcodec/vdpau_hevc.c |  1 +
>  6 files changed, 35 insertions(+), 5 deletions(-)
>  create mode 100644 libavcodec/hevc_data.h
> 

Sure, why not?

lu

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] vaapi_encode: Write sequence header as extradata

2016-10-12 Thread Mark Thompson
On 12/10/16 10:38, Anton Khirnov wrote:
> Quoting Mark Thompson (2016-10-10 21:54:35)
>> Only works if packed headers are supported, where we can know the
>> output before generating the first frame.
>> ---
>> Added padding; fail harder; informative comment in header.
>>
>> Not sure how to do this in the non-packed-header case - we could just
>> invoke this anyway, but the result is unlikely to precisely match what
>> the encoder then produces.
> 
> I guess we shouldn't then -- creating corrupted files is evil. At least
> in avconv we can now use the extract_extradata bitstream filter (once it
> goes in) to get the extradata from the first packet.

Ok.  (Later, the bitstream filter could be applied automatically by the encoder
when it knows it doesn't have packed header support?)

>>
>>  libavcodec/vaapi_encode.c | 22 ++
>>  libavcodec/vaapi_encode.h |  2 ++
>>  2 files changed, 24 insertions(+)
>>
>> diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
>> index b600a00..4dc1f50 100644
>> --- a/libavcodec/vaapi_encode.c
>> +++ b/libavcodec/vaapi_encode.c
>> @@ -1399,6 +1399,28 @@ av_cold int ff_vaapi_encode_init(AVCodecContext 
>> *avctx)
>>  // where it actually overlaps properly, though.)
>>  ctx->issue_mode = ISSUE_MODE_MAXIMISE_THROUGHPUT;
>>
>> +if (ctx->va_packed_headers & VA_ENC_PACKED_HEADER_SEQUENCE &&
>> +ctx->codec->write_sequence_header) {
>> +char data[MAX_PARAM_BUFFER_SIZE];
>> +size_t bit_len = 8 * sizeof(data);
>> +
>> +err = ctx->codec->write_sequence_header(avctx, data, _len);
>> +if (err < 0) {
>> +av_log(avctx, AV_LOG_ERROR, "Failed to write sequence header "
>> +   "for extradata: %d.\n", err);
>> +goto fail;
>> +} else {
>> +avctx->extradata_size = bit_len / 8;
> 
> Round up? Or is this guaranteed to be byte-aligned (in which case why is
> it not in bytes in the first place)?

I'll add the round up.

It is byte-aligned in all current cases; the length is in bits to be consistent
with the other header-writing functions, which can produce non-byte-aligned 
output.

Thanks,

- Mark

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] vaapi_encode: Write sequence header as extradata

2016-10-12 Thread Anton Khirnov
Quoting Mark Thompson (2016-10-10 21:54:35)
> Only works if packed headers are supported, where we can know the
> output before generating the first frame.
> ---
> Added padding; fail harder; informative comment in header.
> 
> Not sure how to do this in the non-packed-header case - we could just
> invoke this anyway, but the result is unlikely to precisely match what
> the encoder then produces.

I guess we shouldn't then -- creating corrupted files is evil. At least
in avconv we can now use the extract_extradata bitstream filter (once it
goes in) to get the extradata from the first packet.

> 
>  libavcodec/vaapi_encode.c | 22 ++
>  libavcodec/vaapi_encode.h |  2 ++
>  2 files changed, 24 insertions(+)
> 
> diff --git a/libavcodec/vaapi_encode.c b/libavcodec/vaapi_encode.c
> index b600a00..4dc1f50 100644
> --- a/libavcodec/vaapi_encode.c
> +++ b/libavcodec/vaapi_encode.c
> @@ -1399,6 +1399,28 @@ av_cold int ff_vaapi_encode_init(AVCodecContext *avctx)
>  // where it actually overlaps properly, though.)
>  ctx->issue_mode = ISSUE_MODE_MAXIMISE_THROUGHPUT;
> 
> +if (ctx->va_packed_headers & VA_ENC_PACKED_HEADER_SEQUENCE &&
> +ctx->codec->write_sequence_header) {
> +char data[MAX_PARAM_BUFFER_SIZE];
> +size_t bit_len = 8 * sizeof(data);
> +
> +err = ctx->codec->write_sequence_header(avctx, data, _len);
> +if (err < 0) {
> +av_log(avctx, AV_LOG_ERROR, "Failed to write sequence header "
> +   "for extradata: %d.\n", err);
> +goto fail;
> +} else {
> +avctx->extradata_size = bit_len / 8;

Round up? Or is this guaranteed to be byte-aligned (in which case why is
it not in bytes in the first place)?

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] GBRP12 support

2016-10-12 Thread Luca Barbato
On 12/10/2016 06:57, Diego Biurrun wrote:
> Thanks, I was not seeing the forest for the trees...

And that's why we should devote some more time in the poor neglected
avscale =) (mine and Vittorio's branches are updated every month =p)

lu
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 2/2] hevcdec: move parameter set parsing into a separate header

2016-10-12 Thread Anton Khirnov
This code is independent from the decoder, so it makes more sense for it
to to have its own header.
---
 libavcodec/hevc.h|   5 +
 libavcodec/hevc_ps.c |  14 +--
 libavcodec/hevc_ps.h | 320 +++
 libavcodec/hevc_ps_enc.c |   2 +-
 libavcodec/hevc_refs.c   |   2 +-
 libavcodec/hevcdec.c |   2 +-
 libavcodec/hevcdec.h | 296 +--
 7 files changed, 339 insertions(+), 302 deletions(-)
 create mode 100644 libavcodec/hevc_ps.h

diff --git a/libavcodec/hevc.h b/libavcodec/hevc.h
index 66816b8..9536608 100644
--- a/libavcodec/hevc.h
+++ b/libavcodec/hevc.h
@@ -62,4 +62,9 @@ enum HEVCNALUnitType {
 #define HEVC_MAX_SHORT_TERM_RPS_COUNT 64
 #define HEVC_MAX_CU_SIZE 128
 
+#define HEVC_MAX_REFS 16
+#define HEVC_MAX_DPB_SIZE 16 // A.4.1
+
+#define HEVC_MAX_LOG2_CTB_SIZE 6
+
 #endif /* AVCODEC_HEVC_H */
diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c
index 44db326..6a8cfeb 100644
--- a/libavcodec/hevc_ps.c
+++ b/libavcodec/hevc_ps.c
@@ -26,8 +26,8 @@
 #include "libavutil/imgutils.h"
 
 #include "golomb.h"
-#include "hevcdec.h"
 #include "hevc_data.h"
+#include "hevc_ps.h"
 
 static const uint8_t default_scaling_list_intra[] = {
 16, 16, 16, 16, 17, 18, 21, 24,
@@ -200,8 +200,8 @@ int ff_hevc_decode_short_term_rps(GetBitContext *gb, 
AVCodecContext *avctx,
 rps->num_negative_pics = get_ue_golomb_long(gb);
 nb_positive_pics   = get_ue_golomb_long(gb);
 
-if (rps->num_negative_pics >= MAX_REFS ||
-nb_positive_pics >= MAX_REFS) {
+if (rps->num_negative_pics >= HEVC_MAX_REFS ||
+nb_positive_pics >= HEVC_MAX_REFS) {
 av_log(avctx, AV_LOG_ERROR, "Too many refs in a short term 
RPS.\n");
 return AVERROR_INVALIDDATA;
 }
@@ -406,7 +406,7 @@ int ff_hevc_decode_nal_vps(GetBitContext *gb, 
AVCodecContext *avctx,
 vps->vps_num_reorder_pics[i]  = get_ue_golomb_long(gb);
 vps->vps_max_latency_increase[i]  = get_ue_golomb_long(gb) - 1;
 
-if (vps->vps_max_dec_pic_buffering[i] > MAX_DPB_SIZE) {
+if (vps->vps_max_dec_pic_buffering[i] > HEVC_MAX_DPB_SIZE) {
 av_log(avctx, AV_LOG_ERROR, "vps_max_dec_pic_buffering_minus1 out 
of range: %d\n",
vps->vps_max_dec_pic_buffering[i] - 1);
 goto err;
@@ -794,7 +794,7 @@ int ff_hevc_parse_sps(HEVCSPS *sps, GetBitContext *gb, 
unsigned int *sps_id,
 sps->temporal_layer[i].max_dec_pic_buffering = get_ue_golomb_long(gb) 
+ 1;
 sps->temporal_layer[i].num_reorder_pics  = get_ue_golomb_long(gb);
 sps->temporal_layer[i].max_latency_increase  = get_ue_golomb_long(gb) 
- 1;
-if (sps->temporal_layer[i].max_dec_pic_buffering > MAX_DPB_SIZE) {
+if (sps->temporal_layer[i].max_dec_pic_buffering > HEVC_MAX_DPB_SIZE) {
 av_log(avctx, AV_LOG_ERROR, "sps_max_dec_pic_buffering_minus1 out 
of range: %d\n",
sps->temporal_layer[i].max_dec_pic_buffering - 1);
 ret = AVERROR_INVALIDDATA;
@@ -804,7 +804,7 @@ int ff_hevc_parse_sps(HEVCSPS *sps, GetBitContext *gb, 
unsigned int *sps_id,
 av_log(avctx, AV_LOG_WARNING, "sps_max_num_reorder_pics out of 
range: %d\n",
sps->temporal_layer[i].num_reorder_pics);
 if (avctx->err_recognition & AV_EF_EXPLODE ||
-sps->temporal_layer[i].num_reorder_pics > MAX_DPB_SIZE - 1) {
+sps->temporal_layer[i].num_reorder_pics > HEVC_MAX_DPB_SIZE - 
1) {
 ret = AVERROR_INVALIDDATA;
 goto err;
 }
@@ -955,7 +955,7 @@ int ff_hevc_parse_sps(HEVCSPS *sps, GetBitContext *gb, 
unsigned int *sps_id,
 goto err;
 }
 
-if (sps->log2_ctb_size > MAX_LOG2_CTB_SIZE) {
+if (sps->log2_ctb_size > HEVC_MAX_LOG2_CTB_SIZE) {
 av_log(avctx, AV_LOG_ERROR, "CTB size out of range: 2^%d\n", 
sps->log2_ctb_size);
 goto err;
 }
diff --git a/libavcodec/hevc_ps.h b/libavcodec/hevc_ps.h
new file mode 100644
index 000..d95aa51
--- /dev/null
+++ b/libavcodec/hevc_ps.h
@@ -0,0 +1,320 @@
+/*
+ * HEVC parameter set parsing
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with Libav; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef 

[libav-devel] [PATCH 1/2] hevcdec: split ff_hevc_diag_scan* declarations into a separate header

2016-10-12 Thread Anton Khirnov
This will be useful in the following commits.
---
 libavcodec/dxva2_hevc.c |  1 +
 libavcodec/hevc_data.h  | 31 +++
 libavcodec/hevc_ps.c|  1 +
 libavcodec/hevcdec.c|  1 +
 libavcodec/hevcdec.h|  5 -
 libavcodec/vdpau_hevc.c |  1 +
 6 files changed, 35 insertions(+), 5 deletions(-)
 create mode 100644 libavcodec/hevc_data.h

diff --git a/libavcodec/dxva2_hevc.c b/libavcodec/dxva2_hevc.c
index 53fd638..673fada 100644
--- a/libavcodec/dxva2_hevc.c
+++ b/libavcodec/dxva2_hevc.c
@@ -22,6 +22,7 @@
 
 #include "libavutil/avassert.h"
 
+#include "hevc_data.h"
 #include "hevcdec.h"
 
 // The headers above may include w32threads.h, which uses the original
diff --git a/libavcodec/hevc_data.h b/libavcodec/hevc_data.h
new file mode 100644
index 000..d1d2c33
--- /dev/null
+++ b/libavcodec/hevc_data.h
@@ -0,0 +1,31 @@
+/*
+ * HEVC shared data tables
+ *
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with Libav; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_HEVC_DATA_H
+#define AVCODEC_HEVC_DATA_H
+
+#include 
+
+extern const uint8_t ff_hevc_diag_scan4x4_x[16];
+extern const uint8_t ff_hevc_diag_scan4x4_y[16];
+extern const uint8_t ff_hevc_diag_scan8x8_x[64];
+extern const uint8_t ff_hevc_diag_scan8x8_y[64];
+
+#endif /* AVCODEC_HEVC_DATA_H */
diff --git a/libavcodec/hevc_ps.c b/libavcodec/hevc_ps.c
index 520017b..44db326 100644
--- a/libavcodec/hevc_ps.c
+++ b/libavcodec/hevc_ps.c
@@ -27,6 +27,7 @@
 
 #include "golomb.h"
 #include "hevcdec.h"
+#include "hevc_data.h"
 
 static const uint8_t default_scaling_list_intra[] = {
 16, 16, 16, 16, 17, 18, 21, 24,
diff --git a/libavcodec/hevcdec.c b/libavcodec/hevcdec.c
index 1da4334..93eb95a 100644
--- a/libavcodec/hevcdec.c
+++ b/libavcodec/hevcdec.c
@@ -37,6 +37,7 @@
 #include "cabac_functions.h"
 #include "golomb.h"
 #include "hevc.h"
+#include "hevc_data.h"
 #include "hevcdec.h"
 #include "profiles.h"
 
diff --git a/libavcodec/hevcdec.h b/libavcodec/hevcdec.h
index 9566223..b4502c0 100644
--- a/libavcodec/hevcdec.h
+++ b/libavcodec/hevcdec.h
@@ -966,9 +966,4 @@ extern const uint8_t ff_hevc_qpel_extra_before[4];
 extern const uint8_t ff_hevc_qpel_extra_after[4];
 extern const uint8_t ff_hevc_qpel_extra[4];
 
-extern const uint8_t ff_hevc_diag_scan4x4_x[16];
-extern const uint8_t ff_hevc_diag_scan4x4_y[16];
-extern const uint8_t ff_hevc_diag_scan8x8_x[64];
-extern const uint8_t ff_hevc_diag_scan8x8_y[64];
-
 #endif /* AVCODEC_HEVCDEC_H */
diff --git a/libavcodec/vdpau_hevc.c b/libavcodec/vdpau_hevc.c
index 9f2baa7..8299456 100644
--- a/libavcodec/vdpau_hevc.c
+++ b/libavcodec/vdpau_hevc.c
@@ -24,6 +24,7 @@
 
 #include "avcodec.h"
 #include "internal.h"
+#include "hevc_data.h"
 #include "hevcdec.h"
 #include "vdpau.h"
 #include "vdpau_internal.h"
-- 
2.0.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/9] hwcontext_dxva2: add support for the P8 format

2016-10-12 Thread Anton Khirnov
Quoting Hendrik Leppkes (2016-10-11 22:33:43)
> On Tue, Oct 11, 2016 at 9:34 PM, Anton Khirnov  wrote:
> > This format is used internally by the QSV encoder to store the encoded
> > bitstream.
> 
> What does that even mean?
> It smells like some evil hackery going into the nice clean hwcontext.

Moderately evil.
If you want to use gpu surfaces with QSV, you need to supply a frame
allocator, which will be invoked to pass surface pools to it. For
encoding, this allocator gets invoked not only for the pool of input
frames, but also for a separate pool of (I assume) reconstructed frames
and another pool of MFX_FOURCC_P8, which on Windows needs to return
D3DFMT_P8 d3d surfaces. I think those are used to store the encoded
bitstream on the GPU.

In any case, while using P8 surfaces for this purpose is indeed rather
strange, there's nothing especially wrong about supporting them in the
DXVA2 hwcontext, it's just another surface format. The only hacky part
is the actual palette itself -- I didn't find a way to retrieve the
palette from a surface, so it's "emulated" by a dummy zero-filled
buffer. The actual contents are irrelevant for my specific use case (and
I don't expect it will be used for anything else).

-- 
Anton Khirnov
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel