Re: [libav-devel] [PATCH] checkasm: aarch64: Specify alignment for the register_init const array

2017-05-04 Thread Luca Barbato
On 5/4/17 8:46 PM, Martin Storsjö wrote:
> Loads from this strictly doesn't require alignment, but specify it
> just for consistency with the arm version.
> ---
>  tests/checkasm/aarch64/checkasm.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/checkasm/aarch64/checkasm.S 
> b/tests/checkasm/aarch64/checkasm.S
> index bc5ed9ea09..327dfc0802 100644
> --- a/tests/checkasm/aarch64/checkasm.S
> +++ b/tests/checkasm/aarch64/checkasm.S
> @@ -22,7 +22,7 @@
>  
>  #include "libavutil/aarch64/asm.S"
>  
> -const register_init
> +const register_init, align=4
>  .quad 0x21f86d66c8ca00ce
>  .quad 0x75b6ba21077c48ad
>  .quad 0xed56bb2dcb3c7736
> 

Sounds a good idea.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] checkasm: aarch64: Specify alignment for the register_init const array

2017-05-04 Thread Martin Storsjö
Loads from this strictly doesn't require alignment, but specify it
just for consistency with the arm version.
---
 tests/checkasm/aarch64/checkasm.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/checkasm/aarch64/checkasm.S 
b/tests/checkasm/aarch64/checkasm.S
index bc5ed9ea09..327dfc0802 100644
--- a/tests/checkasm/aarch64/checkasm.S
+++ b/tests/checkasm/aarch64/checkasm.S
@@ -22,7 +22,7 @@
 
 #include "libavutil/aarch64/asm.S"
 
-const register_init
+const register_init, align=4
 .quad 0x21f86d66c8ca00ce
 .quad 0x75b6ba21077c48ad
 .quad 0xed56bb2dcb3c7736
-- 
2.11.0 (Apple Git-81)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] hevc: Add NEON 32x32 IDCT

2017-05-04 Thread Martin Storsjö

On Thu, 4 May 2017, Alexandra Hájková wrote:


---
libavcodec/arm/hevc_idct.S| 311 +++---
libavcodec/arm/hevcdsp_init_arm.c |   4 +
2 files changed, 294 insertions(+), 21 deletions(-)


My main issues with it have been taken care of, so I don't see it as too 
bad any longer. I haven't read the code in detail, but the overall 
structure is more or less sound at least, so I'm ok with it going in for 
now. The speedup vs C code is 8-18x, so it's clearly worthwhile at least.


Will push.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] hevc: 16x16 NEON idct: Use the right element size for stores.

2017-05-04 Thread Martin Storsjö

On Thu, 4 May 2017, Alexandra Hájková wrote:


This doesn't change the actual behaviour of the code but improves
readability.
---
libavcodec/arm/hevc_idct.S | 16 
1 file changed, 8 insertions(+), 8 deletions(-)


OK

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [GASPP PATCH] Support converting more instructions to their thumb equivalent

2017-05-04 Thread Martin Storsjo
---
These are used for supporting building x264 for windows/arm with
msvc/armasm (currently in the x264 sandbox repo).
---
 gas-preprocessor.pl | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 35d201d..afdfc9e 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -951,6 +951,20 @@ sub handle_serialized_line {
 $line =~ s/stm(?:db|fd)\s+sp!\s*,\s*\{([^,-]+)\}/str $1, [sp, #-4]!/g;
 $line =~ s/ldm(?:ia|fd)?\s+sp!\s*,\s*\{([^,-]+)\}/ldr $1, [sp], #4/g;
 
+# Convert muls into mul+cmp
+$line =~ s/muls\s+(\w+),\s*(\w+)\,\s*(\w+)/mul $1, $2, $3\n\tcmp $1, 
#0/g;
+
+# Convert "and r0, sp, #xx" into "mov r0, sp", "and r0, r0, #xx"
+$line =~ s/and\s+(\w+),\s*(sp|r13)\,\s*#(\w+)/mov $1, $2\n\tand $1, 
$1, #$3/g;
+
+# Convert "ldr r0, [r0, r1, lsl #6]" where the shift is >3 (which
+# can't be handled in thumb) into "add r0, r0, r1, lsl #6",
+# "ldr r0, [r0]", for the special case where the same address is
+# used as base and target for the ldr.
+if ($line =~ /(ldr[bh]?)\s+(\w+),\s*\[\2,\s*(\w+),\s*lsl\s*#(\w+)\]/ 
and $4 > 3) {
+$line =~ 
s/(ldr[bh]?)\s+(\w+),\s*\[\2,\s*(\w+),\s*lsl\s*#(\w+)\]/add $2, $2, $3, lsl 
#$4\n\t$1 $2, [$2]/;
+}
+
 $line =~ s/\.arm/.thumb/x;
 }
 
-- 
2.11.0 (Apple Git-81)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/5] lavu: add new D3D11 pixfmt and hwcontext

2017-05-04 Thread wm4
On Thu,  4 May 2017 08:44:04 +0200
wm4  wrote:

> +AV_PIX_FMT_D3D11, ///< HW decoding through Direct3D11 via new API, 
> Picture.data[0] contains a ID3D11Texture2D pointer, and data[1] contains the 
> texture array index of the frame as intptr_t if the ID3D11Texture2D is an 
> array texture (or 0 if it's a normal texture)
> +
>  AV_PIX_FMT_NB,///< number of pixel formats, DO NOT USE THIS if 
> you want to link with shared libav* 

By the way, there is probably no strict need for a new pixfmt.
With the "new" D3D11 hwaccel, we need it to carry different objects (a
texture handle instead of a decoder view). So the semantics change
which would warrant a new pixfmt.

On the other hand, there's no hard technical reason to use a new
pixfmt. We could just change the definition of AV_PIX_FMT_D3D11VA_VLD
to depend on which API is used.

Pro (for new, separate AV_PIX_FMT_D3D11):
- cleaner
- avoids confusion
- chance that the old API is deprecated, and AV_PIX_FMT_D3D11VA_VLD is
  removed, also removing the problem

Contra:
- libavcodec dxva2 code needs tons of changes to deal with both d3d11
  formats
- separate AVHWAccels needed just because of the pixfmt

Which should it be?

Or maybe I'm missing something big here due to sleep deprivation.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 4/5] dxva: move d3d11 locking/unlocking to functions

2017-05-04 Thread wm4
On Thu, 4 May 2017 11:15:42 +0200
Hendrik Leppkes  wrote:

> On Thu, May 4, 2017 at 8:44 AM, wm4  wrote:
> > I want to make it non-mandatory to set a mutex in the D3D11 device
> > context, and replacing it with user callbacks seems like the best
> > solution. This is preparation for it. Also makes the code slightly more
> > readable.
> >  
> 
> With recent frame-mt hwaccel changes, a user that needs this locking
> could just do it externally around the decode function. Maybe we
> should just get rid of it in the "new" d3d11 hwaccel?

Yeah, I'm not sure how this should be handled, or if this sort of
locking is even required. Note that for sane refcounting and hwframes
functionality in a multithreaded setting there would need to be a
locking mechanism in the hwcontext - but only if this kind of locking
is needed at all. It's not clear to me whether it's needed. It could be
cargo-cult.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] arm: Check for the .arch directive in configure

2017-05-04 Thread Luca Barbato
On 5/4/17 10:45 AM, Martin Storsjö wrote:
> is used), as suggested by Janne on irc.

Looks fine.
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 4/5] dxva: move d3d11 locking/unlocking to functions

2017-05-04 Thread Hendrik Leppkes
On Thu, May 4, 2017 at 8:44 AM, wm4  wrote:
> I want to make it non-mandatory to set a mutex in the D3D11 device
> context, and replacing it with user callbacks seems like the best
> solution. This is preparation for it. Also makes the code slightly more
> readable.
>

With recent frame-mt hwaccel changes, a user that needs this locking
could just do it externally around the decode function. Maybe we
should just get rid of it in the "new" d3d11 hwaccel?

- Hendrik
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] hevc: Add NEON 32x32 IDCT

2017-05-04 Thread Alexandra Hájková
---
 libavcodec/arm/hevc_idct.S| 311 +++---
 libavcodec/arm/hevcdsp_init_arm.c |   4 +
 2 files changed, 294 insertions(+), 21 deletions(-)

diff --git a/libavcodec/arm/hevc_idct.S b/libavcodec/arm/hevc_idct.S
index eeb81e3..79799b2 100644
--- a/libavcodec/arm/hevc_idct.S
+++ b/libavcodec/arm/hevc_idct.S
@@ -28,6 +28,10 @@ const trans, align=4
 .short 89, 75, 50, 18
 .short 90, 87, 80, 70
 .short 57, 43, 25, 9
+.short 90, 90, 88, 85
+.short 82, 78, 73, 67
+.short 61, 54, 46, 38
+.short 31, 22, 13, 4
 endconst
 
 .macro clip10 in1, in2, c1, c2
@@ -509,7 +513,7 @@ endfunc
 vsub.s32\tmp_m, \e, \o
 .endm
 
-.macro tr16_8x4 in0, in1, in2, in3, in4, in5, in6, in7
+.macro tr16_8x4 in0, in1, in2, in3, in4, in5, in6, in7, offset
 tr_4x4_8\in0, \in2, \in4, \in6, q8, q9, q10, q11, q12, q13, 
q14, q15
 
 vmull.s16   q12, \in1, \in0[0]
@@ -535,7 +539,7 @@ endfunc
 butterfly   q9,  q13, q1, q6
 butterfly   q10, q14, q2, q5
 butterfly   q11, q15, q3, q4
-add r4,  sp,  #512
+add r4,  sp,  #\offset
 vst1.s32{q0-q1}, [r4, :128]!
 vst1.s32{q2-q3}, [r4, :128]!
 vst1.s32{q4-q5}, [r4, :128]!
@@ -575,15 +579,15 @@ endfunc
 vsub.s32\in6, \in6, \in7
 .endm
 
-.macro store16 in0, in1, in2, in3, in4, in5, in6, in7
+.macro store16 in0, in1, in2, in3, in4, in5, in6, in7, rx
 vst1.s16\in0, [r1, :64], r2
-vst1.s16\in1, [r3, :64], r4
+vst1.s16\in1, [r3, :64], \rx
 vst1.s16\in2, [r1, :64], r2
-vst1.s16\in3, [r3, :64], r4
+vst1.s16\in3, [r3, :64], \rx
 vst1.s16\in4, [r1, :64], r2
-vst1.s16\in5, [r3, :64], r4
+vst1.s16\in5, [r3, :64], \rx
 vst1.s16\in6, [r1, :64], r2
-vst1.s16\in7, [r3, :64], r4
+vst1.s16\in7, [r3, :64], \rx
 .endm
 
 .macro scale out0, out1, out2, out3, out4, out5, out6, out7, in0, in1, in2, 
in3, in4, in5, in6, in7, shift
@@ -597,19 +601,35 @@ endfunc
 vqrshrn.s32 \out7, \in7, \shift
 .endm
 
-.macro tr_16x4 name, shift
+@stores in1, in2, in4, in6 ascending from off1 and
+@stores in1, in3, in5, in7 descending from off2
+.macro store_to_stack off1, off2, in0, in2, in4, in6, in7, in5, in3, in1
+add r1, sp, #\off1
+add r3, sp, #\off2
+mov r2, #-16
+vst1.s32{\in0}, [r1, :128]!
+vst1.s32{\in1}, [r3, :128], r2
+vst1.s32{\in2}, [r1, :128]!
+vst1.s32{\in3}, [r3, :128], r2
+vst1.s32{\in4}, [r1, :128]!
+vst1.s32{\in5}, [r3, :128], r2
+vst1.s32{\in6}, [r1, :128]
+vst1.s32{\in7}, [r3, :128]
+.endm
+
+.macro tr_16x4 name, shift, offset, step
 function func_tr_16x4_\name
 mov r1,  r5
-add r3,  r5, #64
-mov r2,  #128
+add r3, r5, #(\step * 64)
+mov r2, #(\step * 128)
 load16  d0, d1, d2, d3, d4, d5, d6, d7
 movrel  r1, trans
 
-tr16_8x4d0, d1, d2, d3, d4, d5, d6, d7
+tr16_8x4d0, d1, d2, d3, d4, d5, d6, d7, \offset
 
-add r1,  r5, #32
-add r3,  r5, #(64 + 32)
-mov r2,  #128
+add r1,  r5, #(\step * 32)
+add r3,  r5, #(\step * 3 *32)
+mov r2,  #(\step * 128)
 load16  d8, d9, d2, d3, d4, d5, d6, d7
 movrel  r1, trans + 16
 vld1.s16{q0}, [r1, :128]
@@ -630,11 +650,12 @@ function func_tr_16x4_\name
 add_member  d6, d1[2], d0[3], d0[0], d0[2], d1[1], d1[3], d1[0], 
d0[1], +, -, +, -, +, +, -, +
 add_member  d7, d1[3], d1[2], d1[1], d1[0], d0[3], d0[2], d0[1], 
d0[0], +, -, +, -, +, -, +, -
 
-add r4, sp, #512
+add r4, sp, #\offset
 vld1.s32{q0-q1}, [r4, :128]!
 vld1.s32{q2-q3}, [r4, :128]!
 
 butterfly16 q0, q5, q1, q6, q2, q7, q3, q8
+.if \shift > 0
 scale   d26, d27, d28, d29, d30, d31, d16, d17, q4, q0, q5, 
q1, q6, q2, q7, q3, \shift
 transpose8_4x4  d26, d28, d30, d16
 transpose8_4x4  d17, d31, d29, d27
@@ -642,12 +663,16 @@ function func_tr_16x4_\name
 add r3, r6, #(24 +3*32)
 mov r2, #32
 mov r4, #-32
-store16 d26, d27, d28, d29, d30, d31, d16, d17
+store16 d26, d27, d28, d29, d30, d31, d16, d17, r4
+.else
+store_to_stack  \offset, (\offset + 240), q4, q5, q6, q7, q3, q2, q1, 
q0
+.endif
 
-add   

[libav-devel] [PATCH 1/2] hevc: 16x16 NEON idct: Use the right element size for stores.

2017-05-04 Thread Alexandra Hájková
This doesn't change the actual behaviour of the code but improves
readability.
---
 libavcodec/arm/hevc_idct.S | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/libavcodec/arm/hevc_idct.S b/libavcodec/arm/hevc_idct.S
index fac5758..eeb81e3 100644
--- a/libavcodec/arm/hevc_idct.S
+++ b/libavcodec/arm/hevc_idct.S
@@ -536,10 +536,10 @@ endfunc
 butterfly   q10, q14, q2, q5
 butterfly   q11, q15, q3, q4
 add r4,  sp,  #512
-vst1.s16{q0-q1}, [r4, :128]!
-vst1.s16{q2-q3}, [r4, :128]!
-vst1.s16{q4-q5}, [r4, :128]!
-vst1.s16{q6-q7}, [r4, :128]
+vst1.s32{q0-q1}, [r4, :128]!
+vst1.s32{q2-q3}, [r4, :128]!
+vst1.s32{q4-q5}, [r4, :128]!
+vst1.s32{q6-q7}, [r4, :128]
 .endm
 
 .macro load16 in0, in1, in2, in3, in4, in5, in6, in7
@@ -631,8 +631,8 @@ function func_tr_16x4_\name
 add_member  d7, d1[3], d1[2], d1[1], d1[0], d0[3], d0[2], d0[1], 
d0[0], +, -, +, -, +, -, +, -
 
 add r4, sp, #512
-vld1.s16{q0-q1}, [r4, :128]!
-vld1.s16{q2-q3}, [r4, :128]!
+vld1.s32{q0-q1}, [r4, :128]!
+vld1.s32{q2-q3}, [r4, :128]!
 
 butterfly16 q0, q5, q1, q6, q2, q7, q3, q8
 scale   d26, d27, d28, d29, d30, d31, d16, d17, q4, q0, q5, 
q1, q6, q2, q7, q3, \shift
@@ -645,8 +645,8 @@ function func_tr_16x4_\name
 store16 d26, d27, d28, d29, d30, d31, d16, d17
 
 add r4, sp, #576
-vld1.s16{q0-q1}, [r4, :128]!
-vld1.s16{q2-q3}, [r4, :128]
+vld1.s32{q0-q1}, [r4, :128]!
+vld1.s32{q2-q3}, [r4, :128]
 butterfly16 q0, q9, q1, q10, q2, q11, q3, q12
 scale   d26, d27, d28, d29, d30, d31, d8, d9, q4, q0, q9, q1, 
q10, q2, q11, q3, \shift
 transpose8_4x4  d26, d28, d30, d8
-- 
2.10.2

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: Check for the .arch directive in configure

2017-05-04 Thread Martin Storsjö
When targeting windows, the .arch directive isn't available.

So far, when building for windows, we've always used gas-preprocessor,
both when using msvc's armasm and when using clang. Lately, clang/llvm
has implemented the last missing piece (altmacro support) for building
our assembly without gas-preprocessor. This means that we now build
for arm/windows with clang without any extra compatibility layer.
---
Updated to use a plain ifdef guard around the block instead of introducing
a line prefix as for e.g. ELF (since this is the only place where .arch
is used), as suggested by Janne on irc.
---
 configure   | 4 
 libavutil/arm/asm.S | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/configure b/configure
index c7d0363..029ae9e 100755
--- a/configure
+++ b/configure
@@ -1661,6 +1661,7 @@ SYSTEM_FUNCS="
 "
 
 TOOLCHAIN_FEATURES="
+as_arch_directive
 as_dn_directive
 as_fpu_directive
 as_func
@@ -4372,6 +4373,9 @@ EOF
 
 check_inline_asm asm_mod_q '"add r0, %Q0, %R0" :: "r"((long long)0)'
 
+check_as 

[libav-devel] [PATCH] arm: Check for the .arch directive in configure

2017-05-04 Thread Martin Storsjö
When targeting windows, the .arch directive isn't available.

So far, when building for windows, we've always used gas-preprocessor,
both when using msvc's armasm and when using clang. Lately, clang/llvm
has implemented the last missing piece (altmacro support) for building
our assembly without gas-preprocessor. This means that we now build
for arm/windows with clang without any extra compatibility layer.
---
 configure   |  4 
 libavutil/arm/asm.S | 14 ++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/configure b/configure
index c7d0363..029ae9e 100755
--- a/configure
+++ b/configure
@@ -1661,6 +1661,7 @@ SYSTEM_FUNCS="
 "
 
 TOOLCHAIN_FEATURES="
+as_arch_directive
 as_dn_directive
 as_fpu_directive
 as_func
@@ -4372,6 +4373,9 @@ EOF
 
 check_inline_asm asm_mod_q '"add r0, %Q0, %R0" :: "r"((long long)0)'
 
+check_as 

[libav-devel] [PATCH 4/5] dxva: move d3d11 locking/unlocking to functions

2017-05-04 Thread wm4
I want to make it non-mandatory to set a mutex in the D3D11 device
context, and replacing it with user callbacks seems like the best
solution. This is preparation for it. Also makes the code slightly more
readable.
---
And yes, only because INVALID_HANDLE_VALUE != NULL
---
 libavcodec/dxva2.c | 46 --
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/libavcodec/dxva2.c b/libavcodec/dxva2.c
index 1cb79fe294..0d4effd228 100644
--- a/libavcodec/dxva2.c
+++ b/libavcodec/dxva2.c
@@ -29,6 +29,28 @@
 #include "avcodec.h"
 #include "dxva2_internal.h"
 
+static void ff_dxva2_lock(AVCodecContext *avctx)
+{
+#if CONFIG_D3D11VA
+if (ff_dxva2_is_d3d11(avctx)) {
+AVDXVAContext *ctx = DXVA_CONTEXT(avctx);
+if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
+WaitForSingleObjectEx(D3D11VA_CONTEXT(ctx)->context_mutex, 
INFINITE, FALSE);
+}
+#endif
+}
+
+static void ff_dxva2_unlock(AVCodecContext *avctx)
+{
+#if CONFIG_D3D11VA
+if (ff_dxva2_is_d3d11(avctx)) {
+AVDXVAContext *ctx = DXVA_CONTEXT(avctx);
+if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
+ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
+}
+#endif
+}
+
 static void *get_surface(const AVFrame *frame)
 {
 return frame->data[3];
@@ -153,14 +175,12 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 unsigned type;
 
 do {
+ff_dxva2_lock(avctx);
 #if CONFIG_D3D11VA
-if (ff_dxva2_is_d3d11(avctx)) {
-if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
-WaitForSingleObjectEx(D3D11VA_CONTEXT(ctx)->context_mutex, 
INFINITE, FALSE);
+if (ff_dxva2_is_d3d11(avctx))
 hr = 
ID3D11VideoContext_DecoderBeginFrame(D3D11VA_CONTEXT(ctx)->video_context, 
D3D11VA_CONTEXT(ctx)->decoder,
   get_surface(frame),
   0, NULL);
-}
 #endif
 #if CONFIG_DXVA2
 if (avctx->pix_fmt == AV_PIX_FMT_DXVA2_VLD)
@@ -170,21 +190,13 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 #endif
 if (hr != E_PENDING || ++runs > 50)
 break;
-#if CONFIG_D3D11VA
-if (ff_dxva2_is_d3d11(avctx))
-if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
-ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
-#endif
+ff_dxva2_unlock(avctx);
 av_usleep(2000);
 } while(1);
 
 if (FAILED(hr)) {
 av_log(avctx, AV_LOG_ERROR, "Failed to begin frame: 0x%x\n", hr);
-#if CONFIG_D3D11VA
-if (ff_dxva2_is_d3d11(avctx))
-if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
-ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
-#endif
+ff_dxva2_unlock(avctx);
 return -1;
 }
 
@@ -284,16 +296,14 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 
 end:
 #if CONFIG_D3D11VA
-if (ff_dxva2_is_d3d11(avctx)) {
+if (ff_dxva2_is_d3d11(avctx))
 hr = 
ID3D11VideoContext_DecoderEndFrame(D3D11VA_CONTEXT(ctx)->video_context, 
D3D11VA_CONTEXT(ctx)->decoder);
-if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
-ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
-}
 #endif
 #if CONFIG_DXVA2
 if (avctx->pix_fmt == AV_PIX_FMT_DXVA2_VLD)
 hr = IDirectXVideoDecoder_EndFrame(DXVA2_CONTEXT(ctx)->decoder, NULL);
 #endif
+ff_dxva2_unlock(avctx);
 if (FAILED(hr)) {
 av_log(avctx, AV_LOG_ERROR, "Failed to end frame: 0x%x\n", hr);
 result = -1;
-- 
2.11.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 0/5] New D3D hwaccel API stuff

2017-05-04 Thread wm4
Radically rebased, and omits a few in-between commits that are
unnecessary for the end result. avconv_dxva2.c should probably
also be deleted, but for now it'd only inflate the diff. As
part of the rebase I've also removed Steve Lhomme as author
name - let me know whether I should set his name back on the
two relevant commits (first and last one), or how this should
be correctly handled.

As far as I'm concerned, this is pretty much finished. Please review or
merge.

wm4 (5):
  lavu: add new D3D11 pixfmt and hwcontext
  lavc: set avctx->hwaccel before init
  dxva: preparations for new hwaccel API
  dxva: move d3d11 locking/unlocking to functions
  dxva: add support for new dxva2 and d3d11 hwaccel APIs

 Changelog  |   1 +
 avtools/avconv.h   |   2 +
 avtools/avconv_opt.c   |   8 +-
 configure  |  18 +-
 doc/APIchanges |   9 +
 libavcodec/allcodecs.c |   5 +
 libavcodec/decode.c|   4 +-
 libavcodec/dxva2.c | 723 +++--
 libavcodec/dxva2_h264.c|  36 +-
 libavcodec/dxva2_hevc.c|  32 +-
 libavcodec/dxva2_internal.h|  63 +++-
 libavcodec/dxva2_mpeg2.c   |  32 +-
 libavcodec/dxva2_vc1.c |  54 ++-
 libavcodec/h264_slice.c|   3 +-
 libavcodec/hevcdec.c   |   3 +-
 libavcodec/mpeg12dec.c |   1 +
 libavcodec/vc1dec.c|   1 +
 libavcodec/version.h   |   4 +-
 libavutil/Makefile |   3 +
 libavutil/hwcontext.c  |   4 +
 libavutil/hwcontext.h  |   1 +
 libavutil/hwcontext_d3d11va.c  | 488 +++
 libavutil/hwcontext_d3d11va.h  | 158 +
 libavutil/hwcontext_dxva2.h|   3 +
 libavutil/hwcontext_internal.h |   1 +
 libavutil/pixdesc.c|   4 +
 libavutil/pixfmt.h |   4 +-
 libavutil/version.h|   4 +-
 28 files changed, 1597 insertions(+), 72 deletions(-)
 create mode 100644 libavutil/hwcontext_d3d11va.c
 create mode 100644 libavutil/hwcontext_d3d11va.h

-- 
2.11.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/5] lavc: set avctx->hwaccel before init

2017-05-04 Thread wm4
So a hwaccel can access avctx->hwaccel in init for whatever reason. This
is for the new d3d hwaccel API. We could create separate entrypoints for
each of the 3 hwaccel types (dxva2, d3d11va, new d3d11va), but this
seems nicer.
---
 libavcodec/decode.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/decode.c b/libavcodec/decode.c
index 8aa27095b6..f7cb05851d 100644
--- a/libavcodec/decode.c
+++ b/libavcodec/decode.c
@@ -740,16 +740,16 @@ static int setup_hwaccel(AVCodecContext *avctx,
 return AVERROR(ENOMEM);
 }
 
+avctx->hwaccel = hwa;
 if (hwa->init) {
 ret = hwa->init(avctx);
 if (ret < 0) {
 av_freep(>internal->hwaccel_priv_data);
+avctx->hwaccel = NULL;
 return ret;
 }
 }
 
-avctx->hwaccel = hwa;
-
 return 0;
 }
 
-- 
2.11.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 3/5] dxva: preparations for new hwaccel API

2017-05-04 Thread wm4
The actual hwaccel code will need to access an internal context instead
of avctx->hwaccel_context, so add a new DXVA_CONTEXT() macro, that will
dispatch between the "old" external and the new internal context.

Also, the new API requires a new D3D11 pixfmt, so all places which check
for the pixfmt need to be adjusted. Introduce a ff_dxva2_is_d3d11()
function, which does the check.
---
 libavcodec/dxva2.c  | 33 +
 libavcodec/dxva2_h264.c | 14 +++---
 libavcodec/dxva2_hevc.c | 10 +-
 libavcodec/dxva2_internal.h | 22 +-
 libavcodec/dxva2_mpeg2.c| 10 +-
 libavcodec/dxva2_vc1.c  | 10 +-
 6 files changed, 56 insertions(+), 43 deletions(-)

diff --git a/libavcodec/dxva2.c b/libavcodec/dxva2.c
index b0452b6a9a..1cb79fe294 100644
--- a/libavcodec/dxva2.c
+++ b/libavcodec/dxva2.c
@@ -71,7 +71,7 @@ int ff_dxva2_commit_buffer(AVCodecContext *avctx,
 HRESULT hr;
 
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD)
+if (ff_dxva2_is_d3d11(avctx))
 hr = 
ID3D11VideoContext_GetDecoderBuffer(D3D11VA_CONTEXT(ctx)->video_context,
  D3D11VA_CONTEXT(ctx)->decoder,
  type,
@@ -91,7 +91,7 @@ int ff_dxva2_commit_buffer(AVCodecContext *avctx,
 memcpy(dxva_data, data, size);
 
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD) {
+if (ff_dxva2_is_d3d11(avctx)) {
 D3D11_VIDEO_DECODER_BUFFER_DESC *dsc11 = dsc;
 memset(dsc11, 0, sizeof(*dsc11));
 dsc11->BufferType   = type;
@@ -116,7 +116,7 @@ int ff_dxva2_commit_buffer(AVCodecContext *avctx,
 }
 
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD)
+if (ff_dxva2_is_d3d11(avctx))
 hr = 
ID3D11VideoContext_ReleaseDecoderBuffer(D3D11VA_CONTEXT(ctx)->video_context, 
D3D11VA_CONTEXT(ctx)->decoder, type);
 #endif
 #if CONFIG_DXVA2
@@ -139,7 +139,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
   DECODER_BUFFER_DESC *bs,
   DECODER_BUFFER_DESC *slice))
 {
-AVDXVAContext *ctx = avctx->hwaccel_context;
+AVDXVAContext *ctx = DXVA_CONTEXT(avctx);
 unsigned   buffer_count = 0;
 #if CONFIG_D3D11VA
 D3D11_VIDEO_DECODER_BUFFER_DESC buffer11[4];
@@ -154,7 +154,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 
 do {
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD) {
+if (ff_dxva2_is_d3d11(avctx)) {
 if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
 WaitForSingleObjectEx(D3D11VA_CONTEXT(ctx)->context_mutex, 
INFINITE, FALSE);
 hr = 
ID3D11VideoContext_DecoderBeginFrame(D3D11VA_CONTEXT(ctx)->video_context, 
D3D11VA_CONTEXT(ctx)->decoder,
@@ -171,7 +171,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 if (hr != E_PENDING || ++runs > 50)
 break;
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD)
+if (ff_dxva2_is_d3d11(avctx))
 if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
 ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
 #endif
@@ -181,7 +181,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 if (FAILED(hr)) {
 av_log(avctx, AV_LOG_ERROR, "Failed to begin frame: 0x%x\n", hr);
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD)
+if (ff_dxva2_is_d3d11(avctx))
 if (D3D11VA_CONTEXT(ctx)->context_mutex != INVALID_HANDLE_VALUE)
 ReleaseMutex(D3D11VA_CONTEXT(ctx)->context_mutex);
 #endif
@@ -189,7 +189,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 }
 
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD) {
+if (ff_dxva2_is_d3d11(avctx)) {
 buffer = [buffer_count];
 type = D3D11_VIDEO_DECODER_BUFFER_PICTURE_PARAMETERS;
 }
@@ -212,7 +212,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 
 if (qm_size > 0) {
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD) {
+if (ff_dxva2_is_d3d11(avctx)) {
 buffer = [buffer_count];
 type = D3D11_VIDEO_DECODER_BUFFER_INVERSE_QUANTIZATION_MATRIX;
 }
@@ -235,7 +235,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 }
 
 #if CONFIG_D3D11VA
-if (avctx->pix_fmt == AV_PIX_FMT_D3D11VA_VLD) {
+if (ff_dxva2_is_d3d11(avctx)) {
 buffer   = [buffer_count + 0];
 buffer_slice = [buffer_count + 1];
 }
@@ -262,7 +262,7 @@ int ff_dxva2_common_end_frame(AVCodecContext *avctx, 
AVFrame *frame,
 assert(buffer_count == 1 + (qm_size 

[libav-devel] [PATCH 5/5] dxva: add support for new dxva2 and d3d11 hwaccel APIs

2017-05-04 Thread wm4
This also adds support to avconv (which is trivial due to the new
hwaccel API being generic enough). For now, this keeps avconv_dxva2.c as
"dxva2-old", although it doesn't work as avconv.c can't handle multiple
hwaccels with the same pixfmt.

The new decoder setup code in dxva2.c is significantly based on work by
Steve Lhomme , but with heavy changes/rewrites.
---
 Changelog   |   1 +
 avtools/avconv.h|   2 +
 avtools/avconv_opt.c|   8 +-
 configure   |  12 +-
 doc/APIchanges  |   6 +
 libavcodec/allcodecs.c  |   5 +
 libavcodec/dxva2.c  | 654 +++-
 libavcodec/dxva2_h264.c |  22 ++
 libavcodec/dxva2_hevc.c |  22 ++
 libavcodec/dxva2_internal.h |  43 ++-
 libavcodec/dxva2_mpeg2.c|  22 ++
 libavcodec/dxva2_vc1.c  |  44 +++
 libavcodec/h264_slice.c |   3 +-
 libavcodec/hevcdec.c|   3 +-
 libavcodec/mpeg12dec.c  |   1 +
 libavcodec/vc1dec.c |   1 +
 libavcodec/version.h|   4 +-
 libavutil/hwcontext_dxva2.h |   3 +
 18 files changed, 844 insertions(+), 12 deletions(-)

diff --git a/Changelog b/Changelog
index 6fd30fddb9..e44df54c93 100644
--- a/Changelog
+++ b/Changelog
@@ -15,6 +15,7 @@ version :
 - VP9 superframe split/merge bitstream filters
 - FM Screen Capture Codec decoder
 - ClearVideo decoder (I-frames only)
+- support for decoding through D3D11VA in avconv
 
 
 version 12:
diff --git a/avtools/avconv.h b/avtools/avconv.h
index 3354c50444..fe2bb313b7 100644
--- a/avtools/avconv.h
+++ b/avtools/avconv.h
@@ -54,9 +54,11 @@ enum HWAccelID {
 HWACCEL_AUTO,
 HWACCEL_VDPAU,
 HWACCEL_DXVA2,
+HWACCEL_DXVA2_OLD,
 HWACCEL_VDA,
 HWACCEL_QSV,
 HWACCEL_VAAPI,
+HWACCEL_D3D11VA,
 };
 
 typedef struct HWAccel {
diff --git a/avtools/avconv_opt.c b/avtools/avconv_opt.c
index 9839a2269e..e2599bd4d8 100644
--- a/avtools/avconv_opt.c
+++ b/avtools/avconv_opt.c
@@ -60,8 +60,14 @@ const HWAccel hwaccels[] = {
 { "vdpau", hwaccel_decode_init, HWACCEL_VDPAU, AV_PIX_FMT_VDPAU,
   AV_HWDEVICE_TYPE_VDPAU },
 #endif
+#if HAVE_D3D11VA_LIB
+{ "d3d11va", hwaccel_decode_init, HWACCEL_D3D11VA, AV_PIX_FMT_D3D11,
+  AV_HWDEVICE_TYPE_D3D11VA },
+#endif
 #if HAVE_DXVA2_LIB
-{ "dxva2", dxva2_init, HWACCEL_DXVA2, AV_PIX_FMT_DXVA2_VLD,
+{ "dxva2", hwaccel_decode_init, HWACCEL_DXVA2, AV_PIX_FMT_DXVA2_VLD,
+  AV_HWDEVICE_TYPE_DXVA2},
+{ "dxva2-old", dxva2_init, HWACCEL_DXVA2_OLD, AV_PIX_FMT_DXVA2_VLD,
   AV_HWDEVICE_TYPE_NONE },
 #endif
 #if CONFIG_VDA
diff --git a/configure b/configure
index c3ccf69730..2183d23bde 100755
--- a/configure
+++ b/configure
@@ -2168,7 +2168,7 @@ zmbv_encoder_deps="zlib"
 # hardware accelerators
 d3d11va_deps="d3d11_h dxva_h ID3D11VideoDecoder"
 d3d11va_lib_deps="d3d11va"
-dxva2_deps="dxva2api_h DXVA2_ConfigPictureDecode"
+dxva2_deps="dxva2api_h DXVA2_ConfigPictureDecode ole32"
 dxva2_lib_deps="dxva2"
 vda_deps="VideoDecodeAcceleration_VDADecoder_h blocks_extension pthreads"
 vda_extralibs="-framework CoreFoundation -framework VideoDecodeAcceleration 
-framework QuartzCore"
@@ -2177,6 +2177,8 @@ h263_vaapi_hwaccel_deps="vaapi"
 h263_vaapi_hwaccel_select="h263_decoder"
 h264_d3d11va_hwaccel_deps="d3d11va"
 h264_d3d11va_hwaccel_select="h264_decoder"
+h264_d3d11va2_hwaccel_deps="d3d11va"
+h264_d3d11va2_hwaccel_select="h264_decoder"
 h264_dxva2_hwaccel_deps="dxva2"
 h264_dxva2_hwaccel_select="h264_decoder"
 h264_mmal_hwaccel_deps="mmal"
@@ -2191,6 +2193,8 @@ h264_vdpau_hwaccel_deps="vdpau"
 h264_vdpau_hwaccel_select="h264_decoder"
 hevc_d3d11va_hwaccel_deps="d3d11va DXVA_PicParams_HEVC"
 hevc_d3d11va_hwaccel_select="hevc_decoder"
+hevc_d3d11va2_hwaccel_deps="d3d11va DXVA_PicParams_HEVC"
+hevc_d3d11va2_hwaccel_select="hevc_decoder"
 hevc_dxva2_hwaccel_deps="dxva2 DXVA_PicParams_HEVC"
 hevc_dxva2_hwaccel_select="hevc_decoder"
 hevc_qsv_hwaccel_deps="libmfx"
@@ -2202,6 +2206,8 @@ mpeg1_vdpau_hwaccel_deps="vdpau"
 mpeg1_vdpau_hwaccel_select="mpeg1video_decoder"
 mpeg2_d3d11va_hwaccel_deps="d3d11va"
 mpeg2_d3d11va_hwaccel_select="mpeg2video_decoder"
+mpeg2_d3d11va2_hwaccel_deps="d3d11va"
+mpeg2_d3d11va2_hwaccel_select="mpeg2video_decoder"
 mpeg2_dxva2_hwaccel_deps="dxva2"
 mpeg2_dxva2_hwaccel_select="mpeg2video_decoder"
 mpeg2_mmal_hwaccel_deps="mmal"
@@ -2216,6 +,8 @@ mpeg4_vdpau_hwaccel_deps="vdpau"
 mpeg4_vdpau_hwaccel_select="mpeg4_decoder"
 vc1_d3d11va_hwaccel_deps="d3d11va"
 vc1_d3d11va_hwaccel_select="vc1_decoder"
+vc1_d3d11va2_hwaccel_deps="d3d11va"
+vc1_d3d11va2_hwaccel_select="vc1_decoder"
 vc1_dxva2_hwaccel_deps="dxva2"
 vc1_dxva2_hwaccel_select="vc1_decoder"
 vc1_mmal_hwaccel_deps="mmal"
@@ -2228,6 +2236,7 @@ vp8_qsv_hwaccel_deps="libmfx"
 vp8_vaapi_hwaccel_deps="vaapi VAPictureParameterBufferVP8"
 vp8_vaapi_hwaccel_select="vp8_decoder"
 wmv3_d3d11va_hwaccel_select="vc1_d3d11va_hwaccel"
+wmv3_d3d11va2_hwaccel_select="vc1_d3d11va2_hwaccel"
 

[libav-devel] [PATCH 1/5] lavu: add new D3D11 pixfmt and hwcontext

2017-05-04 Thread wm4
To be used with the new d3d11 hwaccel decode API.

With the new hwaccel API, we don't want surfaces to depend on the
decoder (other than the required dimension and format). The old D3D11VA
pixfmt uses ID3D11VideoDecoderOutputView pointers, which include the
decoder configuration, and thus is incompatible with the new hwaccel
API. This patch introduces AV_PIX_FMT_D3D11, which uses ID3D11Texture2D
and an index. It's simpler and compatible with the new hwaccel API.

The introduced hwcontext supports only the new pixfmt.

Significantly based on work by Steve Lhomme , but with
heavy changes/rewrites.
---
Somewhat sketchy: if initial_pool_size is set, the pool is assumed to
be static.
---
 configure  |   6 +
 doc/APIchanges |   3 +
 libavutil/Makefile |   3 +
 libavutil/hwcontext.c  |   4 +
 libavutil/hwcontext.h  |   1 +
 libavutil/hwcontext_d3d11va.c  | 488 +
 libavutil/hwcontext_d3d11va.h  | 158 +
 libavutil/hwcontext_internal.h |   1 +
 libavutil/pixdesc.c|   4 +
 libavutil/pixfmt.h |   4 +-
 libavutil/version.h|   4 +-
 11 files changed, 673 insertions(+), 3 deletions(-)
 create mode 100644 libavutil/hwcontext_d3d11va.c
 create mode 100644 libavutil/hwcontext_d3d11va.h

diff --git a/configure b/configure
index 6f696c9ab5..c3ccf69730 100755
--- a/configure
+++ b/configure
@@ -1712,6 +1712,7 @@ HAVE_LIST="
 $THREADS_LIST
 $TOOLCHAIN_FEATURES
 $TYPES_LIST
+d3d11va_lib
 dos_paths
 dxva2_lib
 libc_msvcrt
@@ -2166,6 +2167,7 @@ zmbv_encoder_deps="zlib"
 
 # hardware accelerators
 d3d11va_deps="d3d11_h dxva_h ID3D11VideoDecoder"
+d3d11va_lib_deps="d3d11va"
 dxva2_deps="dxva2api_h DXVA2_ConfigPictureDecode"
 dxva2_lib_deps="dxva2"
 vda_deps="VideoDecodeAcceleration_VDADecoder_h blocks_extension pthreads"
@@ -4861,6 +4863,10 @@ if enabled libxcb; then
 check_pkg_config libxcb_xfixes xcb-xfixes xcb/xfixes.h 
xcb_xfixes_get_cursor_image
 fi
 
+enabled d3d11va && 
+check_type "windows.h d3d11.h" ID3D11VideoDevice &&
+enable d3d11va_lib
+
 enabled dxva2 &&
 check_lib dxva2_lib windows.h CoTaskMemFree -lole32
 
diff --git a/doc/APIchanges b/doc/APIchanges
index a251c4ca82..a81e41833d 100644
--- a/doc/APIchanges
+++ b/doc/APIchanges
@@ -13,6 +13,9 @@ libavutil: 2017-03-23
 
 API changes, most recent first:
 
+2017-xx-xx - xxx - lavu 56.2.0 - hwcontext.h
+  Add AV_HWDEVICE_TYPE_D3D11VA and AV_PIX_FMT_D3D11.
+
 2017-04-30 - xxx - lavu 56.1.1 - hwcontext.h
   av_hwframe_ctx_create_derived() now takes some AV_HWFRAME_MAP_* combination
   as its flags argument (which was previously unused).
diff --git a/libavutil/Makefile b/libavutil/Makefile
index 60e180c79d..6fb24db678 100644
--- a/libavutil/Makefile
+++ b/libavutil/Makefile
@@ -27,6 +27,7 @@ HEADERS = adler32.h   
  \
   hmac.h\
   hwcontext.h   \
   hwcontext_cuda.h  \
+  hwcontext_d3d11va.h   \
   hwcontext_dxva2.h \
   hwcontext_qsv.h   \
   hwcontext_vaapi.h \
@@ -112,6 +113,7 @@ OBJS = adler32.o
\
xtea.o   \
 
 OBJS-$(CONFIG_CUDA) += hwcontext_cuda.o
+OBJS-$(CONFIG_D3D11VA)  += hwcontext_d3d11va.o
 OBJS-$(CONFIG_DXVA2)+= hwcontext_dxva2.o
 OBJS-$(CONFIG_LIBMFX)   += hwcontext_qsv.o
 OBJS-$(CONFIG_LZO)  += lzo.o
@@ -121,6 +123,7 @@ OBJS-$(CONFIG_VDPAU)+= hwcontext_vdpau.o
 OBJS += $(COMPAT_OBJS:%=../compat/%)
 
 SKIPHEADERS-$(CONFIG_CUDA) += hwcontext_cuda.h
+SKIPHEADERS-$(CONFIG_D3D11VA)  += hwcontext_d3d11va.h
 SKIPHEADERS-$(CONFIG_DXVA2)+= hwcontext_dxva2.h
 SKIPHEADERS-$(CONFIG_LIBMFX)   += hwcontext_qsv.h
 SKIPHEADERS-$(CONFIG_VAAPI)+= hwcontext_vaapi.h
diff --git a/libavutil/hwcontext.c b/libavutil/hwcontext.c
index 360b01205c..d82df56abf 100644
--- a/libavutil/hwcontext.c
+++ b/libavutil/hwcontext.c
@@ -32,6 +32,9 @@ static const HWContextType * const hw_table[] = {
 #if CONFIG_CUDA
 _hwcontext_type_cuda,
 #endif
+#if CONFIG_D3D11VA
+_hwcontext_type_d3d11va,
+#endif
 #if CONFIG_DXVA2
 _hwcontext_type_dxva2,
 #endif
@@ -50,6 +53,7 @@ static const HWContextType * const hw_table[] = {
 const char *hw_type_names[] = {
 [AV_HWDEVICE_TYPE_CUDA]   = "cuda",
 [AV_HWDEVICE_TYPE_DXVA2]  = "dxva2",
+[AV_HWDEVICE_TYPE_D3D11VA] =