PR #23584 opened by Diego de Souza (ddesouza)
URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23584
Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23584.patch

NVDEC and CUVID now output AV_PIX_FMT_P012 (12-bit 4:2:0), AV_PIX_FMT_P212
(12-bit 4:2:2) and AV_PIX_FMT_YUV444P10MSB / AV_PIX_FMT_YUV444P12MSB
(10/12-bit 4:4:4) for high-bit-depth content, but these CUDA filters
rejected the formats in their supported-format lists, breaking pipelines
such as "-hwaccel cuda ... -vf scale_cuda" on 12-bit input.

These formats use 16-bit sample storage, and the filters select their CUDA
kernel by byte-storage size and plane layout, not by the number of valid
bits, so they can reuse the existing 16-bit kernels:

- scale_cuda: P012/P212 -> "semiplanar16", YUV444P10MSB/YUV444P12MSB ->
  "planar16".
- transpose_cuda: the ushort/ushort2 kernels are chosen from the pixel
  descriptor (byte size + channel count); just allow the new formats.
- thumbnail_cuda: P012 reuses the P010/P016 path and the MSB 4:4:4 formats
  reuse the YUV444P16 path; P012 is added to the 4:2:0 chroma-histogram
  scaling as well. (thumbnail has no 4:2:2 path, so P212 is not added.)

The CUDA deinterlacers (bwdif_cuda, yadif_cuda) already accept any format
with <= 2 bytes per sample and <= 2 channels, so they need no change. The
8-bit-only filters (overlay_cuda, pad_cuda, bilateral_cuda, chromakey_cuda,
colorspace_cuda) do not support high bit depths and are left untouched.

Signed-off-by: Diego de Souza <[email protected]>


>From 74999bbbd9b110a42bd8b50396fb3f20475002c7 Mon Sep 17 00:00:00 2001
From: Diego de Souza <[email protected]>
Date: Wed, 24 Jun 2026 10:24:55 +0200
Subject: [PATCH] avfilter/cuda: support P012/P212 and MSB 4:4:4 in
 scale/transpose/thumbnail

NVDEC and CUVID now output AV_PIX_FMT_P012 (12-bit 4:2:0), AV_PIX_FMT_P212
(12-bit 4:2:2) and AV_PIX_FMT_YUV444P10MSB / AV_PIX_FMT_YUV444P12MSB
(10/12-bit 4:4:4) for high-bit-depth content, but these CUDA filters
rejected the formats in their supported-format lists, breaking pipelines
such as "-hwaccel cuda ... -vf scale_cuda" on 12-bit input.

These formats use 16-bit sample storage, and the filters select their CUDA
kernel by byte-storage size and plane layout, not by the number of valid
bits, so they can reuse the existing 16-bit kernels:

- scale_cuda: P012/P212 -> "semiplanar16", YUV444P10MSB/YUV444P12MSB ->
  "planar16".
- transpose_cuda: the ushort/ushort2 kernels are chosen from the pixel
  descriptor (byte size + channel count); just allow the new formats.
- thumbnail_cuda: P012 reuses the P010/P016 path and the MSB 4:4:4 formats
  reuse the YUV444P16 path; P012 is added to the 4:2:0 chroma-histogram
  scaling as well. (thumbnail has no 4:2:2 path, so P212 is not added.)

The CUDA deinterlacers (bwdif_cuda, yadif_cuda) already accept any format
with <= 2 bytes per sample and <= 2 channels, so they need no change. The
8-bit-only filters (overlay_cuda, pad_cuda, bilateral_cuda, chromakey_cuda,
colorspace_cuda) do not support high bit depths and are left untouched.

Signed-off-by: Diego de Souza <[email protected]>
---
 libavfilter/vf_scale_cuda.c     | 4 ++++
 libavfilter/vf_thumbnail_cuda.c | 9 ++++++++-
 libavfilter/vf_transpose_cuda.c | 4 ++++
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/libavfilter/vf_scale_cuda.c b/libavfilter/vf_scale_cuda.c
index edeecbb1c7..2a7dc300f5 100644
--- a/libavfilter/vf_scale_cuda.c
+++ b/libavfilter/vf_scale_cuda.c
@@ -57,11 +57,15 @@ static const struct format_entry supported_formats[] = {
     {AV_PIX_FMT_YUV420P10,"planar10"},
     {AV_PIX_FMT_YUV422P10,"planar10"},
     {AV_PIX_FMT_YUV444P10,"planar10"},
+    {AV_PIX_FMT_YUV444P10MSB,"planar16"},
+    {AV_PIX_FMT_YUV444P12MSB,"planar16"},
     {AV_PIX_FMT_YUV444P16,"planar16"},
     {AV_PIX_FMT_NV12,     "semiplanar8"},
     {AV_PIX_FMT_NV16,     "semiplanar8"},
     {AV_PIX_FMT_P010,     "semiplanar10"},
     {AV_PIX_FMT_P210,     "semiplanar10"},
+    {AV_PIX_FMT_P012,     "semiplanar16"},
+    {AV_PIX_FMT_P212,     "semiplanar16"},
     {AV_PIX_FMT_P016,     "semiplanar16"},
     {AV_PIX_FMT_P216,     "semiplanar16"},
     {AV_PIX_FMT_0RGB32,   "bgr0"},
diff --git a/libavfilter/vf_thumbnail_cuda.c b/libavfilter/vf_thumbnail_cuda.c
index 121274de11..95c46eefdb 100644
--- a/libavfilter/vf_thumbnail_cuda.c
+++ b/libavfilter/vf_thumbnail_cuda.c
@@ -44,7 +44,10 @@ static const enum AVPixelFormat supported_formats[] = {
     AV_PIX_FMT_YUV420P,
     AV_PIX_FMT_YUV444P,
     AV_PIX_FMT_P010,
+    AV_PIX_FMT_P012,
     AV_PIX_FMT_P016,
+    AV_PIX_FMT_YUV444P10MSB,
+    AV_PIX_FMT_YUV444P12MSB,
     AV_PIX_FMT_YUV444P16,
 };
 
@@ -226,12 +229,15 @@ static int thumbnail(AVFilterContext *ctx, int 
*histogram, AVFrame *in)
             histogram + 512, in->data[2], in->width, in->height, 
in->linesize[2], 1);
         break;
     case AV_PIX_FMT_P010LE:
+    case AV_PIX_FMT_P012LE:
     case AV_PIX_FMT_P016LE:
         thumbnail_kernel(ctx, s->cu_func_ushort, 1,
             histogram, in->data[0], in->width, in->height, in->linesize[0], 2);
         thumbnail_kernel(ctx, s->cu_func_ushort2, 2,
             histogram + 256, in->data[1], in->width / 2, in->height / 2, 
in->linesize[1], 2);
         break;
+    case AV_PIX_FMT_YUV444P10MSB:
+    case AV_PIX_FMT_YUV444P12MSB:
     case AV_PIX_FMT_YUV444P16:
         thumbnail_kernel(ctx, s->cu_func_ushort2, 1,
             histogram, in->data[0], in->width, in->height, in->linesize[0], 2);
@@ -284,7 +290,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame 
*frame)
         return ret;
 
     if (hw_frames_ctx->sw_format == AV_PIX_FMT_NV12 || 
hw_frames_ctx->sw_format == AV_PIX_FMT_YUV420P ||
-        hw_frames_ctx->sw_format == AV_PIX_FMT_P010LE || 
hw_frames_ctx->sw_format == AV_PIX_FMT_P016LE)
+        hw_frames_ctx->sw_format == AV_PIX_FMT_P010LE || 
hw_frames_ctx->sw_format == AV_PIX_FMT_P012LE ||
+        hw_frames_ctx->sw_format == AV_PIX_FMT_P016LE)
     {
         int i;
         for (i = 256; i < HIST_SIZE; i++)
diff --git a/libavfilter/vf_transpose_cuda.c b/libavfilter/vf_transpose_cuda.c
index c34223f27c..d604135f63 100644
--- a/libavfilter/vf_transpose_cuda.c
+++ b/libavfilter/vf_transpose_cuda.c
@@ -47,11 +47,15 @@ static const enum AVPixelFormat supported_formats[] = {
     AV_PIX_FMT_YUV420P10,
     AV_PIX_FMT_YUV422P10,
     AV_PIX_FMT_YUV444P10,
+    AV_PIX_FMT_YUV444P10MSB,
+    AV_PIX_FMT_YUV444P12MSB,
     AV_PIX_FMT_YUV444P16,
     AV_PIX_FMT_NV12,
     AV_PIX_FMT_NV16,
     AV_PIX_FMT_P010,
     AV_PIX_FMT_P210,
+    AV_PIX_FMT_P012,
+    AV_PIX_FMT_P212,
     AV_PIX_FMT_P016,
     AV_PIX_FMT_P216,
     AV_PIX_FMT_0RGB32,
-- 
2.52.0

_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to