With a RTX 5070 Ti OC and Linux, transcoding from 6k5k16bit DPX to
null/FFV1 using FFV1 Vulkan encoder
- DPX Software and files on SSD, ffmpeg (reference): 7.2 fps
- DPX Software and repeated frame, libavcodec (reference): 7.2 fps
- DPX Vulkan and files on SSD, ffmpeg: 11.5 fps (same as ffmpeg -c copy
-f null - >nul)
- DPX Vulkan and files in RAM, ffmpeg: 16.5 fps (same as ffmpeg -c copy
-f null - >nul)
- DPX Vulkan and and repeated frame, libavcodec: 23.2 fps
It is very useful.
After this patch, image2 (used by ffmpeg but not the libavcodec test)
seems to be the new bottleneck, having difficulties to feed libavcodec
quickly enough (here, 24 fps = 4.5 GB/s).
Some DPX are not well decoded. RGB and Y 12/16 bit (for 12-bit it seems
to come from the FFV1 Vulkan encoder we tested at the same time).
Le 23/11/2025 à 01:20, Lynne via ffmpeg-devel a écrit :
PR #21000 opened by Lynne
URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21000
Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/21000.patch
The format is raw and uncompressed, however unpacking is a memcpy, and in case
of big 6k or even 12k frames being several hundred megabytes large, the
bandwidth can be far too much for CPUs.
Also, most of DPX cannot be directly uploaded without a software conversion
since almost nothing supports 3-component image formats like RGB48, and 10 and
12-bit formats are stored either tightly packed or in 32-bit dwords.
Speedup over software is around 3x for Intel, 6x for AMD, and 185x (!) for
Nvidia. You can use the [small program I
wrote](https://github.com/cyanreg/dec_tx_test) to benchmark it, and encoding.
Nvidia hardware seems to really suck at uploading or downloading anything, but
host image copies seem to be the only fast way to do so, so we exploit them a
bit here. They're not yet stable enough for all uploads, but we'll get there.
>From 41c50176c70ae80b9e49ddb89555acf4da6256cf Mon Sep 17 00:00:00 2001
From: Lynne <[email protected]>
Date: Thu, 13 Nov 2025 12:09:11 +0100
Subject: [PATCH 01/11] hwcontext_vulkan: enable runtime descriptor sizing
We were already using this in places, but it seems validation
layers finally got support to detect it.
---
libavutil/hwcontext_vulkan.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index a6bf9a590b..0408b9c117 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -307,6 +307,7 @@ static void
device_features_copy_needed(VulkanDeviceFeatures *dst, VulkanDeviceF
COPY_VAL(vulkan_1_2.vulkanMemoryModel);
COPY_VAL(vulkan_1_2.vulkanMemoryModelDeviceScope);
COPY_VAL(vulkan_1_2.uniformBufferStandardLayout);
+ COPY_VAL(vulkan_1_2.runtimeDescriptorArray);
COPY_VAL(vulkan_1_3.dynamicRendering);
COPY_VAL(vulkan_1_3.maintenance4);
_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]