#11515: Consider NV12 / P010 output pixel format support
-------------------------------------+-------------------------------------
             Reporter:  Robert       |                     Type:  defect
  Mader                              |
               Status:  new          |                 Priority:  normal
            Component:  avcodec      |                  Version:
                                     |  unspecified
             Keywords:  nv12, p010   |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------
 The currently used pixel formats like yuv420 and especially yuv420_10 are
 unfortunately not well suited for existing hardware (like display engines)
 or APIs (like Vulkan, (Linux) KMS, GL), forcing clients to do extra copies
 either on the CPU or GPU, impacting real-world playback performance.

 If libav-decoders would optionally support formats that are typically used
 by hardware decoders, such as P010 and NV12, a big part of existing
 consumer devices that don't have a HW-AV1 decoder could archive the
 optimal number of copies - up to zero-copy in many cases. This would allow
 bringing overall playback performance close to it's theoretical optimum -
 across platforms / OSs.

 From my (limited) understanding of decoders like libav, using p010 instead
 of yuv420_10 would be mostly free from a computational side, only changing
 the place where data is saved and doing a bit-shift - the main burden
 would likely come from increased code complexity. Therefor I'd like to ask
 if the libav project would accept patches for new formats - and if there
 maybe already are plans for that among contributors?

 ---

 Context:

 Over the last years a number of developments changed the landscape for
 what is possible with software video decoding:
 1. Consumer devices are increasingly converging to SoCs with display
 engines with IOMMUs, making it easy for clients/OSs to allocate buffers
 that can be used with fixed-function hardware to display videos. This
 includes pretty much all laptops and phones, but also small/embedded
 devices like the Raspberry Pi 5.
 2. Similarly an increasing number of hardware supports displaying HDR10
 content properly.
 3. Across platforms clients increasingly support zero-copy video playback
 paths for hardware decoded video - examples here include e.g. Firefox and
 Chromium.
 4. The Wayland/Android/Linux ecosystem in particular sees a lot of
 development in this area, coupled with evolving HDR APIs, and toolkits
 like GTK, coupled with e.g. Gstreamer, allow apps to have zero-copy
 playback with little effort.

 I'm part of a group of Gstreamer developers investigating possibilities to
 allow video players when using sw-decoding in the Wayland ecosystem. In
 particular we try to allow clients to reuse hardware decoding paths as
 much as possible. We already are looking into steps like:
 1. Using "dmabuf" allocators like "udmabuf", so buffers can be directly be
 imported/used by GPUs and display engines (will add link once
 https://gitlab.freedesktop.org/ is back).
 2. Using options to pre-allocate buffers for decoders to avoid copies - in
 case of dav1d: https://github.com/rust-av/dav1d-rs/pull/107
 3. Plumbing formats/shaders in GPU drivers (Mesa), compositors etc. to
 allow usage of formats like yuv420_10 as directly as possibly (will add
 link once https://gitlab.freedesktop.org/ is back).

 Experiments with the steps from above already allow us heavily improved
 performance compared to what we had previously (or what players like `mpv`
 currently allow), however in particular the yuv420_10 format causes some
 unfortunate issues (apart from missing plumbing in places like Mesa, DRM
 etc.):
 1. It's not compatible with existing Vulkan formats.
 `VK_FORMAT_G16_B16_R16_3PLANE_420_UNORM` would be strong contender, but
 the 6-bit padding is on the "wrong" side (please correct me if there
 actually is a matching VK format already). Formats like P010, P012 and
 P016 have the padding on the other side, allowing GPUs to just treat the
 same and using a single shader for all of them.
 2. AFAICS there's not a single display engine supporting this format.
 yuv420 also is uncommen compared to NV12, but at least some hardware like
 the RPi4/5 and some qcom devices support it. Support in the display engine
 is needed for zero-copy playback and allows to power down the GPU.
 3. Just doing a CPU conversion adds a significant burden in cases where
 the CPU or bandwidth is already the limiting factor.

 Adding P010 (and potentially NV12 and 422/444 variants) support to libav
 thus looks like a very promising improvement to me. Thanks for considering
 :)

 ---

 Related dav1d issue: https://code.videolan.org/videolan/dav1d/-/issues/454
-- 
Ticket URL: <https://trac.ffmpeg.org/ticket/11515>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker
_______________________________________________
FFmpeg-trac mailing list
FFmpeg-trac@avcodec.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-trac

To unsubscribe, visit link above, or email
ffmpeg-trac-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to