Going NV12->YUV420P->RGB24 instead of NV12->RGB24 made a material difference, thanks!

On 5/20/20 3:10 PM, Carl Eugen Hoyos wrote:
Am Mi., 20. Mai 2020 um 20:36 Uhr schrieb Alex <a...@sighthound.com>:
I've been working on implementing hardware decoder integration. While
things generally are working, there's not much gain: the color
conversion from NV12 seems to be dramatically more expensive than from
the YUV420P in case of software decoder.

When profiling, I'm seeing YUV420P->RGB24 using yuv420_rgb24_mmxext
(which barely registers in profiler), while NV12->RGB24 using
yuv2rgb24_X_c, which takes a fairly dramatic amount of time.

Is that something that is expected? A known problem? Or perhaps
something is wrong on my side of things?
Very, very generally, hardware decoding is very useful if you want to
display on the same gpu where the decoding takes place and if you
want to encode on the same gpu where decoding happens.

If you have to download the decoded frame from graphics memory
to cpu memory, some of the possible gains are always lost.
If you - in addition - need a colour space conversion (in your case, a
conversion that rarely happens when using a software decoder which
is therefore not so well accelerated), even more time is lost.

That being said: Did you try to first convert nv12->yuv420p
using ff_nv12ToUV_avx, then using above yuv420_rgb24_mmxext ?
I would have expected nv12ToUV to be much faster than any yuv2rgb
conversion.

Carl Eugen
_______________________________________________
Libav-user mailing list
Libav-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/libav-user

To unsubscribe, visit link above, or email
libav-user-requ...@ffmpeg.org with subject "unsubscribe".
_______________________________________________
Libav-user mailing list
Libav-user@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/libav-user

To unsubscribe, visit link above, or email
libav-user-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to