Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: update amount of decoder surfaces from within sequence decode callback

2023-08-08 Thread Michael Diesel

Hello ,
after this patch I get stutter in the output video when using h264_cuvid 
as decoder


With "surfaces" as -1 will result :
CUVIDPARSERPARAMS::ulMaxNumDecodeSurfaces = 5 = 
CUVID_DEFAULT_NUM_SURFACES
CUVIDDECODECREATEINFO::ulNumDecodeSurfaces = 8 = 
format->min_num_decode_surfaces + 3


As per NVDEC Video Decoder API Programming Guide / Creating a parser , 
the two values should be equal.


cuvid_handle_video_sequence/pfnSequenceCallback should return a value :
">1: succeeded, and driver should override 
CUVIDPARSERPARAMS::ulMaxNumDecodeSurfaces with this return value"


Something like:
if(ctx->cuparseinfo.ulMaxNumDecodeSurfaces != 
cuinfo.ulNumDecodeSurfaces)

return cuinfo.ulNumDecodeSurfaces;
else
return 1;

This solves partially the problem , there's still occasional stutter , a 
value of 8 is not sufficient.

On my test system 12 is the magic number.

Timo Rothenpieler wrote:

applied with minor amendments.

Turned the if-statements around, we don't do them that way around in 
ffmpeg code.
And switched the surfaces option to have a default of -1, to not rely on 
a magic value of 25. That could surprise users who happen to request 25 
surfaces.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: update amount of decoder surfaces from within sequence decode callback

2023-06-06 Thread Timo Rothenpieler

On 05.06.2023 14:29, Roman Arzumanyan wrote:

Thanks for the review, Anton. Please find the updated patch attached.
BTW - what do you mean by "you should also forward the actual error code"?
Within the cuvid_handle_video_sequence() function in case of error,
function still returns 0 and sets internal context error, so I kept
behavior intact. Do I miss something?


applied with minor amendments.

Turned the if-statements around, we don't do them that way around in 
ffmpeg code.
And switched the surfaces option to have a default of -1, to not rely on 
a magic value of 25. That could surprise users who happen to request 25 
surfaces.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: update amount of decoder surfaces from within sequence decode callback

2023-06-05 Thread Roman Arzumanyan
Thanks for the review, Anton. Please find the updated patch attached.
BTW - what do you mean by "you should also forward the actual error code"?
Within the cuvid_handle_video_sequence() function in case of error,
function still returns 0 and sets internal context error, so I kept
behavior intact. Do I miss something?

пн, 5 июн. 2023 г. в 13:19, Anton Khirnov :

> Quoting Roman Arzumanyan (2023-06-05 09:30:07)
> > Hello,
> >
> > This patch reduces vRAM usage by cuvid decoder implementation.
> > The number of surfaces used for decoding is updated within the parser
> > sequence decode callback.
> > Also the "surfaces" AVDictionary option specific to cuvid was removed in
> > favor of "extra_hw_surfaces".
>
> This can break existing workflows, you should deprecated the option
> instead and only remove it after some time has passed.
>
> >
> > vRAM consumption was tested on various videos and savings are between 1%
> > for 360p resolution up to 21% for some 1080p H.264 videos.
> > Decoding performance was tested on various H.264 and H.265 videos in
> > different resolutions from 360p and higher, no performance penalty was
> > found.
> >
> > From 32a1b016e88fa40b983318d4583750ef250a78d9 Mon Sep 17 00:00:00 2001
> > From: Roman Arzumanyan 
> > Date: Thu, 1 Jun 2023 11:17:39 +0300
> > Subject: [PATCH] libavcodec/cuviddec: determine DPB size from within
> cuvid
> >  parser
> >
> > ---
> >  libavcodec/cuviddec.c | 29 +++--
> >  1 file changed, 27 insertions(+), 2 deletions(-)
> >
> > diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
> > index 3d43bbd466..759ed49870 100644
> > --- a/libavcodec/cuviddec.c
> > +++ b/libavcodec/cuviddec.c
> > @@ -115,6 +115,12 @@ typedef struct CuvidParsedFrame
> >
> >  #define CHECK_CU(x) FF_CUDA_CHECK_DL(avctx, ctx->cudl, x)
> >
> > +// NV recommends [2;4] range
> > +#define CUVID_MAX_DISPLAY_DELAY (4)
> > +
> > +// Actual DPB size will be determined by parser.
> > +#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
> > +
> >  static int CUDAAPI cuvid_handle_video_sequence(void *opaque,
> CUVIDEOFORMAT* format)
> >  {
> >  AVCodecContext *avctx = opaque;
> > @@ -309,6 +315,25 @@ static int CUDAAPI cuvid_handle_video_sequence(void
> *opaque, CUVIDEOFORMAT* form
> >  return 0;
> >  }
> >
> > +if (ctx->nb_surfaces < format->min_num_decode_surfaces + 3)
> > +ctx->nb_surfaces = format->min_num_decode_surfaces + 3;
>
> FFMAX()
>
> > +
> > +if (avctx->extra_hw_frames > 0)
> > +ctx->nb_surfaces += avctx->extra_hw_frames;
> > +
> > +if (0 > av_fifo_realloc2(ctx->frame_queue, ctx->nb_surfaces *
> sizeof(CuvidParsedFrame))) {
>
> this is the old deprecated AVFifoBuffer API, you cannot use it with
> AVFifo objects
>
> you should also forward the actual error code
>
> > +av_log(avctx, AV_LOG_ERROR, "Failed to recreate frame queue on
> video sequence callback\n");
> > +ctx->internal_error = AVERROR(EINVAL);
> > +return 0;
> > +}
> > +
> > +ctx->key_frame = av_realloc_array(ctx->key_frame, ctx->nb_surfaces,
> sizeof(int));
> > +if (!ctx->key_frame) {
> > +av_log(avctx, AV_LOG_ERROR, "Failed to recreate key frame queue
> on video sequence callback\n");
> > +ctx->internal_error = AVERROR(EINVAL);
>
> Leaks key_frame on failure and should be ENOMEM.
>
> --
> Anton Khirnov
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
From 4aeff36c9d79c046e726bcc18a311754ace5c47e Mon Sep 17 00:00:00 2001
From: Roman Arzumanyan 
Date: Thu, 1 Jun 2023 11:17:39 +0300
Subject: [PATCH] libavcodec/cuviddec: determine amount of decoded surfaces
 from within cuvid parser

---
 libavcodec/cuviddec.c | 34 --
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
index 3d43bbd466..ef25c1d470 100644
--- a/libavcodec/cuviddec.c
+++ b/libavcodec/cuviddec.c
@@ -115,6 +115,12 @@ typedef struct CuvidParsedFrame
 
 #define CHECK_CU(x) FF_CUDA_CHECK_DL(avctx, ctx->cudl, x)
 
+// NV recommends [2;4] range
+#define CUVID_MAX_DISPLAY_DELAY (4)
+
+// Actual pool size will be determined by parser.
+#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
+
 static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* format)
 {
 AVCodecContext *avctx = opaque;
@@ -124,6 +130,7 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form
 CUVIDDECODECREATEINFO cuinfo;
 int surface_fmt;
 int chroma_444;
+int fifo_size_inc;
 
 int old_width = avctx->width;
 int old_height = avctx->height;
@@ -309,6 +316,25 @@ static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* form
 return 0;
 }
 
+

Re: [FFmpeg-devel] [PATCH] avcodec/cuviddec: update amount of decoder surfaces from within sequence decode callback

2023-06-05 Thread Anton Khirnov
Quoting Roman Arzumanyan (2023-06-05 09:30:07)
> Hello,
> 
> This patch reduces vRAM usage by cuvid decoder implementation.
> The number of surfaces used for decoding is updated within the parser
> sequence decode callback.
> Also the "surfaces" AVDictionary option specific to cuvid was removed in
> favor of "extra_hw_surfaces".

This can break existing workflows, you should deprecated the option
instead and only remove it after some time has passed.

> 
> vRAM consumption was tested on various videos and savings are between 1%
> for 360p resolution up to 21% for some 1080p H.264 videos.
> Decoding performance was tested on various H.264 and H.265 videos in
> different resolutions from 360p and higher, no performance penalty was
> found.
> 
> From 32a1b016e88fa40b983318d4583750ef250a78d9 Mon Sep 17 00:00:00 2001
> From: Roman Arzumanyan 
> Date: Thu, 1 Jun 2023 11:17:39 +0300
> Subject: [PATCH] libavcodec/cuviddec: determine DPB size from within cuvid
>  parser
> 
> ---
>  libavcodec/cuviddec.c | 29 +++--
>  1 file changed, 27 insertions(+), 2 deletions(-)
> 
> diff --git a/libavcodec/cuviddec.c b/libavcodec/cuviddec.c
> index 3d43bbd466..759ed49870 100644
> --- a/libavcodec/cuviddec.c
> +++ b/libavcodec/cuviddec.c
> @@ -115,6 +115,12 @@ typedef struct CuvidParsedFrame
>  
>  #define CHECK_CU(x) FF_CUDA_CHECK_DL(avctx, ctx->cudl, x)
>  
> +// NV recommends [2;4] range
> +#define CUVID_MAX_DISPLAY_DELAY (4)
> +
> +// Actual DPB size will be determined by parser.
> +#define CUVID_DEFAULT_NUM_SURFACES (CUVID_MAX_DISPLAY_DELAY + 1)
> +
>  static int CUDAAPI cuvid_handle_video_sequence(void *opaque, CUVIDEOFORMAT* 
> format)
>  {
>  AVCodecContext *avctx = opaque;
> @@ -309,6 +315,25 @@ static int CUDAAPI cuvid_handle_video_sequence(void 
> *opaque, CUVIDEOFORMAT* form
>  return 0;
>  }
>  
> +if (ctx->nb_surfaces < format->min_num_decode_surfaces + 3)
> +ctx->nb_surfaces = format->min_num_decode_surfaces + 3;

FFMAX()

> +
> +if (avctx->extra_hw_frames > 0)
> +ctx->nb_surfaces += avctx->extra_hw_frames;
> +
> +if (0 > av_fifo_realloc2(ctx->frame_queue, ctx->nb_surfaces * 
> sizeof(CuvidParsedFrame))) {

this is the old deprecated AVFifoBuffer API, you cannot use it with
AVFifo objects

you should also forward the actual error code

> +av_log(avctx, AV_LOG_ERROR, "Failed to recreate frame queue on video 
> sequence callback\n");
> +ctx->internal_error = AVERROR(EINVAL);
> +return 0;
> +}
> +
> +ctx->key_frame = av_realloc_array(ctx->key_frame, ctx->nb_surfaces, 
> sizeof(int));
> +if (!ctx->key_frame) {
> +av_log(avctx, AV_LOG_ERROR, "Failed to recreate key frame queue on 
> video sequence callback\n");
> +ctx->internal_error = AVERROR(EINVAL);

Leaks key_frame on failure and should be ENOMEM.

-- 
Anton Khirnov
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".