On Sun, 19 Feb 2017 11:09:49 +0100
Anton Khirnov <[email protected]> wrote:

> Some parts of the code are based on a patch by
> Timo Rothenpieler <[email protected]>
> ---
> Compared to the ffmpeg patch which implements cuvid as a separate decoder 
> using
> the higher-level parser API (nvcuvid.h), I did it as a classic hwaccel using
> the lower-level decoder API (cuviddec.h).
> IMO, this has a number of advantages:
>  - integrates much better with the existing acceleration infrastructure/APIs
>  - supports stream parameters changes
>  - the code is much simpler
>  - software fallback
>  - various features from h264dec, such as handling weird invalid streams or
>    exporting metadata from SEIs
> 



> One question to be resolved is retrieving the frames. The way the API works is
> that the decoder maintains and internal pool of frames, to which the caller
> refers by their indices. When you want the data, you map the frame, which 
> allows
> you to copy its contents to a normal CUDA frame. To get optimal performance,
> this map+copy needs to be delayed wrt decoding by a few frames, so the 
> question
> is how this should be done. The options I see are:
>  - introduce a new pixel format, AV_PIX_FMT_CUVID, which wraps the frame index
>    and allows transfer to CUDA via av_hwframe_transfer_data(). Then either
>    * Return those PIX_FMT_CUVID frames to the caller and let them do the copy
>      manually. This is most flexible, but more work for the caller and might
>      mean synchronization problems, so we'd need to add locks (perhaps to the
>      CUVID frames context).
>    * Handle delay+map+copy somewhere else in lavc. The question is where
>      would the right place be. Janne suggested at FOSDEM to add a dummy 
> decoder,
>      h264_cuvid wrapping h264dec, which would do the delay and copy. That 
> should
>      work, but isn't very elegant.
>  - we could also add some sort of a "postprocess" stage to AVHWaccel, run 
> before
>    returning a frame from decode(), or perhaps invoked separately by the lavc
>    generic code.
> This issue might be relevant to other future hwaccels as well (VT?), so 
> ideally
> the solution would be generic. Comments and further suggestions very much
> welcome.

What is with all this complexity? Is this about the final read-back if
you want to decode to system RAM? In this case, let it the API user do,
like any decent API user already does, and which your first point
suggests. (This means you need to hack avconv.c.) Not sure why "locks"
would be needed for this.

(But certainly I insist that av_hwframe_transfer_data() can be called
in any thread - everything else is insane.)

VT is a whole different question.

_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to