On 19/02/2017 11:09, Anton Khirnov wrote: > Some parts of the code are based on a patch by > Timo Rothenpieler <[email protected]> > --- > Compared to the ffmpeg patch which implements cuvid as a separate decoder > using > the higher-level parser API (nvcuvid.h), I did it as a classic hwaccel using > the lower-level decoder API (cuviddec.h). > IMO, this has a number of advantages: > - integrates much better with the existing acceleration infrastructure/APIs > - supports stream parameters changes > - the code is much simpler > - software fallback > - various features from h264dec, such as handling weird invalid streams or > exporting metadata from SEIs > > One question to be resolved is retrieving the frames. The way the API works is > that the decoder maintains and internal pool of frames, to which the caller > refers by their indices. When you want the data, you map the frame, which > allows > you to copy its contents to a normal CUDA frame. To get optimal performance, > this map+copy needs to be delayed wrt decoding by a few frames, so the > question > is how this should be done. The options I see are: > - introduce a new pixel format, AV_PIX_FMT_CUVID, which wraps the frame index > and allows transfer to CUDA via av_hwframe_transfer_data(). Then either > * Return those PIX_FMT_CUVID frames to the caller and let them do the copy > manually. This is most flexible, but more work for the caller and might > mean synchronization problems, so we'd need to add locks (perhaps to the > CUVID frames context). > * Handle delay+map+copy somewhere else in lavc. The question is where > would the right place be. Janne suggested at FOSDEM to add a dummy > decoder, > h264_cuvid wrapping h264dec, which would do the delay and copy. That > should > work, but isn't very elegant. > - we could also add some sort of a "postprocess" stage to AVHWaccel, run > before > returning a frame from decode(), or perhaps invoked separately by the lavc > generic code. > This issue might be relevant to other future hwaccels as well (VT?), so > ideally > the solution would be generic. Comments and further suggestions very much > welcome.
I'd land the code as-is and refactor the memory mapping from there. Did you look on how the deinterlacer can be wired in btw? lu _______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
