Re: [FFmpeg-devel] [PATCH v2 00/15] YUV colorspace filter negotiation
[AMD Official Use Only - General] I found out this change caused page fault as well. 8c7934f73ab6c568acaa47c821a6833f9145fdbb is the first bad commit commit 8c7934f73ab6c568acaa47c821a6833f9145fdbb Author: Niklas Haas Date: Sun Dec 31 13:35:03 2023 -0800 avfilter: add negotiation API for color space/range Motivated by YUVJ removal. This change will allow full negotiation between color ranges and matrices as needed. By default, all ranges and matrices are marked as supported. Because grayscale formats are currently handled very inconsistently (and in particular, assumed as forced full-range by swscale), we exclude them from negotiation altogether for the time being, to get this API merged. After filter negotiation is available, we can relax the grayscale-is-forced-jpeg restriction again, when it will be more feasible to do so without breaking a million test cases. Note that this commit updates one FATE test as a consequence of the sanity fallback for non-YUV formats. In particular, the test case now writes rgb24(pc, gbr/unspecified/unspecified) to the matroska file, instead of rgb24(unspecified/unspecified/unspecified) as before. doc/APIchanges | 3 + libavfilter/avfilter.c | 17 - libavfilter/avfilter.h | 28 +++ libavfilter/avfiltergraph.c | 173 +++- libavfilter/formats.c | 121 +++ libavfilter/formats.h | 54 ++ libavfilter/internal.h | 6 ++ libavfilter/vaapi_vpp.c | 4 + libavfilter/version.h | 4 +- libavfilter/video.c | 2 + tests/ref/fate/shortest-sub | 4 +- 11 files changed, 406 insertions(+), 10 deletions(-) ffmpeg -v trace -hwaccel vaapi -vaapi_device /dev/dri/renderD128 -i ./example_720p30.y4m -vf 'format=nv12|vaapi, hwupload' -c:v h264_vaapi ./out_720p30.264 -y Successfully opened the file. Stream mapping: Stream #0:0 -> #0:0 (rawvideo (native) -> h264 (h264_vaapi)) [vost#0:0/h264_vaapi @ 0x55a86a145c70] Starting thread... [vf#0:0 @ 0x55a86a104290] Starting thread... [vist#0:0/rawvideo @ 0x55a86a105cc0] Starting thread... [in#0/yuv4mpegpipe @ 0x55a86a11d250] Starting thread... Press [q] to stop, [?] for help [rawvideo @ 0x55a86a144d40] PACKET SIZE: 1382400, STRIDE: 1920 [AVFilterGraph @ 0x7fa5eff0] Setting 'pix_fmts' to value 'nv12|vaapi' detected 24 logical cores [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'video_size' to value '1280x720' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'pix_fmt' to value '0' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'time_base' to value '1/30' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'pixel_aspect' to value '0/1' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'colorspace' to value 'unknown' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'range' to value 'unknown' [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] Setting 'frame_rate' to value '30/1' [rawvideo @ 0x55a86a144d40] PACKET SIZE: 1382400, STRIDE: 1920 [graph 0 input from stream 0:0 @ 0x7fa5e0003ab0] w:1280 h:720 pixfmt:yuv420p tb:1/30 fr:30/1 sar:0/1 csp:unknown range:unknown [format @ 0x7fa5e0004390] Setting 'pix_fmts' to value 'vaapi' [auto_scale_0 @ 0x7fa5e00055d0] w:iw h:ih flags:'' interl:0 [Parsed_format_0 @ 0x7fa5e0003450] auto-inserting filter 'auto_scale_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_format_0' [AVFilterGraph @ 0x7fa5eff0] query_formats: 5 queried, 9 merged, 3 already done, 0 delayed Segmentation fault (core dumped) -Original Message- From: ffmpeg-devel On Behalf Of Niklas Haas Sent: Sunday, December 31, 2023 4:51 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v2 00/15] YUV colorspace filter negotiation On Sun, 24 Dec 2023 10:35:22 -0800 Niklas Haas wrote: > On Wed, 13 Dec 2023 14:11:57 +0100 Niklas Haas wrote: > > Split off from my YUVJ removal series. This implements all of the > > libavfilter changes needed to fully deprecate YUVJ, but does not yet > > remove YUVJ, nor add an AVCodec API for advertising colorspace support. > > > > Update includes all of the feedback that was brought up by Anton. > > the major change from v1 is that YUV metadata configured on a link > > is no longer strictly required to be consistent with the frames, > > simply to avoid breaking users that don't set this link metadata. > > Effectively, this allows us to have a grace period where the field > > functions more like a hint than an authoritative field. > > Ping, any further feedback? If not, I will merge this sometime after > Christmas. Merged as 1b0ca477..44a33fccd ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with
Re: [FFmpeg-devel] [PATCH v3 6/6] lavc/vaapi_encode: Add VAAPI AV1 encoder
[AMD Official Use Only - General] -Original Message- From: Wang, Fei W Sent: Wednesday, August 9, 2023 10:55 PM To: ffmpeg-devel@ffmpeg.org Cc: Dong, Ruijing Subject: Re: [FFmpeg-devel] [PATCH v3 6/6] lavc/vaapi_encode: Add VAAPI AV1 encoder On Mon, 2023-08-07 at 22:21 +0100, Mark Thompson wrote: > On 03/08/2023 07:01, fei.w.wang-at-intel@ffmpeg.org wrote: > > From: Fei Wang > > > > Signed-off-by: Fei Wang > > --- > > Changelog |1 + > > configure |3 + > > doc/encoders.texi | 13 + > > libavcodec/Makefile |1 + > > libavcodec/allcodecs.c|1 + > > libavcodec/vaapi_encode.c | 125 +++- > > libavcodec/vaapi_encode.h | 12 + > > libavcodec/vaapi_encode_av1.c | 1229 > > + > > libavcodec/version.h |2 +- > > 9 files changed, 1368 insertions(+), 19 deletions(-) > > create mode 100644 libavcodec/vaapi_encode_av1.c > > I assume this is tested on Intel hardware. Is it tested on AMD as > well? (Apparently working in recent Mesa.) Yep, tested on AMD as well, working. AMD tested the patchset months ago: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22585 This patch changed a little compare with the version in Cartwheel, @Ruijing, Could you help to review this version in ML? To avoid the diffs break you. Thanks. > > > > diff --git a/Changelog b/Changelog > > index bbda4f4fd4..e86f742cd3 100644 > > --- a/Changelog > > +++ b/Changelog > > @@ -27,6 +27,7 @@ version : > > - Bitstream filter for converting VVC from MP4 to Annex B > > - scale_vt filter for videotoolbox > > - transpose_vt filter for videotoolbox > > +- VAAPI AV1 encoder > > > > version 6.0: > > - Radiance HDR image support > > diff --git a/configure b/configure > > index 99388e7664..68a238a819 100755 > > --- a/configure > > +++ b/configure > > @@ -3322,6 +3322,8 @@ av1_qsv_decoder_select="qsvdec" > > av1_qsv_encoder_select="qsvenc" > > av1_qsv_encoder_deps="libvpl" > > av1_amf_encoder_deps="amf" > > +av1_vaapi_encoder_deps="VAEncPictureParameterBufferAV1" > > +av1_vaapi_encoder_select="cbs_av1 vaapi_encode" > > > > # parsers > > aac_parser_select="adts_header mpeg4audio" > > @@ -7106,6 +7108,7 @@ if enabled vaapi; then > > check_type "va/va.h va/va_enc_jpeg.h" > > "VAEncPictureParameterBufferJPEG" > > check_type "va/va.h > > va/va_enc_vp8.h" "VAEncPictureParameterBufferVP8" > > check_type "va/va.h > > va/va_enc_vp9.h" "VAEncPictureParameterBufferVP9" > > +check_type "va/va.h > > va/va_enc_av1.h" "VAEncPictureParameterBufferAV1" > > fi > > > > if enabled_all opencl libdrm ; then > > diff --git a/doc/encoders.texi b/doc/encoders.texi > > index 25d6b7f09e..fb331ebd8e 100644 > > --- a/doc/encoders.texi > > +++ b/doc/encoders.texi > > @@ -3991,6 +3991,19 @@ Average variable bitrate. > > Each encoder also has its own specific options: > > @table @option > > > > +@item av1_vaapi > > +@option{profile} sets the value of @emph{seq_profile}. > > +@option{tier} sets the value of @emph{seq_tier}. > > +@option{level} sets the value of @emph{seq_level_idx}. > > + > > +@table @option > > +@item tiles > > +Set the number of tiles to encode the input video with, as columns > > x rows. > > +(default is 1x1). > > Probably needs some clarification that large resolutions must be > split into tiles? Maybe an "auto" value for this (meaning use as few > as possible), and let explicit "1x1" fail if the resolution is too > large. > > > +@item tile_groups > > +Set tile groups number (default is 1). > > Meaning what? It splits into tile group OBUs containing equal > numbers of tiles, or of equal size in bits, or something else? > > > +@end table > > + > > @item h264_vaapi > > @option{profile} sets the value of @emph{profile_idc} and the > > @emph{constraint_set*_flag}s. > > @option{level} sets the value of @emph{level_idc}. > > diff --git a/libavcodec/Makefile b/libavcodec/Makefile > > index a6b2ecbb22..473afb4471 100644 > > --- a/libavcodec/Makefile > > +++ b/libavcodec/Makefile > > @@ -258,6 +258,7 @@ OBJS-$(CONFIG_AV1_MEDIACODEC_DECODER) += > > mediacodecdec.o > > OBJS-$(CONFIG_AV1_MEDIACODEC_ENCODER) +
Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support P010 format
[AMD Official Use Only - General] We would like to add that in Mesa first to let vaapi path accept 10bit input and HW to convert it to 8bit if it makes sense to do so. Then get a chance to apply it to ffmpeg. Thanks Ruijing -Original Message- From: ffmpeg-devel On Behalf Of James Almer Sent: Friday, November 25, 2022 6:35 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support P010 format On 11/25/2022 8:26 PM, Dong, Ruijing wrote: > [AMD Official Use Only - General] > > Will it make sense to accept P010 input, however encode to h264 8bit? If it works (the encoder accepts the 10 bit input even if it encodes it as 8bit), then i don't see why not. I assume it would also be faster than using swscale to convert said 10bit input to nv12 before passing that to the encoder. Removing support for a pixel format as input in an encoder needs a reason other than "It's rarely used", more so when it's a single line. It either needs to not work, or somehow get in the way of further improvements. > > Thanks, > Ruijing > > -Original Message- > From: ffmpeg-devel On Behalf Of > James Almer > Sent: Friday, November 25, 2022 6:25 PM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support > P010 format > > On 11/25/2022 8:20 PM, Soft Works wrote: >> >> >>> -Original Message- >>> From: ffmpeg-devel On Behalf Of >>> James Almer >>> Sent: Friday, November 25, 2022 8:48 PM >>> To: ffmpeg-devel@ffmpeg.org >>> Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support >>> P010 format >>> >>> On 11/25/2022 3:03 PM, Soft Works wrote: >>>> >>>> >>>>> -Original Message- >>>>> From: ffmpeg-devel On Behalf Of >>>>> Anton Khirnov >>>>> Sent: Friday, November 25, 2022 2:46 PM >>>>> To: FFmpeg development discussions and patches >>>> de...@ffmpeg.org> >>>>> Cc: Haihao Xiang >>>>> Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't >>> support >>>>> P010 format >>>>> >>>>> Why? >>>> >>>> It's rarely used. There's not even a hwaccel that can decode this. >>> >>> The dxva2/d3d11 hwaccels seemingly do >> >> No, only for HEVC and VP9. Probably for AV1 as well in the near future. > > Well, then that is a good reason to not remove support in this encoder for a > pixel format that not only our software scaler can generate, but decoders > using hwaccel backends can output too. > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmp > eg.org%2Fmailman%2Flistinfo%2Fffmpeg-develdata=05%7C01%7Cruijing. > dong%40amd.com%7Cbcfc4e8b332b4a42691908dacf3d87e7%7C3dd8961fe4884e608e > 11a82d994e183d%7C0%7C0%7C638050160442866421%7CUnknown%7CTWFpbGZsb3d8ey > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300 > 0%7C%7C%7Csdata=8UN5OD1Yt2fmSkiHi4cLtnDryBlwpa8DF4Oi%2FMhjxPk%3D& > amp;reserved=0 > > To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org > with subject "unsubscribe". > > > ___ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmp > eg.org%2Fmailman%2Flistinfo%2Fffmpeg-develdata=05%7C01%7Cruijing. > dong%40amd.com%7Cbcfc4e8b332b4a42691908dacf3d87e7%7C3dd8961fe4884e608e > 11a82d994e183d%7C0%7C0%7C638050160442866421%7CUnknown%7CTWFpbGZsb3d8ey > JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C300 > 0%7C%7C%7Csdata=8UN5OD1Yt2fmSkiHi4cLtnDryBlwpa8DF4Oi%2FMhjxPk%3D& > amp;reserved=0 > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-develdata=05%7C01%7Cruijing.dong%40amd.com%7Cbcfc4e8b332b4a42691908dacf3d87e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638050160442866421%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=8UN5OD1Yt2fmSkiHi4cLtnDryBlwpa8DF4Oi%2FMhjxPk%3Dreserved=0 To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". <>___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support P010 format
[AMD Official Use Only - General] Will it make sense to accept P010 input, however encode to h264 8bit? Thanks, Ruijing -Original Message- From: ffmpeg-devel On Behalf Of James Almer Sent: Friday, November 25, 2022 6:25 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support P010 format On 11/25/2022 8:20 PM, Soft Works wrote: > > >> -Original Message- >> From: ffmpeg-devel On Behalf Of >> James Almer >> Sent: Friday, November 25, 2022 8:48 PM >> To: ffmpeg-devel@ffmpeg.org >> Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't support >> P010 format >> >> On 11/25/2022 3:03 PM, Soft Works wrote: >>> >>> -Original Message- From: ffmpeg-devel On Behalf Of Anton Khirnov Sent: Friday, November 25, 2022 2:46 PM To: FFmpeg development discussions and patches >>> de...@ffmpeg.org> Cc: Haihao Xiang Subject: Re: [FFmpeg-devel] [PATCH] lavc/qsvenc_h264: don't >> support P010 format Why? >>> >>> It's rarely used. There's not even a hwaccel that can decode this. >> >> The dxva2/d3d11 hwaccels seemingly do > > No, only for HEVC and VP9. Probably for AV1 as well in the near future. Well, then that is a good reason to not remove support in this encoder for a pixel format that not only our software scaler can generate, but decoders using hwaccel backends can output too. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fffmpeg.org%2Fmailman%2Flistinfo%2Fffmpeg-develdata=05%7C01%7Cruijing.dong%40amd.com%7Cf449421fd89948ad96cc08dacf3c324b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638050154742304108%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=xFeOvsaJw%2F9BlPifrqx33Pv8GurJ1N2lVyuUYHIOClo%3Dreserved=0 To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". <>___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode
[AMD Official Use Only - General] Thanks for the input. I will try this way. Ruijing -Original Message- From: Mark Thompson Sent: Thursday, November 24, 2022 4:10 PM To: Dong, Ruijing ; FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode On 24/11/2022 15:27, Dong, Ruijing wrote: > [AMD Official Use Only - General] > > I might have misunderstood some of the questions, and I would like to explain > more about the issue from my perspective, please correct me if anything wrong. Are you perhaps missing that this is intended to be a common API between multiple implementations and users? The important feature is that once the API has specified the behaviour then both the driver implementers (such as Mesa, but also others) and the users (such as FFmpeg, but also others) don't need to know what is going on on the other side, because the other side always behaves in the same way. In this case, the API specifies that there are two output surfaces - one is written with the reference (pre-grain) frame, one is written with the display (post-grain) frame. The Mesa driver is incorrect because it writes the display frame to the reference surface, and doesn't bother to write the display surface at all. This is then very visible when any user tries to display the display surface, because it hasn't been written. > This patch is NOT a hack, like @Mark Thompson mentioned. Your proposed patch is trying to crystalise an alternative not-quite-VAAPI which uses the same library and appears compatible, but actually isn't because it differs in critical ways. I don't think it is unreasonable to characterise this as a hack. > Video codec, especially decoders will need to meet the requirements of > video codecs first, if the reference picture management (DPB) was > implemented wrongly, then it could not meet the fundamental decoder criteria. > From this point of view, different hardware will need to follow the same > standard for the implementation so that the decoders can generate the > conformance outputs. > > The DPB is always an internal part of the decoder, the detail > implementation could be differed with different benefits, if DPB is > managed by the application, it can be more flexible and easily > maintained, the other way is the DPB is managed by the driver and hardware > itself, it could have more space for the optimization, for example the > reference frame access, where the format of reference frames is NOT used for > display output, and the display output cannot be used for the reference > frames neither because reference frames could use a different format, which > is more efficient for reference access however not good for display. > From my point of view VAAPI supports both of the above two ideas, and > it is not necessary to force one to follow another, because that is limited > by the initial design Idea. In this case, there is no pre-grain output in > the former decoder, it has only one display output. Fair enough. Luckily, the API clearly states where that display output should be written: <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fintel%2Flibva%2Fblob%2Fmaster%2Fva%2Fva_dec_av1.h%23L300-L304data=05%7C01%7CRuijing.Dong%40amd.com%7C632f4b36198b472f479e08dace604aaa%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638049210211836459%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=vDQ87%2BLUym3W7UsO6JJI6o9uGLAxCenCA7K2eRQHtgI%3Dreserved=0>. You can get away with not writing the reference frame because it won't be displayed in this case, but you must write the display frame because that is the one the API says will be displayed to the user. The unwritten reference frame will still get used in references because that is how the API is defined, but since Mesa deals with this by attaching hidden metadata to the frames there is no problem. > From the other side, most of the AMD AV1 decoding issues are resolved > from the community, the film grain problem becomes more noticeable. And > generally speaking it is usually a flexible part of the post processing phase > after video decoding, and here it is strictly defined in AV1 spec, and it is > part of the decoding standard. > It is not practical to make changes in the DPB design idea for > resolving this issue from AMD decoder side. And naturally output the applied > firm grain is just another film grain process mode, I called it "direct film > grain mode". > > I have asked the community to inspire me to have a better idea, and > eventually I found out there is no good way other than to have the > external user choice or detecting AMD driver. I understand doing the string > match to choose AMD driver is
Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode
[AMD Official Use Only - General] I might have misunderstood some of the questions, and I would like to explain more about the issue from my perspective, please correct me if anything wrong. This patch is NOT a hack, like @Mark Thompson mentioned. Video codec, especially decoders will need to meet the requirements of video codecs first, if the reference picture management (DPB) was implemented wrongly, then it could not meet the fundamental decoder criteria. From this point of view, different hardware will need to follow the same standard for the implementation so that the decoders can generate the conformance outputs. The DPB is always an internal part of the decoder, the detail implementation could be differed with different benefits, if DPB is managed by the application, it can be more flexible and easily maintained, the other way is the DPB is managed by the driver and hardware itself, it could have more space for the optimization, for example the reference frame access, where the format of reference frames is NOT used for display output, and the display output cannot be used for the reference frames neither because reference frames could use a different format, which is more efficient for reference access however not good for display. >From my point of view VAAPI supports both of the above two ideas, and it is >not necessary to force one to follow another, because that is limited by the >initial design Idea. In this case, there is no pre-grain output in the former decoder, it has only one display output. >From the other side, most of the AMD AV1 decoding issues are resolved from the >community, the film grain problem becomes more noticeable. And generally >speaking it is usually a flexible part of the post processing phase after video decoding, and here it is strictly defined in AV1 spec, and it is part of the decoding standard. It is not practical to make changes in the DPB design idea for resolving this issue from AMD decoder side. And naturally output the applied firm grain is just another film grain process mode, I called it "direct film grain mode". I have asked the community to inspire me to have a better idea, and eventually I found out there is no good way other than to have the external user choice or detecting AMD driver. I understand doing the string match to choose AMD driver is not a perfect idea, but we really need to have a method to resolve this issue. Please let me know If a better way come to your mind, which can resolve this issue and in the meanwhile not affecting other AV1 implementation path. Appreciate for the help and comments! Ruijing -Original Message- From: ffmpeg-devel On Behalf Of Dong, Ruijing Sent: Tuesday, November 22, 2022 9:43 PM To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode [AMD Official Use Only - General] [AMD Official Use Only - General] Hi Mark, Just got the ffmpeg email, please see my answer below in [rdong]. Thanks, Ruijing -Original Message- From: ffmpeg-devel On Behalf Of Mark Thompson Sent: Tuesday, November 22, 2022 6:34 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode On 22/11/2022 20:59, Mark Thompson wrote: > On 22/11/2022 20:26, Mark Thompson wrote: >> On 22/11/2022 19:18, Dong, Ruijing wrote: >>> [AMD Official Use Only - General] >>> >>> Hi Mark, >>> >>> Sorry for being late to reply to you. >>> >>> Your understanding is correct, and I have sent a new patch [v4] for >>> addressing the current issue and to use driver quirk mechanism to specify >>> only AMD VAAPI driver has this behavior, then this could be more specific. >>> >>> For AMD hardware, it allocates GPU memory internally for the DPB >>> management, the output is always the final one with or without applied >>> film-grain. >> >> I don't see why this requires you to write the output to the wrong surface. >> Why not write it to the correct one instead? > > Indeed, this seems to be a trivial fix in Mesa: > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F19938data=05%7C01%7Cruijing.dong%40amd.com%7Cff9fa5ba1fdc47293fcb08daccfc817a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047682135727840%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=cPnxOAo32wz6YY%2Fcm8lMauIaecI%2BXmy0KgtiIB8e4E4%3Dreserved=0>. > > It would be helpful if someone with suitable hardware could test that. This was too naive, the Mesa driver doesn't make this easy. It is only set up to write to a single surface, which is the one provided to vaBeginPicture(). Howeve
Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode
[AMD Official Use Only - General] Hi Mark, Just got the ffmpeg email, please see my answer below in [rdong]. Thanks, Ruijing -Original Message- From: ffmpeg-devel On Behalf Of Mark Thompson Sent: Tuesday, November 22, 2022 6:34 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode On 22/11/2022 20:59, Mark Thompson wrote: > On 22/11/2022 20:26, Mark Thompson wrote: >> On 22/11/2022 19:18, Dong, Ruijing wrote: >>> [AMD Official Use Only - General] >>> >>> Hi Mark, >>> >>> Sorry for being late to reply to you. >>> >>> Your understanding is correct, and I have sent a new patch [v4] for >>> addressing the current issue and to use driver quirk mechanism to specify >>> only AMD VAAPI driver has this behavior, then this could be more specific. >>> >>> For AMD hardware, it allocates GPU memory internally for the DPB >>> management, the output is always the final one with or without applied >>> film-grain. >> >> I don't see why this requires you to write the output to the wrong surface. >> Why not write it to the correct one instead? > > Indeed, this seems to be a trivial fix in Mesa: > <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fmerge_requests%2F19938data=05%7C01%7Cruijing.dong%40amd.com%7C0896eeffcf674092aa8408dacce228f3%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638047570363839738%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=3Be3k9TQdRjGJv2KYNsssFjl3ORaNXzdSBqei%2BgYSBg%3Dreserved=0>. > > It would be helpful if someone with suitable hardware could test that. This was too naive, the Mesa driver doesn't make this easy. It is only set up to write to a single surface, which is the one provided to vaBeginPicture(). However, VAAPI does not work that way - it wants you to write to both the pre-grain and the post-grain surfaces, where the pre-grain surface is the primary target and gets passed the vaBeginPicture() and the post-grain surface is supplied in the parameters. So that's the first problem: the render target which is given as the pre-grain surface needs to be replaced by post-grain surface if we want to only write a single surface. [rdong] Well, the render surface targets to output the displayable surface, and film grain has many ways to be carried on, one way is to let application doing next step, the other way is the decoder who can directly output the final applied grain result. The VAAPI interface should accommodate as much as possible hardware instead of limiting the only way for the implementation. AMD decoder for the performance optimization (tiled formats and swizzle modes) and other considerations, it will need to manage the reference frames internally, and there is no point to implement a new logic to output both pre-grain and post-grain surfaces. If grain_apply is set, then the decoder will only output the applied grain result. Is that enough? Well, no. The Mesa driver is also messing with the reference frames. The VAAPI model is that the pre-grain surfaces are passed back into the driver on subsequent frames when they are used as reference frames. However, the Mesa driver has hidden the pre-grain surface internally and only written the post-grain surface. Therefore, when writing a post-grain output, it magically associates with the target surface information about the pre-grain surface which was written internally at the same time. Then, when you later give it that surface as a reference frame it ignores the actual content of the frame and looks at the associated data to find what to use internally as the reference. That's the second problem: if the post-grain surface were actually the render target then the magic internal reference gets associated with that, and when we pass the real reference frame (the pre-grain surface) in later then it won't recognise it because it never wrote to that surface. [rdong] In my understanding the target surface should be the final output, and the reason AMD decoder cannot output the reference frames I have already described above, in fact, the reference frame buffer pointers are used as reference to indicate how the reference picture buffer will managed internally, please understand there are some concept differences there. How should it be fixed, then? The best way would be to stop hiding the internal information about reference frames: if the real reference frames were visible in VAAPI then everything would just work and none of the magic internal references would be needed. [rdong] Well, it has many difficulties to map the well designed AMD decoder to the VAAPI interface, however we could not expect everything is perfect. If we
Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode
[AMD Official Use Only - General] Hi Mark, Sorry for being late to reply to you. Your understanding is correct, and I have sent a new patch [v4] for addressing the current issue and to use driver quirk mechanism to specify only AMD VAAPI driver has this behavior, then this could be more specific. For AMD hardware, it allocates GPU memory internally for the DPB management, the output is always the final one with or without applied film-grain. Thanks for your comments and insights! Ruijing -Original Message- From: ffmpeg-devel On Behalf Of Mark Thompson Sent: Sunday, November 20, 2022 11:44 AM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH v2] avcodec/av1_vaapi: add direct film grain mode On 20/11/2022 02:59, Ruijing Dong wrote: > Adding direct film grain mode for av1 decoder, which outputs alongside > film grain. > > AV_HWACCEL_FLAG_DIRECT_FILM_GRAIN is the new flag introduced to enable > this path. > > issue: > By using AMD av1 decoder via VAAPI, when used with film grain content, > the output displays black screen with incorrect frame order. > > The issue being discussed in here: > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl > ab.freedesktop.org%2Fmesa%2Fmesa%2F-%2Fissues%2F6903%23note_1613807 > p;data=05%7C01%7Cruijing.dong%40amd.com%7C6f75be6a4f8044fe037d08dacb16 > 8529%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638045595864762041%7 > CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1 > haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=Ms7PCNEjOs09JVQd2KB46St5 > w3V8Idbc2shZC80VefI%3Dreserved=0 > > example: > This flag can be used in ffmpeg command: > > ffmpeg -hwaccel vaapi -vaapi_device /dev/dri/renderD128 >-hwaccel_flags 8 >-i input_with_film_gram.obu >output_with_film_grain.yuv > > Signed-off-by: Ruijing Dong > --- > update: add new option direct_film_grain in optons_table.h > > libavcodec/avcodec.h | 17 + > libavcodec/options_table.h | 1 + > libavcodec/vaapi_av1.c | 6 -- > 3 files changed, 22 insertions(+), 2 deletions(-) My understanding of this is as follows, please correct me if any of this is wrong: * For AV1 with film grain enabled, VAAPI decode is specified with two output surfaces: one for pre-grain (reference) output and one for post-grain (display) output. * The current driver in Mesa always writes to the pre-grain surface and ignores the post-grain surface entirely. * To fix this, you intend to modify the VAAPI code in libavcodec to allow the user to manually override the expected VAAPI behaviour and instead assume that the post-grain output has been written to the pre-grain surface. Is that right? [rdong]: If it is, could you perhaps explain why this manual option is preferable to the more obvious solution of Mesa being fixed to write the post-grain output to the post-grain surface? Thanks, - Mark > diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index > 3edd8e2636..9420e7820d 100644 > --- a/libavcodec/avcodec.h > +++ b/libavcodec/avcodec.h > @@ -2253,6 +2253,23 @@ typedef struct AVHWAccel { >*/ > #define AV_HWACCEL_FLAG_ALLOW_PROFILE_MISMATCH (1 << 2) > > +/** > + * The film grain synthesis could be seperate from decoding process. > + * The downstream device would apply the film grain parameters seperately. > + * The desired film grain parameters can be found in SEI section in > +H264 > + * or H265 bitstreams. > + * > + * In AV1, film grain is mandatorily specified, av1 decoders like AMD > + * av1 decoder process film grain content internally, and the output > + * includes applied film grain. For the purpose of supporting these > +av1 > + * decoders, this flag needs to be utilized. > + * > + * @warning If the stream has no film grain content, this flag will > + * be ignored in the supported av1 decoders. It is advised > + * that this flag should only be used in av1 decoders > + * that support it. > + */ > +#define AV_HWACCEL_FLAG_DIRECT_FILM_GRAIN (1 << 3) > /** >* @} >*/ > diff --git a/libavcodec/options_table.h b/libavcodec/options_table.h > index cd02f5096f..0302f89280 100644 > --- a/libavcodec/options_table.h > +++ b/libavcodec/options_table.h > @@ -399,6 +399,7 @@ static const AVOption avcodec_options[] = { > {"ignore_level", "ignore level even if the codec level used is unknown or > higher than the maximum supported level reported by the hardware driver", 0, > AV_OPT_TYPE_CONST, { .i64 = AV_HWACCEL_FLAG_IGNORE_LEVEL }, INT_MIN, INT_MAX, > V | D, "hwaccel_flags" }, > {"allow_high_depth", "allow to output YUV pixel formats with a different > chroma sampling than 4:2:0 and/or other than 8 bits per component", 0, > AV_OPT_TYPE_CONST, {.i64 = AV_HWACCEL_FLAG_ALLOW_HIGH_DEPTH }, INT_MIN, > INT_MAX, V | D, "hwaccel_flags"}, > {"allow_profile_mismatch", "attempt to decode anyway if HW > accelerated decoder's supported profiles do not exactly match the > stream", 0,