Feb 11, 2024, 12:58 by s...@jkqxz.net: > On 10/02/2024 22:39, Lynne wrote: > >> 11 files changed, 303 insertions(+), 669 deletions(-) >> delete mode 100644 libavcodec/vulkan_video_codec_av1std.h >> delete mode 100644 libavcodec/vulkan_video_codec_av1std_decode.h >> > > You need something in configure to detect whether the AV1 header is available > so it can build. (Currently this passes configure and errors out at build > time for me with 1.3.275 headers.) >
This affects multiple files, and it's a bit too much to add an ifdef, so I've added a version check (1.3.277). >> - StdVideoAV1MESATile tiles[MAX_TILES]; >> - StdVideoAV1MESATileList tile_list; >> - const uint32_t *tile_offsets; >> + uint32_t tile_count; >> + uint32_t tile_offsets_s[MAX_TILES]; >> + uint32_t tile_sizes[MAX_TILES]; >> > > I don't see any check before writing things to this array. What happens if > you get a non-conforming stream with more than 256 tiles? > Fixed, added a check in vk_av1_decode_slice(). >> + >> + for (int i = 0; i < AV1_TOTAL_REFS_PER_FRAME; i++) { >> + vkav1_std_ref->SavedOrderHints[i] = cbs_ctx->order_hints[i]; >> > > "SavedOrderHints is interpreted as defined in section 7.20 of the AV1 > Specification." > > This doesn't look right for frames in the DPB. SavedOrderHints in 7.20 is a > two-dimensional array containing OrderHints when that frame was decoded, not > the reference order hints of the current frame. > SavedOrderHints is a stack that is simply pushed at the time of decoding, isn't it? I think this is correct, as these are derived from the bitstream values before decoding. >> + vkav1_std_ref->RefFrameSignBias |= cbs_ctx->ref_frame_sign_bias[i] >> << i; >> + } >> + >> + *vkav1_ref = (VkVideoDecodeAV1DpbSlotInfoKHR) { >> + .sType = VK_STRUCTURE_TYPE_VIDEO_DECODE_AV1_DPB_SLOT_INFO_KHR, >> + .pStdReferenceInfo = vkav1_std_ref, >> }; >> >> for (unsigned i = 0; i < 7; i++) { >> const int idx = pic->raw_frame_header->ref_frame_idx[i]; >> - vkav1_ref->ref_order_hint[i] = >> pic->raw_frame_header->ref_order_hint[idx]; >> + vkav1_std_ref->SavedOrderHints[i] = >> pic->raw_frame_header->ref_order_hint[idx]; >> > > Is this overwriting some of the above? > Yes, removed. >> + for (int i = 0; i < STD_VIDEO_AV1_MAX_SEGMENTS; i++) { >> + for (int j = 0; j < STD_VIDEO_AV1_SEG_LVL_MAX; j++) { >> + ap->segmentation.FeatureEnabled[i] |= >> frame_header->feature_enabled[i][j] << j; >> > > "the elements of FeatureEnabled are bitmasks where bit index i of element j > corresponds to FeatureEnabled[i][j] as defined in section 6.8.13 of the AV1 > Specification" > > This looks like you have i and j swapped? > The header define it as: uint8_t FeatureEnabled[STD_VIDEO_AV1_MAX_SEGMENTS]; We define it as: uint8_t feature_enabled[AV1_MAX_SEGMENTS][AV1_SEG_LVL_MAX]; So FeatureEnabled[i] |= feature_enabled[i][j] << j; should be correct. >> + ap->segmentation.FeatureData[i][j] = >> frame_header->feature_value[i][j]; >> + } >> + } >> + >> + ap->loop_filter = (StdVideoAV1LoopFilter) { >> + .flags = (StdVideoAV1LoopFilterFlags) { >> + .loop_filter_delta_enabled = >> frame_header->loop_filter_delta_enabled, >> + .loop_filter_delta_update = >> frame_header->loop_filter_delta_update, >> + }, >> + .loop_filter_sharpness = frame_header->loop_filter_sharpness, >> + }; >> + >> + for (int i = 0; i < STD_VIDEO_AV1_MAX_LOOP_FILTER_STRENGTHS; i++) >> + ap->loop_filter.loop_filter_level[i] = >> frame_header->loop_filter_level[i]; >> + for (int i = 0; i < STD_VIDEO_AV1_LOOP_FILTER_ADJUSTMENTS; i++) >> + ap->loop_filter.loop_filter_mode_deltas[i] = >> frame_header->loop_filter_mode_deltas[i]; >> + >> + ap->cdef = (StdVideoAV1CDEF) { >> + .cdef_damping_minus_3 = frame_header->cdef_damping_minus_3, >> + .cdef_bits = frame_header->cdef_bits, >> + }; >> + >> + uses_lr = frame_header->lr_type[0] || frame_header->lr_type[1] || >> frame_header->lr_type[2], >> + ap->loop_restoration = (StdVideoAV1LoopRestoration) { >> + .FrameRestorationType[0] = remap_lr_type[frame_header->lr_type[0]], >> + .FrameRestorationType[1] = remap_lr_type[frame_header->lr_type[1]], >> + .FrameRestorationType[2] = remap_lr_type[frame_header->lr_type[2]], >> + .LoopRestorationSize[0] = 1 + frame_header->lr_unit_shift, >> + .LoopRestorationSize[1] = 1 + frame_header->lr_unit_shift - >> frame_header->lr_uv_shift, >> + .LoopRestorationSize[2] = 1 + frame_header->lr_unit_shift - >> frame_header->lr_uv_shift, >> > > The standard has: > > LoopRestorationSize[ 0 ] = RESTORATION_TILESIZE_MAX >> (2 - lr_unit_shift) > LoopRestorationSize[ 1 ] = LoopRestorationSize[ 0 ] >> lr_uv_shift > LoopRestorationSize[ 2 ] = LoopRestorationSize[ 0 ] >> lr_uv_shift > > The values here look like log2 of these rather than what they are meant to be? > Correct, fixed. Added an AV1_RESTORATION_TILESIZE_MAX constant to av1.h and calculated them as the spec requires. >> + }; >> + >> + ap->film_grain = (StdVideoAV1FilmGrain) { >> + .flags = (StdVideoAV1FilmGrainFlags) { >> + .chroma_scaling_from_luma = >> film_grain->chroma_scaling_from_luma, >> + .overlap_flag = film_grain->overlap_flag, >> + .clip_to_restricted_range = >> film_grain->clip_to_restricted_range, >> + }, >> + .grain_scaling_minus_8 = film_grain->grain_scaling_minus_8, >> + .ar_coeff_lag = film_grain->ar_coeff_lag, >> + .ar_coeff_shift_minus_6 = film_grain->ar_coeff_shift_minus_6, >> + .grain_scale_shift = film_grain->grain_scale_shift, >> + .grain_seed = film_grain->grain_seed, >> > > .film_grain_params_ref_idx got lost here. > Fixed. >> >> if (apply_grain) { >> for (int i = 0; i < 14; i++) { >> > > The Vulkan headers provide some nice constants for you to use here, use them > rather than magic numbers. > > (Maybe we should add our own versions of these in av1.h?) > Fixed. Did not add our own versions, used the header's. >> >> - ap->av1_frame_header.film_grain.ar_coeffs_cb_plus_128[24] = >> film_grain->ar_coeffs_cb_plus_128[24]; >> - ap->av1_frame_header.film_grain.ar_coeffs_cr_plus_128[24] = >> film_grain->ar_coeffs_cr_plus_128[24]; >> + ap->film_grain.ar_coeffs_cb_plus_128[24] = >> film_grain->ar_coeffs_cb_plus_128[24]; >> + ap->film_grain.ar_coeffs_cr_plus_128[24] = >> film_grain->ar_coeffs_cr_plus_128[24]; >> } >> >> /* Workaround for a spec issue. */ >> > > Is whatever this is still present? > Yes. Removed the comment, added a small TODO at the top. Sad, all the hardware had the same limitations of requiring a unique index for each frame, despite supporting image addresses for the output image. I think it's fixable with us as the indices should correspond to e.g. LAST_FRAME, but since we don't have threading on right now, we can fix it later. >> @@ -480,26 +528,22 @@ static int vk_av1_decode_slice(AVCodecContext *avctx, >> FFVulkanDecodePicture *vp = &ap->vp; >> >> for (int i = s->tg_start; i <= s->tg_end; i++) { >> - ap->tiles[ap->tile_list.nb_tiles] = (StdVideoAV1MESATile) { >> - .size = s->tile_group_info[i].tile_size, >> - .offset = s->tile_group_info[i].tile_offset, >> - .row = s->tile_group_info[i].tile_row, >> - .column = s->tile_group_info[i].tile_column, >> - .tg_start = s->tg_start, >> - .tg_end = s->tg_end, >> - }; >> + >> + ap->tile_offsets_s[ap->tile_count] = >> s->tile_group_info[i].tile_offset; >> + ap->tile_sizes[ap->tile_count] = s->tile_group_info[i].tile_size; >> > > I'm confused by why this is indexed by ap->tile_count rather than, say, i - > s->tg_start? Isn't this repeatedly overwriting one entry in the array and > never touching the others? > No, the call to ff_vk_decode_add_slice() modifies ap->tile_count. It's how it's handled in H26x as well. The logic is tiles are only added if the function call succeeds. >> >> err = ff_vk_decode_add_slice(avctx, vp, >> data + s->tile_group_info[i].tile_offset, >> s->tile_group_info[i].tile_size, 0, >> - &ap->tile_list.nb_tiles, >> + &ap->tile_count, >> &ap->tile_offsets); >> if (err < 0) >> return err; >> >> - ap->tiles[ap->tile_list.nb_tiles - 1].offset = >> ap->tile_offsets[ap->tile_list.nb_tiles - 1]; >> + ap->tile_offsets_s[ap->tile_count - 1] = >> ap->tile_offsets[ap->tile_count - 1]; >> > > And it's not at all clear what this is doing. > Leftover code. Removed, along with tile_offsets_s. >> av_log(avctx, AV_LOG_VERBOSE, "Decoder capabilities for %s profile >> \"%s\":\n", >> @@ -911,7 +911,7 @@ static int vulkan_decode_get_profile(AVCodecContext >> *avctx, AVBufferRef *frames_ >> /* TODO: make dedicated_dpb tunable */ >> dec->dedicated_dpb = !(dec_caps->flags & >> VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_COINCIDE_BIT_KHR); >> dec->layered_dpb = !(caps->flags & >> VK_VIDEO_CAPABILITY_SEPARATE_REFERENCE_IMAGES_BIT_KHR); >> - dec->external_fg = av1_caps.flags & >> VK_VIDEO_DECODE_AV1_CAPABILITY_EXTERNAL_FILM_GRAIN_MESA; >> + // dec->external_fg = av1_caps.flags & >> VK_VIDEO_DECODE_AV1_CAPABILITY_EXTERNAL_FILM_GRAIN_KHR; >> > > What does this mean for the film grain capabilities of the current setup? > No film grain on Intel. Drivers will likely error out on profiles with filmGrainSupport = VK_TRUE, so avctx->export_side_data must have AV_CODEC_EXPORT_DATA_FILM_GRAIN for filmGrainSupport to be VK_FALSE, and the application will have to support applying film grain by itself to be compliant. We'll have to copy libplacebo's film grain generation eventually, but I think erroring out is fine for now. We could force side data to be exported in case the hardware doesn't support it in a later patch. >> +int ff_vk_av1_level_to_av(StdVideoAV1Level level) >> +{ >> + switch (level) { >> + case STD_VIDEO_AV1_LEVEL_2_0: return 20; >> + case STD_VIDEO_AV1_LEVEL_2_1: return 21; >> + case STD_VIDEO_AV1_LEVEL_2_2: return 22; >> + case STD_VIDEO_AV1_LEVEL_2_3: return 23; >> + case STD_VIDEO_AV1_LEVEL_3_0: return 30; >> + case STD_VIDEO_AV1_LEVEL_3_1: return 31; >> + case STD_VIDEO_AV1_LEVEL_3_2: return 32; >> + case STD_VIDEO_AV1_LEVEL_3_3: return 33; >> + case STD_VIDEO_AV1_LEVEL_4_0: return 40; >> + case STD_VIDEO_AV1_LEVEL_4_1: return 41; >> + case STD_VIDEO_AV1_LEVEL_4_2: return 42; >> + case STD_VIDEO_AV1_LEVEL_4_3: return 43; >> + case STD_VIDEO_AV1_LEVEL_5_0: return 50; >> + case STD_VIDEO_AV1_LEVEL_5_1: return 51; >> + case STD_VIDEO_AV1_LEVEL_5_2: return 52; >> + case STD_VIDEO_AV1_LEVEL_5_3: return 53; >> + case STD_VIDEO_AV1_LEVEL_6_0: return 60; >> + case STD_VIDEO_AV1_LEVEL_6_1: return 61; >> + case STD_VIDEO_AV1_LEVEL_6_2: return 62; >> + case STD_VIDEO_AV1_LEVEL_6_3: return 63; >> + case STD_VIDEO_AV1_LEVEL_7_0: return 70; >> + case STD_VIDEO_AV1_LEVEL_7_1: return 71; >> + case STD_VIDEO_AV1_LEVEL_7_2: return 72; >> + default: >> + case STD_VIDEO_AV1_LEVEL_7_3: return 73; >> + } >> +} >> > > Er, what? Where have the numbers on the RHS of this come from? The vulkan > header shows the numbers on the left as correctly matching the spec. > It's how lavc represents levels for H26x codecs, so I thought it's the same for AV1. But checking that now, it isn't the case, so removed it. Thanks for the review, this caught a lot of issues. Will send v4 to the ML shortly. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".