PR #23616 opened by Torbjörn Einarsson (tobbee) URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23616 Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23616.patch
# Summary of changes Unlike H.264, where field coding (PAFF/MBAFF) is part of the bitstream and the decoder already returns a complementary field pair as one full-height frame, HEVC has no field coding at all: interlaced content is just a sequence of independent half-height field pictures with distinct POCs, flagged only via the PTL source-scan flags, VUI field_seq_flag and the picture timing SEI. A plain decoder therefore emits half-height pictures at field rate. Currently, FFmpeg does not properly handle interlaced HEVC. This PR makes the decoder weave the two coded fields of a pair into one full-height frame, bringing field-coded HEVC to parity with H.264 PAFF: a 1080i50 stream decodes to 1920x1080 @ 25fps, flagged INTERLACED with TOP_FIELD_FIRST set for top-field-first content and cleared for bottom-field-first. A previous attempt (Jose Santiago, V1..V7, Oct-Nov 2024) was not merged as too complex for "just weave inside a decoder"; this is a deliberately minimal take: - No copy: the leader (first field in decode order) allocates one full-height buffer; both fields are per-plane views into it with doubled linesize (top->even lines, bottom->odd). Reconstruction is unchanged; inter-field prediction works through the normal POC-based DPB. - No new subsystem: no construction context, no extra FIFO, no mutex. Pairing is a small leader/follower state finalized at the second field's frame_start before ff_thread_finish_setup(), so it is correct under frame threading via the existing thread-context propagation. Output is byte-identical single- vs multi-threaded. - Contained scope: SW decoding only; hwaccel is untouched (opaque surfaces can't be woven via strides), re-checked after get_format(). Field order for bare pic_struct 1/2 (no pairing hint) is anchored on IRAP pictures; the explicit 9..12 hints are handled directly. Testing: full FATE passes after update; fate-hevc-paired-fields (10-bit 4:2:2, explicit hints) is updated, and a new fate-hevc-paired-fields-420 (8-bit 4:2:0, bare pic_struct 1/2) exercises the IRAP-anchored path. The new test needs the attached sample. ```fate-samples hevc/paired_fields_420.hevc ``` From 684914b7b1565fe96ab0c7dc40725b9a19b041c0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]> Date: Tue, 9 Jun 2026 15:20:50 +0200 Subject: [PATCH 1/4] avcodec/hevc: store raw pic_struct and source_scan_type from pic timing SEI MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When frame_field_info_present_flag is set, the picture timing SEI carries pic_struct, source_scan_type and duplicate_flag (H.265 D.3.3), but only the derived picture_struct was kept and the latter two were left unread. Store the raw pic_struct and source_scan_type, flag the timing info as present, and read the previously skipped source_scan_type and duplicate_flag. A following commit uses these to report the field order. Signed-off-by: Torbjörn Einarsson <[email protected]> --- libavcodec/hevc/sei.c | 4 ++++ libavcodec/hevc/sei.h | 3 +++ 2 files changed, 7 insertions(+) diff --git a/libavcodec/hevc/sei.c b/libavcodec/hevc/sei.c index 83c726a217..70b0a6b17f 100644 --- a/libavcodec/hevc/sei.c +++ b/libavcodec/hevc/sei.c @@ -60,6 +60,8 @@ static int decode_nal_sei_pic_timing(HEVCSEI *s, GetBitContext *gb, if (sps->vui.frame_field_info_present_flag) { int pic_struct = get_bits(gb, 4); + h->present = 1; + h->pic_struct = pic_struct; h->picture_struct = AV_PICTURE_STRUCTURE_UNKNOWN; if (pic_struct == 2 || pic_struct == 10 || pic_struct == 12) { av_log(logctx, AV_LOG_DEBUG, "BOTTOM Field\n"); @@ -74,6 +76,8 @@ static int decode_nal_sei_pic_timing(HEVCSEI *s, GetBitContext *gb, av_log(logctx, AV_LOG_DEBUG, "Frame/Field Tripling\n"); h->picture_struct = HEVC_SEI_PIC_STRUCT_FRAME_TRIPLING; } + h->source_scan_type = get_bits(gb, 2); + skip_bits1(gb); // duplicate_flag } return 0; diff --git a/libavcodec/hevc/sei.h b/libavcodec/hevc/sei.h index 59bd2b45f8..55f03e0ede 100644 --- a/libavcodec/hevc/sei.h +++ b/libavcodec/hevc/sei.h @@ -52,6 +52,9 @@ typedef struct HEVCSEIFramePacking { typedef struct HEVCSEIPictureTiming { int picture_struct; + int pic_struct; ///< raw pic_struct (H.265 Table D.2), valid if present + int source_scan_type; ///< 0: interlaced, 1: progressive, 2: unknown + int present; ///< a pic_timing SEI with frame_field_info was parsed } HEVCSEIPictureTiming; typedef struct HEVCSEIAlternativeTransfer { -- 2.52.0 From 426a5c6af0b6b2748172759c46df8ce8f1f9c844 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]> Date: Tue, 9 Jun 2026 15:20:50 +0200 Subject: [PATCH 2/4] avcodec/hevc: set field_order from picture timing SEI in the parser MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit hevc_parse_slice_header assigned the SEI picture_struct, an AV_PICTURE_STRUCTURE_* value, directly to AVCodecParserContext.field_order which is an enum AVFieldOrder. The enums are unrelated, so a top field was reported as "progressive" and a coded frame as "bb". Derive the field order from the picture timing SEI pic_struct, falling back to the profile_tier_level source-scan flags, mirroring the H.264 parser. ffprobe now reports interlaced HEVC as tt/bb instead of progressive. Signed-off-by: Torbjörn Einarsson <[email protected]> --- libavcodec/hevc/parser.c | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/libavcodec/hevc/parser.c b/libavcodec/hevc/parser.c index 47a1ac70d0..508d093807 100644 --- a/libavcodec/hevc/parser.c +++ b/libavcodec/hevc/parser.c @@ -53,6 +53,40 @@ typedef struct HEVCParserContext { int pocTid0; } HEVCParserContext; +/* + * Derive the stream field order. HEVC has no field coding, so interlace is + * carried as metadata: primarily by the pic_struct of the picture timing SEI + * (H.265 Table D.2), with the profile_tier_level source-scan flags as a + * fallback. This mirrors the H.264 parser (h264_parser.c). Top-field-first + * pic_struct values are {1,3,5,9,11}, bottom-field-first {2,4,6,10,12}. + */ +static enum AVFieldOrder hevc_field_order(const HEVCSPS *sps, + const HEVCSEIPictureTiming *pt) +{ + const PTLCommon *ptl = &sps->ptl.general_ptl; + + if (pt->present) { + switch (pt->pic_struct) { + case 1: case 3: case 5: case 9: case 11: + return AV_FIELD_TT; + case 2: case 4: case 6: case 10: case 12: + return AV_FIELD_BB; + case 0: case 7: case 8: + return AV_FIELD_PROGRESSIVE; + } + if (pt->source_scan_type == 1) + return AV_FIELD_PROGRESSIVE; + } + + /* No usable pic_struct: fall back to the profile_tier_level flags. */ + if (ptl->progressive_source_flag && !ptl->interlaced_source_flag) + return AV_FIELD_PROGRESSIVE; + if (ptl->interlaced_source_flag && !ptl->progressive_source_flag) + return AV_FIELD_TT; /* interlaced; order not signalled, assume top first */ + + return AV_FIELD_UNKNOWN; +} + static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal, AVCodecContext *avctx) { @@ -70,7 +104,6 @@ static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal, first_slice_in_pic_flag = get_bits1(gb); s->picture_structure = sei->picture_timing.picture_struct; - s->field_order = sei->picture_timing.picture_struct; if (IS_IRAP_NAL(nal)) { s->key_frame = 1; @@ -85,6 +118,8 @@ static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal, pps = ps->pps_list[pps_id]; sps = pps->sps; + s->field_order = hevc_field_order(sps, &sei->picture_timing); + ow = &sps->output_window; s->coded_width = sps->width; -- 2.52.0 From 7faa4fb6d5df9ec6483d55617bf619b27c44d76a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]> Date: Fri, 12 Jun 2026 00:12:55 +0200 Subject: [PATCH 3/4] avcodec/hevc: combine interlaced field pairs into full frames MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit HEVC has no field coding; broadcast-style interlaced content is coded as a sequence of half-height field pictures, signalled only through the profile_tier_level source-scan flags, VUI field_seq_flag and the picture timing SEI. Decoders that present those pictures as-is yield half-height output at field rate, unlike H.264 PAFF where the two fields of a pair are returned as one full-height frame. Decode the two fields of a complementary pair into one shared full-height buffer and output it once as a woven frame carrying the INTERLACED and TOP_FIELD_FIRST flags, so field-coded HEVC is presented like H.264 PAFF: a 1080i50 stream decodes to 1920x1080 frames at 25 fps. There is no copying: the first field of a pair (the leader, in decode order) allocates the full-height buffer and both fields' frames become per-plane views into it with doubled linesize, the top field on even and the bottom field on odd lines, independently of which one leads, so both TFF and BFF streams work. The reconstruction code takes strides as parameters throughout and needs no changes; inter-field prediction works through the regular POC-based DPB since the fields keep distinct POCs. Pairing follows the pic_struct pairing hints (Table D.2): 11/12 lead, 9/10 follow; bare 1/2, which carry no pairing hint, take their role from the field order locked at the first IRAP field of the sequence (an IRAP starts a frame, so it leads its pair and its parity gives the field order), falling back to decode order before the first IRAP. Combining engages only for software decoding of streams that signal interlaced_source_flag and not progressive_source_flag, set VUI field_seq_flag, and carry picture timing SEI marking field pictures; everything else, including hwaccel decoding, is unchanged (the geometry is re-exported after get_format() so it follows the hwaccel decision). When combining, the exported coded/display height is doubled and the frame rate halved so callers see the woven geometry. The pair is finalized at the second field's frame_start: the leader gets its OUTPUT flag, and the woven frame its flags, doubled crop and duration, and the leader's side data, all before ff_thread_finish_setup(). This makes the scheme work under frame threading: the DPB state every later worker snapshots is consistent, so completed pairs are emitted at sequence boundaries and at EOF. The woven frame cannot reach the caller before both fields are decoded, because frames are returned by the decode call that bumps them from the DPB and the caller receives a worker's output only after that worker and all earlier ones (including the leader's) have finished; the pending leader is carried across worker contexts in update_thread_context(). Output is byte-identical to single-threaded decoding and deterministic over repeated runs, verified for TFF and BFF streams in 8-bit 4:2:0 and 10-bit 4:2:2. An unpaired field is dropped with a warning. Film grain synthesis is not applied to woven frames. verify_md5() uses the frame's own (field) dimensions instead of the doubled avctx coded size. fate-hevc-paired-fields now outputs two full-height frames instead of four fields; extend it to also check the frame dimensions. Add fate-hevc-paired-fields-420, an 8-bit 4:2:0 stream signalling bare pic_struct 1/2, which exercises field-order inference from the IRAP anchor; it needs the new paired_fields_420.hevc sample. Signed-off-by: Torbjörn Einarsson <[email protected]> --- libavcodec/hevc/hevcdec.c | 274 +++++++++++++++++++++++++- libavcodec/hevc/hevcdec.h | 36 ++++ libavcodec/hevc/refs.c | 75 ++++++- tests/fate/hevc.mak | 5 +- tests/ref/fate/hevc-paired-fields | 12 +- tests/ref/fate/hevc-paired-fields-420 | 14 ++ 6 files changed, 397 insertions(+), 19 deletions(-) create mode 100644 tests/ref/fate/hevc-paired-fields-420 diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c index b4c2d82e8d..a0215dec65 100644 --- a/libavcodec/hevc/hevcdec.c +++ b/libavcodec/hevc/hevcdec.c @@ -329,18 +329,40 @@ static int decode_lt_rps(const HEVCSPS *sps, LongTermRPS *rps, return 0; } +/* + * combine_fields: whether the woven full-height geometry should be advertised + * for this SPS. Fields are only combined in software decoding of an + * interlaced-source stream whose coded pictures are individual fields + * (VUI field_seq_flag, D.3.3); a frame-coded interlaced stream must not have + * its geometry doubled. Advertising doubled dimensions while emitting + * per-field frames would corrupt buffer allocation, so the predicate must + * match the runtime combine gate (combine_decide_role() additionally requires + * per-picture timing SEI). + */ +static int combine_geometry_active(const HEVCContext *s, const HEVCSPS *sps) +{ + const PTLCommon *ptl = &sps->ptl.general_ptl; + + return !s->avctx->hwaccel && sps->vui.field_seq_flag && + ptl->interlaced_source_flag && !ptl->progressive_source_flag; +} + static void export_stream_params(HEVCContext *s, const HEVCSPS *sps) { AVCodecContext *avctx = s->avctx; const HEVCVPS *vps = sps->vps; const HEVCWindow *ow = &sps->output_window; unsigned int num = 0, den = 0; + /* combine_fields weaves two field pictures into one frame of twice the + * height at half the field rate; advertise that geometry to the caller. */ + const int combine = combine_geometry_active(s, sps); + const int vmul = combine ? 2 : 1; avctx->pix_fmt = sps->pix_fmt; avctx->coded_width = sps->width; - avctx->coded_height = sps->height; + avctx->coded_height = sps->height * vmul; avctx->width = sps->width - ow->left_offset - ow->right_offset; - avctx->height = sps->height - ow->top_offset - ow->bottom_offset; + avctx->height = (sps->height - ow->top_offset - ow->bottom_offset) * vmul; avctx->has_b_frames = sps->temporal_layer[sps->max_sub_layers - 1].num_reorder_pics; avctx->profile = sps->ptl.general_ptl.profile_idc; avctx->level = sps->ptl.general_ptl.level_idc; @@ -381,8 +403,11 @@ static void export_stream_params(HEVCContext *s, const HEVCSPS *sps) } if (num > 0 && den > 0) + /* num is num_units_in_tick (the framerate denominator), den is + * time_scale (the numerator); combining two fields into one frame + * halves the rate, i.e. doubles the tick count per output frame. */ av_reduce(&avctx->framerate.den, &avctx->framerate.num, - num, den, 1 << 30); + num * vmul, den, 1 << 30); } static int export_stream_params_from_sei(HEVCContext *s) @@ -3194,9 +3219,159 @@ static int find_finish_setup_nal(const HEVCContext *s) return nal_idx; } +/* + * combine_fields: decide the pairing role of the field picture about to be + * started. Returns HEVC_COMBINE_LEADER for the first field of a pair (which + * allocates the full-height buffer), HEVC_COMBINE_FOLLOWER for the second field + * (which completes a pending pair), or HEVC_COMBINE_NONE otherwise (combining + * disabled, progressive content, hwaccel, non-base layer, or no usable picture + * timing SEI). On a non-NONE return *bottom is set to the field's spatial + * parity: 1 if it occupies the bottom (odd) lines of the woven frame, 0 for the + * top (even) lines. The leader/follower role follows decode order and is + * independent of parity, so both top-field-first (leader = top) and + * bottom-field-first (leader = bottom) streams are handled. + * + * The pairing direction comes from the picture timing pic_struct (H.265 Table + * D.2): values 11/12 are a top/bottom field paired with the *next* field in + * output order (the pair leader), 9/10 a top/bottom field paired with the + * *previous* field (the follower). Bare top/bottom fields (1/2) carry no + * pairing hint; their role is taken from the field order locked at the first + * IRAP field of the sequence (an IRAP starts a frame, hence leads its pair, so + * its parity reveals whether the stream is top- or bottom-field-first). Before + * the first IRAP the order is provisionally seeded from decode order. + */ +static int combine_decide_role(HEVCContext *s, const HEVCLayerContext *l, + int *bottom) +{ + const HEVCSEIPictureTiming *pt = &s->sei.picture_timing; + + *bottom = 0; + + if (l != &s->layers[0] || !combine_geometry_active(s, l->sps)) + return HEVC_COMBINE_NONE; + + if (!pt->present || pt->source_scan_type == 1 /* progressive */) + return HEVC_COMBINE_NONE; + + switch (pt->pic_struct) { + case 11: /* top field, paired with next bottom field -> leader, top */ + *bottom = 0; + return HEVC_COMBINE_LEADER; + case 12: /* bottom field, paired with next top field -> leader, bottom */ + *bottom = 1; + return HEVC_COMBINE_LEADER; + case 9: /* top field, paired with previous bottom field -> follower, top */ + *bottom = 0; + return s->combine_leader ? HEVC_COMBINE_FOLLOWER : HEVC_COMBINE_NONE; + case 10: /* bottom field, paired with previous top field -> follower, bottom */ + *bottom = 1; + return s->combine_leader ? HEVC_COMBINE_FOLLOWER : HEVC_COMBINE_NONE; + case 1: /* bare top field -> role from the locked field order */ + case 2: /* bare bottom field -> role from the locked field order */ + { + int parity = pt->pic_struct == 2; /* spatial parity: 0 top, 1 bottom */ + *bottom = parity; + + /* An IRAP field necessarily starts a frame, so it is a pair leader, + * and its parity fixes the field order for the sequence (top -> TFF, + * bottom -> BFF). Real interlaced streams place the IRAP on the first + * field only, so this seeds and re-anchors the pairing phase, also + * recovering from any earlier dropped field. */ + if (IS_IRAP(s)) { + s->combine_lead_bottom = parity; + return HEVC_COMBINE_LEADER; + } + + /* Field order known: a field matching the leading parity opens a pair, + * the opposite parity closes the pending one. */ + if (s->combine_lead_bottom >= 0) { + if (parity == s->combine_lead_bottom) + return HEVC_COMBINE_LEADER; + return s->combine_leader ? HEVC_COMBINE_FOLLOWER : HEVC_COMBINE_NONE; + } + + /* Field order not yet established (entry before the first IRAP): with + * no leader pending this field opens the pair and provisionally seeds + * the order; the next IRAP corrects it if the guess was wrong. */ + if (!s->combine_leader) { + s->combine_lead_bottom = parity; + return HEVC_COMBINE_LEADER; + } + return HEVC_COMBINE_FOLLOWER; + } + } + + return HEVC_COMBINE_NONE; +} + +/* + * combine_fields: the second field of the pair has started decoding into the + * shared buffer, so finalize the full-height "combined" frame the leader owns + * and flag the leader for output. The combined frame already carries pts/ + * duration/SAR/colorimetry from its own get_buffer() at leader frame start; + * here we set the per-frame properties the decoder normally fills and fix up + * the geometry that differs from a single field. + */ +static int combine_finalize(HEVCFrame *leader, int tff) +{ + AVFrame *combined = leader->combined; + + combined->pict_type = leader->f->pict_type; + combined->flags |= AV_FRAME_FLAG_INTERLACED; + if (tff) + combined->flags |= AV_FRAME_FLAG_TOP_FIELD_FIRST; + else + combined->flags &= ~AV_FRAME_FLAG_TOP_FIELD_FIRST; + if (leader->f->flags & AV_FRAME_FLAG_KEY) + combined->flags |= AV_FRAME_FLAG_KEY; + + /* one field line crops to two frame lines; columns are unaffected */ + combined->crop_top = leader->f->crop_top * 2; + combined->crop_bottom = leader->f->crop_bottom * 2; + combined->crop_left = leader->f->crop_left; + combined->crop_right = leader->f->crop_right; + + if (combined->duration > 0) + combined->duration *= 2; + + /* per-frame SEI side data (A53 CC, HDR metadata, ...) was attached to the + * leader's field view in set_side_data(); the woven frame only has the + * packet-level properties from get_buffer() */ + for (int i = 0; i < leader->f->nb_side_data; i++) { + int ret = av_frame_side_data_clone(&combined->side_data, + &combined->nb_side_data, + leader->f->side_data[i], + AV_FRAME_SIDE_DATA_FLAG_UNIQUE); + if (ret < 0) + return ret; + } + + leader->flags |= HEVC_FRAME_FLAG_OUTPUT; + + return 0; +} + +/* + * combine_fields: drop a pending pair leader whose second field can no longer + * arrive (sequence boundary or another leader showing up). The half-filled + * woven buffer is never output; the field itself stays in the DPB as a + * reference until it ages out. + */ +static void combine_drop_leader(HEVCContext *s) +{ + if (s->combine_leader) { + av_log(s->avctx, AV_LOG_WARNING, + "combine_fields: dropping unpaired field with POC %d\n", + s->combine_leader->poc); + s->combine_leader = NULL; + } +} + static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, unsigned nal_idx) { + int combine_role = HEVC_COMBINE_NONE; + int combine_bottom = 0; const HEVCPPS *const pps = s->ps.pps_list[s->sh.pps_id]; const HEVCSPS *const sps = pps->sps; int pic_size_in_ctb = ((sps->width >> sps->log2_min_cb_size) + 1) * @@ -3248,6 +3423,7 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, } } + combine_drop_leader(s); ff_hevc_clear_refs(l); ret = set_sps(s, l, sps); @@ -3263,6 +3439,16 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, return ret; } + /* get_format() may have enabled a hwaccel, which disables field + * combining; re-export so the advertised geometry matches the + * per-field frames that will be produced, keeping the negotiated + * pixel format */ + if (s->avctx->hwaccel) { + enum AVPixelFormat pix_fmt = s->avctx->pix_fmt; + export_stream_params(s, sps); + s->avctx->pix_fmt = pix_fmt; + } + new_sequence = 1; } } @@ -3273,8 +3459,10 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, memset(l->is_pcm, 0, (sps->min_pu_width + 1) * (sps->min_pu_height + 1)); memset(l->tab_slice_address, -1, pic_size_in_ctb * sizeof(*l->tab_slice_address)); - if (IS_IDR(s)) + if (IS_IDR(s)) { + combine_drop_leader(s); ff_hevc_clear_refs(l); + } s->slice_idx = 0; s->first_nal_type = s->nal_unit_type; @@ -3319,10 +3507,41 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, if (ret < 0) return ret; + combine_role = combine_decide_role(s, l, &combine_bottom); + s->combine_field_role = combine_role; + s->combine_field_bottom = combine_bottom; ret = ff_hevc_set_new_ref(s, l, s->poc); + s->combine_field_role = HEVC_COMBINE_NONE; + s->combine_field_bottom = 0; if (ret < 0) goto fail; + if (combine_role == HEVC_COMBINE_LEADER) { + /* a still-pending leader means the previous pair never received its + * second field */ + combine_drop_leader(s); + /* hold the leader's output until the matching second field arrives */ + s->combine_leader = s->cur_frame; + s->combine_leader_bottom = combine_bottom; + s->cur_frame->flags &= ~HEVC_FRAME_FLAG_OUTPUT; + } else if (combine_role == HEVC_COMBINE_FOLLOWER) { + /* The second field completes the pair: release the woven frame for + * output, ordered by the leader's POC. Its samples are only fully + * decoded once this field finishes, but it cannot reach the caller + * earlier: output frames are returned by the decode call that bumps + * them from the DPB, and under frame threading the caller receives a + * worker's frames only after that worker (and all earlier ones, + * including the leader's) fully finished. Finalizing here, before + * ff_thread_finish_setup(), keeps the DPB state that later workers + * snapshot consistent, so the pair is also emitted when a sequence + * boundary or EOF follows immediately. */ + s->cur_frame->flags &= ~HEVC_FRAME_FLAG_OUTPUT; + ret = combine_finalize(s->combine_leader, !s->combine_leader_bottom); + s->combine_leader = NULL; + if (ret < 0) + goto fail; + } + ret = ff_hevc_frame_rps(s, l); if (ret < 0) { av_log(s->avctx, AV_LOG_ERROR, "Error constructing the frame RPS.\n"); @@ -3340,6 +3559,11 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, !(s->avctx->export_side_data & AV_CODEC_EXPORT_DATA_FILM_GRAIN) && !s->avctx->hwaccel; + /* film grain would have to be applied to the woven buffer, not the field + * views; not supported when combining fields */ + if (combine_role != HEVC_COMBINE_NONE) + s->cur_frame->needs_fg = 0; + ret = set_side_data(s); if (ret < 0) goto fail; @@ -3395,6 +3619,8 @@ static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l, return 0; fail: + if (s->combine_leader == l->cur_frame) + s->combine_leader = NULL; if (l->cur_frame) ff_hevc_unref_frame(l->cur_frame, ~0); l->cur_frame = NULL; @@ -3429,8 +3655,10 @@ static int verify_md5(HEVCContext *s, AVFrame *frame) msg_buf[0] = '\0'; for (i = 0; frame->data[i]; i++) { - int width = s->avctx->coded_width; - int height = s->avctx->coded_height; + /* the frame's own dimensions, not avctx->coded_*: when combining + * fields the checksum covers the half-height field view */ + int width = frame->width; + int height = frame->height; int w = (i == 1 || i == 2) ? (width >> desc->log2_chroma_w) : width; int h = (i == 1 || i == 2) ? (height >> desc->log2_chroma_h) : height; uint8_t md5[16]; @@ -3899,6 +4127,24 @@ static int hevc_ref_frame(HEVCFrame *dst, const HEVCFrame *src) dst->needs_fg = 1; } + /* combine_fields: the leader's full-height woven buffer must travel with the + * frame across thread contexts (the follower writes into it and it is the + * frame finally output). It is a plain refcounted AVFrame, so give each DPB + * entry its own reference rather than sharing the pointer (which would + * double-free in ff_hevc_unref_frame). */ + if (src->combined) { + dst->combined = av_frame_alloc(); + if (!dst->combined) { + ff_hevc_unref_frame(dst, ~0); + return AVERROR(ENOMEM); + } + ret = av_frame_ref(dst->combined, src->combined); + if (ret < 0) { + ff_hevc_unref_frame(dst, ~0); + return ret; + } + } + dst->pps = av_refstruct_ref_c(src->pps); dst->tab_mvf = av_refstruct_ref(src->tab_mvf); dst->rpl_tab = av_refstruct_ref(src->rpl_tab); @@ -3999,6 +4245,7 @@ static av_cold int hevc_init_context(AVCodecContext *avctx) s->dovi_ctx.logctx = avctx; s->eos = 0; + s->combine_lead_bottom = -1; ff_hevc_reset_sei(&s->sei); @@ -4048,6 +4295,19 @@ static int hevc_update_thread_context(AVCodecContext *dst, s->eos = s0->eos; s->no_rasl_output_flag = s0->no_rasl_output_flag; + /* combine_fields: carry the pending pair leader to the worker that will + * decode the follower. The DPB was copied index-for-index above, so the + * leader lives at the same slot; re-point into this context's base-layer + * DPB. */ + s->combine_leader_bottom = s0->combine_leader_bottom; + s->combine_lead_bottom = s0->combine_lead_bottom; + if (s0->combine_leader) { + size_t idx = s0->combine_leader - s0->layers[0].DPB; + s->combine_leader = &s->layers[0].DPB[idx]; + } else { + s->combine_leader = NULL; + } + s->is_nalff = s0->is_nalff; s->nal_length_size = s0->nal_length_size; s->layers_active_decode = s0->layers_active_decode; @@ -4187,6 +4447,8 @@ static av_cold int hevc_decode_init(AVCodecContext *avctx) static av_cold void hevc_decode_flush(AVCodecContext *avctx) { HEVCContext *s = avctx->priv_data; + s->combine_leader = NULL; + s->combine_lead_bottom = -1; ff_hevc_flush_dpb(s); ff_hevc_reset_sei(&s->sei); ff_dovi_ctx_flush(&s->dovi_ctx); diff --git a/libavcodec/hevc/hevcdec.h b/libavcodec/hevc/hevcdec.h index 8394740c4b..21460e19b3 100644 --- a/libavcodec/hevc/hevcdec.h +++ b/libavcodec/hevc/hevcdec.h @@ -357,6 +357,17 @@ typedef struct DBParams { #define HEVC_FRAME_FLAG_UNAVAILABLE (1 << 3) #define HEVC_FRAME_FLAG_CORRUPT (1 << 4) +/* combine_fields: pairing role of the field picture currently being started, + * used to decide how its decode buffer is allocated (see alloc_frame()). The + * role reflects decode order (leader is the first field of the pair), which is + * independent of spatial parity: for top-field-first content the leader is the + * top field, for bottom-field-first content it is the bottom field. The spatial + * parity (which lines of the woven frame a field writes) is carried separately + * by combine_field_bottom. */ +#define HEVC_COMBINE_NONE 0 +#define HEVC_COMBINE_LEADER 1 ///< first field of the pair: owns the full-height buffer +#define HEVC_COMBINE_FOLLOWER 2 ///< second field of the pair: a view into the leader's buffer + typedef struct HEVCFrame { union { struct { @@ -376,6 +387,15 @@ typedef struct HEVCFrame { RefPicListTab *rpl; ///< RefStruct reference int nb_rpl_elems; + /** + * combine_fields: for the leader (the first field, in decode order) of a + * combined field pair, this is the full-height AVFrame that both fields are + * decoded into and which is output once as a single full-height frame. f is + * a doubled-stride view into this buffer. NULL for followers and for + * normally-decoded frames. + */ + AVFrame *combined; + void *hwaccel_picture_private; ///< RefStruct reference // for secondary-layer frames, this is the DPB index of the base-layer frame @@ -561,6 +581,22 @@ typedef struct HEVCContext { ///< as a format defined in 14496-15 int apply_defdispwin; + /// while mid-pair, the leader (first) field frame the follower combines into + HEVCFrame *combine_leader; + /// spatial parity of the held leader: 1 if it writes the bottom (odd) lines, + /// i.e. the pair is bottom-field-first; used to set TOP_FIELD_FIRST + int combine_leader_bottom; + /// transient HEVC_COMBINE_* role for the frame currently being allocated + int combine_field_role; + /// transient spatial parity (1 = bottom/odd lines) of the frame being allocated + int combine_field_bottom; + /// combine_fields: locked field order for bare top/bottom fields (pic_struct + /// 1/2, which carry no pairing hint). Parity of the field that leads each + /// pair, matching combine_leader_bottom: -1 unknown, 0 top-field-first, + /// 1 bottom-field-first. Set from the first IRAP field of the sequence (an + /// IRAP starts a frame, hence leads its pair) and re-anchored at every IRAP. + int combine_lead_bottom; + // multi-layer AVOptions int *view_ids; unsigned nb_view_ids; diff --git a/libavcodec/hevc/refs.c b/libavcodec/hevc/refs.c index 2acffd72d9..17b9810ef8 100644 --- a/libavcodec/hevc/refs.c +++ b/libavcodec/hevc/refs.c @@ -39,6 +39,7 @@ void ff_hevc_unref_frame(HEVCFrame *frame, int flags) frame->flags = 0; if (!frame->flags) { ff_progress_frame_unref(&frame->tf); + av_frame_free(&frame->combined); av_frame_unref(frame->frame_grain); frame->needs_fg = 0; @@ -107,6 +108,63 @@ static int replace_alpha_plane(AVFrame *alpha, AVFrame *base) return AVERROR_BUG; } +/* + * combine_fields: set up frame->f as a doubled-stride field view into a + * full-height "combined" buffer that holds both fields of an interlaced pair. + * + * For the pair leader (the first field in decode order) a fresh full-height + * (2*sps->height) AVFrame is allocated and stored in frame->combined; for the + * follower the leader's combined buffer is shared. Independently of which field + * leads, the spatial parity selects the lines written: a top field maps the + * even lines, a bottom field the odd lines (data offset by one line). In both + * cases f keeps the field's logical dimensions (sps->width x sps->height); only + * data[] and linesize[] describe the interleaved view, so reconstruction, SAO, + * deblocking and motion compensation all operate in field geometry and + * naturally write every other line of the shared frame. + */ +static int combine_setup_field_view(HEVCContext *s, HEVCLayerContext *l, + HEVCFrame *frame, int is_leader, int bottom) +{ + AVFrame *f = frame->f; + AVFrame *combined; + int ret, i; + + if (is_leader) { + combined = av_frame_alloc(); + if (!combined) + return AVERROR(ENOMEM); + combined->format = s->avctx->pix_fmt; + combined->width = l->sps->width; + combined->height = l->sps->height * 2; + ret = ff_thread_get_buffer(s->avctx, combined, AV_GET_BUFFER_FLAG_REF); + if (ret < 0) { + av_frame_free(&combined); + return ret; + } + frame->combined = combined; + } else { + if (!s->combine_leader || !s->combine_leader->combined) + return AVERROR_BUG; + combined = s->combine_leader->combined; + } + + f->format = combined->format; + f->width = l->sps->width; + f->height = l->sps->height; + for (i = 0; i < FF_ARRAY_ELEMS(f->buf) && combined->buf[i]; i++) { + f->buf[i] = av_buffer_ref(combined->buf[i]); + if (!f->buf[i]) + return AVERROR(ENOMEM); + } + for (i = 0; i < FF_ARRAY_ELEMS(f->data) && combined->data[i]; i++) { + f->data[i] = combined->data[i] + (bottom ? combined->linesize[i] : 0); + f->linesize[i] = combined->linesize[i] * 2; + } + f->extended_data = f->data; + + return 0; +} + static HEVCFrame *alloc_frame(HEVCContext *s, HEVCLayerContext *l) { const HEVCVPS *vps = l->sps->vps; @@ -157,9 +215,17 @@ static HEVCFrame *alloc_frame(HEVCContext *s, HEVCLayerContext *l) } } - ret = ff_thread_get_buffer(s->avctx, frame->f, AV_GET_BUFFER_FLAG_REF); - if (ret < 0) - goto fail; + if (s->combine_field_role != HEVC_COMBINE_NONE) { + ret = combine_setup_field_view(s, l, frame, + s->combine_field_role == HEVC_COMBINE_LEADER, + s->combine_field_bottom); + if (ret < 0) + goto fail; + } else { + ret = ff_thread_get_buffer(s->avctx, frame->f, AV_GET_BUFFER_FLAG_REF); + if (ret < 0) + goto fail; + } size_t rpl_bytes; if (av_size_mult(s->pkt.nb_nals, sizeof(*frame->rpl), &rpl_bytes) < 0) @@ -303,7 +369,8 @@ int ff_hevc_output_frames(HEVCContext *s, (nb_output && (nb_dpb[0] > max_dpb || nb_dpb[1] > max_dpb))) { HEVCFrame *frame = &s->layers[min_layer].DPB[min_idx]; - AVFrame *f = frame->needs_fg ? frame->frame_grain : frame->f; + AVFrame *f = frame->needs_fg ? frame->frame_grain : + frame->combined ? frame->combined : frame->f; int output = !discard && (layers_active_output & (1 << min_layer)); if (output) { diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak index 7ce9ff403b..8045c99c8e 100644 --- a/tests/fate/hevc.mak +++ b/tests/fate/hevc.mak @@ -266,9 +266,12 @@ FATE_HEVC-$(call FRAMEMD5, HEVC, HEVC, HEVC_PARSER) += fate-hevc-skiploopfilter FATE_HEVC-$(call FRAMEMD5, MOV, HEVC, SCALE_FILTER) += fate-hevc-extradata-reload fate-hevc-extradata-reload: CMD = framemd5 -i $(TARGET_SAMPLES)/hevc/extradata-reload-multi-stsd.mov -sws_flags bitexact -fate-hevc-paired-fields: CMD = probeframes -show_entries frame=interlaced_frame,top_field_first $(TARGET_SAMPLES)/hevc/paired_fields.hevc +fate-hevc-paired-fields: CMD = probeframes -show_entries frame=width,height,interlaced_frame,top_field_first $(TARGET_SAMPLES)/hevc/paired_fields.hevc FATE_HEVC_FFPROBE-$(call DEMDEC, HEVC, HEVC) += fate-hevc-paired-fields +fate-hevc-paired-fields-420: CMD = probeframes -show_entries frame=width,height,pix_fmt,interlaced_frame,top_field_first $(TARGET_SAMPLES)/hevc/paired_fields_420.hevc +FATE_HEVC_FFPROBE-$(call DEMDEC, HEVC, HEVC) += fate-hevc-paired-fields-420 + fate-hevc-monochrome-crop: CMD = probeframes -show_entries frame=width,height:stream=width,height $(TARGET_SAMPLES)/hevc/hevc-monochrome.hevc FATE_HEVC_FFPROBE-$(call PARSERDEMDEC, HEVC, HEVC, HEVC) += fate-hevc-monochrome-crop diff --git a/tests/ref/fate/hevc-paired-fields b/tests/ref/fate/hevc-paired-fields index f2223e770b..ace522cc9b 100644 --- a/tests/ref/fate/hevc-paired-fields +++ b/tests/ref/fate/hevc-paired-fields @@ -1,16 +1,12 @@ [FRAME] +width=1920 +height=1080 interlaced_frame=1 top_field_first=1 [/FRAME] [FRAME] -interlaced_frame=1 -top_field_first=0 -[/FRAME] -[FRAME] +width=1920 +height=1080 interlaced_frame=1 top_field_first=1 [/FRAME] -[FRAME] -interlaced_frame=1 -top_field_first=0 -[/FRAME] diff --git a/tests/ref/fate/hevc-paired-fields-420 b/tests/ref/fate/hevc-paired-fields-420 new file mode 100644 index 0000000000..d667e26079 --- /dev/null +++ b/tests/ref/fate/hevc-paired-fields-420 @@ -0,0 +1,14 @@ +[FRAME] +width=1920 +height=1080 +pix_fmt=yuv420p +interlaced_frame=1 +top_field_first=1 +[/FRAME] +[FRAME] +width=1920 +height=1080 +pix_fmt=yuv420p +interlaced_frame=1 +top_field_first=1 +[/FRAME] -- 2.52.0 From 370460d5a79be5632f45eee51af28e5c4e4ca474 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]> Date: Fri, 12 Jun 2026 00:04:59 +0200 Subject: [PATCH 4/4] avcodec/hevc: mark HEVC as field-based and report field-unit repeat_pict MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add AV_CODEC_PROP_FIELDS to the HEVC codec descriptor and make the parser report repeat_pict in field periods minus one (0 field, 1 frame, 2 frame with a repeated field, 3 frame doubling, 5 frame tripling) and avctx->framerate as the frame rate: the HEVC VUI tick is per coded picture, which for field_seq streams is the field rate, so it is halved for field pictures. This mirrors the H.264 parser. libavformat then derives correct packet durations for field-coded streams and estimates e.g. avg_frame_rate 25/1 with r_frame_rate 50/1 for a 1080i50 stream in MPEG-TS, the same presentation as for interlaced H.264, instead of reporting the field rate as the frame rate. Signed-off-by: Torbjörn Einarsson <[email protected]> --- libavcodec/codec_desc.c | 3 ++- libavcodec/hevc/parser.c | 41 ++++++++++++++++++++++++++++++++++++---- 2 files changed, 39 insertions(+), 5 deletions(-) diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c index 81c095bea7..46c01ca14c 100644 --- a/libavcodec/codec_desc.c +++ b/libavcodec/codec_desc.c @@ -1268,7 +1268,8 @@ static const AVCodecDescriptor codec_descriptors[] = { .type = AVMEDIA_TYPE_VIDEO, .name = "hevc", .long_name = NULL_IF_CONFIG_SMALL("H.265 / HEVC (High Efficiency Video Coding)"), - .props = AV_CODEC_PROP_LOSSY | AV_CODEC_PROP_LOSSLESS | AV_CODEC_PROP_REORDER, + .props = AV_CODEC_PROP_LOSSY | AV_CODEC_PROP_LOSSLESS | + AV_CODEC_PROP_REORDER | AV_CODEC_PROP_FIELDS, .profiles = NULL_IF_CONFIG_SMALL(ff_hevc_profiles), }, { diff --git a/libavcodec/hevc/parser.c b/libavcodec/hevc/parser.c index 508d093807..51c605cb4b 100644 --- a/libavcodec/hevc/parser.c +++ b/libavcodec/hevc/parser.c @@ -87,6 +87,18 @@ static enum AVFieldOrder hevc_field_order(const HEVCSPS *sps, return AV_FIELD_UNKNOWN; } +/* whether the current picture is an individual field (H.265 Table D.2) */ +static int hevc_pic_struct_is_field(const HEVCSEIPictureTiming *pt) +{ + if (!pt->present) + return 0; + switch (pt->pic_struct) { + case 1: case 2: case 9: case 10: case 11: case 12: + return 1; + } + return 0; +} + static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal, AVCodecContext *avctx) { @@ -139,8 +151,12 @@ static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal, } if (num > 0 && den > 0) + /* Report the frame rate: the VUI tick is per coded picture, so for + * field pictures the tick rate is the field rate and one frame + * spans two ticks (AV_CODEC_PROP_FIELDS semantics, as for H.264). */ av_reduce(&avctx->framerate.den, &avctx->framerate.num, - num, den, 1 << 30); + num * (int64_t)(hevc_pic_struct_is_field(&sei->picture_timing) ? 2 : 1), + den, 1 << 30); if (!first_slice_in_pic_flag) { unsigned int slice_segment_addr; @@ -271,10 +287,27 @@ static int parse_nal_units(AVCodecParserContext *s, const uint8_t *buf, case HEVC_NAL_RADL_R: case HEVC_NAL_RASL_N: case HEVC_NAL_RASL_R: - if (ctx->sei.picture_timing.picture_struct == HEVC_SEI_PIC_STRUCT_FRAME_DOUBLING) { + /* repeat_pict is the picture's duration in field periods minus + * one (AV_CODEC_PROP_FIELDS semantics, as in the H.264 parser) */ + if (ctx->sei.picture_timing.present) { + switch (ctx->sei.picture_timing.pic_struct) { + case 1: case 2: case 9: case 10: case 11: case 12: + s->repeat_pict = 0; /* individual field */ + break; + case 5: case 6: + s->repeat_pict = 2; /* field pair with one field repeated */ + break; + case 7: + s->repeat_pict = 3; /* frame doubling */ + break; + case 8: + s->repeat_pict = 5; /* frame tripling */ + break; + default: + s->repeat_pict = 1; /* frame */ + } + } else { s->repeat_pict = 1; - } else if (ctx->sei.picture_timing.picture_struct == HEVC_SEI_PIC_STRUCT_FRAME_TRIPLING) { - s->repeat_pict = 2; } ret = hevc_parse_slice_header(s, nal, avctx); if (ret) -- 2.52.0 _______________________________________________ ffmpeg-devel mailing list -- [email protected] To unsubscribe send an email to [email protected]
