[FFmpeg-devel] [PR] hevc-interlace (PR #23616)

Torbjörn Einarsson via ffmpeg-devel Sat, 27 Jun 2026 08:16:37 -0700

PR #23616 opened by Torbjörn Einarsson (tobbee)
URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23616
Patch URL: https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/23616.patch


# Summary of changes

Unlike H.264, where field coding (PAFF/MBAFF) is part of the bitstream and
the decoder already returns a complementary field pair as one full-height
frame, HEVC has no field coding at all: interlaced content is just a
sequence of independent half-height field pictures with distinct POCs,
flagged only via the PTL source-scan flags, VUI field_seq_flag and the
picture timing SEI. A plain decoder therefore emits half-height pictures
at field rate. Currently, FFmpeg does not properly handle interlaced
HEVC. This PR makes the decoder weave the two coded fields of a pair into
one full-height frame, bringing field-coded HEVC to parity with H.264
PAFF: a 1080i50 stream decodes to 1920x1080 @ 25fps, flagged INTERLACED
with TOP_FIELD_FIRST set for top-field-first content and cleared for
bottom-field-first.

A previous attempt (Jose Santiago, V1..V7, Oct-Nov 2024) was not merged as
too complex for "just weave inside a decoder"; this is a deliberately
minimal take:

- No copy: the leader (first field in decode order) allocates one full-height
  buffer; both fields are per-plane views into it with doubled linesize
  (top->even lines, bottom->odd). Reconstruction is unchanged; inter-field
  prediction works through the normal POC-based DPB.
- No new subsystem: no construction context, no extra FIFO, no mutex. Pairing
  is a small leader/follower state finalized at the second field's frame_start
  before ff_thread_finish_setup(), so it is correct under frame threading via
  the existing thread-context propagation. Output is byte-identical single- vs
  multi-threaded.
- Contained scope: SW decoding only; hwaccel is untouched (opaque surfaces
  can't be woven via strides), re-checked after get_format().

Field order for bare pic_struct 1/2 (no pairing hint) is anchored on IRAP
pictures; the explicit 9..12 hints are handled directly.

Testing: full FATE passes after update;
fate-hevc-paired-fields (10-bit 4:2:2, explicit hints) is updated,
and a new fate-hevc-paired-fields-420 (8-bit 4:2:0, bare pic_struct 1/2)
exercises the IRAP-anchored path. The new test needs the
attached sample.

```fate-samples
hevc/paired_fields_420.hevc
```


From 684914b7b1565fe96ab0c7dc40725b9a19b041c0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]>
Date: Tue, 9 Jun 2026 15:20:50 +0200
Subject: [PATCH 1/4] avcodec/hevc: store raw pic_struct and source_scan_type
 from pic timing SEI
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

When frame_field_info_present_flag is set, the picture timing SEI
carries pic_struct, source_scan_type and duplicate_flag (H.265 D.3.3),
but only the derived picture_struct was kept and the latter two were
left unread. Store the raw pic_struct and source_scan_type, flag the
timing info as present, and read the previously skipped
source_scan_type and duplicate_flag. A following commit uses these to
report the field order.

Signed-off-by: Torbjörn Einarsson <[email protected]>
---
 libavcodec/hevc/sei.c | 4 ++++
 libavcodec/hevc/sei.h | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/libavcodec/hevc/sei.c b/libavcodec/hevc/sei.c
index 83c726a217..70b0a6b17f 100644
--- a/libavcodec/hevc/sei.c
+++ b/libavcodec/hevc/sei.c
@@ -60,6 +60,8 @@ static int decode_nal_sei_pic_timing(HEVCSEI *s, 
GetBitContext *gb,
 
     if (sps->vui.frame_field_info_present_flag) {
         int pic_struct = get_bits(gb, 4);
+        h->present = 1;
+        h->pic_struct = pic_struct;
         h->picture_struct = AV_PICTURE_STRUCTURE_UNKNOWN;
         if (pic_struct == 2 || pic_struct == 10 || pic_struct == 12) {
             av_log(logctx, AV_LOG_DEBUG, "BOTTOM Field\n");
@@ -74,6 +76,8 @@ static int decode_nal_sei_pic_timing(HEVCSEI *s, 
GetBitContext *gb,
             av_log(logctx, AV_LOG_DEBUG, "Frame/Field Tripling\n");
             h->picture_struct = HEVC_SEI_PIC_STRUCT_FRAME_TRIPLING;
         }
+        h->source_scan_type = get_bits(gb, 2);
+        skip_bits1(gb); // duplicate_flag
     }
 
     return 0;
diff --git a/libavcodec/hevc/sei.h b/libavcodec/hevc/sei.h
index 59bd2b45f8..55f03e0ede 100644
--- a/libavcodec/hevc/sei.h
+++ b/libavcodec/hevc/sei.h
@@ -52,6 +52,9 @@ typedef struct HEVCSEIFramePacking {
 
 typedef struct HEVCSEIPictureTiming {
     int picture_struct;
+    int pic_struct;        ///< raw pic_struct (H.265 Table D.2), valid if 
present
+    int source_scan_type;  ///< 0: interlaced, 1: progressive, 2: unknown
+    int present;           ///< a pic_timing SEI with frame_field_info was 
parsed
 } HEVCSEIPictureTiming;
 
 typedef struct HEVCSEIAlternativeTransfer {
-- 
2.52.0


From 426a5c6af0b6b2748172759c46df8ce8f1f9c844 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]>
Date: Tue, 9 Jun 2026 15:20:50 +0200
Subject: [PATCH 2/4] avcodec/hevc: set field_order from picture timing SEI in
 the parser
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

hevc_parse_slice_header assigned the SEI picture_struct, an
AV_PICTURE_STRUCTURE_* value, directly to
AVCodecParserContext.field_order which is an enum AVFieldOrder. The
enums are unrelated, so a top field was reported as "progressive" and
a coded frame as "bb". Derive the field order from the picture timing
SEI pic_struct, falling back to the profile_tier_level source-scan
flags, mirroring the H.264 parser. ffprobe now reports interlaced HEVC
as tt/bb instead of progressive.

Signed-off-by: Torbjörn Einarsson <[email protected]>
---
 libavcodec/hevc/parser.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/libavcodec/hevc/parser.c b/libavcodec/hevc/parser.c
index 47a1ac70d0..508d093807 100644
--- a/libavcodec/hevc/parser.c
+++ b/libavcodec/hevc/parser.c
@@ -53,6 +53,40 @@ typedef struct HEVCParserContext {
     int pocTid0;
 } HEVCParserContext;
 
+/*
+ * Derive the stream field order. HEVC has no field coding, so interlace is
+ * carried as metadata: primarily by the pic_struct of the picture timing SEI
+ * (H.265 Table D.2), with the profile_tier_level source-scan flags as a
+ * fallback. This mirrors the H.264 parser (h264_parser.c). Top-field-first
+ * pic_struct values are {1,3,5,9,11}, bottom-field-first {2,4,6,10,12}.
+ */
+static enum AVFieldOrder hevc_field_order(const HEVCSPS *sps,
+                                          const HEVCSEIPictureTiming *pt)
+{
+    const PTLCommon *ptl = &sps->ptl.general_ptl;
+
+    if (pt->present) {
+        switch (pt->pic_struct) {
+        case 1: case 3: case 5: case 9: case 11:
+            return AV_FIELD_TT;
+        case 2: case 4: case 6: case 10: case 12:
+            return AV_FIELD_BB;
+        case 0: case 7: case 8:
+            return AV_FIELD_PROGRESSIVE;
+        }
+        if (pt->source_scan_type == 1)
+            return AV_FIELD_PROGRESSIVE;
+    }
+
+    /* No usable pic_struct: fall back to the profile_tier_level flags. */
+    if (ptl->progressive_source_flag && !ptl->interlaced_source_flag)
+        return AV_FIELD_PROGRESSIVE;
+    if (ptl->interlaced_source_flag && !ptl->progressive_source_flag)
+        return AV_FIELD_TT; /* interlaced; order not signalled, assume top 
first */
+
+    return AV_FIELD_UNKNOWN;
+}
+
 static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal,
                                    AVCodecContext *avctx)
 {
@@ -70,7 +104,6 @@ static int hevc_parse_slice_header(AVCodecParserContext *s, 
H2645NAL *nal,
 
     first_slice_in_pic_flag = get_bits1(gb);
     s->picture_structure = sei->picture_timing.picture_struct;
-    s->field_order = sei->picture_timing.picture_struct;
 
     if (IS_IRAP_NAL(nal)) {
         s->key_frame = 1;
@@ -85,6 +118,8 @@ static int hevc_parse_slice_header(AVCodecParserContext *s, 
H2645NAL *nal,
     pps = ps->pps_list[pps_id];
     sps = pps->sps;
 
+    s->field_order = hevc_field_order(sps, &sei->picture_timing);
+
     ow  = &sps->output_window;
 
     s->coded_width  = sps->width;
-- 
2.52.0


From 7faa4fb6d5df9ec6483d55617bf619b27c44d76a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]>
Date: Fri, 12 Jun 2026 00:12:55 +0200
Subject: [PATCH 3/4] avcodec/hevc: combine interlaced field pairs into full
 frames
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

HEVC has no field coding; broadcast-style interlaced content is coded as
a sequence of half-height field pictures, signalled only through the
profile_tier_level source-scan flags, VUI field_seq_flag and the picture
timing SEI. Decoders that present those pictures as-is yield half-height
output at field rate, unlike H.264 PAFF where the two fields of a pair
are returned as one full-height frame.

Decode the two fields of a complementary pair into one shared
full-height buffer and output it once as a woven frame carrying the
INTERLACED and TOP_FIELD_FIRST flags, so field-coded HEVC is presented
like H.264 PAFF: a 1080i50 stream decodes to 1920x1080 frames at 25 fps.
There is no copying: the first field of a pair (the leader, in decode
order) allocates the full-height buffer and both fields' frames become
per-plane views into it with doubled linesize, the top field on even and
the bottom field on odd lines, independently of which one leads, so both
TFF and BFF streams work. The reconstruction code takes strides as
parameters throughout and needs no changes; inter-field prediction works
through the regular POC-based DPB since the fields keep distinct POCs.
Pairing follows the pic_struct pairing hints (Table D.2): 11/12 lead,
9/10 follow; bare 1/2, which carry no pairing hint, take their role from
the field order locked at the first IRAP field of the sequence (an IRAP
starts a frame, so it leads its pair and its parity gives the field
order), falling back to decode order before the first IRAP.

Combining engages only for software decoding of streams that signal
interlaced_source_flag and not progressive_source_flag, set VUI
field_seq_flag, and carry picture timing SEI marking field pictures;
everything else, including hwaccel decoding, is unchanged (the geometry
is re-exported after get_format() so it follows the hwaccel decision).
When combining, the exported coded/display height is doubled and the
frame rate halved so callers see the woven geometry.

The pair is finalized at the second field's frame_start: the leader
gets its OUTPUT flag, and the woven frame its flags, doubled crop and
duration, and the leader's side data, all before
ff_thread_finish_setup(). This makes the scheme work under frame
threading: the DPB state every later worker snapshots is consistent, so
completed pairs are emitted at sequence boundaries and at EOF. The woven
frame cannot reach the caller before both fields are decoded, because
frames are returned by the decode call that bumps them from the DPB and
the caller receives a worker's output only after that worker and all
earlier ones (including the leader's) have finished; the pending leader
is carried across worker contexts in update_thread_context(). Output is
byte-identical to single-threaded decoding and deterministic over
repeated runs, verified for TFF and BFF streams in 8-bit 4:2:0 and
10-bit 4:2:2.

An unpaired field is dropped with a warning. Film grain synthesis is
not applied to woven frames. verify_md5() uses the frame's own (field)
dimensions instead of the doubled avctx coded size.

fate-hevc-paired-fields now outputs two full-height frames instead of
four fields; extend it to also check the frame dimensions. Add
fate-hevc-paired-fields-420, an 8-bit 4:2:0 stream signalling bare
pic_struct 1/2, which exercises field-order inference from the IRAP
anchor; it needs the new paired_fields_420.hevc sample.

Signed-off-by: Torbjörn Einarsson <[email protected]>
---
 libavcodec/hevc/hevcdec.c             | 274 +++++++++++++++++++++++++-
 libavcodec/hevc/hevcdec.h             |  36 ++++
 libavcodec/hevc/refs.c                |  75 ++++++-
 tests/fate/hevc.mak                   |   5 +-
 tests/ref/fate/hevc-paired-fields     |  12 +-
 tests/ref/fate/hevc-paired-fields-420 |  14 ++
 6 files changed, 397 insertions(+), 19 deletions(-)
 create mode 100644 tests/ref/fate/hevc-paired-fields-420

diff --git a/libavcodec/hevc/hevcdec.c b/libavcodec/hevc/hevcdec.c
index b4c2d82e8d..a0215dec65 100644
--- a/libavcodec/hevc/hevcdec.c
+++ b/libavcodec/hevc/hevcdec.c
@@ -329,18 +329,40 @@ static int decode_lt_rps(const HEVCSPS *sps, LongTermRPS 
*rps,
     return 0;
 }
 
+/*
+ * combine_fields: whether the woven full-height geometry should be advertised
+ * for this SPS. Fields are only combined in software decoding of an
+ * interlaced-source stream whose coded pictures are individual fields
+ * (VUI field_seq_flag, D.3.3); a frame-coded interlaced stream must not have
+ * its geometry doubled. Advertising doubled dimensions while emitting
+ * per-field frames would corrupt buffer allocation, so the predicate must
+ * match the runtime combine gate (combine_decide_role() additionally requires
+ * per-picture timing SEI).
+ */
+static int combine_geometry_active(const HEVCContext *s, const HEVCSPS *sps)
+{
+    const PTLCommon *ptl = &sps->ptl.general_ptl;
+
+    return !s->avctx->hwaccel && sps->vui.field_seq_flag &&
+           ptl->interlaced_source_flag && !ptl->progressive_source_flag;
+}
+
 static void export_stream_params(HEVCContext *s, const HEVCSPS *sps)
 {
     AVCodecContext *avctx = s->avctx;
     const HEVCVPS    *vps = sps->vps;
     const HEVCWindow *ow = &sps->output_window;
     unsigned int num = 0, den = 0;
+    /* combine_fields weaves two field pictures into one frame of twice the
+     * height at half the field rate; advertise that geometry to the caller. */
+    const int combine = combine_geometry_active(s, sps);
+    const int vmul     = combine ? 2 : 1;
 
     avctx->pix_fmt             = sps->pix_fmt;
     avctx->coded_width         = sps->width;
-    avctx->coded_height        = sps->height;
+    avctx->coded_height        = sps->height * vmul;
     avctx->width               = sps->width  - ow->left_offset - 
ow->right_offset;
-    avctx->height              = sps->height - ow->top_offset  - 
ow->bottom_offset;
+    avctx->height              = (sps->height - ow->top_offset  - 
ow->bottom_offset) * vmul;
     avctx->has_b_frames        = sps->temporal_layer[sps->max_sub_layers - 
1].num_reorder_pics;
     avctx->profile             = sps->ptl.general_ptl.profile_idc;
     avctx->level               = sps->ptl.general_ptl.level_idc;
@@ -381,8 +403,11 @@ static void export_stream_params(HEVCContext *s, const 
HEVCSPS *sps)
     }
 
     if (num > 0 && den > 0)
+        /* num is num_units_in_tick (the framerate denominator), den is
+         * time_scale (the numerator); combining two fields into one frame
+         * halves the rate, i.e. doubles the tick count per output frame. */
         av_reduce(&avctx->framerate.den, &avctx->framerate.num,
-                  num, den, 1 << 30);
+                  num * vmul, den, 1 << 30);
 }
 
 static int export_stream_params_from_sei(HEVCContext *s)
@@ -3194,9 +3219,159 @@ static int find_finish_setup_nal(const HEVCContext *s)
     return nal_idx;
 }
 
+/*
+ * combine_fields: decide the pairing role of the field picture about to be
+ * started. Returns HEVC_COMBINE_LEADER for the first field of a pair (which
+ * allocates the full-height buffer), HEVC_COMBINE_FOLLOWER for the second 
field
+ * (which completes a pending pair), or HEVC_COMBINE_NONE otherwise (combining
+ * disabled, progressive content, hwaccel, non-base layer, or no usable picture
+ * timing SEI). On a non-NONE return *bottom is set to the field's spatial
+ * parity: 1 if it occupies the bottom (odd) lines of the woven frame, 0 for 
the
+ * top (even) lines. The leader/follower role follows decode order and is
+ * independent of parity, so both top-field-first (leader = top) and
+ * bottom-field-first (leader = bottom) streams are handled.
+ *
+ * The pairing direction comes from the picture timing pic_struct (H.265 Table
+ * D.2): values 11/12 are a top/bottom field paired with the *next* field in
+ * output order (the pair leader), 9/10 a top/bottom field paired with the
+ * *previous* field (the follower). Bare top/bottom fields (1/2) carry no
+ * pairing hint; their role is taken from the field order locked at the first
+ * IRAP field of the sequence (an IRAP starts a frame, hence leads its pair, so
+ * its parity reveals whether the stream is top- or bottom-field-first). Before
+ * the first IRAP the order is provisionally seeded from decode order.
+ */
+static int combine_decide_role(HEVCContext *s, const HEVCLayerContext *l,
+                               int *bottom)
+{
+    const HEVCSEIPictureTiming *pt = &s->sei.picture_timing;
+
+    *bottom = 0;
+
+    if (l != &s->layers[0] || !combine_geometry_active(s, l->sps))
+        return HEVC_COMBINE_NONE;
+
+    if (!pt->present || pt->source_scan_type == 1 /* progressive */)
+        return HEVC_COMBINE_NONE;
+
+    switch (pt->pic_struct) {
+    case 11: /* top field, paired with next bottom field    -> leader, top */
+        *bottom = 0;
+        return HEVC_COMBINE_LEADER;
+    case 12: /* bottom field, paired with next top field     -> leader, bottom 
*/
+        *bottom = 1;
+        return HEVC_COMBINE_LEADER;
+    case 9:  /* top field, paired with previous bottom field -> follower, top 
*/
+        *bottom = 0;
+        return s->combine_leader ? HEVC_COMBINE_FOLLOWER : HEVC_COMBINE_NONE;
+    case 10: /* bottom field, paired with previous top field -> follower, 
bottom */
+        *bottom = 1;
+        return s->combine_leader ? HEVC_COMBINE_FOLLOWER : HEVC_COMBINE_NONE;
+    case 1:  /* bare top field    -> role from the locked field order */
+    case 2:  /* bare bottom field -> role from the locked field order */
+    {
+        int parity = pt->pic_struct == 2; /* spatial parity: 0 top, 1 bottom */
+        *bottom = parity;
+
+        /* An IRAP field necessarily starts a frame, so it is a pair leader,
+         * and its parity fixes the field order for the sequence (top -> TFF,
+         * bottom -> BFF). Real interlaced streams place the IRAP on the first
+         * field only, so this seeds and re-anchors the pairing phase, also
+         * recovering from any earlier dropped field. */
+        if (IS_IRAP(s)) {
+            s->combine_lead_bottom = parity;
+            return HEVC_COMBINE_LEADER;
+        }
+
+        /* Field order known: a field matching the leading parity opens a pair,
+         * the opposite parity closes the pending one. */
+        if (s->combine_lead_bottom >= 0) {
+            if (parity == s->combine_lead_bottom)
+                return HEVC_COMBINE_LEADER;
+            return s->combine_leader ? HEVC_COMBINE_FOLLOWER : 
HEVC_COMBINE_NONE;
+        }
+
+        /* Field order not yet established (entry before the first IRAP): with
+         * no leader pending this field opens the pair and provisionally seeds
+         * the order; the next IRAP corrects it if the guess was wrong. */
+        if (!s->combine_leader) {
+            s->combine_lead_bottom = parity;
+            return HEVC_COMBINE_LEADER;
+        }
+        return HEVC_COMBINE_FOLLOWER;
+    }
+    }
+
+    return HEVC_COMBINE_NONE;
+}
+
+/*
+ * combine_fields: the second field of the pair has started decoding into the
+ * shared buffer, so finalize the full-height "combined" frame the leader owns
+ * and flag the leader for output. The combined frame already carries pts/
+ * duration/SAR/colorimetry from its own get_buffer() at leader frame start;
+ * here we set the per-frame properties the decoder normally fills and fix up
+ * the geometry that differs from a single field.
+ */
+static int combine_finalize(HEVCFrame *leader, int tff)
+{
+    AVFrame *combined = leader->combined;
+
+    combined->pict_type = leader->f->pict_type;
+    combined->flags    |= AV_FRAME_FLAG_INTERLACED;
+    if (tff)
+        combined->flags |= AV_FRAME_FLAG_TOP_FIELD_FIRST;
+    else
+        combined->flags &= ~AV_FRAME_FLAG_TOP_FIELD_FIRST;
+    if (leader->f->flags & AV_FRAME_FLAG_KEY)
+        combined->flags |= AV_FRAME_FLAG_KEY;
+
+    /* one field line crops to two frame lines; columns are unaffected */
+    combined->crop_top    = leader->f->crop_top    * 2;
+    combined->crop_bottom = leader->f->crop_bottom * 2;
+    combined->crop_left   = leader->f->crop_left;
+    combined->crop_right  = leader->f->crop_right;
+
+    if (combined->duration > 0)
+        combined->duration *= 2;
+
+    /* per-frame SEI side data (A53 CC, HDR metadata, ...) was attached to the
+     * leader's field view in set_side_data(); the woven frame only has the
+     * packet-level properties from get_buffer() */
+    for (int i = 0; i < leader->f->nb_side_data; i++) {
+        int ret = av_frame_side_data_clone(&combined->side_data,
+                                           &combined->nb_side_data,
+                                           leader->f->side_data[i],
+                                           AV_FRAME_SIDE_DATA_FLAG_UNIQUE);
+        if (ret < 0)
+            return ret;
+    }
+
+    leader->flags |= HEVC_FRAME_FLAG_OUTPUT;
+
+    return 0;
+}
+
+/*
+ * combine_fields: drop a pending pair leader whose second field can no longer
+ * arrive (sequence boundary or another leader showing up). The half-filled
+ * woven buffer is never output; the field itself stays in the DPB as a
+ * reference until it ages out.
+ */
+static void combine_drop_leader(HEVCContext *s)
+{
+    if (s->combine_leader) {
+        av_log(s->avctx, AV_LOG_WARNING,
+               "combine_fields: dropping unpaired field with POC %d\n",
+               s->combine_leader->poc);
+        s->combine_leader = NULL;
+    }
+}
+
 static int hevc_frame_start(HEVCContext *s, HEVCLayerContext *l,
                             unsigned nal_idx)
 {
+    int combine_role = HEVC_COMBINE_NONE;
+    int combine_bottom = 0;
     const HEVCPPS *const pps = s->ps.pps_list[s->sh.pps_id];
     const HEVCSPS *const sps = pps->sps;
     int pic_size_in_ctb  = ((sps->width  >> sps->log2_min_cb_size) + 1) *
@@ -3248,6 +3423,7 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
             }
         }
 
+        combine_drop_leader(s);
         ff_hevc_clear_refs(l);
 
         ret = set_sps(s, l, sps);
@@ -3263,6 +3439,16 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
                 return ret;
             }
 
+            /* get_format() may have enabled a hwaccel, which disables field
+             * combining; re-export so the advertised geometry matches the
+             * per-field frames that will be produced, keeping the negotiated
+             * pixel format */
+            if (s->avctx->hwaccel) {
+                enum AVPixelFormat pix_fmt = s->avctx->pix_fmt;
+                export_stream_params(s, sps);
+                s->avctx->pix_fmt = pix_fmt;
+            }
+
             new_sequence = 1;
         }
     }
@@ -3273,8 +3459,10 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
     memset(l->is_pcm,        0, (sps->min_pu_width + 1) * (sps->min_pu_height 
+ 1));
     memset(l->tab_slice_address, -1, pic_size_in_ctb * 
sizeof(*l->tab_slice_address));
 
-    if (IS_IDR(s))
+    if (IS_IDR(s)) {
+        combine_drop_leader(s);
         ff_hevc_clear_refs(l);
+    }
 
     s->slice_idx         = 0;
     s->first_nal_type    = s->nal_unit_type;
@@ -3319,10 +3507,41 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
     if (ret < 0)
         return ret;
 
+    combine_role = combine_decide_role(s, l, &combine_bottom);
+    s->combine_field_role   = combine_role;
+    s->combine_field_bottom = combine_bottom;
     ret = ff_hevc_set_new_ref(s, l, s->poc);
+    s->combine_field_role   = HEVC_COMBINE_NONE;
+    s->combine_field_bottom = 0;
     if (ret < 0)
         goto fail;
 
+    if (combine_role == HEVC_COMBINE_LEADER) {
+        /* a still-pending leader means the previous pair never received its
+         * second field */
+        combine_drop_leader(s);
+        /* hold the leader's output until the matching second field arrives */
+        s->combine_leader        = s->cur_frame;
+        s->combine_leader_bottom = combine_bottom;
+        s->cur_frame->flags &= ~HEVC_FRAME_FLAG_OUTPUT;
+    } else if (combine_role == HEVC_COMBINE_FOLLOWER) {
+        /* The second field completes the pair: release the woven frame for
+         * output, ordered by the leader's POC. Its samples are only fully
+         * decoded once this field finishes, but it cannot reach the caller
+         * earlier: output frames are returned by the decode call that bumps
+         * them from the DPB, and under frame threading the caller receives a
+         * worker's frames only after that worker (and all earlier ones,
+         * including the leader's) fully finished. Finalizing here, before
+         * ff_thread_finish_setup(), keeps the DPB state that later workers
+         * snapshot consistent, so the pair is also emitted when a sequence
+         * boundary or EOF follows immediately. */
+        s->cur_frame->flags &= ~HEVC_FRAME_FLAG_OUTPUT;
+        ret = combine_finalize(s->combine_leader, !s->combine_leader_bottom);
+        s->combine_leader = NULL;
+        if (ret < 0)
+            goto fail;
+    }
+
     ret = ff_hevc_frame_rps(s, l);
     if (ret < 0) {
         av_log(s->avctx, AV_LOG_ERROR, "Error constructing the frame RPS.\n");
@@ -3340,6 +3559,11 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
         !(s->avctx->export_side_data & AV_CODEC_EXPORT_DATA_FILM_GRAIN) &&
         !s->avctx->hwaccel;
 
+    /* film grain would have to be applied to the woven buffer, not the field
+     * views; not supported when combining fields */
+    if (combine_role != HEVC_COMBINE_NONE)
+        s->cur_frame->needs_fg = 0;
+
     ret = set_side_data(s);
     if (ret < 0)
         goto fail;
@@ -3395,6 +3619,8 @@ static int hevc_frame_start(HEVCContext *s, 
HEVCLayerContext *l,
     return 0;
 
 fail:
+    if (s->combine_leader == l->cur_frame)
+        s->combine_leader = NULL;
     if (l->cur_frame)
         ff_hevc_unref_frame(l->cur_frame, ~0);
     l->cur_frame = NULL;
@@ -3429,8 +3655,10 @@ static int verify_md5(HEVCContext *s, AVFrame *frame)
 
     msg_buf[0] = '\0';
     for (i = 0; frame->data[i]; i++) {
-        int width  = s->avctx->coded_width;
-        int height = s->avctx->coded_height;
+        /* the frame's own dimensions, not avctx->coded_*: when combining
+         * fields the checksum covers the half-height field view */
+        int width  = frame->width;
+        int height = frame->height;
         int w = (i == 1 || i == 2) ? (width  >> desc->log2_chroma_w) : width;
         int h = (i == 1 || i == 2) ? (height >> desc->log2_chroma_h) : height;
         uint8_t md5[16];
@@ -3899,6 +4127,24 @@ static int hevc_ref_frame(HEVCFrame *dst, const 
HEVCFrame *src)
         dst->needs_fg = 1;
     }
 
+    /* combine_fields: the leader's full-height woven buffer must travel with 
the
+     * frame across thread contexts (the follower writes into it and it is the
+     * frame finally output). It is a plain refcounted AVFrame, so give each 
DPB
+     * entry its own reference rather than sharing the pointer (which would
+     * double-free in ff_hevc_unref_frame). */
+    if (src->combined) {
+        dst->combined = av_frame_alloc();
+        if (!dst->combined) {
+            ff_hevc_unref_frame(dst, ~0);
+            return AVERROR(ENOMEM);
+        }
+        ret = av_frame_ref(dst->combined, src->combined);
+        if (ret < 0) {
+            ff_hevc_unref_frame(dst, ~0);
+            return ret;
+        }
+    }
+
     dst->pps     = av_refstruct_ref_c(src->pps);
     dst->tab_mvf = av_refstruct_ref(src->tab_mvf);
     dst->rpl_tab = av_refstruct_ref(src->rpl_tab);
@@ -3999,6 +4245,7 @@ static av_cold int hevc_init_context(AVCodecContext 
*avctx)
 
     s->dovi_ctx.logctx = avctx;
     s->eos = 0;
+    s->combine_lead_bottom = -1;
 
     ff_hevc_reset_sei(&s->sei);
 
@@ -4048,6 +4295,19 @@ static int hevc_update_thread_context(AVCodecContext 
*dst,
     s->eos        = s0->eos;
     s->no_rasl_output_flag = s0->no_rasl_output_flag;
 
+    /* combine_fields: carry the pending pair leader to the worker that will
+     * decode the follower. The DPB was copied index-for-index above, so the
+     * leader lives at the same slot; re-point into this context's base-layer
+     * DPB. */
+    s->combine_leader_bottom = s0->combine_leader_bottom;
+    s->combine_lead_bottom   = s0->combine_lead_bottom;
+    if (s0->combine_leader) {
+        size_t idx = s0->combine_leader - s0->layers[0].DPB;
+        s->combine_leader = &s->layers[0].DPB[idx];
+    } else {
+        s->combine_leader = NULL;
+    }
+
     s->is_nalff        = s0->is_nalff;
     s->nal_length_size = s0->nal_length_size;
     s->layers_active_decode = s0->layers_active_decode;
@@ -4187,6 +4447,8 @@ static av_cold int hevc_decode_init(AVCodecContext *avctx)
 static av_cold void hevc_decode_flush(AVCodecContext *avctx)
 {
     HEVCContext *s = avctx->priv_data;
+    s->combine_leader = NULL;
+    s->combine_lead_bottom = -1;
     ff_hevc_flush_dpb(s);
     ff_hevc_reset_sei(&s->sei);
     ff_dovi_ctx_flush(&s->dovi_ctx);
diff --git a/libavcodec/hevc/hevcdec.h b/libavcodec/hevc/hevcdec.h
index 8394740c4b..21460e19b3 100644
--- a/libavcodec/hevc/hevcdec.h
+++ b/libavcodec/hevc/hevcdec.h
@@ -357,6 +357,17 @@ typedef struct DBParams {
 #define HEVC_FRAME_FLAG_UNAVAILABLE (1 << 3)
 #define HEVC_FRAME_FLAG_CORRUPT (1 << 4)
 
+/* combine_fields: pairing role of the field picture currently being started,
+ * used to decide how its decode buffer is allocated (see alloc_frame()). The
+ * role reflects decode order (leader is the first field of the pair), which is
+ * independent of spatial parity: for top-field-first content the leader is the
+ * top field, for bottom-field-first content it is the bottom field. The 
spatial
+ * parity (which lines of the woven frame a field writes) is carried separately
+ * by combine_field_bottom. */
+#define HEVC_COMBINE_NONE     0
+#define HEVC_COMBINE_LEADER   1  ///< first field of the pair: owns the 
full-height buffer
+#define HEVC_COMBINE_FOLLOWER 2  ///< second field of the pair: a view into 
the leader's buffer
+
 typedef struct HEVCFrame {
     union {
         struct {
@@ -376,6 +387,15 @@ typedef struct HEVCFrame {
     RefPicListTab *rpl;            ///< RefStruct reference
     int nb_rpl_elems;
 
+    /**
+     * combine_fields: for the leader (the first field, in decode order) of a
+     * combined field pair, this is the full-height AVFrame that both fields 
are
+     * decoded into and which is output once as a single full-height frame. f 
is
+     * a doubled-stride view into this buffer. NULL for followers and for
+     * normally-decoded frames.
+     */
+    AVFrame *combined;
+
     void *hwaccel_picture_private; ///< RefStruct reference
 
     // for secondary-layer frames, this is the DPB index of the base-layer 
frame
@@ -561,6 +581,22 @@ typedef struct HEVCContext {
                             ///< as a format defined in 14496-15
     int apply_defdispwin;
 
+    /// while mid-pair, the leader (first) field frame the follower combines 
into
+    HEVCFrame *combine_leader;
+    /// spatial parity of the held leader: 1 if it writes the bottom (odd) 
lines,
+    /// i.e. the pair is bottom-field-first; used to set TOP_FIELD_FIRST
+    int combine_leader_bottom;
+    /// transient HEVC_COMBINE_* role for the frame currently being allocated
+    int combine_field_role;
+    /// transient spatial parity (1 = bottom/odd lines) of the frame being 
allocated
+    int combine_field_bottom;
+    /// combine_fields: locked field order for bare top/bottom fields 
(pic_struct
+    /// 1/2, which carry no pairing hint). Parity of the field that leads each
+    /// pair, matching combine_leader_bottom: -1 unknown, 0 top-field-first,
+    /// 1 bottom-field-first. Set from the first IRAP field of the sequence (an
+    /// IRAP starts a frame, hence leads its pair) and re-anchored at every 
IRAP.
+    int combine_lead_bottom;
+
     // multi-layer AVOptions
     int         *view_ids;
     unsigned  nb_view_ids;
diff --git a/libavcodec/hevc/refs.c b/libavcodec/hevc/refs.c
index 2acffd72d9..17b9810ef8 100644
--- a/libavcodec/hevc/refs.c
+++ b/libavcodec/hevc/refs.c
@@ -39,6 +39,7 @@ void ff_hevc_unref_frame(HEVCFrame *frame, int flags)
         frame->flags = 0;
     if (!frame->flags) {
         ff_progress_frame_unref(&frame->tf);
+        av_frame_free(&frame->combined);
         av_frame_unref(frame->frame_grain);
         frame->needs_fg = 0;
 
@@ -107,6 +108,63 @@ static int replace_alpha_plane(AVFrame *alpha, AVFrame 
*base)
     return AVERROR_BUG;
 }
 
+/*
+ * combine_fields: set up frame->f as a doubled-stride field view into a
+ * full-height "combined" buffer that holds both fields of an interlaced pair.
+ *
+ * For the pair leader (the first field in decode order) a fresh full-height
+ * (2*sps->height) AVFrame is allocated and stored in frame->combined; for the
+ * follower the leader's combined buffer is shared. Independently of which 
field
+ * leads, the spatial parity selects the lines written: a top field maps the
+ * even lines, a bottom field the odd lines (data offset by one line). In both
+ * cases f keeps the field's logical dimensions (sps->width x sps->height); 
only
+ * data[] and linesize[] describe the interleaved view, so reconstruction, SAO,
+ * deblocking and motion compensation all operate in field geometry and
+ * naturally write every other line of the shared frame.
+ */
+static int combine_setup_field_view(HEVCContext *s, HEVCLayerContext *l,
+                                    HEVCFrame *frame, int is_leader, int 
bottom)
+{
+    AVFrame *f = frame->f;
+    AVFrame *combined;
+    int ret, i;
+
+    if (is_leader) {
+        combined = av_frame_alloc();
+        if (!combined)
+            return AVERROR(ENOMEM);
+        combined->format = s->avctx->pix_fmt;
+        combined->width  = l->sps->width;
+        combined->height = l->sps->height * 2;
+        ret = ff_thread_get_buffer(s->avctx, combined, AV_GET_BUFFER_FLAG_REF);
+        if (ret < 0) {
+            av_frame_free(&combined);
+            return ret;
+        }
+        frame->combined = combined;
+    } else {
+        if (!s->combine_leader || !s->combine_leader->combined)
+            return AVERROR_BUG;
+        combined = s->combine_leader->combined;
+    }
+
+    f->format = combined->format;
+    f->width  = l->sps->width;
+    f->height = l->sps->height;
+    for (i = 0; i < FF_ARRAY_ELEMS(f->buf) && combined->buf[i]; i++) {
+        f->buf[i] = av_buffer_ref(combined->buf[i]);
+        if (!f->buf[i])
+            return AVERROR(ENOMEM);
+    }
+    for (i = 0; i < FF_ARRAY_ELEMS(f->data) && combined->data[i]; i++) {
+        f->data[i]     = combined->data[i] + (bottom ? combined->linesize[i] : 
0);
+        f->linesize[i] = combined->linesize[i] * 2;
+    }
+    f->extended_data = f->data;
+
+    return 0;
+}
+
 static HEVCFrame *alloc_frame(HEVCContext *s, HEVCLayerContext *l)
 {
     const HEVCVPS *vps = l->sps->vps;
@@ -157,9 +215,17 @@ static HEVCFrame *alloc_frame(HEVCContext *s, 
HEVCLayerContext *l)
             }
         }
 
-        ret = ff_thread_get_buffer(s->avctx, frame->f, AV_GET_BUFFER_FLAG_REF);
-        if (ret < 0)
-            goto fail;
+        if (s->combine_field_role != HEVC_COMBINE_NONE) {
+            ret = combine_setup_field_view(s, l, frame,
+                                           s->combine_field_role == 
HEVC_COMBINE_LEADER,
+                                           s->combine_field_bottom);
+            if (ret < 0)
+                goto fail;
+        } else {
+            ret = ff_thread_get_buffer(s->avctx, frame->f, 
AV_GET_BUFFER_FLAG_REF);
+            if (ret < 0)
+                goto fail;
+        }
 
         size_t rpl_bytes;
         if (av_size_mult(s->pkt.nb_nals, sizeof(*frame->rpl), &rpl_bytes) < 0)
@@ -303,7 +369,8 @@ int ff_hevc_output_frames(HEVCContext *s,
             (nb_output &&
              (nb_dpb[0] > max_dpb || nb_dpb[1] > max_dpb))) {
             HEVCFrame *frame = &s->layers[min_layer].DPB[min_idx];
-            AVFrame *f = frame->needs_fg ? frame->frame_grain : frame->f;
+            AVFrame *f = frame->needs_fg ? frame->frame_grain :
+                         frame->combined ? frame->combined : frame->f;
             int output = !discard && (layers_active_output & (1 << min_layer));
 
             if (output) {
diff --git a/tests/fate/hevc.mak b/tests/fate/hevc.mak
index 7ce9ff403b..8045c99c8e 100644
--- a/tests/fate/hevc.mak
+++ b/tests/fate/hevc.mak
@@ -266,9 +266,12 @@ FATE_HEVC-$(call FRAMEMD5, HEVC, HEVC, HEVC_PARSER) += 
fate-hevc-skiploopfilter
 FATE_HEVC-$(call FRAMEMD5, MOV, HEVC, SCALE_FILTER) += 
fate-hevc-extradata-reload
 fate-hevc-extradata-reload: CMD = framemd5 -i 
$(TARGET_SAMPLES)/hevc/extradata-reload-multi-stsd.mov -sws_flags bitexact
 
-fate-hevc-paired-fields: CMD = probeframes -show_entries 
frame=interlaced_frame,top_field_first $(TARGET_SAMPLES)/hevc/paired_fields.hevc
+fate-hevc-paired-fields: CMD = probeframes -show_entries 
frame=width,height,interlaced_frame,top_field_first 
$(TARGET_SAMPLES)/hevc/paired_fields.hevc
 FATE_HEVC_FFPROBE-$(call DEMDEC, HEVC, HEVC) += fate-hevc-paired-fields
 
+fate-hevc-paired-fields-420: CMD = probeframes -show_entries 
frame=width,height,pix_fmt,interlaced_frame,top_field_first 
$(TARGET_SAMPLES)/hevc/paired_fields_420.hevc
+FATE_HEVC_FFPROBE-$(call DEMDEC, HEVC, HEVC) += fate-hevc-paired-fields-420
+
 fate-hevc-monochrome-crop: CMD = probeframes -show_entries 
frame=width,height:stream=width,height 
$(TARGET_SAMPLES)/hevc/hevc-monochrome.hevc
 FATE_HEVC_FFPROBE-$(call PARSERDEMDEC, HEVC, HEVC, HEVC) += 
fate-hevc-monochrome-crop
 
diff --git a/tests/ref/fate/hevc-paired-fields 
b/tests/ref/fate/hevc-paired-fields
index f2223e770b..ace522cc9b 100644
--- a/tests/ref/fate/hevc-paired-fields
+++ b/tests/ref/fate/hevc-paired-fields
@@ -1,16 +1,12 @@
 [FRAME]
+width=1920
+height=1080
 interlaced_frame=1
 top_field_first=1
 [/FRAME]
 [FRAME]
-interlaced_frame=1
-top_field_first=0
-[/FRAME]
-[FRAME]
+width=1920
+height=1080
 interlaced_frame=1
 top_field_first=1
 [/FRAME]
-[FRAME]
-interlaced_frame=1
-top_field_first=0
-[/FRAME]
diff --git a/tests/ref/fate/hevc-paired-fields-420 
b/tests/ref/fate/hevc-paired-fields-420
new file mode 100644
index 0000000000..d667e26079
--- /dev/null
+++ b/tests/ref/fate/hevc-paired-fields-420
@@ -0,0 +1,14 @@
+[FRAME]
+width=1920
+height=1080
+pix_fmt=yuv420p
+interlaced_frame=1
+top_field_first=1
+[/FRAME]
+[FRAME]
+width=1920
+height=1080
+pix_fmt=yuv420p
+interlaced_frame=1
+top_field_first=1
+[/FRAME]
-- 
2.52.0


From 370460d5a79be5632f45eee51af28e5c4e4ca474 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Torbjo=CC=88rn=20Einarsson?= <[email protected]>
Date: Fri, 12 Jun 2026 00:04:59 +0200
Subject: [PATCH 4/4] avcodec/hevc: mark HEVC as field-based and report
 field-unit repeat_pict
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add AV_CODEC_PROP_FIELDS to the HEVC codec descriptor and make the
parser report repeat_pict in field periods minus one (0 field, 1 frame,
2 frame with a repeated field, 3 frame doubling, 5 frame tripling) and
avctx->framerate as the frame rate: the HEVC VUI tick is per coded
picture, which for field_seq streams is the field rate, so it is halved
for field pictures. This mirrors the H.264 parser.

libavformat then derives correct packet durations for field-coded
streams and estimates e.g. avg_frame_rate 25/1 with r_frame_rate 50/1
for a 1080i50 stream in MPEG-TS, the same presentation as for interlaced
H.264, instead of reporting the field rate as the frame rate.

Signed-off-by: Torbjörn Einarsson <[email protected]>
---
 libavcodec/codec_desc.c  |  3 ++-
 libavcodec/hevc/parser.c | 41 ++++++++++++++++++++++++++++++++++++----
 2 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/libavcodec/codec_desc.c b/libavcodec/codec_desc.c
index 81c095bea7..46c01ca14c 100644
--- a/libavcodec/codec_desc.c
+++ b/libavcodec/codec_desc.c
@@ -1268,7 +1268,8 @@ static const AVCodecDescriptor codec_descriptors[] = {
         .type      = AVMEDIA_TYPE_VIDEO,
         .name      = "hevc",
         .long_name = NULL_IF_CONFIG_SMALL("H.265 / HEVC (High Efficiency Video 
Coding)"),
-        .props     = AV_CODEC_PROP_LOSSY | AV_CODEC_PROP_LOSSLESS | 
AV_CODEC_PROP_REORDER,
+        .props     = AV_CODEC_PROP_LOSSY | AV_CODEC_PROP_LOSSLESS |
+                     AV_CODEC_PROP_REORDER | AV_CODEC_PROP_FIELDS,
         .profiles  = NULL_IF_CONFIG_SMALL(ff_hevc_profiles),
     },
     {
diff --git a/libavcodec/hevc/parser.c b/libavcodec/hevc/parser.c
index 508d093807..51c605cb4b 100644
--- a/libavcodec/hevc/parser.c
+++ b/libavcodec/hevc/parser.c
@@ -87,6 +87,18 @@ static enum AVFieldOrder hevc_field_order(const HEVCSPS *sps,
     return AV_FIELD_UNKNOWN;
 }
 
+/* whether the current picture is an individual field (H.265 Table D.2) */
+static int hevc_pic_struct_is_field(const HEVCSEIPictureTiming *pt)
+{
+    if (!pt->present)
+        return 0;
+    switch (pt->pic_struct) {
+    case 1: case 2: case 9: case 10: case 11: case 12:
+        return 1;
+    }
+    return 0;
+}
+
 static int hevc_parse_slice_header(AVCodecParserContext *s, H2645NAL *nal,
                                    AVCodecContext *avctx)
 {
@@ -139,8 +151,12 @@ static int hevc_parse_slice_header(AVCodecParserContext 
*s, H2645NAL *nal,
     }
 
     if (num > 0 && den > 0)
+        /* Report the frame rate: the VUI tick is per coded picture, so for
+         * field pictures the tick rate is the field rate and one frame
+         * spans two ticks (AV_CODEC_PROP_FIELDS semantics, as for H.264). */
         av_reduce(&avctx->framerate.den, &avctx->framerate.num,
-                  num, den, 1 << 30);
+                  num * 
(int64_t)(hevc_pic_struct_is_field(&sei->picture_timing) ? 2 : 1),
+                  den, 1 << 30);
 
     if (!first_slice_in_pic_flag) {
         unsigned int slice_segment_addr;
@@ -271,10 +287,27 @@ static int parse_nal_units(AVCodecParserContext *s, const 
uint8_t *buf,
         case HEVC_NAL_RADL_R:
         case HEVC_NAL_RASL_N:
         case HEVC_NAL_RASL_R:
-            if (ctx->sei.picture_timing.picture_struct == 
HEVC_SEI_PIC_STRUCT_FRAME_DOUBLING) {
+            /* repeat_pict is the picture's duration in field periods minus
+             * one (AV_CODEC_PROP_FIELDS semantics, as in the H.264 parser) */
+            if (ctx->sei.picture_timing.present) {
+                switch (ctx->sei.picture_timing.pic_struct) {
+                case 1: case 2: case 9: case 10: case 11: case 12:
+                    s->repeat_pict = 0; /* individual field */
+                    break;
+                case 5: case 6:
+                    s->repeat_pict = 2; /* field pair with one field repeated 
*/
+                    break;
+                case 7:
+                    s->repeat_pict = 3; /* frame doubling */
+                    break;
+                case 8:
+                    s->repeat_pict = 5; /* frame tripling */
+                    break;
+                default:
+                    s->repeat_pict = 1; /* frame */
+                }
+            } else {
                 s->repeat_pict = 1;
-            } else if (ctx->sei.picture_timing.picture_struct == 
HEVC_SEI_PIC_STRUCT_FRAME_TRIPLING) {
-                s->repeat_pict = 2;
             }
             ret = hevc_parse_slice_header(s, nal, avctx);
             if (ret)
-- 
2.52.0

_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[FFmpeg-devel] [PR] hevc-interlace (PR #23616)

Reply via email to