I've incorporated the earlier comments and made separate limits for
the comment header, clarifying that comment header packets exceeding
61,440 octets may be partially processed but not rejected entirely
unless they exceed a much larger size.  This size was chosen somewhat
arbitrarily to be 120 * 2^20 octets, which should be sufficient for
any reasonable album art in addition to other comments.

In Section 5.1.1.5 replace:

   An Ogg Opus player MUST play any Ogg Opus stream with a channel
   mapping family of 0 or 1, even if the number of channels does not
   match the physically connected audio hardware.

with:

   An Ogg Opus player MUST play any valid Ogg Opus stream with a
   channel mapping family of 0 or 1 that contains a comment header
   no larger than 125,829,120 octets (see Section 5.2), and no audio
   data packet larger than 61,440 octets (see Section 6), even if the
   number of channels does not match the physically connected audio
   hardware.

In Section 5.2, immediately before Section 5.2.1, insert:

   The comment header can be arbitrarily large and might be spread
   over a large number of Ogg pages.  Decoders SHOULD avoid attempting
   to allocate excessive amounts of memory when presented with a very
   large comment header.  To accomplish this, decoders MAY reject a
   comment header larger than 125,829,120 octets, and MAY ignore
   individual comments that are not fully contained within the first
   61,440 octets of the comment header or that would otherwise have
   no impact.

Replace Section 6 with:

6. Audio Data Packet Size Limits

   Technically, valid audio data packets can be arbitrarily large due
   to the padding format, although the amount of non-padding data they
   can contain is bounded.  These packets might be spread over a
   similarly enormous number of Ogg pages.  Encoders SHOULD limit the
   use of padding in audio data packets to no more than is necessary
   to make a variable bitrate (VBR) Ogg Opus stream constant bitrate
   (CBR).  Decoders SHOULD reject audio data packets larger than
   61,440 octets per Opus stream; such packets necessarily contain
   more padding than needed for this purpose.  Decoders SHOULD avoid
   attempting to allocate excessive amounts of memory when presented
   with a very large packet.  Decoders MAY reject or partially process
   audio data packets larger than 61,440 octets in an Ogg Opus stream
   with channel mapping family 1, or in any Ogg Opus stream if the
   packet is also larger than 7680 octets per Opus stream.  The
   presence of an extremely large packet in the stream could indicate
   a memory exhaustion attack or stream corruption.

   In an Ogg Opus stream, the largest possible valid audio data packet
   that does not use padding has a size of (61,298*N - 2) octets.
   With 255 Opus streams, this is 15,630,988 octets and can span up to
   61,298 Ogg pages, all but one of which will have a granule position
   of -1.  This is of course a very extreme packet, consisting of 255
   Opus streams, each containing 120 ms of audio encoded as 2.5 ms
   frames, each frame using the maximum possible number of octets
   (1275) and stored in the least efficient manner allowed (a VBR code
   3 Opus packet).  Even in such a packet, most of the data will be
   zeros as 2.5 ms frames cannot actually use all 1275 octets.

   The largest audio data packet consisting of entirely useful data is
   (15,326*N - 2) octets.  This corresponds to 120 ms of audio encoded
   as 10 ms frames in either SILK or Hybrid mode, but at a data rate
   of over 1 Mbps, which makes little sense for the quality achieved.

   A more reasonable audio data packet size limit is (7,664*N - 2)
   octets.  This corresponds to 120 ms of audio encoded as 20 ms
   stereo CELT mode frames, with a total bitrate just under 511 kbps
   (not counting the Ogg encapsulation overhead).  For mapping family
   1, N=8 provides a reasonable upper bound, as it allows for each of
   the 8 possible output channels to be decoded from a separate stereo
   Opus stream.  This gives a size of 61,310 octets, which is rounded
   up to a multiple of 1024 octets to yield the audio data packet size
   of 61,440 octets that any implementation is expected to be able to
   process successfully.

In Section 14.2 replace:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-800004.3.9
with:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-810004.3.9

Replace:
   https://xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-130000A.2
with:
   https://www.xiph.org/vorbis/doc/Vorbis_I_spec.html#x1-132000A.2

 - Mark

_______________________________________________
codec mailing list
codec@ietf.org
https://www.ietf.org/mailman/listinfo/codec

Reply via email to