On Thu, Dec 13, 2012 at 10:07:19AM +0100, Tina le Grand wrote:
> Hi,
> 
> I have reviewed draft-ietf-codec-oggopus-00, and I have some questions and
> comments. The Ogg format is new to me, so some of my comments might not
> require any changes to the specification.
> 
> Section 3:
> - In RFC 3533 "packet" is defined to be "created by the encoder of the
> logical bitstream and represent meaningful entities for the encoder only",
> but in this section it is said that "the first packet in the logical Ogg
> bitstream MUST contain the identification (ID) header...". This header is
> not part of the regular Opus bitstream, and is defined for the Ogg Opus
> format only. does "packet" have a different meaning in this specification?
> Or should it say "the first page.."?

It's actually probably "encoder" that might have the different meaning
there :)  Here it means the thing encoding the "ogg bitstream" which
includes both the codec (Opus) packets and the metadata in the bitstream.

Ogg pages contain packets, and all payload is contained in packets,
including metadata not produced by the codec itself.

If there's some simple text we could add to clarify this though, please
do suggest something.


> Section 5:
> - Would be great if you could mention the name of the two header packets in
> the first section.

They are introduced and named in Section 3.  But the "tell them what you
are going to tell them" principle means repeating that in the Section 5
intro probably isn't a bad idea.

> Section 5.1, list number 4:
> - What is a cropped Ogg Opus stream?

That's mentioned in Section 4.4.  Ogg streams can generally be shortened by
trimming packets off the beginning or end of the stream without needing to
rewrite the entire stream portion that is retained.

If a few more words to explain this are warranted, then please suggest
some extra text (and the best place to put it) for that too.

> Section 5.1, list number 5 (page 13):
> - What does this mean "The original sample rate of the encoder input is not
> preserved by the lossy compression"? You don't get perfect reconstruction,
> of course, but you'll get the same audio bandwidth except for the 48 kHz
> mode, or is there any other filtering affecting the bandwidth?

Once the stream is encoded as Opus, there is no longer any concept of
sample rate inherent in the data.  It is entirely a decision of the
decoder configuration what sample rate it will be reconstructed with.

Bandwidth will be preserved if it chooses a sufficiently high sample
rate, but it could also take a full-band 48k original signal and
reconstruct that at 8kHz.  It would just throw away extra information
that wasn't actually lost by the encoder.

If you want to know exactly what the sample rate of the original was,
(regardless of the actual bandwidth of the audio it contained), then
we need to store that as metadata, because Opus doesn't know it.

> Section 5.1, list number 5 (page 13):
> - There is a new numbered list within list number 5, which makes
> the document hard to read.

That's not really a numbered list, it's a 4 step procedure.
That said, that procedure itself is also under review, since it is
arguably not actually the optimal decision for quality at present.

> - List number 3: Can be more clear: "...decode at the highest supported
> rate above the hardware's sample rate and resample."
> - Right after the list of 4 ways of choosing decode rate there is a fifth,
> that is not in the list. I think it should be merged into the list as one
> option.

That's not really a 5th step in the procedure for selecting a _playback_
sample rate.  It's the rationale for having this metadata field.

As the first sentence says:

 This field is _not_ the sample rate to use for playback of the
 encoded data.

But if there is confusion over this section, it would be good to figure
out where and clear that up too.  It does sort of sidetrack a little
before getting to the actual use of the field, but the sidetrack is
related to when a decoder should use the field and when it should just
ignore it completely.


> Section 5.1.1, page 17:
> - Family 1: Would be better if the Vorbis channel order was described in
> this spec as well.

We had discussed that, but there was no clear guidance yet on whether it
was better as an external reference or should be repeated again here.

> Section 5.2, list number 3:
> - NUL -> NULL? or "null" as in section 5.1 list number 2, page 12.

NUL is the traditional name for the 0 octet in a character string,
as distinct from a NULL pointer value etc. -- though it's true that
it is also called a "null terminator" when it's not actually a part
of the string content ...

http://en.wikipedia.org/wiki/Null_character


Thanks for your feedback on all this.

  Ron


_______________________________________________
codec mailing list
codec@ietf.org
https://www.ietf.org/mailman/listinfo/codec

Reply via email to