On Thu, Dec 13, 2012 at 10:07:19AM +0100, Tina le Grand wrote: > Hi, > > I have reviewed draft-ietf-codec-oggopus-00, and I have some questions and > comments. The Ogg format is new to me, so some of my comments might not > require any changes to the specification. > > Section 3: > - In RFC 3533 "packet" is defined to be "created by the encoder of the > logical bitstream and represent meaningful entities for the encoder only", > but in this section it is said that "the first packet in the logical Ogg > bitstream MUST contain the identification (ID) header...". This header is > not part of the regular Opus bitstream, and is defined for the Ogg Opus > format only. does "packet" have a different meaning in this specification? > Or should it say "the first page.."?
It's actually probably "encoder" that might have the different meaning there :) Here it means the thing encoding the "ogg bitstream" which includes both the codec (Opus) packets and the metadata in the bitstream. Ogg pages contain packets, and all payload is contained in packets, including metadata not produced by the codec itself. If there's some simple text we could add to clarify this though, please do suggest something. > Section 5: > - Would be great if you could mention the name of the two header packets in > the first section. They are introduced and named in Section 3. But the "tell them what you are going to tell them" principle means repeating that in the Section 5 intro probably isn't a bad idea. > Section 5.1, list number 4: > - What is a cropped Ogg Opus stream? That's mentioned in Section 4.4. Ogg streams can generally be shortened by trimming packets off the beginning or end of the stream without needing to rewrite the entire stream portion that is retained. If a few more words to explain this are warranted, then please suggest some extra text (and the best place to put it) for that too. > Section 5.1, list number 5 (page 13): > - What does this mean "The original sample rate of the encoder input is not > preserved by the lossy compression"? You don't get perfect reconstruction, > of course, but you'll get the same audio bandwidth except for the 48 kHz > mode, or is there any other filtering affecting the bandwidth? Once the stream is encoded as Opus, there is no longer any concept of sample rate inherent in the data. It is entirely a decision of the decoder configuration what sample rate it will be reconstructed with. Bandwidth will be preserved if it chooses a sufficiently high sample rate, but it could also take a full-band 48k original signal and reconstruct that at 8kHz. It would just throw away extra information that wasn't actually lost by the encoder. If you want to know exactly what the sample rate of the original was, (regardless of the actual bandwidth of the audio it contained), then we need to store that as metadata, because Opus doesn't know it. > Section 5.1, list number 5 (page 13): > - There is a new numbered list within list number 5, which makes > the document hard to read. That's not really a numbered list, it's a 4 step procedure. That said, that procedure itself is also under review, since it is arguably not actually the optimal decision for quality at present. > - List number 3: Can be more clear: "...decode at the highest supported > rate above the hardware's sample rate and resample." > - Right after the list of 4 ways of choosing decode rate there is a fifth, > that is not in the list. I think it should be merged into the list as one > option. That's not really a 5th step in the procedure for selecting a _playback_ sample rate. It's the rationale for having this metadata field. As the first sentence says: This field is _not_ the sample rate to use for playback of the encoded data. But if there is confusion over this section, it would be good to figure out where and clear that up too. It does sort of sidetrack a little before getting to the actual use of the field, but the sidetrack is related to when a decoder should use the field and when it should just ignore it completely. > Section 5.1.1, page 17: > - Family 1: Would be better if the Vorbis channel order was described in > this spec as well. We had discussed that, but there was no clear guidance yet on whether it was better as an external reference or should be repeated again here. > Section 5.2, list number 3: > - NUL -> NULL? or "null" as in section 5.1 list number 2, page 12. NUL is the traditional name for the 0 octet in a character string, as distinct from a NULL pointer value etc. -- though it's true that it is also called a "null terminator" when it's not actually a part of the string content ... http://en.wikipedia.org/wiki/Null_character Thanks for your feedback on all this. Ron _______________________________________________ codec mailing list codec@ietf.org https://www.ietf.org/mailman/listinfo/codec