david <da...@rowetel.com> writes: > Here is first version of some Codec 2 algorithm documentation: > > https://github.com/drowe67/codec2/blob/main/doc/codec2.pdf > > It has a description aimed at Hams (sort of thing that could go in a > Ham magazine), plus a maths/signal processing section. It pulls > together a bunch of work I've done since the start of the > project. Kindly funded by our ARDC grant. > > I'm interested in review comments: > > 1/ Any typos/spelling, wording issues. A GitHub PR would be best for these > if you are comfortable with Git. > 2/ Anything that is not explained clearly? Other topics that you would like > to see covered?
I'm coming at this as someone who worked in network protocols (sort of IETF adjacent) and has a hazy understanding of speech coding. I've been lurking forever silently cheering you on. Some meta comments about issues belonging in a spec. These are partly about my impressions over the years, and a bit about the document. First, I think "codec2" is a family of codecs. There are various modes described by bitrate. A decoder for one probably does not process the other. Or maybe it does and I'm confused. The first point under introduction says most of this, but it doesn't address interop. I realize "a codec with modes" and "a family of codecs" is perhaps just wording choice, but to me the key point is that with the second phrasing there is no expectation that a 700C decoder will decode a 700A stream, and even less so that it will decode a 3200 stream. I have long been unclear on whether codec2 is stable, in that if one builds a device that uses a currently-defined mode that this can be expected to remain interoperable over the long term. I'm getting the impression that each mode designation is by fiat stable, but that some modes get retired. And that therefore "implementing codec2" must be done in such a way that a slightly different algorithm needs to be able to be loaded. But with an expectation the processing requirements will not substantially change. It now seems to be that "700C" is a fixed definition, at least for the decoder, and that is guaranteed to remain. But, there might be a 700D. (And yes, I get it that 700 is much harder than 2400 to sound ok and hence there are more revisions.) I have the impression that for a given mode, say 2400, there is a single correct way to decode. But that encoding might have valid choices within the spec, resulting in better estimated parameters, with no need for the decoder to be different. I think there should be a document that specifies each mode. That can be one doc for all, or per mode, and referring to a common doc. None of that is important; the point is that someone who wishes to implement a mode from scratch can assemble a pile of paper and implement from that, without reading your source code, and come up with an interoperable implementation. This is the IETF way, and i think it has great merit. It's also a huge amount of work. In the codec2 part, the theory and the "why are we doing it this way" seems to be the hardest part, and "this is what the bitstream means" seems simpler. The document seems to be short of this, not explaining the bit format.s I have downloaded the file to really read it. I think it would be great if there were tools to unpack the bitstream to say json, to make writing random analysis code easier, and to be able to have something that works like tcpdump. hope this helps... 73 de n1dam _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2