Hi Following Bruce's email I took a look at https://github.com/drowe67/codec2/blob/main/doc/codec2.pdf to see how the speech synthesis in codec2 actually works. This because I suspect that NN synth (LPCnet?) is doing something similar to what CELT (one half of Opus) does, namely filling off-peak parts of the spectrum with noise.
Filling the spectrum with noise makes the output sound a lot less robot-y, and robot-y sound is known in the CELT world to be due to collapsing all energy into a single spectral bin per Bark band which, looking at equation (10), is *precisely what the codec2 synth does*! Or, sort of. For low F0 there may be multiple peaks in some bands. One way to make the output less robot-y could be to convolve Ŝw with a suitable function that smooths out the peaks. In CELT this is done via clever use of rotations if I remember correctly. A floor of comfort noise might also be a good idea. Oh and on the topic of errors in the paper, there's a spelling mistake on page 2: anlaysed. /Tomas _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2