fre 2024-04-19 klockan 13:29 +0200 skrev Tomas Härdin: > Hi > > Following Bruce's email I took a look at > https://github.com/drowe67/codec2/blob/main/doc/codec2.pdf to see how > the speech synthesis in codec2 actually works. This because I suspect > that NN synth (LPCnet?) is doing something similar to what CELT (one > half of Opus) does, namely filling off-peak parts of the spectrum > with > noise. > > Filling the spectrum with noise makes the output sound a lot less > robot-y, and robot-y sound is known in the CELT world to be due to > collapsing all energy into a single spectral bin per Bark band which, > looking at equation (10), is *precisely what the codec2 synth does*! > Or, sort of. For low F0 there may be multiple peaks in some bands. > > One way to make the output less robot-y could be to convolve Ŝw with > a > suitable function that smooths out the peaks. In CELT this is done > via > clever use of rotations if I remember correctly. A floor of comfort > noise might also be a good idea. > > Oh and on the topic of errors in the paper, there's a spelling > mistake > on page 2: anlaysed.
Replying to myself here to say that I tried this with 700C, setting off-peak entries in the spectrum to fractions of the peak amplitudes. This sadly didn't seem to help much. But it did give me an idea: what if 700C had unique codebooks for each fundamental frequency, each entry being of size FFT_DEC? That way the "average" spectral shape is preserved, encoded in the tables. This blows up the size of the binary of course (k=20 -> k=512 and 64x the tables), and probably requires a larger corpus to prevent overfitting, but it wouldn't require changing the bitstream.. An "inbetween" solution could be to have just one high-resolution table (+ refinement) and then scaling its entries according to Wo. That way you're dealing with 2 MiB of tables rather than 128 MiB /Tomas _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2