fre 2024-04-19 klockan 13:29 +0200 skrev Tomas Härdin:
> Hi
> 
> Following Bruce's email I took a look at
> https://github.com/drowe67/codec2/blob/main/doc/codec2.pdf to see how
> the speech synthesis in codec2 actually works. This because I suspect
> that NN synth (LPCnet?) is doing something similar to what CELT (one
> half of Opus) does, namely filling off-peak parts of the spectrum
> with
> noise.
> 
> Filling the spectrum with noise makes the output sound a lot less
> robot-y, and robot-y sound is known in the CELT world to be due to
> collapsing all energy into a single spectral bin per Bark band which,
> looking at equation (10), is *precisely what the codec2 synth does*!
> Or, sort of. For low F0 there may be multiple peaks in some bands.
> 
> One way to make the output less robot-y could be to convolve Ŝw with
> a
> suitable function that smooths out the peaks. In CELT this is done
> via
> clever use of rotations if I remember correctly. A floor of comfort
> noise might also be a good idea.
> 
> Oh and on the topic of errors in the paper, there's a spelling
> mistake
> on page 2: anlaysed.

Replying to myself here to say that I tried this with 700C, setting
off-peak entries in the spectrum to fractions of the peak amplitudes.
This sadly didn't seem to help much. But it did give me an idea: what
if 700C had unique codebooks for each fundamental frequency, each entry
being of size FFT_DEC? That way the "average" spectral shape is
preserved, encoded in the tables. This blows up the size of the binary
of course (k=20 -> k=512 and 64x the tables), and probably requires a
larger corpus to prevent overfitting, but it wouldn't require changing
the bitstream..

An "inbetween" solution could be to have just one high-resolution table
(+ refinement) and then scaling its entries according to Wo. That way
you're dealing with 2 MiB of tables rather than 128 MiB

/Tomas


_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to