Hi Sumek,
Wow you are getting into the hard core DSP - well done :-)
1) Codec 2 uses the sinusoidal model to synthesise the speech, i.e. many
sine wave oscillators, each with their own amplitude A(m).
So we need to estimate the amplitude of each sine wave from the LPC
coefficients, which describe an all-pole filter. We do this by sampling
the all pole LPC spectrum in the frequency domain using aks_to_M2().
As you suggest, other synthesis methods are possible (eg time domain
using the LPC synthesis filter directly) - and would be an interesting
project for a masters or honors project.
IIRC there is a chapter/section/plot in the thesis that explains how we
get A(m) from the all pole LPC spectrum by averaging over a band rather
than direct sampling.
Also some blog posts around 2009:
http://www.rowetel.com/?p=130
BTW I have just added an archive to my blog:
http://www.rowetel.com/?page_id=6172
2/ I have found the post filter is very important to speech quality,
here is a blog post:
http://www.rowetel.com/?p=2661
If you install the FreeDV GUI program, and run it in loop back in the
FreeDV 1600 mode, you adjust the post filter in real time at the top of
the Tools-Filter menu.
On Codec 2 700C (used in the FreeDV 700D mode), a post filter is the key
to the (relatively) high quality at such low bit rates. They way it
works is a little mysterious - and also a good topic for research.
Thomas, a Masters student recently contacted me and is doing some work
in this area. He may like to comment :-)
Cheers,
David
On 06/06/18 04:06, sumek.wi via Freetel-codec2 wrote:
Hi,
I am a noob in voice coding. Now I want to do a project about Codec2, so
I went over some
theories of voice coding, LPC etc. Also I downloaded the reference
C-code and David's Thesis.
There are some points I don't understand, I hope you guys can help me on
these.
1. In the decoder block, I almost get the idea of how it works (thank
to David's PhD thesis)
However, I don't understand why after we get LPC
coefficients from LSP2LPC process, we
have to use FFT (aks_to_M2 function to get ak[i]). As long
as I know, LPC coefficients can
synthesize speech data by all-pole filter from excitation, I
wonder why this algorithm use
FFT directly to the ak[i] to get Aw[]. How do LPC
coefficients relates to the speech power
spectrum. If you guys know any good reference, please let me
know.
2. How important is the lpc_post_filter() ? I look at the code,
it is pretty high computation
process with FFT, and power estimation. I tried to decode
with and without post filter at
1200 bps, I don't hear any difference much. what is the
criteria to use/not use the post
filter ?
Thank you for your helps in advance.
Sumek.
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2