Thanks David and Gregory for explaining.
I'm very curious, if this works in combination with LDPC, where does it fit 
within Shannon-Hartley. It could be one heap of coding gain to be able to send 
8 kHz audio information on a 2 kbit channel

Is this even the right way to categorize it?
I guess I'm a bit aprehensive because I don't understand the math behind it, 
unlike regular Codec2 where even if I'm not a specialist the principles are 
clear to me.

Best regards,
Adrian

On March 4, 2019 8:18:17 PM UTC, David Rowe <da...@rowetel.com> wrote:
>Hi Adrian,
>
>The idea is that if you use a large enough speech database to train, it
>then tends to work OK on general speech from outside the training
>database.  I used about 13 hours of speech for training.  It's
>available
>for free download.
>
>The samples on the blog post were from outside the training database,
>so
>it's working OK on general speech signals.  However I imagine we'll
>find
>some corner cases where the speech is still a bit rough.
>
>As Gregory says, synthesis now runs on regular CPUs.  Those samples
>were
>generated on a 2012 model laptop (with AVX instruction set) at about 3x
>real time.
>
>All an end-user needs is a decent CPU (e.g. a modern smartphone or
>desktop/laptop) and a few MBytes of RAM (the synthesis executable is
>10Mbytes unstripped).
>
>Instructions for trying it are on the LPCNet page.
>
>- David
>
>On 05/03/19 04:22, Adrian Musceac wrote:
>> Hi David,
>> 
>> I read your article with interest. I'm curious your efforts and how
>it
>> scales for a normal person as opposed to a huge internet company.
>> If I read the conclusions correctly, in order to achieve similar
>voice
>> faithfulness as 8 kHz PCM, it seems that one needs to have a huge
>> amount of training data to represent possible words and utterances. I
>> admit I know nothing of this neural network thing, but to me it looks
>> like there is no way to achieve similar quality for every possible
>> amateur radio person who might decide to use it unless all the other
>> users have recordings of the same thing he is trying to say.
>> Now this approach might work reasonably well for Wavenet since they
>> have access one way ort another to a huge amount of voice samples to
>> train their network.
>> 
>> How does this scale for normal hams though? Do each of us need a huge
>> GPU and voice samples to even be able to use this? I am afraid that
>> this might have even less success with hams than current FreeDV
>> releases.
>> 
>> Best regards,
>> Adrian
>> 
>> 
>> _______________________________________________
>> Freetel-codec2 mailing list
>> Freetel-codec2@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
>> 
>
>
>_______________________________________________
>Freetel-codec2 mailing list
>Freetel-codec2@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/freetel-codec2
_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to