Thanks David and Gregory for explaining. I'm very curious, if this works in combination with LDPC, where does it fit within Shannon-Hartley. It could be one heap of coding gain to be able to send 8 kHz audio information on a 2 kbit channel
Is this even the right way to categorize it? I guess I'm a bit aprehensive because I don't understand the math behind it, unlike regular Codec2 where even if I'm not a specialist the principles are clear to me. Best regards, Adrian On March 4, 2019 8:18:17 PM UTC, David Rowe <da...@rowetel.com> wrote: >Hi Adrian, > >The idea is that if you use a large enough speech database to train, it >then tends to work OK on general speech from outside the training >database. I used about 13 hours of speech for training. It's >available >for free download. > >The samples on the blog post were from outside the training database, >so >it's working OK on general speech signals. However I imagine we'll >find >some corner cases where the speech is still a bit rough. > >As Gregory says, synthesis now runs on regular CPUs. Those samples >were >generated on a 2012 model laptop (with AVX instruction set) at about 3x >real time. > >All an end-user needs is a decent CPU (e.g. a modern smartphone or >desktop/laptop) and a few MBytes of RAM (the synthesis executable is >10Mbytes unstripped). > >Instructions for trying it are on the LPCNet page. > >- David > >On 05/03/19 04:22, Adrian Musceac wrote: >> Hi David, >> >> I read your article with interest. I'm curious your efforts and how >it >> scales for a normal person as opposed to a huge internet company. >> If I read the conclusions correctly, in order to achieve similar >voice >> faithfulness as 8 kHz PCM, it seems that one needs to have a huge >> amount of training data to represent possible words and utterances. I >> admit I know nothing of this neural network thing, but to me it looks >> like there is no way to achieve similar quality for every possible >> amateur radio person who might decide to use it unless all the other >> users have recordings of the same thing he is trying to say. >> Now this approach might work reasonably well for Wavenet since they >> have access one way ort another to a huge amount of voice samples to >> train their network. >> >> How does this scale for normal hams though? Do each of us need a huge >> GPU and voice samples to even be able to use this? I am afraid that >> this might have even less success with hams than current FreeDV >> releases. >> >> Best regards, >> Adrian >> >> >> _______________________________________________ >> Freetel-codec2 mailing list >> Freetel-codec2@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 >> > > >_______________________________________________ >Freetel-codec2 mailing list >Freetel-codec2@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/freetel-codec2
_______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2