Hello Jeroen, The experimental quantisation I'm working on is flexible, you can trade off the number of VQ stages and frame rate (10,20,30 ... ms).
In order to get a FreeDV mode on the air, I've settled on 52 bits every 30ms (1733 bits/s). You'll need syncronisation and maybe FEC on top of that. The work-in-progress code is here: https://github.com/mozilla/LPCNet/tree/dr_exp_quant - David On 25/02/19 03:31, Jeroen Vreeken wrote: > Hi David, > > How 'fixed' is the 'around 2000 bits/s' number? And do you have some > idea of the frame size you are going to use? > (e.g. 40ms or something different?P > I really liked the sound of the new modes and would like to test them on > UHF with a mode based on 2400B. > If we take the current 2400B frame and strip the padding (not needed if > not doing TDMA) and protocol bits (can be doing in the data channel with > the alternate UW) you get 80bits of usefull data per 40ms frame which > translates nicely to 2000bits/s. > > If you end up on something different it can probably still be done by > changing the framing some more (e.g. larger frame than just 96 bits). In > that case I would like to prepare the fmfsk code and framing code for it > and start testing a bit. > > Regards, > Jeroen > > On 02/24/2019 05:08 AM, David Rowe wrote: >> Hi Mike, >> >> Unfortunately the masking model work didn't lead to a viable Codec 2 mode. >> >> I think a LPCNet codec might suit your application well, I'll have a >> release in the next few weeks for you to play with (around 2000 bits/s). >> >> Jean-Marc Valin (the author) has ported the code to C and to run on >> general purpose CPUs, and we have some done some optimisation for >> ARM-NEON. It's real time on a modern smart phone, and has scope for >> further optimisation (help wanted here!) >> >> You don't need any special libraries, and it doesn't (really) need >> training for specific speakers, although you could re-train if you >> wanted to. >> >> Cheers, >> David >> >> On 23/02/19 19:40, Mike Dawson wrote: >>> Hi Codec2 list, >>> >>> I'm working together with Samih, looking at shrinking Khan Academy and >>> other educational content for our offline library app. I've been trying >>> to figure out optimal codec2 encoding / decoding parameters. >>> >>> We know who the speaker is in each clip. As far as I can understand, the >>> best approach for us to achieve optimal results with a fixed speaker >>> set, with having access to the original would be using the masking model >>> outlined here: https://www.rowetel.com/?p=4454. Is this masking model >>> per speaker, or per clip? >>> >>> I haven't managed to get the masking model running yet, but I made a >>> basic script ( >>> https://gist.github.com/mikedawson/1d66a1d35bd1538b2a9950246ef061a2 ) to >>> generate comparison tables using a basket of clips and different >>> parameter combinations. The audio from 4 Khan Academy clips with >>> different codec2 settings is here: >>> >>> https://www.ustadmobile.com/files/codec2/out/ >>> >>> Using VP9 compression, the video in a 3.5 min clip can be shrunk to just >>> under 100kB. If we used 2.4kbps codec2 for the audio, we could get the >>> audio to around 70kB. As there are around 15,000 videos (only in >>> English), codec2 could save a huge amount of space and bandwidth. That >>> makes it around 60-70% smaller than the smallest 'mobile friendly' mp4 >>> version from Khan Academy. >>> >>> On the LPCNet topic: this is definitely interesting, but will need >>> further investigation. The examples from the masking model sounded >>> pretty good. One obstacle I can see is the size of the training file. >>> The app has to work offline and we have to keep the app size itself as >>> small as possible. Perhaps with a limited speaker set, and no need to >>> work on untrained files, this would not be so bad. We would also need to >>> get the model to work with Tensorflow lite. Finally, in many places >>> where low bandwidth and device space is an issue, the phones themselves >>> often have limited capacity (Android 4.4 is still very much alive). >>> >>> Any further suggestion on what would be the current recommended / >>> optimal approach for a fixed set of speakers would be much appreciated! >>> We're very excited about the potential of this to make this education >>> content more accessible. >>> >>> Thanks! >>> >>> -Mike >>> >>> CEO >>> Ustad Mobile >>> >>> Email: m...@ustadmobile.com >>> Web: www.ustadmobile.com >>> Twitter: @ustadmobile >>> Facebook: www.facebook.com/Ustad.Mobile >>> >>> >>> _______________________________________________ >>> Freetel-codec2 mailing list >>> Freetel-codec2@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 >>> >> >> _______________________________________________ >> Freetel-codec2 mailing list >> Freetel-codec2@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/freetel-codec2 > > > > _______________________________________________ > Freetel-codec2 mailing list > Freetel-codec2@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/freetel-codec2 > _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2