Hello Jeroen,

The experimental quantisation I'm working on is flexible, you can trade
off the number of VQ stages and frame rate (10,20,30 ... ms).

In order to get a FreeDV mode on the air, I've settled on 52 bits every
30ms (1733 bits/s).  You'll need syncronisation and maybe FEC on top of
that.

The work-in-progress code is here:

  https://github.com/mozilla/LPCNet/tree/dr_exp_quant

- David

On 25/02/19 03:31, Jeroen Vreeken wrote:
> Hi David,
> 
> How 'fixed' is the 'around 2000 bits/s' number? And do you have some
> idea of the frame size you are going to use?
> (e.g. 40ms or something different?P
> I really liked the sound of the new modes and would like to test them on
> UHF with a mode based on 2400B.
> If we take the current 2400B frame and strip the padding (not needed if
> not doing TDMA) and protocol bits (can be doing in the data channel with
> the alternate UW) you get 80bits of usefull data per 40ms frame which
> translates nicely to 2000bits/s.
> 
> If you end up on something different it can probably still be done by
> changing the framing some more (e.g. larger frame than just 96 bits). In
> that case I would like to prepare the fmfsk code and framing code for it
> and start testing a bit.
> 
> Regards,
> Jeroen
> 
> On 02/24/2019 05:08 AM, David Rowe wrote:
>> Hi Mike,
>>
>> Unfortunately the masking model work didn't lead to a viable Codec 2 mode.
>>
>> I think a LPCNet codec might suit your application well, I'll have a
>> release in the next few weeks for you to play with (around 2000 bits/s).
>>
>> Jean-Marc Valin (the author) has ported the code to C and to run on
>> general purpose CPUs, and we have some done some optimisation for
>> ARM-NEON.  It's real time on a modern smart phone, and has scope for
>> further optimisation (help wanted here!)
>>
>> You don't need any special libraries, and it doesn't (really) need
>> training for specific speakers, although you could re-train if you
>> wanted to.
>>
>> Cheers,
>> David
>>
>> On 23/02/19 19:40, Mike Dawson wrote:
>>> Hi Codec2 list,
>>>
>>> I'm working together with Samih, looking at shrinking Khan Academy and
>>> other educational content for our offline library app. I've been trying
>>> to figure out optimal codec2 encoding / decoding parameters.
>>>
>>> We know who the speaker is in each clip. As far as I can understand, the
>>> best approach for us to achieve optimal results with a fixed speaker
>>> set, with having access to the original would be using the masking model
>>> outlined here: https://www.rowetel.com/?p=4454. Is this masking model
>>> per speaker, or per clip?
>>>
>>> I haven't managed to get the masking model running yet, but I made a
>>> basic script (
>>> https://gist.github.com/mikedawson/1d66a1d35bd1538b2a9950246ef061a2 ) to
>>> generate comparison tables using a basket of clips and different
>>> parameter combinations. The audio from 4 Khan Academy clips with
>>> different codec2 settings is here:
>>>
>>> https://www.ustadmobile.com/files/codec2/out/
>>>
>>> Using VP9 compression, the video in a 3.5 min clip can be shrunk to just
>>> under 100kB. If we used 2.4kbps codec2 for the audio, we could get the
>>> audio to around 70kB. As there are around 15,000 videos (only in
>>> English), codec2 could save a huge amount of space and bandwidth. That
>>> makes it around 60-70% smaller than the smallest 'mobile friendly' mp4
>>> version from Khan Academy.
>>>
>>> On the LPCNet topic: this is definitely interesting, but will need
>>> further investigation. The examples from the masking model sounded
>>> pretty good. One obstacle I can see is the size of the training file.
>>> The app has to work offline and we have to keep the app size itself as
>>> small as possible. Perhaps with a limited speaker set, and no need to
>>> work on untrained files, this would not be so bad. We would also need to
>>> get the model to work with Tensorflow lite. Finally, in many places
>>> where low bandwidth and device space is an issue, the phones themselves
>>> often have limited capacity (Android 4.4 is still very much alive). 
>>>
>>> Any further suggestion on what would be the current recommended /
>>> optimal approach for a fixed set of speakers would be much appreciated!
>>> We're very excited about the potential of this to make this education
>>> content more accessible.
>>>
>>> Thanks!
>>>
>>> -Mike
>>>
>>> CEO
>>> Ustad Mobile
>>>
>>> Email: m...@ustadmobile.com
>>> Web: www.ustadmobile.com
>>> Twitter: @ustadmobile
>>> Facebook: www.facebook.com/Ustad.Mobile
>>>
>>>
>>> _______________________________________________
>>> Freetel-codec2 mailing list
>>> Freetel-codec2@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
>>>
>>
>> _______________________________________________
>> Freetel-codec2 mailing list
>> Freetel-codec2@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
> 
> 
> 
> _______________________________________________
> Freetel-codec2 mailing list
> Freetel-codec2@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/freetel-codec2
> 


_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to