Re: [Freetel-codec2] facebook speech codec at 365bps

Steve Underwood Mon, 15 Nov 2021 14:26:32 -0800

On 12/09/2021 03:33, Greg Maxwell wrote:

https://speechbot.github.io/resynthesis/


https://ai.facebook.com/blog/textless-nlp-generating-expressive-speech-from-raw-audio/

The 365 bps figure is not totally fairly comparable to more
traditional codecs because they presume a per-speaker speaker
embedding is sent once.

Does that data also depend on the language the speaker is using? Thatis, will new data be required if the speaker changes languages? I'veencountered this issue before with codecs requiring up front voiceparameters, especially when tonal languages, like Cantonese, are used.


Regards,

Steve



_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Re: [Freetel-codec2] facebook speech codec at 365bps

Reply via email to