On Mon, 2021-09-13 at 07:24 +0000, Greg Maxwell wrote:
> On Mon, Sep 13, 2021 at 7:05 AM Random via Freetel-codec2
> <freetel-codec2@lists.sourceforge.net> wrote:
> > Is it speaker-independent ?
> 
> It's speaker independent with the additional per-speaker data
> mentioned in my post.
> 

That sounds like speaker dependence to me.

I encountered this with the early LPCNet work as well (as used in
FreeDV 2020), the quality dropped off significantly for about 10% of
voices (including mine!).  However I haven't tried the latest version
of LPCnet from Jean-Marc, he's been steadily improving his NN model and
codec.

The Lyra paper mentions some specific work in this area, so I'm sure it
will be addressed in time.  High quality, speaker independent speech
coding at sub 1000 bit's certainly feels possible.

Another issue to address is robustness to bit errors.  In codec 2 I
avoid inter-frame coding (ie coding differences) to keep some tolerance
to the high bit error rates.  This costs a few bits/s compared to a
super efficient approach.

I figure tolerance to bit errors might be something we can train for in
NN codecs.

- David




_______________________________________________
Freetel-codec2 mailing list
Freetel-codec2@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freetel-codec2

Reply via email to