On Mon, 2021-09-13 at 07:24 +0000, Greg Maxwell wrote: > On Mon, Sep 13, 2021 at 7:05 AM Random via Freetel-codec2 > <freetel-codec2@lists.sourceforge.net> wrote: > > Is it speaker-independent ? > > It's speaker independent with the additional per-speaker data > mentioned in my post. >
That sounds like speaker dependence to me. I encountered this with the early LPCNet work as well (as used in FreeDV 2020), the quality dropped off significantly for about 10% of voices (including mine!). However I haven't tried the latest version of LPCnet from Jean-Marc, he's been steadily improving his NN model and codec. The Lyra paper mentions some specific work in this area, so I'm sure it will be addressed in time. High quality, speaker independent speech coding at sub 1000 bit's certainly feels possible. Another issue to address is robustness to bit errors. In codec 2 I avoid inter-frame coding (ie coding differences) to keep some tolerance to the high bit error rates. This costs a few bits/s compared to a super efficient approach. I figure tolerance to bit errors might be something we can train for in NN codecs. - David _______________________________________________ Freetel-codec2 mailing list Freetel-codec2@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/freetel-codec2