The problem is that all the so called human voices are spliced together 
syllables and word fragments taped together. So you get emphasis on the wrong 
parts of the sentences, pauses in the wrong place, and stuff like that. If they 
would devote more machine learning time into proper text to speech rendering 
instead of sensorship and other nonsense, we might get somewhere.

----- Original Message -----
From: Linux for blind general discussion <[email protected]>
To: [email protected]
Date: Sun, 18 Apr 2021 00:42:25 +0000
Subject: Re: Formatting - was Would you be interested in having natural 
sounding TTS voices by Readspeaker on Linux? demo link included

> Don't get me wrong, more natural sounding TTS with proper inflection
> would be great, and for me, the holy grail would be TTS capable of
> reading a digitized novel in real-time or reading subtitles on foreign
> media in real-time and be indistinguishable from a human cast
> recording a audio dramatization or dubbed vocal track... but unless
> there's been massive improvements in recent years I'm unaware of, the
> natural voices are at that point where they almost sound human but
> fail in a subtle but unsettling way that's hard to qualify, and until
> we get over that hurdle, I'll take the obviously robotic monotone over
> the almost, but not quite, passes for a human reader voices for daily
> work.
> 
> _______________________________________________
> Blinux-list mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/blinux-list
> 
> 


_______________________________________________
Blinux-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/blinux-list

Reply via email to