Hi Mike, I couldn't agree with Klaatu or yourself more, this comment and command deserves it's own show. You should contact Jonathan Duddington, and see if you can get an interview.
However, from my own listening to this show I understood that the context was in relation to the use case of the HPR Introduction only. In that context many of your points do not apply. His walk through of the "state of the art" is something that I did myself some years ago and I am glad he revisited it. His findings are unfortunately (and this is again about the HPR Introduction only) that the natural sounding voices available on Linux have not improved. Back then they sounded old and dated, and now even more so when compared with Amazon Echo, Apples Siri, or Google Assistant. I had hoped to train Mary TTS to have a HPR voice but it was beyond my skill level to do this. I also contacted the Mycroft team to see how they encoded Popeys voice and unfortunately it was closed source. So we are still in the situation of having no "natural voices" available for the many use cases where that would be useful. I'm including Jeroen on the bcc and he can address your criticisms directly if he wishes to. -- Regards, Ken Fallon http://kenfallon.com http://hackerpublicradio.org/correspondents.php?hostid=30 On 2019-04-16 04:34, Klaatu wrote: > An HPR episode on this topic would be amazing. If you're too busy, i'd > be happy to read your email into a recorder and release the show in > your name. > > On 16 April 2019 1:10:29 PM NZST, Mike Ray <[email protected]> wrote: > > I got an error when I tried to post a comment about the latest podcast > about TTS. > > I don't have the error code now as I pasted it into an email to admin > and the email bounced. > > I suspect it was because my comment was too long, here it is: > > > Condescending and sarcastic. > > Oh isn't text-to-speech such a laugh? > > I get really, really annoyed when people criticise eSpeak. > > Anybody who complains about it sounding robotic obviously was not around > thirty years ago. > > eSpeak's author, Jonathan Duddington, in my humble opinion, deserves a > Nobel prize. > > He has probably done more for blind and visually impaired computer > users, like me, all over the world, than any other individual or > organisation. > > In fact it is hard to name any single person who has had such an impact > on any group of users, apart perhaps from Linus Torvalds and Richard > Stallman. > > 1. It is Open Source. > 2. It is tiny, the memory footprint is small. > 3. It is snappy and can speak really fast, which is what we (blind > people) use when we get used to it, speeds that would make your hair curl. > 4. It probably has more language support than any other free and Open > Source synthesiser. > 5. It can run in a mode where it can return rendered speech, in the form > of PCM, to a calling program, so it can be used in other programs. I > don't think any other synth can do this. > 6. It can even return phonemes, a mode which I have used more than once > to provide a kind of 'fuzzy search' in a database. > > I regularly write and maintain library code, and application code, in C, > C++ and/or Python, as well as Perl, and many of these code libraries > have in excess of 100k lines. > > Including, incidentally, a library which used a combination of eSpeak > and OMX to render TTS directly on the GPU of a Raspberry Pi when 'they' > broke the ALSA driver on the Pi, which made the speech stutter and crash > the kernel, and refused to fix it for about four years. > > If I spent all my time bitching about how robotic eSpeak is I would > never get any work done. > > How much time do you spend when you should be writing code, worrying > about your wallpaper or the colour of your screen's background? > > Or do you just not notice it after a while? > > Well, after spending years writing code when I can't even see the Sun > when I stare directly at it, I can tell you I never notice what eSpeak > sounds like. > > I would probably be equally at home working with flite, festival, or > svox pico (which you missed). > > In addition, eSpeak is in use in NVDA, the free and Open Source Windows > screen reader which is currently giving the multi-hundreds of pounds > commercial offerings a real problem, and providing cash-strapped blind > users a chance. Although now the Windows Narrator is catching up, I > still prefer NVDA and eSpeak. > > MaryTTS is bloated. There was some excitement around it a few years > ago, but it has more or less faded away in the minds of the blind and VI > community, since it is so bloated and, as far as I know, nobody has ever > made a successful screen reader from it. > > Even if there was one, it would probably make a Raspberry Pi choke. > Whereas eSpeak runs snappily and happily on a 256k Raspberry Pi first-gen. > > The 'holy trinity' of the Linux GUI, as far as blind and VI users are > concerned, is: > > 1. Orca, the GTK screen reader, written in Python, and a work of art. > 2. speech-dispatcher, written in C, a TTS 'server' program which Orca > connects to to send text and get speech from it. > 3. eSpeak, although there are speech-dispatcher modules also for flite > and festival, eSpeak is the best one IMHO. > > In the console: > > 1. SpeakUp, kernel modules including speakup and speakup_soft which make > a console mode screen reader. > 2. espeakup, the SpeakUp to eSpeak connector. > 3. eSpeak. > > eSpeak is gold dust. > > > > > _______________________________________________ > Hpr mailing list > [email protected] > http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Hpr mailing list [email protected] http://hackerpublicradio.org/mailman/listinfo/hpr_hackerpublicradio.org
