I've had some experience with TTS from developing Read Etexts. Originally, I used speech-dispatcher, which provided a way of writing apps that used TTS without knowing what TTS engine was being used underneath. There were a couple of problems with that. Since more than one engine was supported, the RPM for speech-dispatcher needed to have them all installed. Second, you needed to configure it.
Aleksey Lim came up with a gstreamer plugin for espeak that needed no configuration, and that's what we've been using ever since. One problem we have with TTS is doing highlighting. An XO laptop is not fast enough to make the highlighted word keep up with the word being spoken. (The gstreamer plugin does callbacks just before it speaks a word, and these callbacks are used to highlight the words). A slightly faster computer is enough to resolve the problem. If Festival needed more horsepower to run than espeak it would make a bad situation worse. James Simmons On Tue, Jun 21, 2011 at 8:43 AM, Paul Fox <[email protected]> wrote: > sridhar wrote: > > I'm wondering if there's anything we can do to make TTS sound more > > 'human'. We'd like to be able to use the XOs to teach English > > literacy, but the espeak voices are very robotic. > > > > My understanding is that espeak is optimised for low-power devices > > (great for XOs) and clear (if robotic) speech. Would it be feasible to > > switch to something else, like festival? > > i've run festival as part of my home automation system for many many > years, including the last 3 or so on an XO-1 (debxo) which acts as my > current HA server. > > the first secret is to run it in client/server mode, to avoid the > server startup latency on every enunciation. but even after that, i > think the latency will be too high for your application. i just > tested it: given a moderate english sentence, it took 3 seconds to > produce output. (i hide this on my system by caching utterances -- > that's more feasible in a menuing system than when teaching literacy.) > http://dev.laptop.org/~pgf/junk/festival_out.wav (5 seconds on XO-1) > > flite is a lower cost version of festival that might be appropriate. > it seems to reduce the conversion time to about half a second. > but the quality suffers as well. > http://dev.laptop.org/~pgf/junk/flite_out.wav (.5 seconds on XO-1) > > fyi, current festival server process footprint: > root 999 0.0 9.4 26668 20004 ? Ss Jun06 10:03 > /usr/bin/festival --server /usr/local/etc/nosil.scm > > i haven't used espeak -- i suspect there are API interfaces that are > far richer than what i'm doing from the shell commandline. i don't > know how one might access festival at that level. > > paul > > > > > This is some food for thought: > > http://braille.uwo.ca/pipermail/speakup/2008-July/046755.html > > > > Sridhar > > > > > > Sridhar Dhanapalan > > Technical Manager > > One Laptop per Child Australia > > M: +61 425 239 701 > > E: [email protected] > > A: G.P.O. Box 731 > > Sydney, NSW 2001 > > W: www.laptop.org.au > > _______________________________________________ > > Devel mailing list > > [email protected] > > http://lists.laptop.org/listinfo/devel > > =--------------------- > paul fox, [email protected] > _______________________________________________ > IAEP -- It's An Education Project (not a laptop project!) > [email protected] > http://lists.sugarlabs.org/listinfo/iaep > _______________________________________________ IAEP -- It's An Education Project (not a laptop project!) [email protected] http://lists.sugarlabs.org/listinfo/iaep
