----- Original Message -----
From: "Jon Stockill" <[EMAIL PROTECTED]>
To: "FlightGear developers discussions" <[EMAIL PROTECTED]>
Sent: Friday, September 17, 2004 11:54 AM
Subject: Re: [Flightgear-devel] A voice for FG

> John Wojnaroski wrote:
> > Hi,
> >
> > The last month or so I've been working with adding synthetic speech and
> > voice recognition to my 747 project. The results have been quite good;
> > unfortunately it's kind of hard to demonstrate or display the results.
> >
> > Jim Brennan is preparing a corpus of messages and ATC phrases which will
> > used to create a LM (Language Model) for speech recognition and the
> > synthetic speech voices come from a variety of sources -- most notably,
> > FestVox folks at CMU, MBROLA, and the OGI-Festival project at CSLU.
> I was working with the pre-release of festival 2.0 at work last week,
> and the new synthesis methods and voices that are available in that
> release sound particularly impressive. I did think of the possibility of
> using it for air traffic control, if not "live" then as an easy method
> to generate a batch of samples for use in a similar way to the way ATIS
> works at the moment.
The approach that I've taken is to start a festival server on a networked
machine and a small client program that receives a text message as a string
, stuffs it into a festival protcol wrapper and calls the
festivalStringToWave() method. This also will allow you to send control
commands and files to the server to change voices, LMs, etc..

"../bin/festival --server loopback"  starts the server and any client on the
local machine can connect by default. Connections over a LAN require a small
Scheme script to add users to the festival_access_list as part of the
argument list.

The client program then has a few lines of socket code to connect to FG. On
the FG side all you need is something to send a text string over the socket.
Something like FGVoice::fg_say_mesg("this is a test"); There are a couple of
good examples in the /examples/ directory which I used to create a
"atc_net_demo.cpp" application.

The voice recognition is just as easy (actually easier to set up) but
training the model, building the Acoustic model, and the dictionary plus any
special phones is a little more envolved. If you don't mind a bit of a delay
(around 2-3 sec) to decode the audio, you can use the existing models and
get pretty good results. The resultant text string is sent to the AI
controller where it is parsed into tokens and analyzed(compiled?).

I'm not sure how all of this would fit into FG. I suspect the easiest way
would be to create a voice object and a few methods and leave it up to the
individual user if they want to setup the TTS festival package or ASR

John W

Flightgear-devel mailing list

Reply via email to