For the last few months, we here at Cepstral have been working on
building speech input and output applications using Festival, Sphinx,
Perl, and POE (of course).  Now, it seems, the modules underlying them
are finally ready to be released into the unsuspecting world.  Behold
our works, ye mighty, and despair:

POE::Component::Festival - POE component for speech synthesis

This is a servlet component which allows you to communicate with a
Festival server in event-driven fashion.  It is essentially an adaptor
for the Festival::Client::Async module, which can also be found at the
site below.  You must have Festival
(http://www.speech.cs.cmu.edu/festival/) installed and running as a
server to use it.

POE::Component::SPX - POE component for speech recognition

This is a servlet component which interfaces to the Sphinx-II speech
recognition system.  It is an adaptor for Speech::Recognizer::SPX,
which, again, can also be found on our site.  Currently, you must have
a recent version of Sphinx-II (http://www.speech.cs.cmu.edu/sphinx/)
from CVS (see http://www.sourceforge.net/projects/cmusphinx/) - the
0.2 release is too old.

POE::Component::Audio - POE component for audio input/output

This is a servlet component which provides an asynchronous,
event-driven, quasi-real-time interface to OSS audio devices (such as
on Linux or *BSD).  It requires our Audio::OSS module.

POE::Component::SilenceFilter - POE component for audio silence filtering

This is a servlet component which is interposed between
PoCo::Audio::Input and PoCo::SPX in order to remove silence regions
from audio input and trigger the starting and stopping of utterance
processing.  It adapts Audio::SPX::Continuous, which is part of the
Speech::Recognizer::SPX distribution.

All of these modules are released under the same license as Perl
itself, and the underlying software (Sphinx-II and Festival) is
released under BSD-like licenses.  So yes...  it's all free software!

You can obtain all these modules, and their dependencies, from our
developers' site at

Some caveats:

Currently this stuff has only been tested under GNU/Linux on the i386
platform.  Due to the dependence on OSS audio (or OSS emulation, such
as in ALSA), the audio code is only likely to work on Linux and *BSD
systems with OSS drivers.  The synthesis and recognition functions,
however, should be sufficiently platform-independent.

The modules and components appear to be quite stable, though there are
many features remaining to be implemented.

All modules and components are documented, and all except
PoCo::SilenceFilter have test suites.  However, test suite and
documentation coverage is not complete, and there should be more
example code.

Please send suggestions, bug reports, success and failure stories, and
questions to me, <[EMAIL PROTECTED]>.

A bit more about us:

Cepstral LLC was founded in 2000 by Kevin A. Lenzo and Alan W Black,
leading researchers in speech recognition and synthesis, to develop,
market, and support open-source speech technology.  We specialize in
high-quality characteristic speech output systems, based on proven,
scalable open-source engines.

David Huggins-Daines    |  [EMAIL PROTECTED]
Toolsmith               |  http://www.cepstral.com/
Cepstral LLC            |  We Build Voices

Reply via email to