David Huggins-Daines
Wed, 28 Mar 2001 07:57:15 -0800
Hi, For the last few months, we here at Cepstral have been working on building speech input and output applications using Festival, Sphinx, Perl, and POE (of course). Now, it seems, the modules underlying them are finally ready to be released into the unsuspecting world. Behold our works, ye mighty, and despair: POE::Component::Festival - POE component for speech synthesis ------------------------------------------------------------- This is a servlet component which allows you to communicate with a Festival server in event-driven fashion. It is essentially an adaptor for the Festival::Client::Async module, which can also be found at the site below. You must have Festival (http://www.speech.cs.cmu.edu/festival/) installed and running as a server to use it. POE::Component::SPX - POE component for speech recognition ---------------------------------------------------------- This is a servlet component which interfaces to the Sphinx-II speech recognition system. It is an adaptor for Speech::Recognizer::SPX, which, again, can also be found on our site. Currently, you must have a recent version of Sphinx-II (http://www.speech.cs.cmu.edu/sphinx/) from CVS (see http://www.sourceforge.net/projects/cmusphinx/) - the 0.2 release is too old. POE::Component::Audio - POE component for audio input/output ------------------------------------------------------------ This is a servlet component which provides an asynchronous, event-driven, quasi-real-time interface to OSS audio devices (such as on Linux or *BSD). It requires our Audio::OSS module. POE::Component::SilenceFilter - POE component for audio silence filtering ------------------------------------------------------------------------- This is a servlet component which is interposed between PoCo::Audio::Input and PoCo::SPX in order to remove silence regions from audio input and trigger the starting and stopping of utterance processing. It adapts Audio::SPX::Continuous, which is part of the Speech::Recognizer::SPX distribution. All of these modules are released under the same license as Perl itself, and the underlying software (Sphinx-II and Festival) is released under BSD-like licenses. So yes... it's all free software! You can obtain all these modules, and their dependencies, from our developers' site at http://www.cepstral.com/website/pages/cep_developers.html Some caveats: Currently this stuff has only been tested under GNU/Linux on the i386 platform. Due to the dependence on OSS audio (or OSS emulation, such as in ALSA), the audio code is only likely to work on Linux and *BSD systems with OSS drivers. The synthesis and recognition functions, however, should be sufficiently platform-independent. The modules and components appear to be quite stable, though there are many features remaining to be implemented. All modules and components are documented, and all except PoCo::SilenceFilter have test suites. However, test suite and documentation coverage is not complete, and there should be more example code. Please send suggestions, bug reports, success and failure stories, and questions to me, <[EMAIL PROTECTED]>. A bit more about us: Cepstral LLC was founded in 2000 by Kevin A. Lenzo and Alan W Black, leading researchers in speech recognition and synthesis, to develop, market, and support open-source speech technology. We specialize in high-quality characteristic speech output systems, based on proven, scalable open-source engines. -- David Huggins-Daines | [EMAIL PROTECTED] Toolsmith | http://www.cepstral.com/ Cepstral LLC | We Build Voices