message from Alan W Black <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Christoph Nuscheler wrote:
message from Christoph Nuscheler <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Hi,

This is good news, thanks for doing this.

If you are happy we'd like to include this in the standard system
and replace the ancient SAPI code that is there ...


I'm currently trying to port the Flite 1.4 voices to the Microsoft Speech API. First, I got all the voices to work in the "Flite way" on Win32, i. e., including flite.h in Windows applications and synthesizing speech directly through Flite functions works just fine. Check out http://www.student.uni-augsburg.de/~nuschech/FliteForWindows.html if you're interested. ;-)

As a next step, I want to create the MS SAPI interfaces to all the voices. Flite 1.4 sources already contain some SAPI code for the kal voice (FliteTTSEngineObj etc.). This code is _very_ outdated, using Visual C++ 6.0 projects and such, but finally I got it to compile on Visual Studio 2010. Installing the kal voice in MS SAPI works flawlessly. Finally, I created projects for all the other voices.

Now that each voice compiles and installs successfully as a SAPI voice, I realised that all voices except kal and kal16 sound strange when synthesizing through MS SAPI.

The synthesis output of MS SAPI:
http://www.student.uni-augsburg.de/~nuschech/speech-samples-flite-sapi.mp3

For comparison, the synthesis output when synthesizing directly through Flite (using the same Win32 DLLs, but without using FliteTTSEngineObj): http://www.student.uni-augsburg.de/~nuschech/speech-samples-flite-direct.mp3

Note that awb already sounds a bit strange, while rms and slt both sound very very awkward...

I suspect a problem within the FliteTTSEngineObj code. While sapi/README states that this code can be used to SAPI-enable other voices as well, I think there might be something wrong with some of the constant values in this class that will not work for voices beyond kal.

Any suggestions? Perhaps someone can identify the bug by listening to the MP3s... ;-)

So clearly rms and slt are at the wrong sample rate. But kal is 8KHz and kal16 is 16Khz so that part is working, awb, rms and slt are 16KHz which admittedly might not be ideal for some versions of Windows and there may be some silly missing resampling going on in the Windows drivers themselves.

Did you generate all these in the order we are listening them?  I wonder
if a sample rate value isn't being changed when yuo change voice.

Do you get the same result is you play rms immediately after a reboot?

Alan



Best regards,

Christoph

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
=    University of Edinburgh's Festival Speech Synthesis System       =
= http://festvox.org/festival      Sent Via [email protected] =
=                           To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =


= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
=    University of Edinburgh's Festival Speech Synthesis System       =
= http://festvox.org/festival      Sent Via [email protected] =
=                           To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

_______________________________________________
Festlang-talk mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/festlang-talk

Reply via email to