message from Alan W Black <[email protected]> to festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Christoph Nuscheler wrote:
message from Christoph Nuscheler <[email protected]> to
festival-talk
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Hi,
This is good news, thanks for doing this.
If you are happy we'd like to include this in the standard system
and replace the ancient SAPI code that is there ...
I'm currently trying to port the Flite 1.4 voices to the Microsoft
Speech API. First, I got all the voices to work in the "Flite way" on
Win32, i. e., including flite.h in Windows applications and synthesizing
speech directly through Flite functions works just fine. Check out
http://www.student.uni-augsburg.de/~nuschech/FliteForWindows.html if
you're interested. ;-)
As a next step, I want to create the MS SAPI interfaces to all the
voices. Flite 1.4 sources already contain some SAPI code for the kal
voice (FliteTTSEngineObj etc.). This code is _very_ outdated, using
Visual C++ 6.0 projects and such, but finally I got it to compile on
Visual Studio 2010. Installing the kal voice in MS SAPI works
flawlessly. Finally, I created projects for all the other voices.
Now that each voice compiles and installs successfully as a SAPI voice,
I realised that all voices except kal and kal16 sound strange when
synthesizing through MS SAPI.
The synthesis output of MS SAPI:
http://www.student.uni-augsburg.de/~nuschech/speech-samples-flite-sapi.mp3
For comparison, the synthesis output when synthesizing directly through
Flite (using the same Win32 DLLs, but without using FliteTTSEngineObj):
http://www.student.uni-augsburg.de/~nuschech/speech-samples-flite-direct.mp3
Note that awb already sounds a bit strange, while rms and slt both sound
very very awkward...
I suspect a problem within the FliteTTSEngineObj code. While sapi/README
states that this code can be used to SAPI-enable other voices as well, I
think there might be something wrong with some of the constant values in
this class that will not work for voices beyond kal.
Any suggestions? Perhaps someone can identify the bug by listening to
the MP3s... ;-)
So clearly rms and slt are at the wrong sample rate. But kal is 8KHz
and kal16 is 16Khz so that part is working, awb, rms and slt are 16KHz
which admittedly might not be ideal for some versions of Windows and
there may be some silly missing resampling going on in the Windows
drivers themselves.
Did you generate all these in the order we are listening them? I wonder
if a sample rate value isn't being changed when yuo change voice.
Do you get the same result is you play rms immediately after a reboot?
Alan
Best regards,
Christoph
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= University of Edinburgh's Festival Speech Synthesis System =
= http://festvox.org/festival Sent Via [email protected] =
= To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= University of Edinburgh's Festival Speech Synthesis System =
= http://festvox.org/festival Sent Via [email protected] =
= To unsubscribe mail [email protected] =
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
_______________________________________________
Festlang-talk mailing list
[email protected]
https://lists.berlios.de/mailman/listinfo/festlang-talk