[python-win32] How Do You Make Your Speech or SAPI 5 Voices Portable?

FT Thu, 26 Jun 2008 05:09:17 -0700

Hi!

    I am a visually impaired programmer and just joined the Win32 list to
possibly learn how to make my own screen reader, thus have to learn the in's
and out's of windows. I am sending, attached, a simple version of my voice
package and 2 different test modules. Have not learned all the list
comprehension approaches, but may do so in just this simple package. For it
would come in handy in the Create method at the end of the module.


    I am asking if anyone has played with or used the built in SAPI 5
Microsoft voices and tts engine and or tried to make the voices portable so
they are not required to be installed by the user?

    I also have the eSpeak voices in and they also did not get compiled into
the py2exe package.

    I found out when sending the test to a friend the voices were not there
except the built in voices on there machine. It would be nice to bundle them
into a package so as not to have that person hunt them down. Granted they
could be placed on a CD or a web site for download, but I have not gotten
that far yet and would be nice to do it all in one.

    You can test my module and see if you like it or can use it. The only
one I have not tested yet is the bookmark method. Have not had a need for it
yet, but will soon test it. The Voice2.py is a wxpython module test and
allows you to adjust the Volume, Rate, and Pitch of the voice or voices
installed.

NOTE:
    If you do make an executable you may also after compiling the Voice2.py
get an error if you do not
have MSVcp71.dll, mfc71.dll, and gdiplus.dll copied into your setup.py file.
In other words a copy command to
copy the dll's from your system32 folder. For if you do not, some computers
it
is run on will fail if they do not have those dll's installed. It would
appear
that they are not automatically copied.

    I am assuming that most of you probably have all the C dll's and such
and may not need the note, but those who do not will need the dll's.

#FORCE THE NEEDED DLL'S OR COMMENT THEM OUT!
shutil.copy("c:/windows/system32/msvcp71.dll", os.path.join(os.getcwd(),
"dist",
"msvcp71.dll"))
shutil.copy("c:/windows/system32/msvcp71.dll", os.path.join(os.getcwd(),
"dist",
"mfc71.dll"))
shutil.copy("c:/windows/system32/msvcp71.dll", os.path.join(os.getcwd(),
"dist",
"gdiplus.dll"))

    I hope someone has tried to make a portable voice package, but knowing
that more than 95% of programmers are sighted a may have to search a lot.
But if anyone knows how please let me know.

        Bruce

#DRIVERS FOR SAPI 5 AND VOICES!
#NOTE THE CONSTANTS AND IN THE SPEAK FUNCTION AND THE ADDING/OR OF THE VALUES.
from comtypes.client import CreateObject
import _winreg

class constants4tts:
    Wait = -1
    Sync = 0
    Async = 1
    Purge = 2
    Is_filename = 4
    XML = 8
    Not_XML = 16
    Persist = 32
    Punc = 64

class SynthDriver():
    name="sapi5"
    description="Microsoft Speech API version 5 (sapi.SPVoice)"
    _voice = 0
    _pitch = 0
    _voices = []
    _wait = -1 #WAIT INDEFINITELY
    _sync = 0 #WAIT UNTIL SPEECH IS DONE.
    _async = 1 #DO NOT WAIT FOR SPEECH
    _purge = 2 #CLEAR SPEAKING BUFFER
    _is_filename = 4 #OPEN WAV FILE TO SPEAK OR SAVE TO WAV FILE
    _xml = 8 #XML COMMANDS, PRONUNCIATION AND GRAMMER.
    _not_xml = 16 #NO XML COMMANDS
    _persist_xml = 32 #Changes made in one speak command persist to other calls 
to Speak.
    _punc = 64 #PRONOUNCE ALL PUNCTUATION!
    def check(self):
        try:
            r=_winreg.OpenKey(_winreg.HKEY_CLASSES_ROOT,"SAPI.SPVoice")
            r.Close()
            return True
        except:
            return False
#INITIALIZE ENGINE!
    def init(self):
        try:
            self.tts = CreateObject( 'sapi.SPVoice')
            self._voice=0
            self._voiceCount = len(self.tts.GetVoices())
            for v in range(self._voiceCount):
                self._voices.append( self.tts.GetVoices()[v])
            return True
        except:
            return False
#TERMINATE INSTANCE OF ENGINE!
    def terminate(self):
        del self.tts
#NUMBER OF VOICES FOR ENGINE!
    def getVoiceCount(self):
        return len(self.tts.GetVoices())
#NAME OF A VOICE BY NUMBER!
    def getVoiceNameByNum(self, num):
        return self.tts.GetVoices()[ num].GetDescription()
#NAME OF A VOICE!
    def getVoiceName(self):
        return self.tts.GetVoices()[ self._voice].GetDescription()
#WHAT IS VOICE RATE?
    def getRate(self):
        "MICROSOFT SAPI 5 RATE IS -10 TO 10"
        return (self.tts.rate)
#WHAT IS THE VOICE PITCH?
    def getPitch(self):
        "PITCH FOR MICROSOFT SAPI 5 IS AN XML COMMAND!"
        return self._pitch
#GET THE ENGINE VOLUME!
    def getVolume(self):
        "MICROSOFT SAPI 5 VOLUME IS 1% TO 100%"
        return self.tts.volume
#GET THE VOICE NUMBER!
    def getVoiceNum(self):
        return self._voice
#SET A VOICE BY NAME!
    def setVoiceByName(self, name):
        "VOICE IS SET BY NAME!"
        for i in range( self._voiceCount):
            if self.tts.GetVoices()[ i].GetDescription().find( name) >= 0:
                self.tts.Voice = self._voices[i]
#                self.tts.Speak( "%s Set!" % name)
                self._voice=i
                break
        if i >= self._voiceCount:
            self.tts.Speak( "%s Not Found!" % name)
#USED FOR BOOKMARKING AND USE LATER!
    def _get_lastIndex(self):
        bookmark=self.tts.status.LastBookmark
        if bookmark!="" and bookmark is not None:
            return int(bookmark)
        else:
            return -1
#NOW SET ENGINE PARMS!
#SET THE VOICE RATE!
    def setRate(self, rate):
        "MICROSOFT SAPI 5 RATE IS -10 TO 10"
        if rate > 10: rate = 10
        if rate < -10: rate = -10
        self.tts.Rate = rate
#SET PITCH OF THE VOICE!
    def setPitch(self, value):
        "MICROSOFT SAPI 5 pitch is really controled with xml around speECH TEXT 
AND IS -10 TO 10"
        if value > 10: value = 10
        if value < -10: value = -10
        self._pitch=value
#SET THE VOICE VOLUME!
    def setVolume(self, value):
        "MICROSOFT SAPI 5 VOLUME IS 1% TO 100%"
        self.tts.Volume = value
#CREATE ANOTHER INSTANCE OF A VOICE!
    def createVoice(self, name):
        num = 0
        for i in range( self._voiceCount):
            if self.tts.GetVoices()[ i].GetDescription().find( name) >= 0:
                num=i
                break
        new_tts = CreateObject( 'sapi.SPVoice')
        new_tts.Voice = self._voices[ num]
        return (new_tts)
#SPEAKING TEXT!
#SPEAK TEXT USING BOOKMARKS AND PITCH!
    def SpeakText(self, text, wait=False, index=None):
        "SPEAK TEXT AND XML FOR PITCH MUST REPLACE ANY <> SYMBOLS BEFORE USING 
XML BRACKETED TEXT"
        flags = constants4tts.XML
        text = text.replace( "<", "&lt;")
        pitch = ((self._pitch*2)-100)/10
        if isinstance(index, int):
            bookmarkXML = "<Bookmark Mark = \"%d\" />" % index #NOTE \" FOR XML 
FORMAT CONVERSION!
        else:
            bookmarkXML = ""
        flags = constants4tts.XML
        if wait is False:
            flags += constants4tts.Async
        self.tts.Speak( "<pitchabsmiddle = \"%s\">%s%s</pitch>" % (pitch, 
bookmarkXML, text), flags)
#CANCEL SPEAK IN PROGRESS!
    def cancel(self):
        #if self.tts.Status.RunningState == 2:
        self.tts.Speak(None, 1|constants4tts.Purge)
#SET AUDIO STREAM FOR OUTPUT TO A FILE!
    def SpeakToWav(self, filename, text, voice):
        """THIS METHOD ASSUMES THE IMPORT OF COMTYPES.CLIENT createObject SO
            A VOICE AND FILE STREAM OBJECT ARE CREATED WITH THE PASSING IN OF 3 
STRINGS:
            THE FILE NAME TO SAVE THE VOICE INTO, THE TEXT, AND THE VOICE 
SPOKEN IN.
            ONCE THE TEXT IS SPOKEN INTO THE FILE, IT IS CLOSED AND THE OBJECTS 
DESTROYED!"""
        num = 0
        for i in range( self._voiceCount):
            if self.tts.GetVoices()[ i].GetDescription().find( voice) >= 0:
                num=i
                break
        stream = CreateObject("SAPI.SpFileStream")
        tts4file = CreateObject( 'sapi.SPVoice')
        tts4file.Voice = self._voices[ num]
        from comtypes.gen import SpeechLib
        stream.Open( filename, SpeechLib.SSFMCreateForWrite)
        tts4file.AudioOutputStream = stream
        tts4file.Speak( text, 0)
        stream.Close()
        del tts4file
        del stream
#NOW SPEAK THE WAV FILE SAVED!
    def SpeakFromWav(self, filename, sync=0, async=0, purge=0):
        "SPEAKING A WAV FILE ONLY!"
        self.tts.Speak( filename, sync |async |purge |self._is_filename)
#SPEAK TEXT!
    def Speak(self, text, wait=0, sync=0, async=0, purge=0, isfile=0, xml=0, 
not_xml=0, persist=0, punc=0):
        "SAPI 5 HAS NO PITCH SO HAS TO BE IN TEXT SPOKEN!"
        pitch=self._pitch
        self.tts.Speak( "<pitch absmiddle='%s'/>%s" % (pitch, text), wait |sync 
|async |purge |isfile |xml |not_xml |persist |punc)
#SPEAK TEXT WITHOUT PITCH!
    def Speaking(self, text, wait=0, sync=0, async=0, purge=0, isfile=0, xml=0, 
not_xml=0, persist=0, punc=0):
        "SPEAKING A FILE WITHOUT PITCH"
        self.tts.Speak( text, wait |sync |async |purge |isfile |xml |not_xml 
|persist |punc)
#SET THE VOICE BY VALUE!
    def setVoice(self, value):
        """SET VOICE BY NUMBER OR VALUE!"""
        if value >= self._voiceCount:
            value = self._voiceCount-1
        if value < 1:
            value=0
        self.tts.Voice = self._voices[ value]
#        vd = self.tts.GetVoices()[ value].GetDescription()
#        self.tts.Speak( vd[ vd.find(" ")+1:])
        self._voice=value
#READ ALL THE VOICES IN THE ENGINE!
    def read_Voices(self):
        self.tts.Speak( "Voices are:")
        for i in range( self._voiceCount):
            print "%d) %s" % (i, self.getVoiceNameByNum(i))
            self.tts.Voice = self.tts.GetVoices()[i]
            vd = self.tts.GetVoices()[ i].GetDescription()
            self.tts.Speak( "%d) %s" % (i, vd[ vd.find(" ")+1:]))

def Create( vs={}):
    "CREATE A SAPI VOICE INSTANCE!"
    vp = {"name":"Sam", "volume":100, "rate":0, "pitch":0}
    for i in vs: vp[i] = vs[i]
    newVoice = SynthDriver()
    if newVoice.check():
        newVoice.init()
        newVoice.setVoiceByName( vp["name"])
        newVoice.setVolume( vp["volume"])
        newVoice.setRate( vp["rate"])
        newVoice.setPitch( vp["pitch"])
        return newVoice
    else:
        print "SAPI Engine Is Not Installed On This Computer!"
        return Null

Description="""
Widgets communicate
The test below gives a voice to the buttons and the text that is changed.
The moving of the mouse fast stops the speech of the previous button.
Also speech is allowed to continue and not hold up the background screen 
changes.
It is important to know, how widgets can communicate in application. Follow the 
example."""
#!/usr/bin/python
#VoiceFrame.py
#HOW TO COMMUNICATE INSIDE A FRAME PANELS!
import wx
import Sapi5
tts = Sapi5.Create()
purge = tts._purge
async = tts._async
class LeftPanel(wx.Panel):
    def __init__(self, parent, id):
        wx.Panel.__init__(self, parent, id, style=wx.BORDER_SUNKEN)
        self.text = parent.GetParent().rightPanel.text
        self.textNum = 0
        button1 = wx.Button(self, -1, 'Plus', (10, 10))
        button2 = wx.Button(self, -1, 'Minus', (10, 40))
        button3 = wx.Button(self, -1, 'Voices', (10, 70))
        button4 = wx.Button(self, -1, 'Rate', (10, 90))
        button5 = wx.Button(self, -1, 'Pitch', (10, 120))
        button1.Bind(wx.EVT_ENTER_WINDOW, self.OnPlusSpeak) #, 
id=button1.GetId())
        button2.Bind(wx.EVT_ENTER_WINDOW, self.OnMinusSpeak) #, 
id=button2.GetId())
        button3.Bind(wx.EVT_ENTER_WINDOW, self.OnVoiceSpeak) #, 
id=button3.GetId())
        button4.Bind(wx.EVT_ENTER_WINDOW, self.OnRateSpeak) #, 
id=button4.GetId())
        button5.Bind(wx.EVT_ENTER_WINDOW, self.OnPitchSpeak) #, 
id=button5.GetId())
        self.Bind(wx.EVT_BUTTON, self.OnPlus, id=button1.GetId())
        self.Bind(wx.EVT_BUTTON, self.OnMinus, id=button2.GetId())
        self.Bind(wx.EVT_BUTTON, self.OnVoice, id=button3.GetId())
        self.Bind(wx.EVT_BUTTON, self.OnRate, id=button4.GetId())
        self.Bind(wx.EVT_BUTTON, self.OnPitch, id=button5.GetId())
    def OnPlusSpeak(self, event):
        self.text.SetLabel( str( self.textNum))
        text = "Plus Button " +self.text.GetLabel()
        tts.Speak(text, async, purge)
    def OnMinusSpeak(self, event):
        self.text.SetLabel( str( self.textNum))
        text = "Minus Buttonn " +self.text.GetLabel()
        tts.Speak(text, async, purge)
    def OnVoiceSpeak(self, event):
        num = tts.getVoiceNum()
        text = "Voices Button To Change " +tts.getVoiceNameByNum( num)
        tts.Speak(text, async, purge)
    def OnRateSpeak(self, event):
        text = "Rate Buttonn " +str( tts.getRate())
        tts.Speak(text, async, purge)
    def OnPitchSpeak(self, event):
        text = "Pitch Buttonn " +str( tts.getPitch())
        tts.Speak(text, async, purge)
    def OnPlus(self, event):
        self.textNum = int(self.text.GetLabel()) +1
        tts.Speak( self.textNum, async)
        self.text.SetLabel( str( self.textNum))
    def OnMinus(self, event):
        self.textNum = int(self.text.GetLabel()) -1
        tts.Speak( self.textNum, async)
        self.text.SetLabel( str( self.textNum))
    def OnVoice(self, event):
        value=tts.getVoiceNum() + 1
        if value >= tts.getVoiceCount(): value=0
        tts.setVoice( value)
        msg = "The voice is now: %d) %s" % (tts.getVoiceNum(), 
tts.getVoiceName())
        tts.Speak( msg, async, purge)
        self.text.SetLabel( msg)
    def OnRate(self, event):
        value = tts.getRate() + 1
        if value > 10: value = -10
        tts.setRate( value)
        msg = "The Rate is now: %d" % value
        tts.Speak( msg, async, purge)
        self.text.SetLabel( msg)
    def OnPitch(self, event):
        value = tts.getPitch() + 1
        if value > 10: value = -10
        tts.setPitch( value)
        msg = "The Pitch is now: %d" % value
        tts.Speak( msg, async, purge)
        self.text.SetLabel( msg)
class RightPanel(wx.Panel):
    def __init__(self, parent, id):
        wx.Panel.__init__(self, parent, id, style=wx.BORDER_SUNKEN)
        self.text = wx.StaticText(self, -1, '0', (40, 60))
class Communicate(wx.Frame):
    def __init__(self, parent, id, title):
        wx.Frame.__init__(self, parent, id, title, size=(280, 200))
        panel = wx.Panel(self, -1)
        self.rightPanel = RightPanel(panel, -1)
        leftPanel = LeftPanel(panel, -1)
        hbox = wx.BoxSizer()
        hbox.Add(leftPanel, 1, wx.EXPAND | wx.ALL, 5)
        hbox.Add(self.rightPanel, 1, wx.EXPAND | wx.ALL, 5)
        panel.SetSizer(hbox)
        self.Centre()
        self.Show(True)
app = wx.App(0)
Communicate(None, -1, 'widgets communicate')
app.MainLoop()

import Sapi5, time, os

av = Sapi5.Create()
Mike = Sapi5.Create( {"name":"Mike", "pitch":1, "rate":1})
Mary = Sapi5.Create( {"name":"Mary", "pitch":1, "rate":-1})
SYNC = av._sync
ASYNC = av._async
PURGE = av._purge
ISFILE = av._is_filename
XML = av._xml
NOT_XML = av._not_xml
PERSIST = av._persist_xml
PUNC = av._punc
WAIT = av._wait

av.Speak("Hello!")
av.Speak( "I am speaking in the default voice!")
av.Speak( "Number of voices is: %d" % av.getVoiceCount())
av.read_Voices()
Mike.Speak( "Hello! Now saying the punctuation in this sentence.", PUNC)
time.sleep(.5)
Mary.Speak( "Mary and Mike are saving a wav file!")
time.sleep(.5)
av.SpeakToWav( 'spain.wav', 'Mary says, The rain in Spain falls mainly on the 
plain.', 'Mary')
av.SpeakFromWav( 'spain.wav', ASYNC)
av.SpeakToWav( "test.wav", "Mike says, Hello To The World!", "Mike")
av.SpeakFromWav( "test.wav", SYNC)
time.sleep(.5)
av.setVoiceByName( "Mary")
Mike.setPitch(5)
Mike.setVolume( 100)
Mike.Speak( " The Volume Is Set At Speaker Volume!")
av.setRate( 5)
av.setPitch(5)
av.Speak( "Rate At 75%, and Pitch 5 or 75% of maximum pitch!")
time.sleep(1)
av.setPitch(0)
av.setRate(0)
av.Speak( "The rate and pitch are now set for normal!")
#av.setVoiceByName( "eSpeak-EN+M2")
#av.Speak("Hello! I am a eSpeak man!")
av.Speaking( "Spell This! <spell> Spell This! </spell>", XML)
time.sleep(1)
av.setVoiceByName( "Mary")
av.Speak( "Hit enter key to stop speaking!")
Mike.Speak( "Hey! Hit enter key!")
av.setPitch(10)
av.Speak("Pitch 100%", ASYNC)
av.setPitch(5)
av.Speak("Pitch 75%", ASYNC)
av.setPitch(0)
av.Speak("Pitch 50%", ASYNC)
av.setPitch( -5)
av.Speak("Pitch 25%", ASYNC)
av.setPitch( -10)
av.Speak("Pitch 0%", ASYNC)
av.setPitch(0)
Mike.Speak( "Hit enter key!", ASYNC)
hit = raw_input("Hit enter key >")
av.Speak(" Mary Says, Now Good bye", ASYNC, PURGE)
Mike.Speak(" Mike Says, Hey! Good bye", ASYNC, PERSIST)
av.setVoiceByName( "Sam")
av.Speak( "<volume level='50'/> Sam says goodbye")
Mike.Speak( "<volume level='50'/> goodbye")
Mary.Speak( "<volume level='50'/> goodbye")
Mary.terminate()
Mike.terminate()
av.terminate()

_______________________________________________
python-win32 mailing list
python-win32@python.org
http://mail.python.org/mailman/listinfo/python-win32

[python-win32] How Do You Make Your Speech or SAPI 5 Voices Portable?

Reply via email to