RE: Part 3B of 3, Getting To Know Your Computer - Speech Synthesizer

Chip Orange Tue, 19 Jun 2012 18:34:29 -0700

Why is that Louis; I noticed a fair amount of typos, but nothing else.

Chip


> -----Original Message-----
> From: Louis [mailto:[email protected]] 
> Sent: Monday, June 18, 2012 8:02 AM
> To: 'David'; 'WE English mailing list'
> Subject: RE: Part 3B of 3, Getting To Know Your Computer - 
> Speech Synthesizer
> 
> I'd be curious to know which language that document was 
> translated from, or which language the writer speaks as his 
> mother tongue.
> 
> Louis Gosselin
> 
> 
> -----Original Message-----
> From: David [mailto:[email protected]]
> Sent: Sunday, June 17, 2012 2:56 PM
> To: WE English mailing list
> Subject: Part 3B of 3, Getting To Know Your Computer - Speech 
> Synthesizer
> 
> (C) Copyright, David (No) - June 2012
>  
> ---This is the final part of the article.---
>  
>  
> Finalizing Your Speech Synthesizer
> When you have decided on the exact technique of wpeech 
> production, built the whole sound library needed, created all 
> the rules for pronounciation and modulations, and constructed 
> the softwae to handle all of this; you are the happy owner of 
> a new Synthetic voice. As you have learned, it could be fully 
> synthetic, or partially pre-recorded human sounds. You now 
> can put the voice up for sale, on the market. Well, almost.
> There still remains a couple of decisions to be made. Should 
> you let the voice out the door, as a Single-Voice product. 
> Or, should you bundle it up with a few more voices?
> Many manufacturers decide to bundle several voices together. 
> Typically, in such a bundle, you would find at least one 
> female, and one male voice. Many times, you might even find 
> several versions of the two. A few manufacturers, even do 
> offer a child voice. And, at least one, has a voice made up 
> of dog-barking - should you ever want such in your projects.
> How do you go about, in making more voices? This would 
> greatly depend on the technique you decided to go for. If 
> your synthetic voice is a Digitized one, you will need more 
> people to narrate all the words in your synthetic voice's 
> vocabulary. You would then, have a female narrator do the 
> whole set of words, and a male narrator do the exact same 
> job. Then let your software handle each of the sound 
> libraries, according to the request from the end-user.
> Did you decide to go for the fully Electronic voice? This is 
> the easiest to multiply.
> A good bit of tweaking - adjusting speed, duration, volume 
> and pitch - of the many individual tones included in your 
> sound library, will readily make your voice sound female, 
> male or even childish. You can quite quickly have a deep 
> voice, a thin one, and a really shouty one. And, don't you 
> forget to include your first one; since it is all that robotic.
> J
> If you, on the other hand, decided to go for the Hybrid 
> technique, there could be a couple of ways for building new 
> voices. Again, you could have several human narrators read a 
> text, and then fragmentate the recordings into the 
> word-fractions you need for each voice. You further could 
> make several adjustments to each of the fractional 
> recordings, which quickly would make a voice sound slightly 
> differently. All the readers, who are old enough to have been 
> playing with a tape recorder with speed adjustment, willknow 
> that it is quite easy to make mom's voice turn really deep-sounding.
> Or, you could have Daddy sound like a little boy, simply by 
> speeding up the playback.
> Similarly, in your laboratory, you can perform a load of 
> adjustments on the recorded voices, and have them sound differently.
> Building A Speech Synthesizer
> The term "Speech Synthesizer", basically is another term for 
> a bundle of Synthetic Voices. You would typically, name the 
> different voices with some human names. Your Synthesizer, on 
> the other hand, you would name the same as your company or project.
> For instance, Microsoft has built a Speech Synthesizer. It is 
> called Microsoft.
> It holds several voices, like Mary, Mike and Sam. Another 
> company, AT&T, did build a Speech Synthesizer named "AT&T 
> Natural Voices". It holds voices like Crystal, Mike Mel, 
> Julia, and Ray. Nuance name their synthesizer Scansoft, and 
> it holds voices like Samantha, Daniel, Tom, Nora and Nanna. 
> Another manufacturer, NeoSpeech, has voices like Kate and 
> Paul included in their synthesizer. To distinguish the many 
> voices and synthesizers, or manufacturers, we often refer to 
> them as "Microsoft Mike", and "AT&T Crystal", or "Eloquence 
> Sandy". This way, we will easily know, whether we have the 
> Mike-voice from Microsoft, or the Mike-voice from AT&T in 
> question. Since any manufacturer can name their voices what 
> they want, anyone could have a voice named Mike. But if you 
> listen to the Microsoft Mike, and the AT&T Mike, you will 
> right away hear that they are two totally different voices.
> Interfacing Your Voice
> Is your product ready for shipping now? Hang on, for a 
> moment. There is just one small techie thing left. So far, we 
> have made synthesizer and voices, but we haven't yet given 
> the enduser any way of "communicating" with the voice. We 
> need to provide a way for the end-user to choose which of the 
> voices in our synthesizer she wants listening to. Further a 
> way to send text to the selected voice, for narration. Also, 
> the user should be offered the chance of at least slow down 
> or speed up the narration, and maybe alter the pitch. 
> Correctly done, our voice should signalize to the computer 
> when it is ready for receiving text and commands, and when 
> not to disturb it. And, maybe quite important, it should hold 
> a feature for the user to stop the speech at any time.
> All of this controlling, is what we name the "interface" of 
> the speech synthesizer.
> There would be many ways for interfacing a speech 
> synthesizer. The most commonly used standard, is called SAPI. 
> Letting your speech synthesizer meet the requirements of the 
> SAPI interface, will ensure that it can be used by numerous 
> software on the market.
> SAPI voices come in two main flavors: SAPI 4, and SAPI 5.
> Some voices come in both versions, other only in one of them. 
> The SAPI 4 interface did offer an extensive amount of 
> adjustments for volume, speed and pitch. SAPI 5 voices have a 
> somehow more limitted adjustment for each of these features. 
> Unfortunately, you might often find that adjusting a feature 
> in SAPI 5, will result in either too much or too little. Say 
> you have set your volume to 5. Adjusting it down to 4, it 
> becomes hard to hear. Setting it to 6, your ear drums are 
> blown. And since you are not offered anything in between 4 
> and 5, or 5 and 6, you end up not using the feature of volume 
> adjusting too often. Same goes with the other parameters. 
> When comes to the listening experience, it is all a matter of 
> personal taste and preferences. Some will claim, that the 
> sound has got a bit clearer on SAPI 5 voices, but that the SAPI
> 4 voices had better modulation. Some SAPI 5 versions are too 
> eagerly modulating their narrating. The modulation issue, 
> might not really be related to the SAPI bersion in itself. 
> Rather, the manufacturers who upgraded their SAPI 4 voices, 
> might have thought they would do a bit of maintainance on 
> their product, first they had to meddle with it anyway. 
> Unfortunately, such upgrading has - in some cases resulted in 
> a less pleasant-sounding voice
>   for long-term narrating.
> Yet, not all manufacturers want to 'expose' their speech 
> synthesizer to whichever software or user-interaction. The 
> manufacturer could offer a "dedicated" speech synthesizer.
> This kind of synthesizers, can only be reached from the 
> software for which it has been dedicated. So, whilst 
> installing a SAPI voice in general would mean that you can 
> reach the voice from any software that supports SAPI, a 
> dedicated speech synthesizer can only be reached from a given 
> software (like a screen reader).
> As always in the computer world, there is a chance of 
> exceptions. Even some SAPI voices on the market, are somehow 
> 'locked' to a given software. The voices from Acapela, is one 
> example of such. Acapela voices can basically only be used 
> with the software they were bought with, unless you buy a 
> special license that would open them up for usage with other 
> software on your computer. Still, we might not necessarily 
> refer to this kind of SAPI as real dedicated voices.
> A manufacturer might include extra controlling capabilities 
> in a dedicated speech synthesizer. Controlling that would 
> need more technical handling, than what is possible or 
> generally accepted in the SAPI standard. Or, he might offer a 
> better quality, faster responding, or in any other way 
> modified version of his voice in the dedicated version. He 
> might offer his voices as SAPI, as dedicated ones, or as both.
> Since a dedicated voice cannot be reached from any other than 
> the hosting software, there is less chance of interference 
> from other processes on your computer. But then again, you 
> are locked to using the voices provided in the synthesizer 
> from within the hosting software. Therefore, most voices on 
> the market, are SAPI voices, leaving them open to receive 
> inputs from any software that supports this interface.
> Hardware Or Software Synthesizers
> So far, we have been discussing the construction and 
> interfacing of voices and synthesizers directly on the 
> computer. They are stored on the hard disk, and reached 
> directly inside your computer.In older times (a decade or so 
> ago), when hard disk space was limitted and computers ran 
> slow, the computer simply did not have enough resources to 
> run a speech synthesizer of fair quality. Manufacturers 
> therefore, did drop the whole synthesizer - including voices, 
> pronounciation rules, exception dictionaries and interfacing 
> software - onto an electronic chipset. This chipset in turn, 
> was enclosed in a small unit, that the user could connect to 
> his computer. Since a good amount of hardware was included in 
> this kind of units, we call them "hardware synthesizers".
> They still can be had, but are rather rarely seen.
> Modern computers have enough processor, memory (RAM) and hard 
> disk space, to run a fairly and even good sounding speech 
> synthesizer. Furthermore, today's computers have built-in 
> sound cards, that are capable of handling speech and music 
> from multiple sources, and with high quality and precission. 
> As such, the hardware synthesizers are no longer needed. And 
> excluding them, will make the portability of your computer 
> system far better, when comes to a laptop. Since most of the 
> speech synthesizer now is handled inside your computer, and 
> little hardware is included in the process, we call the 
> modern speech synthesizers "software synthesizers". One big 
> benefit of a software synthesizer is, that you can add on as 
> many voices as you want. And you can have a collection of 
> voices from several manufacturers, meaning that you can have 
> different voices for different tasks on your computer. They 
> also might be more responsive in many cases, due to things 
> happening far more quickly inside your computer, than most 
> external connections could ever perform.
> Speech Synthesizers In Combination With A Screen Reader 
> Whether you now have manufactured your own synthetic voice, 
> or you made the shortcut and bought one of the many available 
> on the market, you are only half way in making real use of 
> it. It is like buying a flute, and then having noone to play it.
> Several processes on your computer could make use of the new 
> voice. We do presume, that the processes are handled by 
> software that supports, or communicats, with the interface of 
> your Speech Synthesizer. One software might be acting like a 
> calculator.
> It might send any of the numbers you enter, as well as the 
> results, to your synthesizer; making the whole thing into a 
> talking calculator. Another software is a game of some kind. 
> Whenever certain things happen in the game, a phrase or two 
> are send to your synthesizer, for voicing. Or, maybe you have 
> got hold of one of the software that retrieves the current 
> time at given intervals, then sends the retreived numbers to 
> the synthesizer, and you will hear the computer telling you 
> the time of the day, every single hour. Still other software 
> packages are created so as to retrieve weather reports and 
> forecasts from the internet, then to send the results to the 
> synthesizer, and you will have it read out to you, at given 
> intervals. All of these examples, are what we call 
> Self-Voicing Software. The establish direct contact between 
> themselves and the speech synthesizer.
> Sometimes, we want a bit more use of our synthesizer. Maybe 
> you are a student, or simply just love to read a good novel. 
> Or, do you have huge amounts of documents you need to read in 
> your job? In such cases a TTS (Test-To-Speech) software, 
> could be in place for you. The job for a TTs, is to send a 
> bigger amount of text to your synthesizer. There is really a 
> good amount of processing included in this. For one thing, 
> the software should look out for when your synthesizer is 
> 'ready to receive text', and then send a chunk over. Then, it 
> would patiently wait for the synthesizer to narrate the piece 
> of text, before a new block of your document is being send to 
> the synthesizer. TTS software might also hold features for 
> setting up correct pausing at the beginning of each paragraph 
> and chapter in the document. It might further offer you 
> extended control when comes to pronounciation of characters, 
> words and phrases. Many TTS software even offer you the 
> chance of turning the narrated document into sound files, 
> like MP3, that you can play back on your portable player. A 
> cheap and very good TTS, is TextAloud manufactured by Nextup.com.
> But your possibilities don't end here. Whether you are a 
> dysletic, or maybe you don't have enough sight to read the 
> computer screen, your new speech synthesizer will be a 
> 'helping hand' hereafter. Well, if you get a Screen Reader software.
> As the term indicates, this kind of software will act as a 
> reader, or a pair of eyes, on the computer screen. It will 
> keep track of any changes that take effect on the screen, and 
> whenever a change is distinguished, the information hereof is 
> send to the speech synthesizer, which will read it out. 
> Thereby, you can hear what is going on on your screen at any 
> time. A screen reader offers you a long line of controlling features.
> You can decide which part of the screen should be watched and 
> narrated, as well as when to have anything send to the 
> synthesizer. Further the more sofisticated screen readers, 
> offer the user full control as to how he wants to be informed 
> about the things happening on the computer. He can, for 
> instance, decide whether he wants a word read as a word, or 
> being spelled out letter by letter. Or, he can decide if he 
> wants to hear every tiny change that is being made in a 
> window on the screen, or only when given things take place.
> A very basic screen reader is included with Windows itself. 
> Any Windows user can go to the Run menu, and type 
> Narratorfollowed by a press on the Enter key.
> Microsoft has even included a SAPI voice, which will 
> immediately jump into action, and let you hear some of the 
> things taking place on the computer screen.
> As mentioned, this is a very basic screen reader, enough to 
> give you a touch of what it is all about. Should you feel for 
> more control and feedback, the free NVDA screen reader could 
> be a choice. Or you could buy a fully functional screen 
> reader like Window-Eyes, Jaws, Blindows or SuperNova.
> This article intended to let you have a peek behind the 
> scene, when comes to some of the technology that makes many 
> an equipment in our modern living speak.
> I do hope that you have got some answers to questions you 
> might have had in this regard. Also, I do hope you have got a 
> better grasp of some of the terms, that you might be coming 
> across when looking for a new voice on your computer. If you 
> have feedback on the material here provided, you are welcome 
> to contact me at:
> [email protected] <mailto:[email protected]> 
> Any manufacturer or product name mentioned in this article, 
> is the property of the respective owners. None of them have 
> been mentioned for advertising reasons, merely to inform the 
> reader of some of the available products on the market.
> June 07, 2012
> C Copyright, David (Norway) - All rights reserved ---End Of Article---
> 
> If you reply to this message it will be delivered to the 
> original sender only. If your reply would benefit others on 
> the list and your message is related to GW Micro, then please 
> consider sending your message to [email protected] so the 
> entire list will receive it.
> 
> GW-Info messages are archived at 
> http://www.gwmicro.com/gwinfo. You can manage your list 
> subscription at http://www.gwmicro.com/listserv.
> 
If you reply to this message it will be delivered to the original sender only. 
If your reply would benefit others on the list and your message is related to 
GW Micro, then please consider sending your message to [email protected] so 
the entire list will receive it.

GW-Info messages are archived at http://www.gwmicro.com/gwinfo. You can manage 
your list subscription at http://www.gwmicro.com/listserv.

RE: Part 3B of 3, Getting To Know Your Computer - Speech Synthesizer

Reply via email to