Re: [Audyssey] Why SAPI Output is Superior

Philip Bennefall Sat, 04 May 2013 09:40:31 -0700

Hi Thomas,

I agree with all your points here. As I said in my prior post, if you don'twant to invest a lot of money in voices for the game then sapi is definitelythe way forward. However, one thing confuses me. You say you have alreadywritten such wrapper classes, and given that, why is sapi any easier interms of coding? Sure there is a small amount of time required initially butit's a trivial implementation that benefits you a great deal throughout theentire coding process, especially when you have cross platform in mind. WhatI mean is, sapi forces you to do less work but if you've already done thatwork, that is no longer an issue.

All in all, I think you have made the right decision given the parameters -no money spent, and maximum time efficiency.


Kind regards,

Philip Bennefall

----- Original Message -----From: "Thomas Ward" <thomasward1...@gmail.com>

To: <phi...@blastbay.com>; "Gamers Discussion list" <gamers@audyssey.org>
Sent: Saturday, May 04, 2013 6:30 PM
Subject: Re: [Audyssey] Why SAPI Output is Superior

Hi Philip,

When it comes to over all atmosphere you are right. A human voice is
always superior, but I haven't been doing that with any of my games.
With my original draft of STFC I used Neosspeach Kate, then Realspeak
Karen, and in many of the betas of MOTA I was using Acapela Heather.
Since all of these games were essentially using prerecorded clips of
SAPI voices anyway I was essentially doing a lot of unnecessary work
recording,editing, and then using the wav files of synth voices when I
could have used them directly.

Now, my voice isn't very good, and do to some physical impairments it
is impossible for me to speak clearly and I would never use it for a
professional game. Unfortunately, that means I would have to hire out
to someone else to do the voice clips and I don't really want to spend
a great deal of money on a game like Mysteries of the Ancients just to
get realistic human speech. Perhaps if the game sells well I could
consider reinvesting some of that money in voice acting etc, but for a
preliminary release SAPI is definitely the best way to go.

As far as wrapper classes that's pretty much what I was doing from
betas 1 to 22. I had a function called SpeakNumber() that would take a
number and do all the loading and processing of number files which
works pretty well. However, that required some extra work developing
those wrapper classes in order to get the same functionality I have
with SAPI right away.

There is one other issue you didn't mention and that is cross-platform
support. There really aren't any good cross-platform TTS solutions.
ESpeak, Festival, etc are all pretty bad and using prerecorded speech
clips is a great way to support Windows, Linux, Mac, and iOS, etc all
with the same speech and from that perspective it is easier to deal
with than a specific API like SAPI.

Cheers!

On 5/4/13, Philip Bennefall <phi...@blastbay.com> wrote:

Hi Thomas,

I just wanted to respond with a few spontaneous thoughts based on my own
experiences with sapi versus prerecorded audio.

Mostly I agree with your points. But for my own part I find that using
synthetic speech, whether it be prerecorded or generated on the fly by a
sapi controlled engine, greatly detracts from the atmosphere. Imagine ifyou
will, that you are in a shadowy temple with a faint sound of wind, distant
screams from the depths of a dungeon far below, haunting music with
reverberating strings, and an equally dramatic narrative performed by
Microsoft Mike. A bit of a contrast.

To take a less extreme example, in my upcoming title the atmosphere is
generally quite dramatic and with a dark undertone most of the time.
Therefore, when I recorded my speech files I spoke in a very low voiceeven
for such trivial things as numbers and status messages. I knew in what
context my voice was going to be played, and was able to adapt it toreally
blend in with the rest of the scene that I was trying to create. A speech
engine could never achieve that. So I think that in terms of atmosphereand
dramatic effect, prerecorded speech is far superior than a tts engine.There
are, of course, numerous cases where the advantages of tts output greatly
outweigh the pros of prerecorded material, such as when there isabsolutely
no way for you to know what content that may need to be spoken. But if, as
in the case of Mota, you are using prerecorded speech files that just
contain tts generated content anyway then these have absolutely noadvantage
so I would not disagree with your decision to use sapi if you aren't
interested in getting a voice talent to record your files. In short, Ifeel
that prerecorded voice files are only advantageous if you actually have a
real voice for the game.
As for ease of coding - that has never really bothered me. It is trivialto
design wrapper classes that can take a number, or a filename etc using
overloaded methods, and rendering that with the available speech files
asynchronously in the same thread so that one doesn't have to care about
synchronization. So with a careful design you can get it down to a oneliner
just like you would with sapi. What I find annoying with sapi is that Ican
never be sure what the final output will actually sound like. Many of the
commercial voices have quite a bit of delay during their initial buffering
for each new phrase. When you speak the same phrase for the second timethey
are usually faster, but the fact that there is so much variation between
voices makes it difficult to judge in advance how the game will actually
perform if that makes sense. I tend to use sapi extensively forprototyping
a game, but once I reach a stage where I start to consider a potential
release date my first priority is to rip out sapi and replace it with real
human speech.

In summary, I think that if you want to get something out quickly and
without spending any money then sapi is definitely the right way to go. If
on the other hand you are interested in maintaining a great atmosphere and
you want to make sure the game sounds and performs the same everywhere,I'd
recommend a voice actor. Best of luck, and sorry about the extremelylengthy
ramble!

Kind regards,

Philip Bennefall



---
Gamers mailing list __ Gamers@audyssey.org
If you want to leave the list, send E-mail to gamers-unsubscr...@audyssey.org.
You can make changes or update your subscription via the web, at
http://audyssey.org/mailman/listinfo/gamers_audyssey.org.
All messages are archived and can be searched and read at
http://www.mail-archive.com/gamers@audyssey.org.
If you have any questions or concerns regarding the management of the list,
please send E-mail to gamers-ow...@audyssey.org.

Re: [Audyssey] Why SAPI Output is Superior

Reply via email to