Speak Up, a Computer Is Listening
By [4]DAVID POGUE
Of all the high-tech fantasies that sci-fi movies tantalize their
escapist audiences with, surely that bit about giving your computer
spoken orders is one of the most alluring. Ever since "Star Trek,"
we've dreamed of being able to say, "Computer, display all known
sources of dilithium crystals in the Kraxon Nebula!"
So far, the closest we can get is strapping on a headset and
dictating, using a program like Dragon NaturallySpeaking to do the
typing. This software is great for anyone who can't type or doesn't
like to. And it lets you speak the names of menu commands and "click"
links on a Web page.
But that's not the same as telling the computer what to do in
conversational English.
NaturallySpeaking 10, available Thursday, takes some baby steps in the
right direction. It doesn't turn your computer into the "Star Trek"
mainframe; it doesn't know what you mean by, for example, "Make this
document shorter and funnier." But in its timid, conservative way, it
takes voice control unmistakably closer to that holy grail of
computing.
NatSpeak's principal mission, though, is to type out, into any Windows
program, whatever you say. And in version 10, its maker, Nuance,
claims to have eked out yet another 20 percent accuracy improvement.
I installed the program, donned the included headset and clicked "Skip
initial training." (In the early days of speech recognition, you had
to read a 45-minute sample script to train the program to recognize
your voice. Today, the software is so good, you can skip the training
altogether.)
As a quick test, I read aloud the first 1,000 words of "Freakonomics"
into [5]Microsoft Word. Impressively enough, NatSpeak effortlessly
transcribed words like "[6]Ku Klux Klan" and "Punic war." It did,
however, mistype seven easier words ("addition" instead of "edition,"
for example, and "per trail" instead of "portrayal"). Accuracy tally
with no training: 99.3 percent. Not too shabby.
Then I tried a second test: I read one of the five-minute training
scripts (a Kennedy speech), which is recommended for even better
initial accuracy. I again read the first 1,000 words of
"Freakonomics," and the program mistyped five words. Accuracy this
time: 99.5 percent.
In both cases, the number of spelling mistakes was zero. People who
use NaturallySpeaking never make typos, only wordos.
As you correct the mistakes with your voice -- a speedy, streamlined
procedure -- the program learns. Whether you skip initial training or
not, accuracy inches toward perfection over time.
One way that Nuance has improved accuracy is by acknowledging, for the
first time, that not everyone speaks alike. Version 10 recognizes
eight accents: general (none), Australian, British, Indian, Great
Lakes (Buffalo to Chicago), Southeast Asian, Southern United States
and Spanish. If you don't specify, the program will identify you
automatically.
Isn't that somehow politically incorrect? Should a software program
treat you differently depending on how you sound?
Ah, the heck with it. It's dictation software. A little stereotyping
can go a long way.
Speed is another virtue in version 10. The program still waits for a
pause in your talking before it types, so that it can use context to
choose, for example, the correct homonym (there/they're/their). But
that waiting period has been halved; text appears almost
instantaneously at each pause.
Second -- and here's where things start to get Star Trekky -- the
program understands more "natural language" commands.
For example, italicizing something you've already typed, say, the
phrase "gas prices," used to require three separate commands. First,
"Select gas prices." Then, "Italicize that." Finally, to move your
insertion point back where you stopped, "Go to end of document."
In version 10, a single command does the trick: "italicize `gas
prices.'" The program makes the change and returns to where you
stopped, all in a blink. The same trick also works with the verbs
"bold," "underline," "delete," "cut" and "copy." (Yes, "bold" is a
verb now.)
You can speak a series of new Search commands, beginning with "Search
computer for ...," "Search the Web for ...," "Search e-mail for ..."
and so on.
For example: "Search maps for Chinese restaurants near Hoboken." Or
"Search [7]Wikipedia for Bay of Pigs." Or "Search images for
[8]Gwyneth Paltrow." These shortcuts work 100 percent reliably and do
truly save you time and typing. Next version: more of them, please.
And now, the NatSpeak Frequently Asked Questions:
"Does NaturallySpeaking work on a Mac?" Yes, but only when the Mac is
running Windows and you're using a U.S.B. headset adapter. It works
fantastically in Boot Camp and fast enough in [9]VMware Fusion, an
emulator program.
Of course, it might be simpler just to buy MacSpeech Dictate, a Mac
program that uses the same Dragon recognition technology. The current
version is fast and accurate, but it lags behind NatSpeak in features
and power; it doesn't even let you make corrections by voice, and
therefore the accuracy never improves. But a 1.2 version, with voice
correction and voice spelling, is in testing now.
"Can I transcribe interviews with it?" No. NatSpeak knows only one
person's voice: yours. It also requires a clean audio signal, like the
one from a headset mike half an inch from your mouth.
"Can I dictate with a wireless Bluetooth earpiece?" Yes. In fact,
version 10 greatly expands the number of compatible earpiece models
(18 so far, listed at [10]nuance.com). Accuracy may take a hit,
though.
"Can I dictate into a pocket recorder and transcribe it later?" Yes.
The setup is more involved, though: only some recorders are
compatible, and you have to record 15 minutes of training.
"Doesn't Windows Vista come with speech recognition?" Yes, and it's
really good -- quite similar to NatSpeak, actually. But Nuance says
that, oddly enough, Vista has had virtually no effect on NatSpeak
sales.
I'm guessing that obscurity is part of the reason; most people aren't
even aware that Vista offers such a feature. Vista doesn't come with
the required headset, either. Nor does the Vista version offer the
same accuracy, features or power of NatSpeak, and it isn't available
in other languages (French, Italian, German, Spanish, Dutch and so
on).
NatSpeak is available in a number of versions. The Standard edition
($100) has the same accuracy as the others, but it's just for
bare-bones dictation.
To get the more advanced goodies described in this review -- the
natural-language commands, Bluetooth mikes and recorders -- you need
the Preferred edition ($200). It also lets you set up voice macros
that type out boilerplate text. For example, you can say, "Buzz off,"
and it will type: "Thanks for thinking of me! Unfortunately, I'm
afraid I'm unable to accept your kind offer at this time."
There are also medical and legal editions ($1,600 and $1,200, yikes),
as well as a Professional edition ($900) for corporate administrators
who want to manage many NatSpeak installations from a central server.
The Pro version also recognizes natural-language commands for
Microsoft Outlook, like "Send e-mail to Mom" or "Schedule a meeting
with [11]Barack Obama and [12]John McCain."
Apart from Vista, NatSpeak really has no competition. Philips has
dropped out of the American market. [13]I.B.M.'s own ViaVoice hasn't
been updated since 2003, and its sole distributor is, get this,
Nuance.
Maybe that's why Nuance makes only small, confident changes from one
version of NatSpeak to the next. Without any rivals, why add bells and
whistles that risk mucking up the program's virtues?
As a result, existing NaturallySpeaking owners can usually afford to
skip a generation between upgrades. Version 10 is a healthy leap ahead
of version 8, but version 9 owners shouldn't feel compelled to
upgrade.
And now, if you'll excuse me, I have some real work to do: "Search
maps for dilithium crystals near New York City. ..."
E-mail:
[EMAIL PROTECTED]
Join Access India convention: For updates on it visit:
http://accessindia.org.in/harish/convention.htm
Registration is now open!
To unsubscribe send a message to [EMAIL PROTECTED] with the subject unsubscribe.
To change your subscription to digest mode or make any other changes, please
visit the list home page at
http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in