Speak Up, a Computer Is Listening

    By [4]DAVID POGUE

    Of all the high-tech fantasies that sci-fi movies tantalize their
    escapist audiences with, surely that bit about giving your computer
    spoken orders is one of the most alluring. Ever since "Star Trek,"
    we've dreamed of being able to say, "Computer, display all known
    sources of dilithium crystals in the Kraxon Nebula!"

    So far, the closest we can get is strapping on a headset and
    dictating, using a program like Dragon NaturallySpeaking to do the
    typing. This software is great for anyone who can't type or doesn't
    like to. And it lets you speak the names of menu commands and "click"
    links on a Web page.

    But that's not the same as telling the computer what to do in
    conversational English.

    NaturallySpeaking 10, available Thursday, takes some baby steps in the
    right direction. It doesn't turn your computer into the "Star Trek"
    mainframe; it doesn't know what you mean by, for example, "Make this
    document shorter and funnier." But in its timid, conservative way, it
    takes voice control unmistakably closer to that holy grail of
    computing.

    NatSpeak's principal mission, though, is to type out, into any Windows
    program, whatever you say. And in version 10, its maker, Nuance,
    claims to have eked out yet another 20 percent accuracy improvement.

    I installed the program, donned the included headset and clicked "Skip
    initial training." (In the early days of speech recognition, you had
    to read a 45-minute sample script to train the program to recognize
    your voice. Today, the software is so good, you can skip the training
    altogether.)

    As a quick test, I read aloud the first 1,000 words of "Freakonomics"
    into [5]Microsoft Word. Impressively enough, NatSpeak effortlessly
    transcribed words like "[6]Ku Klux Klan" and "Punic war." It did,
    however, mistype seven easier words ("addition" instead of "edition,"
    for example, and "per trail" instead of "portrayal"). Accuracy tally
    with no training: 99.3 percent. Not too shabby.

    Then I tried a second test: I read one of the five-minute training
    scripts (a Kennedy speech), which is recommended for even better
    initial accuracy. I again read the first 1,000 words of
    "Freakonomics," and the program mistyped five words. Accuracy this
    time: 99.5 percent.

    In both cases, the number of spelling mistakes was zero. People who
    use NaturallySpeaking never make typos, only wordos.

    As you correct the mistakes with your voice -- a speedy, streamlined
    procedure -- the program learns. Whether you skip initial training or
    not, accuracy inches toward perfection over time.

    One way that Nuance has improved accuracy is by acknowledging, for the
    first time, that not everyone speaks alike. Version 10 recognizes
    eight accents: general (none), Australian, British, Indian, Great
    Lakes (Buffalo to Chicago), Southeast Asian, Southern United States
    and Spanish. If you don't specify, the program will identify you
    automatically.

    Isn't that somehow politically incorrect? Should a software program
    treat you differently depending on how you sound?

    Ah, the heck with it. It's dictation software. A little stereotyping
    can go a long way.

    Speed is another virtue in version 10. The program still waits for a
    pause in your talking before it types, so that it can use context to
    choose, for example, the correct homonym (there/they're/their). But
    that waiting period has been halved; text appears almost
    instantaneously at each pause.

    Second -- and here's where things start to get Star Trekky -- the
    program understands more "natural language" commands.

    For example, italicizing something you've already typed, say, the
    phrase "gas prices," used to require three separate commands. First,
    "Select gas prices." Then, "Italicize that." Finally, to move your
    insertion point back where you stopped, "Go to end of document."

    In version 10, a single command does the trick: "italicize `gas
    prices.'" The program makes the change and returns to where you
    stopped, all in a blink. The same trick also works with the verbs
    "bold," "underline," "delete," "cut" and "copy." (Yes, "bold" is a
    verb now.)

    You can speak a series of new Search commands, beginning with "Search
    computer for ...," "Search the Web for ...," "Search e-mail for ..."
    and so on.

    For example: "Search maps for Chinese restaurants near Hoboken." Or
    "Search [7]Wikipedia for Bay of Pigs." Or "Search images for
    [8]Gwyneth Paltrow." These shortcuts work 100 percent reliably and do
    truly save you time and typing. Next version: more of them, please.

    And now, the NatSpeak Frequently Asked Questions:

    "Does NaturallySpeaking work on a Mac?" Yes, but only when the Mac is
    running Windows and you're using a U.S.B. headset adapter. It works
    fantastically in Boot Camp and fast enough in [9]VMware Fusion, an
    emulator program.

    Of course, it might be simpler just to buy MacSpeech Dictate, a Mac
    program that uses the same Dragon recognition technology. The current
    version is fast and accurate, but it lags behind NatSpeak in features
    and power; it doesn't even let you make corrections by voice, and
    therefore the accuracy never improves. But a 1.2 version, with voice
    correction and voice spelling, is in testing now.

    "Can I transcribe interviews with it?" No. NatSpeak knows only one
    person's voice: yours. It also requires a clean audio signal, like the
    one from a headset mike half an inch from your mouth.

    "Can I dictate with a wireless Bluetooth earpiece?" Yes. In fact,
    version 10 greatly expands the number of compatible earpiece models
    (18 so far, listed at [10]nuance.com). Accuracy may take a hit,
    though.

    "Can I dictate into a pocket recorder and transcribe it later?" Yes.
    The setup is more involved, though: only some recorders are
    compatible, and you have to record 15 minutes of training.

    "Doesn't Windows Vista come with speech recognition?" Yes, and it's
    really good -- quite similar to NatSpeak, actually. But Nuance says
    that, oddly enough, Vista has had virtually no effect on NatSpeak
    sales.

    I'm guessing that obscurity is part of the reason; most people aren't
    even aware that Vista offers such a feature. Vista doesn't come with
    the required headset, either. Nor does the Vista version offer the
    same accuracy, features or power of NatSpeak, and it isn't available
    in other languages (French, Italian, German, Spanish, Dutch and so
    on).

    NatSpeak is available in a number of versions. The Standard edition
    ($100) has the same accuracy as the others, but it's just for
    bare-bones dictation.

    To get the more advanced goodies described in this review -- the
    natural-language commands, Bluetooth mikes and recorders -- you need
    the Preferred edition ($200). It also lets you set up voice macros
    that type out boilerplate text. For example, you can say, "Buzz off,"
    and it will type: "Thanks for thinking of me! Unfortunately, I'm
    afraid I'm unable to accept your kind offer at this time."

    There are also medical and legal editions ($1,600 and $1,200, yikes),
    as well as a Professional edition ($900) for corporate administrators
    who want to manage many NatSpeak installations from a central server.
    The Pro version also recognizes natural-language commands for
    Microsoft Outlook, like "Send e-mail to Mom" or "Schedule a meeting
    with [11]Barack Obama and [12]John McCain."

    Apart from Vista, NatSpeak really has no competition. Philips has
    dropped out of the American market. [13]I.B.M.'s own ViaVoice hasn't
    been updated since 2003, and its sole distributor is, get this,
    Nuance.

    Maybe that's why Nuance makes only small, confident changes from one
    version of NatSpeak to the next. Without any rivals, why add bells and
    whistles that risk mucking up the program's virtues?

    As a result, existing NaturallySpeaking owners can usually afford to
    skip a generation between upgrades. Version 10 is a healthy leap ahead
    of version 8, but version 9 owners shouldn't feel compelled to
    upgrade.

    And now, if you'll excuse me, I have some real work to do: "Search
    maps for dilithium crystals near New York City. ..."

    E-mail: 
[EMAIL PROTECTED]


Join Access India convention: For updates on it visit: 
http://accessindia.org.in/harish/convention.htm
Registration is now open!

To unsubscribe send a message to [EMAIL PROTECTED] with the subject unsubscribe.

To change your subscription to digest mode or make any other changes, please 
visit the list home page at
  http://accessindia.org.in/mailman/listinfo/accessindia_accessindia.org.in

Reply via email to