-Caveat Lector-

 By Suelette Dreyfus
 Special Correspondent
 CyberWire Dispatch

 "Semantic Forests" doesn't mean much to the average person. But if
 you say it in concert with the words "automatic voice telephone
 interception" and "U.S. National Security Agency" to a computational
 linguist, you might just witness the physical manifestations of the word

 Words are funny things, often so imprecise. Two people can have a
 telephone conversation about sex, without ever mentioning the word.
 And when the artist formerly known as Prince sang a song about
 "cream," he wasn't talking about a dairy product.

 All this linguistic imprecision has largely protected our voice
 conversations from the prying ears of governments. Until now.

 Or, more particularly, it protected us until 15 April, 1997 - the
 date the NSA lodged a secret patent application at the US Patent
 Office. Of course, the content of the NSA patent was not made public for
 two years, since the Patent Office keeps patent applications secret until
 they are approved, which in this case was August 10, 1999.

 What is so worrying about patent number 5,937,422? The NSA is
 believed to be the largest and by far most well-funded spy agency in the
 world, a Microsoft of Spookdom. This document provides the first hard
 evidence that the NSA appears to be well on its way to creating
 eavesdropping software capable of listening to millions of international
 telephone calls a day. Automatically.

 Patents are sometimes simply ambit claims, legal handcuffs on what
 often amounts to little more than theory. Not in this case. This is
 real. The U.S. Department of Defense has developed the NSA's patent
 ideas into a real software program, called "Semantic Forests," which it
 has been lab testing for at least two years.

 Two important reports to the European Parliament, in 1998 and 1999,
 and Nicky Hager's 1996 book "Secret Power" reveal that the NSA
 intercepts international faxes and emails. At the time, this
 revelation upset a great number of people, no doubt including the
 European companies which lost competitive tenders to American
 corporations not long after the NSA found its post-Cold War "new
 economy" calling: economic espionage.

 Voice telephone calls, however, well, that is another story. Not
 even the world's most technically advanced spy agency has the
 ability to do massive telephone interception and automatically
 massage the content looking for particular words, and presumably
 topics. Or so said a comprehensive recent report to the European

 In April 1999, a report commissioned by the Parliament's Office of
 Scientific and Technological Options Assessment (STOA), concluded
 that "effective voice 'wordspotting' systems do not exist" and "are
 not in use".

 The tricky bit there is "do not exist". Maybe these systems haven't
 been deployed en masse, but it is looking increasingly like they do
 actually exist, probably in some form which may be closer to the
 more powerful topic spotting.

 Do The Math

 There are two new pieces of evidence to support this, and added
 together, they raise some fairly explosive questions about exactly
 what the NSA is doing with the millions of international phone calls it
 intercepts every day in its electronic eavesdropping web commonly known
 as Echelon.

 First. The NSA's shiny new patent describes a method of
 "automatically generating a topic description for text and sorting
 text by topic." Sound like a sophisticated web search engine? That's
 because it is.

 This is a search engine designed to trawl through "machine
 transcribed speech," in the words of the patent application. Think
 computers automatically typing up words falling from human lips. Now
 think of a powerful search engine trawling through those words.

 Now sweat...

 Maybe the spy agency only wants to transcribe the BBC Radio World
 News, but I don't think so. The patent contains a few more
 linguistic clues about the NSA's intent - little golden Easter eggs
 buried in the legal long grass. The "Background to the Invention"
 section of every patent application is the place where the
 intellectual property lawyers desperately try to waive away everyone
 else's right to claim anything even remotely touching on the patent.

 In this section, the NSA attorneys observed there has been "growing
 Interest" in automatically identifying topics in "unconstrained

 Only a lawyer could make talking sound so painful. "Unconstrained
 speech" means human conversation. Maybe it's been "unconstrained" by the
 likelihood of being automatically transcribed for real time topic

 Here's the part where the imprecision of words - particularly spoken
 words - comes in. Machine transcribed conversations are raw, and very
 hard to analyze automatically with software. Many experts thought the NSA
 couldn't go driftnet fishing in the content of everyone's international
 phone calls because the technology to transcribe and analyze those calls
 was too young.

 However, if the NSA didn't have the technology to do automatic
 transcription of speech, why would it have patented a sifting method
 which, by its very own words, is aimed at transcripts of human speech?

 As Australian cryptographer Julian Assange, who discovered the DoD
 and patent papers while investigating NSA capabilities observed:
 "Why make tires if you don't have a car? Maybe we haven't seen the
 car yet, but we can infer that it exists by all the tires and

 One of the top American cryptographers, Bruce Schneier, also
 believes the NSA already has machine transcription capability. "One
 of the Holy Grails of the NSA is the ability to automatically search
 through voice traffic," Schneier said. "They would have expended
 considerable effort on this capability, and this research indicates at
 least some of it has been fruitful."

 Second, two Department of Defense academic papers show the U.S.
 developed a real software program, called "Semantic Forests," to
 implement the patented method.

 Published as part of the Text REtrieval Conference (TREC) in 1997
 and 1998, the Semantic Forest papers show the program has one main
 purpose: "performing retrieval on the output of automatic
 speech-to-text (speech recognition) systems." In other words, the
 U.S. built this software *specifically* to sift through
 computer-transcribed human speech.

 If that doesn't send a chill down your spine, read on.

 The DoD's second prime purpose for Semantic Forests was to "explore
 rapid Prototyping" of this information retrieval system. That
 statement was written in 1997.

 There's also an unambiguous link between Semantic Forests and the
 NSA patent, it's human and its name is Patrick Schone.

 Schone appears on the NSA patent documents, as an inventor, and the
 Semantic Forests papers, as an author and he works at Ft. Meade,
 NSA's headquarters.

 Specifically, he works in the DoD's "Speech Research Branch" which
 just happens to be located at, you guessed it, Ft. Meade.

 Very Clever Fish

 The NSA and the DoD refused to comment on the patent or Semantic
 Forests respectively. Not surprising really but no matter, since the
 Semantic Forest papers speak for themselves. The papers reveal a software
 program which, while somewhat raw a year ago, was advancing quickly in
 its ability to fish relevant data out of various document pools,
 including those based on speech.

 For example, in one set of tests, the scientists increased the
 average precision rate for finding relevant documents per query from 19%
 to 27% in just one year, from 1997 to 1998. Tests in 1998 on another set
 of documents, in the "Spoken Document Retrieval" pool were turning up
 similar stats around 20-23 per cent. The team also discovered that a
 little hand-fiddling in the software reaped large rewards.

 According to the 1998 TREC paper: "When we supplemented the topic
 lists for all the queries (by hand) to contain additional words from the
 relevant documents, our average precision at the number of relevant
 documents went from 28% to 50%."

 The truth is that Schone and his colleagues have created a truly
 clever invention. They have done some impressive research. What a
 shame all this creativity and laborious testing is going to be used
 for such dark, Orwellian purposes.

 Let's work on the mental image of that dark landscape. The NSA sucks down
 phone calls, emails - all sorts of communications to its satellite bases.
 Its computers sift through the data looking for information which might
 interest the U.S. or, if the Americans happen to be feeling generous that
 day, their allies.

 Now, whenever NSA agents want to find out about you, they pull up a
 slew of details about you on their database. And not just the
 run-of-the-mill gumshoe detective stuff like your social security
 number, address, but the telephone number of every person you call
 regularly, and everything you have said when making those calls to
 1-900-Lick-Me from your hotel room on those stop overs in Cleveland.

 And here's the real scary stuff:

 The NSA likely already has a file on many of us. It's not a
 traditional manilla file with your name typed neatly on the front.
 It's the ability to reference you, or anyone who matches your
 patterns of behavior and contacts, in the NSA's databases. Now, or
 in the near future, this file may not just include who you are, but
 what you *say*.

 British Member of the European Parliament Glyn Ford is one of the
 few politicians around who is truly concerned with the individual's
 right to privacy. A driving force behind the European Parliament's
 STOA panel's two year investigation into electronic communications,
 Ford is worried that the NSA possesses technologies that are
 "potentially very dangerous" to privacy and yet have no controls
 over their activities.

 The Australian aboriginal activist and lawyer Noel Pearson once said that
 that the British gave three great things to the world: tea, cricket and
 common law. If unchecked, the NSA and its sister spy agencies in the
 UK/USA agreement may use this technology to lead an assault on the most
 important of those gifts and the common law tenet "innocent until proven
 guilty" may be the first casualty.

 How ironic: one Blair wrote '1984' as fiction, and another is
 helping to make it fact.

 = = = = = = = = = = = = = = = =

 An Australian-American writer, Suelette Dreyfus was educated in the
 UK and US, studied at Oxford University and Columbia University in
 New York, where she won the prestigious Teichmann Prize for
 excellence and originality in writing. She is the author of
 Underground, the first book about Australian computer hacking,
 available at

 = = = = = = = = = = = = = = = = =

 EDITOR'S NOTE: CyberWire Dispatch, with an Internet circulation
 estimated at more than 600,000 is now developing plans for a
 once-a-week e-mail publication. Every week, one of five well-known
 investigative reporters will file for CWD. If you think your company or
 organization would be interested in more information about establishing
 an sponsorship relationship with CyberWire Dispatch, please contact Lewis



"I wonder who will be the first local, who, when asked
directions, will say, 'well, take a right at the next
corner and go down on Clinton'." --Member Comments on
Little Rock's proposed  "Clinton Avenue."

CTRL is a discussion and informational exchange list. Proselyzting propagandic
screeds are not allowed. Substanceónot soapboxing!  These are sordid matters
and 'conspiracy theory', with its many half-truths, misdirections and outright
frauds is used politically  by different groups with major and minor effects
spread throughout the spectrum of time and thought. That being said, CTRL
gives no endorsement to the validity of posts, and always suggests to readers;
be wary of what you read. CTRL gives no credeence to Holocaust denial and
nazi's need not apply.

Let us please be civil and as always, Caveat Lector.
Archives Available at:

To subscribe to Conspiracy Theory Research List[CTRL] send email:

To UNsubscribe to Conspiracy Theory Research List[CTRL] send email:


Reply via email to