-Caveat Lector-

see also: http://mediafilter.org/caq/cryptogate/

Dave Hartley
http://www.Asheville-Computer.com/dave

 US5937422: Automatically generating a topic description for text and
searching and sorting text by topic using the same
 http://www.patents.ibm.com/details?&pn=US05937422__
Inventor(s): Nelson; Douglas J. , Columbia, MD
Schone; Patrick John , Elkridge, MD
Bates; Richard Michael , Greenbelt, MD

 Applicant(s): The United States of America as represented by the National
Security Agency, Washington, DC
News, Profiles, Stocks and More about this company

Issued/Filed Dates: Aug. 10, 1999 / April 15, 1997

Application Number: US1997000834263

IPC Class: G06F 017/30;

Class: 707/531; 707/004; 707/532; 707/535; 707/512;

Field of Search: 704/010 707/512,532,535,531,3-5,7

Abstract: A method of automatically generating a topical description of text
by receiving the text containing input words; stemming each input word to
its root form; assigning a user-definable part-of-speech score to each input
word; assigning a language salience score to each input word; assigning an
input-word score to each input word; creating a tree structure under each
input word, where each tree structure contains the definition of the
corresponding input word; assigning a definition-word score to each
definition word; collapsing each tree structure to a corresponding tree-word
list; assigning a tree-word-list score to each entry in each tree-word list;
combining the tree-word lists into a final word list; assigning each word in
the final word list a final-word-list score; and choosing the top N scoring
words in the final word list as the topic description of the input text.
Document searching and sorting may be accomplished by performing the method
described above on each document in a database and then comparing the
similarity of the resulting topical descriptions.

Attorney, Agent, or Firm: Morelli; Robert D.;

Primary/Assistant Examiners: Amsbury; Wayne; Channavajjala; Srirama

U.S. References:   (No patents reference this one) Patent   Issued
Inventor(s)  Title
US4965763 10 /1990  Zamora Computer method for automatic extraction of
commonly specified information from business correspondence
US5371673 12 /1994  Fan Information processing analysis system for sorting
and scoring text
US5384703 1 /1995  Withgott et al. Method and apparatus for summarizing
documents according to theme
US5434962 7 /1995  Kyojima et al. Method and system for automatically
generating logical structures of electronic documents
US5619410 4 /1997  Emori et al. Keyword extraction apparatus for Japanese
texts
US5845278 12 /1998  Kirsch et al. Method for automatically selecting
collections to search in full text searches
US5873660 2 /1999  Walsh et al. Morphological search and replace

First Claim: Show all 31 claims
What is claimed is:
    1. A method of automatically generating a topical description of text,
comprising the steps of:
a) receiving the text, where the text consists of one or more input words;
b) stemming each input word to its root form;
c) assigning a user-definable part-of-speech score ßi to each input word;
d) assigning a language salience score Si to each input word;
e) assigning an input-word score to each input word that is a function of
the corresponding input word's part-of-speech score ßi, language salience
score Si, and the number of times the corresponding input word appears in
the text;
f) creating a tree structure under each input word, where each tree
structure contains the definition of the corresponding input word, where
each definition word may be further defined to a user-definable number of
levels;
g) assigning a definition-word score Ai,t [j] to each definition word in
each tree structure based on the definition word's part-of-speech score ßj,
the language salience score of the word the definition word defines, a
relational salience score Rk,j, and a user-definable factor W;
h) collapsing each tree structure to a corresponding tree-word list, where
each tree-word list contains the unique words contained in the corresponding
tree structure;
i) assigning a tree-word-list score to each word in each tree-word list,
where each tree-word-list score is a function of the scores of the
corresponding word that existed in the corresponding uncollapsed tree
structure;
j) combining the tree-word lists into a final word list, where the final
word list contains the unique words contained in the tree-word lists;
k) assigning a final-word-list score Afi [j] to each word in the final word
list, where Afi [j] is a function of the corresponding word's dictionary
salience and tree-word-list scores; and
l) choosing the top N scoring words in the final word list as the topic
description of the input text, where the value N may be defined by the user.

DECLARATION & DISCLAIMER
==========
CTRL is a discussion and informational exchange list. Proselyzting propagandic
screeds are not allowed. Substance—not soapboxing!  These are sordid matters
and 'conspiracy theory', with its many half-truths, misdirections and outright
frauds is used politically  by different groups with major and minor effects
spread throughout the spectrum of time and thought. That being said, CTRL
gives no endorsement to the validity of posts, and always suggests to readers;
be wary of what you read. CTRL gives no credeence to Holocaust denial and
nazi's need not apply.

Let us please be civil and as always, Caveat Lector.
========================================================================
Archives Available at:
http://home.ease.lsoft.com/archives/CTRL.html

http:[EMAIL PROTECTED]/
========================================================================
To subscribe to Conspiracy Theory Research List[CTRL] send email:
SUBSCRIBE CTRL [to:] [EMAIL PROTECTED]

To UNsubscribe to Conspiracy Theory Research List[CTRL] send email:
SIGNOFF CTRL [to:] [EMAIL PROTECTED]

Om

Reply via email to