Hi!

This is a little off topic, I hope you won't mind.

I'm looking for information about generating keyword lists for search
engines:

The "base" approach is the following:
        - all words of a document are taken
        - words in a "stop list" of general words are ignored
        - the roots of the words are determined (similiar to htfuzzy word2root)

Now, I'm looking for ways to improve this (i.e. to reduce the size of the
list without loosing much information).

Any hint on approaches and/or information to this topic is wellcome!

Regards
-- jochen


-- 
Smoking Prohibited.  Absolutely no ifs, ands, or butts.

  [This is a signature virus, please copy me into your signature file!]


_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to