there are just about as many ways of doing it as there are papers that talk about automatic relevance feedback. many require domain-specific reference documents that are full of facts and therefore good sources of related words. some people use Wordnet. some of these techniques can add 400-500 terms into a query if they are searching long documents and using reference documents that are equally long. the technique is very important only when searching long documents and almost irrelevant for very short ones.
Herb.... -----Original Message----- From: Karl Koch [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 21, 2004 11:09 AM To: Lucene Users List Subject: Re: setMaxClauseCount ?? that sounds interesting to me. I refer to a paper written by NIST about Relevance Feedback which was doing test with 20 - 200 words. This is why I thought it might be good to be able to use all non stopwords of a document for that and see what is happening. Do you know good papers about strategies of how to select keywords effectivly beyond the scope of stopword lists and stemming? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]