So even an educated calculation won't do it because you'd need to know how many documents the word occurs in (you could do a search, but that would be overkill and impractical).
Cool -----Original Message----- From: Ype Kingma [mailto:[EMAIL PROTECTED] Sent: Thursday, April 29, 2004 10:57 AM To: Lucene Users List Subject: Re: Count for a keyword occurance in a file On Thursday 29 April 2004 08:14, Nader S. Henein wrote: > Tricky, scoring has to do with the frequency of the occurrence of the > word as opposed to the amount of words in the file in general > (Somebody correct me if I'm wrong) , so short of an educated > approximation, you could hack Lucene uses two frequencies for a term: the nr. of docs in which it occurs in an index (basis for IDF), and the nr of times a term occurs in a document. > the indexer to dynamically store the frequency of a word (oh so > unadvisable). Personally I recommend the educated approximation, > because you could index the document with the number of words in it ( > you would have to make sure you're not using Stop Word Analyzer or > Port Stemmer) and then based on the score reverse engineer the result > you want. > > Nader Henein > > -----Original Message----- > From: hemal bhatt [mailto:[EMAIL PROTECTED] > Sent: Wednesday, April 28, 2004 5:50 PM > To: Lucene Users List > Subject: Count for a keyword occurance in a file > > > Hi, > > How can I get a count of the score given by Hits.Score(). > i.e I want to know how many times a keyword occurs in a file. Any help > on this would be appreciated. The easiest way is to use IndexReader. I don't know what you mean by file (index or document), but you can have both frequencies I mentioned above from an IndexReader, evt. using skipTo() to go to the document. The methods are docFreq(Term) and termDocs(Term). Regards, Ype --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
