Calculating the total occurrence counts of a term in all of the documents in the collection via the TermDocs route is costly if you do it at runtime for a probabilstic retrieval model. However, this process could be taken offline and you can create a new index which has a Document for each term in the original index and a stored field with the occurrence count calculated from the offline process. This could save you a lot of runtime compuatations and also can provide you with capability to store collection level statistics about a term.
- Niranjan Niranjan Balasubramanian Software Engineer Center For Natural Language Processing (http://cnlp.syr.edu) Syracuse University >>> [EMAIL PROTECTED] 8/4/2004 11:34:40 AM >>> On Aug 4, 2004, at 8:25 AM, ABDOU Samir wrote: > What about the frequency of any given term in the whole collection!? IndexReader.docFreq(Term t) > Calculate this at runtime may affect considerably performance! It's computed during indexing! :) Erik > > Thanks, > > > -----Message d'origine----- > De : Erik Hatcher [mailto:[EMAIL PROTECTED] > Envoyé : mercredi, 4. août 2004 12:25 > À : Lucene Developers List > Objet : Re: Term Collection Frequency? > > The new term vector feature will give you this exact information for a > particular document or field. > > Erik > > > On Aug 4, 2004, at 3:59 AM, ABDOU Samir wrote: > >> Hi, >> >> In order to implement a new search model within Lucene >> (probabilistic), >> I need a collection frequency of each term (the number of occurrences >> of >> a term within a collection). So, what would be the best way to >> implement >> this? >> >> Any suggestions, ideas... are welcome. >> >> Thanks, >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]