On Thursday 14 Apr 2005 15:15, Pablo Gomes Ludermir wrote:
> Hello all,
>
> I would like to get the following information from the index:
>
> 1. Given a term, how many times the term occurs in each document.
> Something like a triple:
> < Term, Doc1, Freq> , <Term, Doc2, Freq>, <Term2, Docx, Freq>, ...
>
> Is possible to do that?
>
>
> Regards,
> Pablo
Off the top of my head... assuming you have an IndexReader (or MultiReader)
called reader:
TermEnum te = reader.terms();
while (te.next()) {
Term currentTerm = te.term();
TermDocs docs = reader.termDocs(currentTerm);
int docCounter = 1;
while (docs.next()) {
System.out.println(currentTerm.text() + ", doc" + docCount + ",
" + docs.freq());
docCounter++;
}
}
HTH,
Andy
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]