> while (termDocs.next()) {
> termDocs.next();
> }
For one, this loop calls next() twice in each iteration,
so every second is skipped... ?
"chris.b" <[EMAIL PROTECTED]> wrote on 10/12/2007 12:58:15:
>
> Here goes,
> I'm developing an application using lucene which will evaluate the
> representativeness of a list of keywords within a collection ofdocuments.
> I'm doing this by indexing the documents and then, loading the list of
> keywords and using the IndexReader Class and DefaultSimilarity,retrieving
> and average tf of each word (where the tf is obtained through
> TermDocs.freq() and the average is the sum of tf's divided by
> the number of
> documents) and the idf for each word, and printing the output in an html
> document, together with the documents in which they appear, and others.
>
> At this point, I have found two problems,
> I have documents, in which I know the word appears, but still
> the tf comes
> out as '0' (even though the number of documents says 2).
> and it doesn't print a list of all the documents (ie: it says there are 2
> documents which contain the word, but only one of them is printed).
>
> I don't know if what i'm doing is correct, but to obtain the
> tf, i'm doing
> the following:
>
> while (termDocs.next()) {
> listaDocNums.add(termDocs.doc());
> tf += termDocs.freq();
> termDocs.next();
> }
>
> where termDocs is an enumeration of the documents which containthe word.
>
> and for the document names I'm doing the following:
>
> for (int f = 0; f < listaDocNums.size(); f++) {
> outrstream.write(reader.document(listaDocNums.
> get(f)).get("filename"));
> }
>
> where listaDocNums is an arraylist which contains the numbers for the
> documents.
> I must also mention that when i try printing the list of numbers, it also
> doesn't contain all the documents.
>
> That's it, i think i wrote all that was needed.
>
> Thanks in advance for any help/guidelines :)
>
> Chris
> --
> View this message in context: http://www.nabble.com/Problem-
> with-termdocs.freq-and-other-tp14250898p14250898.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]