> while (termDocs.next()) { > termDocs.next(); > }
For one, this loop calls next() twice in each iteration, so every second is skipped... ? "chris.b" <[EMAIL PROTECTED]> wrote on 10/12/2007 12:58:15: > > Here goes, > I'm developing an application using lucene which will evaluate the > representativeness of a list of keywords within a collection ofdocuments. > I'm doing this by indexing the documents and then, loading the list of > keywords and using the IndexReader Class and DefaultSimilarity,retrieving > and average tf of each word (where the tf is obtained through > TermDocs.freq() and the average is the sum of tf's divided by > the number of > documents) and the idf for each word, and printing the output in an html > document, together with the documents in which they appear, and others. > > At this point, I have found two problems, > I have documents, in which I know the word appears, but still > the tf comes > out as '0' (even though the number of documents says 2). > and it doesn't print a list of all the documents (ie: it says there are 2 > documents which contain the word, but only one of them is printed). > > I don't know if what i'm doing is correct, but to obtain the > tf, i'm doing > the following: > > while (termDocs.next()) { > listaDocNums.add(termDocs.doc()); > tf += termDocs.freq(); > termDocs.next(); > } > > where termDocs is an enumeration of the documents which containthe word. > > and for the document names I'm doing the following: > > for (int f = 0; f < listaDocNums.size(); f++) { > outrstream.write(reader.document(listaDocNums. > get(f)).get("filename")); > } > > where listaDocNums is an arraylist which contains the numbers for the > documents. > I must also mention that when i try printing the list of numbers, it also > doesn't contain all the documents. > > That's it, i think i wrote all that was needed. > > Thanks in advance for any help/guidelines :) > > Chris > -- > View this message in context: http://www.nabble.com/Problem- > with-termdocs.freq-and-other-tp14250898p14250898.html > Sent from the Lucene - Java Users mailing list archive at Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]