That's relatively easy, but not out-of-the box... Something like:
private TreeMap<Double, String> getTFIDF(String index, int DocumentID, String Field ){ try{ IndexReader ir = IndexReader.open(index); TermFreqVector tv = ir.getTermFreqVector(DocumentID, Field); String[] Termstv=tv.getTerms(); Double Score; TreeMap<Double, String> TfIdfs = new TreeMap<Double, String>(); int docFreq, N; double[] TF = getTermFreqs(tv); for (int i =0 ; i < tv.size(); i++){ docFreq = ir.docFreq(new Term(Field,Termstv[i])); N = ir.numDocs() / docFreq; Score= Double.valueOf(TF[i] * ( Math.log(N)/Math.log(2))); TfIdfs.put(Score, Termstv[i]); } return TfIdfs; Searching the mailinglist might help as well; http://mail-archives.apache.org/mod_mbox/lucene-java-user/200506.mbox/[EMAIL PROTECTED] And see also: http://www.alias-i.com/lingpipe/demos/tutorial/interestingPhrases/read-me.html Edgar > -----Oorspronkelijk bericht----- > Van: thanh nguyen [mailto:[EMAIL PROTECTED] > Verzonden: Wednesday, March 22, 2006 6:31 PM > Aan: java-user@lucene.apache.org > Onderwerp: Repeat Second time: Extract important terms by > programming?? > > Can anyone help me? > > > > > > > > > ________________________________________________________ > Bạn có sử dụng Yahoo! không? > Hãy xem thử trang chủ Yahoo! Việt Nam! > http://vn.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]