Hi, If I understand correctly what you are trying to do as far as getting corpusTF, you might want to look at the implementation of the "-t" flag in org.apache.lucene.misc/HighFreqTerms.java in contib.
Take a look at the getTotalTermFreq method in trunk. http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/contrib/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java?view=markup 3.x version here: http://svn.apache.org/viewvc/lucene/dev/branches/branch_3x/lucene/contrib/misc/src/java/org/apache/lucene/misc/HighFreqTerms.java?view=markup Tom http://www.hathitrust.org/blogs/large-scale-search -----Original Message----- From: nitinhardeniya [mailto:nitinharden...@gmail.com] Sent: Tuesday, March 22, 2011 1:57 PM To: java-user@lucene.apache.org Subject: TermDoc to TermDocsEnum hi I have a code that work fine with lucene 3.2 where i used TermDocs to find the corpusTF here is the code public void calculateCorpusTF(IndexReader reader) throws IOException { // TODO Auto-generated method stub Iterator it = word.iterator(); Iterator iwp = word_prop.iterator(); wordProp wp; Term ta = null; TermDocs tds; // DocsEnum tds; String text; tfDoc tfcoll; long freq=0; OpenBitSet skipDocs = null; skipDocs = new OpenBitSet(0); //System.out.println("Length: "+skipDocs.length()); try { while(it.hasNext()) { text=it.next(); wp=iwp.next(); System.out.println("Word is "+text); ta= new Term("content",text); //BytesRef term = new BytesRef(text.toCharArray(),0,text.length()); tfcoll = new tfDoc(); freq=0; tds=reader.termDocs(ta); // tds=reader.terms("content"); if(tds!=null) { while(tds.next()) { freq+=tds.freq(); // System.out.print( text +" "+ freq); } } // New Code --> // tds = reader.termDocsEnum(skipDocs, "content", term); // if(tds!=null) // { // while(true) { // freq += tds.freq(); // final int docID = tds.nextDoc(); // if (docID == DocsEnum.NO_MORE_DOCS) { // break; // } // } // } // // New code Ends <-- tfcoll.tfA=freq; System.out.print( text +" "+ freq); if(tfcoll.totalTF()==0) { //System.out.println(" "+tfcoll.tfA+" "+tfcoll.tfD+" "+tfcoll.tfC); System.out.println("Text "+text+ " Freq "+freq); } wp.tfColl=tfcoll; } } catch (Exception e) { // TODO: handle exception e.printStackTrace(); } } but now i have to use TermDocEnum because i am using lucene dev4.0 which does not have TermDocs method i was trying to change my code .please refer to new code [commented ] and tell me how to use this method in a proper way . if you can provide an example that would be great. tds = reader.termDocsEnum(skipDocs, "content", term); I have tried using null at skipdoc because i don't want to skip anything but it through error please help -- View this message in context: http://lucene.472066.n3.nabble.com/TermDoc-to-TermDocsEnum-tp2716046p2716046.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org