Hi there, I'm not sure if the performance is considerable for you but you could try: TermDocs termDocs = this.reader.termDocs(term); int count = 0; while(termDocs.next()){ count += termDocs.freq(); }
simon On Mon, Aug 24, 2009 at 6:14 PM, Ivan Vasilev<ivasi...@sirma.bg> wrote: > Hi All, > > We use faceting in our app but it is very slow for the indexes that use our > clients. > First I will say what I understand under faceting - this is for each term > for certain field to obtain 1. number of docs that contain it, 2. the total > number of occurrences of the term in the index. > Now what we use to obtain the information: > > ... > some code for obtained terms on which we will make faceting > ... > > Term[] retTerms = new Term[terms.size()]; > int[] retFreqs = new int[retTerms.length]; > int[] retDocs = new int[retTerms.length]; > TermPositions tp = mSearcher.getIndexReader().termPositions(); > int i = 0; > for(Iterator<Term> iter = terms.iterator(); iter.hasNext(); i++) { > try { > retTerms[i] = iter.next(); > tp.seek(retTerms[i]); > while(tp.next()) { > // tp.read(new int[]{}, new int[]{}); > // tp.doc(); > retFreqs[i] += tp.freq(); > retDocs[i]++; > } > } finally { > if(tp != null) { > tp.close(); > } > } > } > > Now what I discovered that is extremely faster for obtaining number of docs > that contain each term. > > ... > the same code for obtained terms on which we will make faceting > ... > > Term[] retTerms = new Term[terms.size()]; > int[] retFreqs = new int[retTerms.length]; > int i = 0; > long t1 = System.currentTimeMillis(); > for (Term currTerm : terms) { > retTerms[i] = currTerm; > retFreqs[i] = mSearcher.docFreq(currTerm); > i++; > } > > I tested two code versions for obtaining 1 237 390 term facets. The > difference in time was 10 times (second version wins). I know that this is > because Lucene index keeps for each term the number of docs that contain it. > > My question - is there some way to obtain the total number of occurrences of > the term in the index in some similar fast way? > > Best Regards, > Ivan > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org