RE: fastest way to gather simple terms that match documents?

Uwe Schindler Mon, 05 Apr 2010 11:50:31 -0700

Alternatively index your documents with term vectors for the field enabled:


http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/document/Field.TermVector.html

And then use IndexReader.getTermFreqVector() with the matching doc ID:

http://lucene.apache.org/java/3_0_1/api/all/org/apache/lucene/index/IndexReader.html#getTermFreqVector(int,
 java.lang.String)

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Chris Hostetter [mailto:hossman_luc...@fucit.org]
> Sent: Monday, April 05, 2010 8:24 PM
> To: java-user@lucene.apache.org
> Subject: Re: fastest way to gather simple terms that match documents?
> 
> 
> :     After I've run a query I need to know which terms matched each
> : result document (ie doc termfrequency>0).
>       ...
> : I don't care how many were found or what position or anything else.
> : just which ones matched.
> 
> if all you care about is simple "which terms does it have" you can take
> your list of terms, and your list of docids, sort both lists and then
> use
> termDocs to loop over the terms and over the docs.  (the sorting is key
> for performance, because it allways you to alwasy skip forward, w/o
> needing to restart the termDocs)
> 
> something like...
> 
> TermDocs iter = indexReader.termDocs();
> for (Term t : myTerms) {
>   iter.seek(t);
>   for (int docid : myDocs) {
>     if (iter.skipTo(docid) && (iter.doc() == docid)) {
>       doSomethingWith(t, docid);
>     }
>   }
> }
> 
> 
> 
> -Hoss
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

RE: fastest way to gather simple terms that match documents?

Reply via email to