Thanks but TermsEnum has two methods that returns frequency-related info, both are corpus-level, not document specific:

-docFreq() Returns the number of documents containing the current term.
-totalTermFreq() Returns the total number of occurrences of this term across all documents (the sum of the freq() for each doc that has this term).

However I will need document specific frequency, i.e., freq of term A in Doc 1, 2, ... N

Thanks

On 20/09/2015 15:07, Uwe Schindler wrote:
Hi,

With the terms enum you can iterate over all terms. Each one returns its term 
frequency. Of course, you need to enable term vectors during indexing. The 
pattern how to use terms enum can be looked up at various places in Lucene 
source code. It's a very expert API but it is the way to go here.

Uwe

Am 20. September 2015 15:35:40 MESZ, schrieb Ziqi Zhang 
<ziqi.zh...@sheffield.ac.uk>:
Hi

Is it possible to get a list of terms within a document, and also TF of

each of these terms *in that document only*? (Lucene 5.3)

IndexReader has a method "Terms getTermVector(int docID, String
field)",
which gives me a "Terms" object, on which I can get a TermsEnum. But I
do not know where to go then.

thanks
--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de


--
Ziqi Zhang
Research Associate
Department of Computer Science
University of Sheffield


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to