Dan Climan wrote:
TermEnum terms = ir.terms();
int numTerms = 0;
while (terms.next())
{
Term t = terms.term();
if (t.field().equals("FullText"))
numTerms++;
}
double lengthNorm = 1.0 / Math.sqrt(numTerms); //since
lengthNorm was defined as 1/sqrt(numTerms) by default

The numTerms is not the number of unique words in the collection, but rather the number of tokens in the document in question. So, if you want to re-create this externally you could re-tokenize the text for the field and count the tokens.


Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to