Re: Question about lengthNorm(numTerms)

2013-06-18 Thread jiangwen jiang
I got it, thanks, Jack

2013/6/18 Jack Krupansky j...@basetechnology.com

   The length normalization gets compressed down to a single byte “norm”,
 stored in the “.nrm” files.

 See:
 norm(t,d)

 http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

 -- Jack Krupansky

  *From:* jiangwen jiang jiangwen...@gmail.com
 *Sent:* Tuesday, June 18, 2013 12:35 AM
 *To:* dev@lucene.apache.org
 *Subject:* Question about lengthNorm(numTerms)

 Hi, guys:

 Is it suitable to send question in this mail list? There's a question
 about numTerms.

 http://www.lucenetutorial.com/advanced-topics/scoring.html, this website
 describes Lucene scoring.

 *4. lengthNorm*
 Implementation: 1/sqrt(numTerms)
 Implication: a term matched in fields with less terms have a higher score
 Rationale: a term in a field with less terms is more important than one with 
 more


 numTerms mentioned here, I think it means number of terms in field per 
 document. But the Lucene

 file format page doesn't mentioned it.

 http://lucene.apache.org/core/3_6_2/fileformats.html

 Does the numTerms really exists in Lucene index, if yes, how to get it?


 Regards




Question about lengthNorm(numTerms)

2013-06-17 Thread jiangwen jiang
Hi, guys:

Is it suitable to send question in this mail list? There's a question about
numTerms.

http://www.lucenetutorial.com/advanced-topics/scoring.html, this website
describes Lucene scoring.

*4. lengthNorm*
Implementation: 1/sqrt(numTerms)
Implication: a term matched in fields with less terms have a higher score
Rationale: a term in a field with less terms is more important than
one with more


numTerms mentioned here, I think it means number of terms in field per
document. But the Lucene

file format page doesn't mentioned it.

http://lucene.apache.org/core/3_6_2/fileformats.html

Does the numTerms really exists in Lucene index, if yes, how to get it?


Regards


Re: Question about lengthNorm(numTerms)

2013-06-17 Thread Jack Krupansky
The length normalization gets compressed down to a single byte “norm”, stored 
in the “.nrm” files.

See:
norm(t,d)
http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html

-- Jack Krupansky

From: jiangwen jiang 
Sent: Tuesday, June 18, 2013 12:35 AM
To: dev@lucene.apache.org 
Subject: Question about lengthNorm(numTerms)

Hi, guys: 

Is it suitable to send question in this mail list? There's a question about 
numTerms.

http://www.lucenetutorial.com/advanced-topics/scoring.html, this website 
describes Lucene scoring.
4. lengthNorm
Implementation: 1/sqrt(numTerms)
Implication: a term matched in fields with less terms have a higher score
Rationale: a term in a field with less terms is more important than one with 
morenumTerms mentioned here, I think it means number of terms in field per 
document. But the Lucenefile format page doesn't mentioned 
it.http://lucene.apache.org/core/3_6_2/fileformats.htmlDoes the numTerms really 
exists in Lucene index, if yes, how to get it?Regards