Hi,

You may not need to change the length-norm at all: If you want to support 
*additional* statistics, add a docvalues field to your index where you can 
store that information in addition to the Lucene-Default statistics. Based on a 
function query you can then use it for scoring. In fact, you can then also use 
a different data type for the statistics value. The norms in Lucene are already 
internally handled as docvalues fields, too.

On the other hand, if you want to modify the lengthNorm and you use a non-float 
value, you have to also modify the encodeNorm/decodeNorm methods of the 
similarity. The default uses a very lossy float->1byte transformation.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -----Original Message-----
> From: Nalini Kartha [mailto:nalinikar...@gmail.com]
> Sent: Thursday, June 19, 2014 7:14 PM
> To: java-user@lucene.apache.org
> Subject: Changing field lengthnorm to store length
> 
> Hi,
> 
> We're interested in having access to the number of terms in the fields for a
> document vs the pre-calculated lengthnorm at scoring time - we want
> experiment with different lengthnorm functions so it seems like storing the
> raw length and then doing the norm calculation at query time would work.
> 
> Is changing the lengthnorm method on Similarity class to return the raw
> number of terms the right way to go to for this? We realize this will result 
> in
> taking up more than a byte to store the value but we're OK with this. Will 
> this
> break anything else under the hood?
> 
> Thanks,
> Nalini


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to