Ok, makes sense. Thanks for the info!

On Thu, Jun 19, 2014 at 3:05 PM, Robert Muir <rcm...@gmail.com> wrote:

> Don't extend that: extend Similarity.
>
> Some of those implementations actually rely and optimize for the fact
> that its a byte and build lookup tables and so on.
>
> On Thu, Jun 19, 2014 at 6:03 PM, Nalini Kartha <nalinikar...@gmail.com>
> wrote:
> > Sorry, I meant the encodeNormValue and decodeNormValue methods on the
> > TFIDFSimilarity class -
> >
> > public byte encodeNormValue(float f)
> > public float decodeNormValue(byte b)
> >
> >
> > On Thu, Jun 19, 2014 at 12:08 PM, Robert Muir <rcm...@gmail.com> wrote:
> >
> >> No they do not. The method is:
> >>
> >>   public abstract long computeNorm(FieldInvertState state);
> >>
> >>
> >>
> >> On Thu, Jun 19, 2014 at 1:54 PM, Nalini Kartha <nalinikar...@gmail.com>
> >> wrote:
> >> > Thanks for the info!
> >> >
> >> > We're more interested in changing the lengthnorm function vs using
> >> > additional stats for scoring so option 2 seems like the right way.
> >> >
> >> > It looks like the encode and decode methods deal with bytes right now
> -
> >> > would changing those APIs to deal with longs instead be a good idea?
> It
> >> > looks like the byte returned from encode is always being cast to long
> and
> >> > the byte passed into decode is always a long to begin with. If we make
> >> this
> >> > change, would it be useful to submit a patch for it?
> >> >
> >> > Thanks,
> >> > Nalini
> >> >
> >> >
> >> > On Thu, Jun 19, 2014 at 10:28 AM, Uwe Schindler <u...@thetaphi.de>
> wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> You may not need to change the length-norm at all: If you want to
> >> support
> >> >> *additional* statistics, add a docvalues field to your index where
> you
> >> can
> >> >> store that information in addition to the Lucene-Default statistics.
> >> Based
> >> >> on a function query you can then use it for scoring. In fact, you can
> >> then
> >> >> also use a different data type for the statistics value. The norms in
> >> >> Lucene are already internally handled as docvalues fields, too.
> >> >>
> >> >> On the other hand, if you want to modify the lengthNorm and you use a
> >> >> non-float value, you have to also modify the encodeNorm/decodeNorm
> >> methods
> >> >> of the similarity. The default uses a very lossy float->1byte
> >> >> transformation.
> >> >>
> >> >> Uwe
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> >> http://www.thetaphi.de
> >> >> eMail: u...@thetaphi.de
> >> >>
> >> >>
> >> >> > -----Original Message-----
> >> >> > From: Nalini Kartha [mailto:nalinikar...@gmail.com]
> >> >> > Sent: Thursday, June 19, 2014 7:14 PM
> >> >> > To: java-user@lucene.apache.org
> >> >> > Subject: Changing field lengthnorm to store length
> >> >> >
> >> >> > Hi,
> >> >> >
> >> >> > We're interested in having access to the number of terms in the
> fields
> >> >> for a
> >> >> > document vs the pre-calculated lengthnorm at scoring time - we want
> >> >> > experiment with different lengthnorm functions so it seems like
> >> storing
> >> >> the
> >> >> > raw length and then doing the norm calculation at query time would
> >> work.
> >> >> >
> >> >> > Is changing the lengthnorm method on Similarity class to return the
> >> raw
> >> >> > number of terms the right way to go to for this? We realize this
> will
> >> >> result in
> >> >> > taking up more than a byte to store the value but we're OK with
> this.
> >> >> Will this
> >> >> > break anything else under the hood?
> >> >> >
> >> >> > Thanks,
> >> >> > Nalini
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >> >>
> >> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> >> For additional commands, e-mail: java-user-h...@lucene.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>
>

Reply via email to