Re: Scoring question

Marcel Reutegger Mon, 08 Sep 2008 08:24:41 -0700

Hi,

jackrabbit 1.5 will allow you to configure a custom similarity implementation.
See: http://wiki.apache.org/jackrabbit/Search parameter: similarityClass


for details on how to implement a similarity class see the lucene documentation.

regards
 marcel

flopsi73 wrote:
> Hi everybody,
> 
> i have a question regarding custom scoring:
> I want to implement a scoring so that the score of a document is just equal
> to the occurences of the terms in the document. No special rules about term
> length, ocurrences in other documents etc.
> 
> defining that only jcr:content/@jcr:data is indexed, e.g. a document with
> content
> 'This is a test document of jackrabbit scoring mechanism, just a test
> document'
> should always get a score of 3
> with a search
> 'test scoring'
> 
> Does anyone  have an idea on how to achieve this most easily? Is there
> already anything? Or if not, which classes are to subclass? Just Scorer and
> Weight? I think Similarity is not necessary (see MatchAllScorer)?!? Or maybe
> even Query?
> 
> I thought about something like this (in a new 'HitScorer' class):
> 
>       public float score() throws IOException {
>               TermFreqVector tfv = reader.getTermFreqVector(nextDoc, 
> "jcr:content");
>               int[] freqs = tfv.getTermFrequencies();
>               int sum = 0;
>               for (int i = 0; i < freqs.length; i++)
>                       sum += freqs[i];
>               return sum;
>       }
> 
> But what to do in Weight.getSumOfSquaredWeights and Weight.normalize? Just
> 1.0f? And is the property name correct? I admit i am a bit confused about
> the DefaultSimilarity formula(s)...
> 
> Thanks a lot, best regards
> Flo
>

Re: Scoring question

Reply via email to