Hi,
jackrabbit 1.5 will allow you to configure a custom similarity implementation.
See: http://wiki.apache.org/jackrabbit/Search parameter: similarityClass
for details on how to implement a similarity class see the lucene documentation.
regards
marcel
flopsi73 wrote:
> Hi everybody,
>
> i have a question regarding custom scoring:
> I want to implement a scoring so that the score of a document is just equal
> to the occurences of the terms in the document. No special rules about term
> length, ocurrences in other documents etc.
>
> defining that only jcr:content/@jcr:data is indexed, e.g. a document with
> content
> 'This is a test document of jackrabbit scoring mechanism, just a test
> document'
> should always get a score of 3
> with a search
> 'test scoring'
>
> Does anyone have an idea on how to achieve this most easily? Is there
> already anything? Or if not, which classes are to subclass? Just Scorer and
> Weight? I think Similarity is not necessary (see MatchAllScorer)?!? Or maybe
> even Query?
>
> I thought about something like this (in a new 'HitScorer' class):
>
> public float score() throws IOException {
> TermFreqVector tfv = reader.getTermFreqVector(nextDoc,
> "jcr:content");
> int[] freqs = tfv.getTermFrequencies();
> int sum = 0;
> for (int i = 0; i < freqs.length; i++)
> sum += freqs[i];
> return sum;
> }
>
> But what to do in Weight.getSumOfSquaredWeights and Weight.normalize? Just
> 1.0f? And is the property name correct? I admit i am a bit confused about
> the DefaultSimilarity formula(s)...
>
> Thanks a lot, best regards
> Flo
>