Re: custom similarity based on tf but greater than 1.0

Otis Gospodnetic Fri, 19 Jan 2007 02:55:36 -0800

Jumping in at this point and not having read other responses, I think the 
function that Vangelis is looking for is coord method in Similarity - that's 
for document terms/query terms overlap, I believe.


Otis

----- Original Message ----
From: Mark Miller <[EMAIL PROTECTED]>
To: [email protected]
Sent: Thursday, January 18, 2007 5:36:21 PM
Subject: Re: custom similarity based on tf but greater than 1.0

I just did the same thing. If you search the list you'll find the thread 
where Hoss gave me the info you need. It really comes down to makeing a 
FakeNormsIndexReader. The problem you are having is a result of the 
field size normalization.

- mark

Vagelis Kotsonis wrote:
> Hi all.
> I am trying to make some experiments in an algorithm that scores results by
> counting how many words of the query submited are in a document.
>
> For example if i enter the query 
>
> A B D A
>
> The similarities I want to get for the documents follows:
>
> A A C F D (2-found A and D)
> A B D S S A (3 - found A, B and D)
> D D D (1 - only found D)
>
> I built a Similarity that actually sets everything's price as 1.0f except tf
>
> The tf functions returns 1.0f if freq>0 and 0.0f else.
>
> I think that this change does count what I want, but when it comes to show
> the score, all are normalized. So, the greater similarity is equal to 1.0f
> and the others are lower than 1.0f
>
> How can I "deactivate" the score normalization?
>
> Thank you!
>
> I want to 
>   

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: custom similarity based on tf but greater than 1.0

Reply via email to