Thanks, I got it by doing something like this:
public class PartialSimilarity : DefaultSimilarity
{
public override float Idf(long docFreq, long docCount)
{
return 1.0f;
}
public override float Tf(float freq)
{
return 1.0f;
}
public override float LengthNorm(FieldInvertState state)
{
int numTerms;
if (m_discountOverlaps)
{
numTerms = state.Length - state.NumOverlap;
}
else
{
numTerms = state.Length;
}
return (float)numTerms;
}
public override long ComputeNorm(FieldInvertState state)
{
float normValue = LengthNorm(state);
return (long)normValue;
}
public override float QueryNorm(float sumOfSquaredWeights)
{
return 1.0f;
}
public override float DecodeNormValue(long norm)
{
return 1.0f / (float)norm;
}
public override float Coord(int overlap, int maxOverlap)
{
return 1.0f;
}
}
A slightly different variation of this is the following:
If it’s a partial match, how can I return a score of 0? i.e. if query is “A B
C” and the field contains “B D”, then, I want to say that the score is 0. This
requires knowledge of the sum of scores of all terms, which I am not sure how I
can access.
My hunch is that I would need to create a specialized type of query, but it’s
not clear to me what it needs to be. Any suggestions?
Best,
Georgios
From: Adrien Grand <[email protected]>
Sent: Wednesday, May 22, 2024 12:20 AM
To: [email protected]
Subject: [EXTERNAL] Re: Question about extending Similarity
You don't often get email from [email protected]<mailto:[email protected]>.
Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
Hi Georgios,
This is possible. You need to create a similarity that stores the number of
terms as a norm, and then produce scores that are equal to freq/norm at search
time.
On Tue, May 21, 2024 at 8:02 PM Georgios Georgiadis
<[email protected]<mailto:[email protected]>>
wrote:
Hi,
I would like to extend Similarity to have the following functionality: if the
query is “A B C” and a field contains “B C” then I would like to call that a
“match” and return a score of 1 (2/2). If the query is “A B C” and the field
contains “B D” then I would like to call that a partial match and give a score
of 0.5 (1/2). Is this possible?
Best,
Georgios
--
Adrien