that's a pretty good idea, using 'delta score'

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: Toke Eskildsen <t...@statsbiblioteket.dk>
To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
Sent: Thu, January 20, 2011 11:31:48 PM
Subject: Re: pruning search result with search score gradient

On Tue, 2011-01-11 at 12:12 +0100, Julien Piquot wrote:
> I would like to be able to prune my search result by removing the less 
> relevant documents. I'm thinking about using the search score : I use 
> the search scores of the document set (I assume there are sorted by 
> descending order), normalise them (0 would be the the lowest value and 1 
> the greatest value) and then calculate the gradient of the normalised 
> scores. The documents with a gradient below a threshold value would be 
> rejected.

As part of experimenting with federated search, this is one approach
we'll be trying out to determine which results to discard when merging.

> If the scores are linearly decreasing, then no document is rejected. 
> However, if there is a brutal score drop, then the documents below the 
> drop are rejected.

So if we have the scores
1.0, 0.9, 0.2, 0.15, 0.1, 0.05
then the slopes will be
0.05, 0.4, 0.025, 0.025, 0.025
and with a slope threshold of 0.1, we would discard everything from
score 0.2 and below.

It makes sense if the scores are linear with the relevance (a document
with score 0.8 has double the relevance as one with 0.4). I don't know
if they are, so experiments must be made and I fear that this is another
demonstration of the inherent problem with quantifying quality.

- Toke

Reply via email to