Query independent means that the threshold should have the same relevance for 
all queries and discard found docs below it. Current scoring implementation 
doesn't give guaranties that, say two documents found in two queries and which 
got the same score 0.5 are of the same quality.   

I don't want discarding docs from being indexed, no. But I want to be sure that 
two docs with the same score in two different queries have the same quality 
(they contain the same set of found terms, lenght etc.)

Alexander

-----Original Message-----
From: Andrzej Bialecki <[EMAIL PROTECTED]>
To: java-dev@lucene.apache.org
Date: Thu, 07 Aug 2008 22:44:46 +0200
Subject: Re: lucene scoring


Александр Аристов wrote:
> I want implement searching with ability to set so-called a confidence
> level below which I would treat documents as garbage. I cannot defile
> the level per query as the level should be relevant for all
> documents.

Hmm .. I'm not sure if I understand it properly - if the level is 
query-independent, then it's a constant factor, which you can put in a 
field during the index creation - and then you could use a Filter or 
FunctionQuery to exclude documents with this factor below the threshold.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to