Re: custom similarity based on tf but greater than 1.0

Vagelis Kotsonis Thu, 18 Jan 2007 17:17:40 -0800

But i don't want to get the frequency of each term in the doc.

what I want is 1 if the term exists in the doc and 0 if it doesn't. After
this, I want all thes 1s and 0s to be summed and give me a number to use as
a score.


If I set the TF value as 1 or 0, as I described above, I get the right
number, but this number is normalized to 1.0 and smaller numbers.

It is the normalization that I want to avoid.

Thanks again!
Vagelis


markrmiller wrote:
> 
> Dont return 1 for tf...just return the tf straight with no 
> changes...return freq. For everything else return 1. After that 
> OMIT_NORMS should work. If you want to try a custom reader:
> 
> public class FakeNormsIndexReader extends FilterIndexReader {
>     byte[] ones = SegmentReader.createFakeNorms(maxDoc());
> 
>     public FakeNormsIndexReader(IndexReader in) {
>         super(in);
>     }
>     public synchronized byte[] norms(String field) throws IOException {
>           System.out.println("returning fake norms...");
>         return ones;
>     }
> 
>     public synchronized void norms(String field, byte[] result, int 
> offset) {
>           System.out.println("writing fake norms...");
>         System.arraycopy(ones, 0, result, offset, maxDoc());
>     }
> }
> 
> The beauty of this reader is that you can flip between it and your 
> custom similarity and Lucene's default implementations live on the same 
> index.
> 
> - Mark
> 
> 

-- 
View this message in context: 
http://www.nabble.com/custom-similarity-based-on-tf-but-greater-than-1.0-tf3037071.html#a8442711
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: custom similarity based on tf but greater than 1.0

Reply via email to