Hi Srikant,

I really thank you for your reply, it's very interesting.
I have to say I am confused with that now... 
I do not know what I can to for passing this Unit test...

I agree with you, it may be an issue of computing relevance.

Fabrice


Srikant Jakilinki-3 wrote:
> 
> OK, got it to work. Thanks.
> 
> By a quick scoring comparision, I got the same scores for both hits. 
> Maybe there is a loss of precision somewhere. Or when scores are equal, 
> Lucene is doing something unintended/overlooked and thus putting shorter 
> documents higher as the experiment is a special case where the TF of a 
> queried term (for both suites, the TF of x = 10%) is equal which is very 
> rarely. Or maybe the IDF factor is kicking in in some strange way 
> although it shouldnt. There are a number of varied reasons, but for the 
> naked eye, there isnt much.
> 
> However, that said, length normalization is not a science but an art and 
> the simple scheme we have here in the FairSimilarity will not perform 
> always as expected in real world scenarios. Maybe I am missing something 
> or have forgot my basics but that is not to say your observation is
> trivial.
> 
> Rather, the contrary. Hope there will be more activity on this topic 
> because it is an issue of computing relevance which is the core of search.
> 
> Cheers,
> Srikant
> 
> Fabrice Robini wrote:
>> Oooops sorry, bad cut/paste...
>>
>> Here is the right one :-)
>>
>>     public void testFairSimilarity() throws CorruptIndexException,
>> IOException, ParseException
>>     {
>>         Directory theDirectory = new RAMDirectory();
>>         Analyzer theAnalyzer = new StandardAnalyzer();
>>         
>>         IndexWriter theIndexWriter = new IndexWriter(theDirectory,
>> theAnalyzer);
>>         theIndexWriter.setSimilarity(new FairSimilarity());
>>         
>>         Document doc1 = new Document();
>>         Field name1 = new Field("NAME", "SHORT_SUITE", Field.Store.YES,
>> Field.Index.UN_TOKENIZED);
>>         Field content1 = new Field("CONTENT", "x 2 3 4 5 6 7 8 9 10",
>> Field.Store.NO, Field.Index.TOKENIZED);
>>         doc1.add(name1);
>>         doc1.add(content1);        
>>         theIndexWriter.addDocument(doc1);
>>         
>>         Document doc2 = new Document();
>>         Field name2 = new Field("NAME", "BIG_SUITE", Field.Store.YES,
>> Field.Index.UN_TOKENIZED);
>>         Field content2 = new Field("CONTENT", "x x 3 4 5 6 7 8 9 10 11 12
>> 13
>> 14 15 16 17 18 19 20", Field.Store.NO, Field.Index.TOKENIZED);
>>         doc2.add(name2);
>>         doc2.add(content2);        
>>         theIndexWriter.addDocument(doc2);
>>         
>>         theIndexWriter.close();
>>         
>>         Searcher searcher = new IndexSearcher(theDirectory);
>>         searcher.setSimilarity(new FairSimilarity());
>>
>>         QueryParser queryParser = new QueryParser("CONTENT",
>> theAnalyzer);
>>
>>         Hits hits = searcher.search(queryParser.parse("x"));
>>
>>         assertEquals(2, hits.length());
>>         assertEquals("BIG_SUITE", hits.doc(0).get("NAME"));
>>         assertEquals("SHORT_SUITE", hits.doc(1).get("NAME"));
>>     }
>>     
>>
>>
>>
>> Srikant Jakilinki-3 wrote:
>>   
>>> Well, I cant seem to even get past the assertions of this code.
>>>
>>> The first assertion is failing in that I get 0 hits. I am using 
>>> SimpleAnalyzer since I do not have a FrenchAnalyzer.
>>>
>>> Any thoughts?
>>> Srikant
>>>
> 
> ----------------------------------------------------------------------
> Free pop3 email with a spam filter.
> http://www.bluebottle.com/tag/5
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Is-Fair-Similarity-working-with-lucene-2.2---tp15001250p15026214.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to