Re: A question about scoring function in Lucene

2004-12-15 Thread Doug Cutting
Chuck Williams wrote: I believe the biggest problem with Lucene's approach relative to the pure vector space model is that Lucene does not properly normalize. The pure vector space model implements a cosine in the strictly positive sector of the coordinate space. This is guaranteed intrinsically

Re: A question about scoring function in Lucene

2004-12-15 Thread Chris Hostetter
: I question whether such scores are more meaningful. Yes, such scores : would be guaranteed to be between zero and one, but would 0.8 really be : meaningful? I don't think so. Do you have pointers to research which : demonstrates this? E.g., when such a scoring method is used, that :

Re: A question about scoring function in Lucene

2004-12-15 Thread Otis Gospodnetic
There is one case that I can think of where this 'constant' scoring would be useful, and I think Chuck already mentioned this 1-2 months ago. For instace, having such scores would allow one to create alert applications where queries run by some scheduler would trigger an alert whenever the score

RE: A question about scoring function in Lucene

2004-12-15 Thread Chuck Williams
: A question about scoring function in Lucene Chris Hostetter wrote: For example, using the current scoring equation, if i do a search for Doug Cutting and the results/scores i get back are... 1: 0.9 2: 0.3 3: 0.21 4: 0.21 5: 0.1

RE: A question about scoring function in Lucene

2004-12-15 Thread Nhan Nguyen Dang
. Chuck -Original Message- From: Vikas Gupta [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 14, 2004 9:32 PM To: Lucene Users List Subject: Re: A question about scoring function in Lucene Lucene uses the vector space model. To understand

RE: A question about scoring function in Lucene

2004-12-15 Thread Chuck Williams
15, 2004 1:18 AM To: Lucene Users List Subject: RE: A question about scoring function in Lucene Thank for your answer, In Lucene scoring function, they use only norm_q, but for one query, norm_q is the same for all documents. So norm_q is actually not effect the score

Re: A question about scoring function in Lucene

2004-12-15 Thread Doug Cutting
Otis Gospodnetic wrote: There is one case that I can think of where this 'constant' scoring would be useful, and I think Chuck already mentioned this 1-2 months ago. For instace, having such scores would allow one to create alert applications where queries run by some scheduler would trigger an

Re: A question about scoring function in Lucene

2004-12-15 Thread Doug Cutting
Chris Hostetter wrote: For example, using the current scoring equation, if i do a search for Doug Cutting and the results/scores i get back are... 1: 0.9 2: 0.3 3: 0.21 4: 0.21 5: 0.1 ...then there are at least two meaningful pieces of data I can glean:

RE: A question about scoring function in Lucene

2004-12-14 Thread Chuck Williams
] Sent: Tuesday, December 14, 2004 9:32 PM To: Lucene Users List Subject: Re: A question about scoring function in Lucene Lucene uses the vector space model. To understand that: -Read section 2.1 of Space optimizations for Total Ranking paper (Linked here http

Re: A question about scoring function in Lucene

2004-12-14 Thread Vikas Gupta
Lucene uses the vector space model. To understand that: -Read section 2.1 of Space optimizations for Total Ranking paper (Linked here http://lucene.sourceforge.net/publications.html) -Read section 6 to 6.4 of http://www.csee.umbc.edu/cadip/readings/IR.report.120600.book.pdf -Read section 1 of