Thanks for the precision Hoss, that is helpful an explanation. I am still unsure how it is ever possible to display score-bars for which you need some normalization... but that's for another day.
I feel indications of match quality is still somehow a science that has not blossomed yet. Sorting by score is, however, in very good shape. paul Le 25 avr. 2011 à 22:53, Chris Hostetter a écrit : > > > : All I found was: > http://search.lucidimagination.com/search/document/9d06882d97db5c59/a_question_about_solr_score > : > : where Hoss suggests to normalize depending on the maxScore. > > to be clear, i do not (nor have i ever) suggested that someone normalize > based on maxScore. > > my point there was that when [people *insist* on providing osme sort of > normalization, the maxScore is always available if they want to use it > > : I am not comfortable with that since, at least, I want that a search for > : "the wombats" in a directory of mathematical concepts, and display that > : all scores are pretty bad and not display 1.0 for matches that are only > : on the word "the". > > the crux of the problem is in deciding what you want to normalize relative > to -- the "ideal" solution is to normalize relative the maximum *possible* > score for *any* query against your corpus, but that's not something that's > generally feasible to do (and based on experiments i tried once, it didn't > seem like it would be very useful anyway) > > : It seems that the strategy would be to normalize by maxScore if the > maxScore is bigger than 1.0. > : Can you confirm that? > : Isn't there going to be similar edge cases as above? > : > : I remember a time where Lucene results' score were always normalized. > : That seems to be not in SOLR, or? > > once upon a time, lucene's most "beginer freindly" api did provide > normalized scores, using the approach you described (divide by max score > if max score greater then 1.0) and it had all of the problems you might > expect -- but some people liked it because they had an irrational dislike > for scores greater then 1. > > Solr has never supported those psuedo-nromalize scores, and lucene's java > API eventually got rid of them. > > -Hoss