> However ... i still think that if you realy want > a length norm that takes into account the average > length of the docs, you want one that rewards docs > for being near the average ...
... like SweetSpotSimilarity (SSS) > it doesn't seem to make a lot of sense to me to say > that a doc whose length is N% longer longer then the > average length is significantly worse the docs whose > length is N% shorter then the average length. I don't understand why a doc should be punished for just having length different from the average length (i.e. no matter longer or shorter). The (evolving) way I understand it: (a) Very long docs are likely to contain everything, let's punish them to relax this; (b) This is what the original doc-length-norm actually does; (c) But then very short docs might be rewarded too much; (d) Now we might get stupid (or erroneous) few words docs as top results; (e) To solve this, pivoted doc-length-norm punishes too long docs (longer than the average) but only slightly rewards docs that are shorter than the average. It makes sense to me (IR'ishly if I may say so). The SSS way does not make sense to me that way. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]