On Sun, Nov 13, 2005 at 12:04:41AM +0100, Karl Koch wrote: > My aim is to combine this two scores. The Lucenes score is normalisied > between 0.0 and 1.0 (if the score exceeded 1.0 at some point) or less then > 1.0 (if it did not). The user model looks the same in this perspective - > although based on different data - a 1.0 means the maximum of relevance and > a 0.0 a minimum or relevance. At the moment I am multiplying the Lucene > score with the score produced by the user model. This means the resulting, > combiend socre is number between 0.0 and 1.0 and represents the merged view > from both models - the IR view and the view of the user model.
I came across that question too recently; it seems to be a rather under-researched topic in the literature. I used multiplication in the end, because it's simple, it produces reasonable results, it's not tunable, and it's invariant to normalization. (Don't make a model with tunable parameters if you don't know how to tune them ...) The most helpful paper I came across was this: http://trec.nist.gov/pubs/trec13/papers/microsoft-cambridge.web.hard.pdf It's about combining PageRank with a relevance score, but it contains a good description of how they arrived at their scoring formula. They use a linear combination of the two measures and transform them to have a roughly similar distribution. They then tuned the parameters using a test corpus (which may be difficult/impossible for your application.) Their system was one of the best at TREC-13. Regards, Sebastian -- Sebastian Kirsch <[EMAIL PROTECTED]> [http://www.sebastian-kirsch.org/] NOTE: New email address! Please update your address book. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]