I have a recommender that I would like to evaluate. The Absolute evaluator doesn't work, because it compares preference values. The recommender and its datamodel operate in different numerical spaces and there is no way to normalize the two. So this leaves comparing the relative order of the recommendations from the DataModel v.s. Recommender. There is no order-comparing evaluator.
What's a good strategy for this problem? Order comparison seems the right approach, but what are "intellectually defendable" formulae? This is what I've got: For each user, I get the item preferences from the DataModel. Then I get preferences for the same items from the recommender. These are stored in matching arrays. I've tried a couple of measurements: 1) Do a bubble sort of one prefs list against the other, counting the number of swaps needed to make the two match. 2) For each item in one prefs list, find its position in the other prefs list and save the distance. For both of these measures, I've tried various combinations of division and square roots to get a useful comparison score. Throwing in a square root allows one to accentuate nearer distances v.s. farther. Comments? Technical references? Thanks, -- Lance Norskog [email protected]
