Here's the thing: my derivation creates complete recommendations, so it is both a data model and a recommender itself.
Ok, so I should separate the training data (GroupLens) into training and test sets. Then, I should transform the training set. I will run a SlopeOne recommender from the original test set, and my recommender the same. Then, I will compare the recommendations... somehow. Sean, you are correct, the derivation gives ranking values in a different space than the original data model. So, I'm comparing the order of recommendations. I'm trying a normal-scores thing because it's easy. http://comp9.psych.cornell.edu/Darlington/normscor.htm However, at the moment I'm testing SlopeOne against random data to to get a baseline. Lance On Fri, Oct 22, 2010 at 12:58 PM, Federico Castanedo <[email protected]> wrote: > Hi Lance, > > IMHO I think the best way to compare how much information are you loosing from > your derivative function is to perform a cross-validation scheme both > in the original > data set and on the derivative data set. > > But be sure to compare the same validation set of the two sets (the original > and > the derivative), I mean if you use and 80%-20% for training/validation > with a 5 cross-validation > scheme, be sure you are comparing the same sub-set of your two sets. > > Regards, > Federico > > 2010/10/22 Sean Owen <[email protected]>: >> Yah I still think held-out data is the best thing, if you want to use this >> built-in evaluation mechanism. Hold out the same data from both models and >> run the same test. >> >> There is another approach which doesn't necessarily require held-out data. >> On the original, full model, just compute recommendations for any users you >> like. Assume these are "correct". Then do the same for the derived model. >> >> It will return to you estimated preferences in both cases. You could use the >> deltas as a measure of "error" (unless your derived model has quite a >> different rating space). >> >> Or simply use the difference in rankings -- compute some metric that >> penalizes having recommendations in different places in the ordering. >> >> I'll say I don't know which of these is most mathematically sound. >> Interpreting the results may be hard. But, any of these should give a notion >> of "better" and "worse". >> >> >> Assuming the original model's recommendations are "correct" is a reasonably >> big one. For example, the whole point of an SVD recommender is to modify the >> model (reduce its dimension really) in order to be able to recommend items >> that should be recommended, but weren't before due to model sparseness. >> There, transforming the data in theory gives better results. That it's >> different doesn't mean worse necessarily. >> >> But maybe that's not an issue for your use case, don't know. >> >> >> On Fri, Oct 22, 2010 at 5:39 AM, Lance Norskog <[email protected]> wrote: >> >>> Here is my use case: I have two data models. >>> 1) the original data, for example GroupLens >>> 2) the derivative. This is a second data model which is derived from >>> the original. It is made with a one-way function from the master. >>> >>> I wish to measure how much information is lost in the derivation >>> function. There is some entropy, so therefore the derived data model >>> cannot supply recommendations as good as the original data. But how >>> much worse? >>> >>> My naive method is to make recommendations using the master model, and >>> the derived model, and compare them. If the recommendations from the >>> derived model are, say, 90% as good as from the original data, then >>> the derivation function is ok. >>> >>> Now, obviously, the gold standard for recommendations is the data in >>> the original model. So, I make recommendations from the original, and >>> the derived, from the user/item prefs given in the original data. I >>> don't really care about what the user gave as preferences: I want to >>> know what the recommender algorithm itself thinks. But the >>> recommenders just parrot back the data model instead of giving me >>> their own opinion. Thus, the point of this whole thread. But how >>> recommender algorithms work is a side issue. I'm trying to use them as >>> an indirect measurement of something else. >>> >>> What is another way to test what I'm trying to test? What is another >>> way to evaluate the quality of my derivation function? >>> >>> On Wed, Oct 20, 2010 at 11:41 PM, Sebastian Schelter <[email protected]> >>> wrote: >>> > Hi Lance, >>> > >>> > When evaluating a recommender you should split your dataset in a training >>> > and test part. Only data from the training part should be included in >>> your >>> > DataModel and you only measure the accuracy of predicting ratings that >>> are >>> > included in the test part (which is not known by your recommender). If >>> you >>> > structure things this way, the current implementation should work fine >>> for >>> > you. >>> > >>> > --sebastian >>> > >>> > On 21.10.2010 04:56, Lance Norskog wrote: >>> >> >>> >> Since this is Recommender day, here is another kvetch: >>> >> >>> >> The recommender implementations with algorithms all do this in >>> >> Recommender.estimatePreference(): >>> >> public float estimatePreference(long userID, long itemID) throws >>> >> TasteException { >>> >> DataModel model = getDataModel(); >>> >> Float actualPref = model.getPreferenceValue(userID, itemID); >>> >> if (actualPref != null) { >>> >> return actualPref; >>> >> } >>> >> return doEstimatePreference(userID, itemID); >>> >> } >>> >> >>> >> Meaning: "if I told you something, just parrot it back to me." >>> >> Otherwise, make a guess. >>> >> >>> >> I am doing head-to-head comparisons of the dataModel preferences v.s. >>> >> the Recommender. This code makes it impossible to directly compare >>> >> what the recommender thinks v.s. the actual preference. If I wanted to >>> >> know what I told it, I already know that. I want to know what the >>> >> recommender thinks. >>> >> >>> >> If this design decision is something y'all have argued about and >>> >> settled on, never mind. If it is just something that seemed like a >>> >> good idea at the time, can we change the recommenders, and the >>> >> Recommender "contract", to always use their own algorithm? >>> >> >>> >> >>> > >>> > >>> >>> >>> >>> -- >>> Lance Norskog >>> [email protected] >>> >> > -- Lance Norskog [email protected]
