Here's the thing: my derivation creates complete recommendations, so
it is both a data model and a recommender itself.

Ok, so I should separate the training data (GroupLens) into training
and test sets. Then, I should transform the training set. I will run a
SlopeOne recommender from the original test set, and my recommender
the same. Then, I will compare the recommendations... somehow.

Sean, you are correct, the derivation gives ranking values in a
different space than the original data model. So, I'm comparing the
order of recommendations. I'm trying a normal-scores thing because
it's easy.

http://comp9.psych.cornell.edu/Darlington/normscor.htm

However, at the moment I'm testing SlopeOne against random data to to
get a baseline.

Lance

On Fri, Oct 22, 2010 at 12:58 PM, Federico Castanedo
<[email protected]> wrote:
> Hi Lance,
>
> IMHO I think the best way to compare how much information are you loosing from
> your derivative function is to perform a cross-validation scheme both
> in the original
> data set and on the derivative data set.
>
> But be sure to compare the same validation set of the two sets (the original 
> and
> the derivative), I mean if you use and 80%-20% for training/validation
> with a 5 cross-validation
> scheme, be sure you are comparing the same sub-set of your two sets.
>
> Regards,
> Federico
>
> 2010/10/22 Sean Owen <[email protected]>:
>> Yah I still think held-out data is the best thing, if you want to use this
>> built-in evaluation mechanism. Hold out the same data from both models and
>> run the same test.
>>
>> There is another approach which doesn't necessarily require held-out data.
>> On the original, full model, just compute recommendations for any users you
>> like. Assume these are "correct". Then do the same for the derived model.
>>
>> It will return to you estimated preferences in both cases. You could use the
>> deltas as a measure of "error" (unless your derived model has quite a
>> different rating space).
>>
>> Or simply use the difference in rankings -- compute some metric that
>> penalizes having recommendations in different places in the ordering.
>>
>> I'll say I don't know which of these is most mathematically sound.
>> Interpreting the results may be hard. But, any of these should give a notion
>> of "better" and "worse".
>>
>>
>> Assuming the original model's recommendations are "correct" is a reasonably
>> big one. For example, the whole point of an SVD recommender is to modify the
>> model (reduce its dimension really) in order to be able to recommend items
>> that should be recommended, but weren't before due to model sparseness.
>> There, transforming the data in theory gives better results. That it's
>> different doesn't mean worse necessarily.
>>
>> But maybe that's not an issue for your use case, don't know.
>>
>>
>> On Fri, Oct 22, 2010 at 5:39 AM, Lance Norskog <[email protected]> wrote:
>>
>>> Here is my use case: I have two data models.
>>> 1) the original data, for example GroupLens
>>> 2) the derivative. This is a second data model which is derived from
>>> the original. It is made with a one-way function from the master.
>>>
>>> I wish to measure how much information is lost in the derivation
>>> function. There is some entropy, so therefore the derived data model
>>> cannot supply recommendations as good as the original data. But how
>>> much worse?
>>>
>>> My naive method is to make recommendations using the master model, and
>>> the derived model, and compare them. If the recommendations from the
>>> derived model are, say, 90% as good as from the original data, then
>>> the derivation function is ok.
>>>
>>> Now, obviously, the gold standard for recommendations is the data in
>>> the original model. So, I make recommendations from the original, and
>>> the derived, from the user/item prefs given in the original data. I
>>> don't really care about what the user gave as preferences: I want to
>>> know what the recommender algorithm itself thinks. But the
>>> recommenders just parrot back the data model instead of giving me
>>> their own opinion. Thus, the point of this whole thread. But how
>>> recommender algorithms work is a side issue. I'm trying to use them as
>>> an indirect measurement of something else.
>>>
>>> What is another way to test what I'm trying to test? What is another
>>> way to evaluate the quality of my derivation function?
>>>
>>> On Wed, Oct 20, 2010 at 11:41 PM, Sebastian Schelter <[email protected]>
>>> wrote:
>>> > Hi Lance,
>>> >
>>> > When evaluating a recommender you should split your dataset in a training
>>> > and test part. Only data from the training part should be included in
>>> your
>>> > DataModel and you only measure the accuracy of predicting  ratings that
>>> are
>>> > included in the test part (which is not  known by your recommender). If
>>> you
>>> > structure things this way, the current implementation should work fine
>>> for
>>> > you.
>>> >
>>> > --sebastian
>>> >
>>> > On 21.10.2010 04:56, Lance Norskog wrote:
>>> >>
>>> >> Since this is Recommender day, here is another kvetch:
>>> >>
>>> >> The recommender implementations with algorithms all do this in
>>> >> Recommender.estimatePreference():
>>> >>  public float estimatePreference(long userID, long itemID) throws
>>> >> TasteException {
>>> >>     DataModel model = getDataModel();
>>> >>     Float actualPref = model.getPreferenceValue(userID, itemID);
>>> >>     if (actualPref != null) {
>>> >>       return actualPref;
>>> >>     }
>>> >>     return doEstimatePreference(userID, itemID);
>>> >>   }
>>> >>
>>> >> Meaning: "if I told you something, just parrot it back to me."
>>> >> Otherwise, make a guess.
>>> >>
>>> >> I am doing head-to-head comparisons of the dataModel preferences v.s.
>>> >> the Recommender. This code makes it impossible to directly compare
>>> >> what the recommender thinks v.s. the actual preference. If I wanted to
>>> >> know what I told it, I already know that. I want to know what the
>>> >> recommender thinks.
>>> >>
>>> >> If this design decision is something y'all have argued about and
>>> >> settled on, never mind. If it is just something that seemed like a
>>> >> good idea at the time, can we change the recommenders, and the
>>> >> Recommender "contract", to always use their own algorithm?
>>> >>
>>> >>
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Lance Norskog
>>> [email protected]
>>>
>>
>



-- 
Lance Norskog
[email protected]

Reply via email to