To put it simplified: The vector of recommendations is the sum of the similarity vectors for all preferred items. In each similarity vector for a preferred item the entry for that particular item is set to NaN.
That means that in the recommendation vector the entries for all preferred items will be NaN. It's a neat trick that is unfortunately very hard to see in the code. --sebastian On 20.10.2011 18:36, WangRamon wrote: > > Hi Sebastian > "But as the entry for the item itself is set to NaN in its similarityvector > and NaN plus something stays always NaN, the predicted preferencefor an item > that was already preferred is NaN. And the NaN entries aredropped later." > Wait a minute here, i can understand NaN plus something stays always NaN, > but, how do you explain "the predicted preference for an item that was > already preferred is NaN", where do you put the code to check an item that > was already preferred? The only thing about NaN in > SimilarityMatrixRowWrapperMapper is to say two item (A to A) has a similarity > of NaN, am i right? > Thanks > Ramon >> Date: Thu, 20 Oct 2011 17:04:20 +0200 >> From: [email protected] >> To: [email protected] >> Subject: Re: Recommend result contains item which user has already given >> preference, is that correct? >> >> On 20.10.2011 16:57, WangRamon wrote: >>> >>> Hi Sebastian and Sean >>> Thanks for your help. >>> >>> I re-read the code again (debug seems to be very difficult for me to setup >>> the environment) and find the line in SimilarityMatrixRowWrapperMapper, i >>> past it below with the comments: >>> /* remove self similarity */ >>> similarityMatrixRow.set(key.get(), Double.NaN); >>> I think the meanning is to mark the similarity between Item X and Item X >>> (the identical one) as NaN, but it doesn't exclude Item X from >>> recommendation, then in AggregateAndRecommendReducer, it uses >>> simColumn.times(prefValue) as part of the formula to calculate the >>> preferences for all items that similar to Item i (it could be Item X or >>> some other item), then return the top 10 (default) for a user. >>> During this process, i cannot see any code to exclude an item which the >>> user has already given preference from recommendation. >> >> It's a little bit hidden :) For each preferred item, a vector of all its >> similarities is added: >> >> numerators = numerators == null >> ? prefValue == BOOLEAN_PREF_VALUE ? simColumn.clone() : >> simColumn.times(prefValue) >> : numerators.plus(prefValue == BOOLEAN_PREF_VALUE ? simColumn >> : simColumn.times(prefValue)); >> >> But as the entry for the item itself is set to NaN in its similarity >> vector and NaN plus something stays always NaN, the predicted preference >> for an item that was already preferred is NaN. And the NaN entries are >> dropped later. >> >> --sebastian >> >> >>> Correct me if i miss something, thank you guys. >>> Cheers Ramon >>>> Date: Thu, 20 Oct 2011 13:59:28 +0100 >>>> Subject: Re: Recommend result contains item which user has already given >>>> preference, is that correct? >>>> From: [email protected] >>>> To: [email protected] >>>> >>>> Ah OK, figured as much. WangRamon does that answer your question >>>> and/or can you debug to see if this is happening, not happening for >>>> you in your use case? >>>> >>>> On Thu, Oct 20, 2011 at 1:42 PM, Sebastian Schelter <[email protected]> >>>> wrote: >>>>> It's still included in SimilarityMatrixRowWrapperMapper. We also have a >>>>> unit test that checks whether a user is only recommended unknown items >>>>> which still works. >>> >> >
