Re: Recommend result contains item which user has already given preference, is that correct?

Sebastian Schelter Thu, 20 Oct 2011 09:41:14 -0700

To put it simplified:

The vector of recommendations is the sum of the similarity vectors for
all preferred items. In each similarity vector for a preferred item the
entry for that particular item is set to NaN.


That means that in the recommendation vector the entries for all
preferred items will be NaN.

It's a neat trick that is unfortunately very hard to see in the code.

--sebastian

On 20.10.2011 18:36, WangRamon wrote:
> 
> Hi Sebastian
> "But as the entry for the item itself is set to NaN in its similarityvector 
> and NaN plus something stays always NaN, the predicted preferencefor an item 
> that was already preferred is NaN. And the NaN entries aredropped later."
> Wait a minute here, i can understand NaN plus something stays always NaN, 
> but, how do you explain "the predicted preference for an item that was 
> already preferred is NaN", where do you put the code to check an item that 
> was already preferred? The only thing about NaN in 
> SimilarityMatrixRowWrapperMapper is to say two item (A to A) has a similarity 
> of NaN, am i right?
> Thanks
> Ramon
>> Date: Thu, 20 Oct 2011 17:04:20 +0200
>> From: [email protected]
>> To: [email protected]
>> Subject: Re: Recommend result contains item which user has already given 
>> preference, is that correct?
>>
>> On 20.10.2011 16:57, WangRamon wrote:
>>>
>>> Hi Sebastian and Sean 
>>> Thanks for your help. 
>>>
>>> I re-read the code again (debug seems to be very difficult for me to setup 
>>> the environment) and find the line in SimilarityMatrixRowWrapperMapper,  i 
>>> past it below with the comments: 
>>>     /* remove self similarity */ 
>>>     similarityMatrixRow.set(key.get(), Double.NaN); 
>>> I think the meanning is to mark the similarity between Item X and Item X 
>>> (the identical one) as NaN, but it doesn't exclude Item X from 
>>> recommendation, then in AggregateAndRecommendReducer, it uses 
>>> simColumn.times(prefValue) as part of the formula to calculate the 
>>> preferences for all items that similar to Item i (it could be Item X or 
>>> some other item), then return the top 10 (default) for a user. 
>>> During this process, i cannot see any code to exclude an item which the 
>>> user has already given preference from recommendation. 
>>
>> It's a little bit hidden :) For each preferred item, a vector of all its
>> similarities is added:
>>
>>       numerators = numerators == null
>>           ? prefValue == BOOLEAN_PREF_VALUE ? simColumn.clone() :
>> simColumn.times(prefValue)
>>           : numerators.plus(prefValue == BOOLEAN_PREF_VALUE ? simColumn
>> : simColumn.times(prefValue));
>>
>> But as the entry for the item itself is set to NaN in its similarity
>> vector and NaN plus something stays always NaN, the predicted preference
>> for an item that was already preferred is NaN. And the NaN entries are
>> dropped later.
>>
>> --sebastian
>>
>>
>>> Correct me if i miss something, thank you guys. 
>>> Cheers Ramon
>>>> Date: Thu, 20 Oct 2011 13:59:28 +0100
>>>> Subject: Re: Recommend result contains item which user has already given 
>>>> preference, is that correct?
>>>> From: [email protected]
>>>> To: [email protected]
>>>>
>>>> Ah OK, figured as much. WangRamon does that answer your question
>>>> and/or can you debug to see if this is happening, not happening for
>>>> you in your use case?
>>>>
>>>> On Thu, Oct 20, 2011 at 1:42 PM, Sebastian Schelter <[email protected]> 
>>>> wrote:
>>>>> It's still included in SimilarityMatrixRowWrapperMapper. We also have a
>>>>> unit test that checks whether a user is only recommended unknown items
>>>>> which still works.
>>>                                       
>>
>

Re: Recommend result contains item which user has already given preference, is that correct?

Reply via email to