It's true that already preferred items might be looked at in
AggregateAndRecommendReducer but the prediction for them will always be
NaN so they will be filtered out.
--sebastian
On 17.01.2011 16:57, han henry wrote:
Hi,Sebastian,
I have viewed the code today.
Assume that the output of job partialMultiply as following:
context.write(key, vectorAndPrefs);
ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1}))
ItemB--> (([itemA,0.9]),{user1,user2),(5,1)).
It meas that user1 has existed item itemA and ItemB,it also may
recommend user1 with itemA or ItemB.
Am I right ?
Best Regards,
--Henry Han
2011/1/14 Sebastian Schelter <s...@apache.org <mailto:s...@apache.org>>
Hi Han,
It's hard to see from the sources how the users' already preferred
items (#3) are excluded from the final results but it's definitely done.
I'll walk you through the code:
In SimilarityMatrixRowWrapperMapper.map() we map all similar items
for each item as a vector, notice that the similarity value of each
item to itself is set to NaN here.
When AggregateAndRecommender computes the final recommendations, it
receives a PrefAndSimilarityColumnWritable for each item preferred
by the user. Those similarity vectors and preference values are used
to compute the weighted sum that gives the prediction value for each
item to recommend.
For each item that has already been preferred by the user we can be
sure that there is the NaN value from above added to its sum which
makes it NaN too. Finally all NaN predictions are explicitly
filtered in AggregateAndRecommendReducer.writeRecommendedItems().
--sebastian
On 14.01.2011 11:19, han henry wrote:
Hi,Sean and sebastian
We have two type preference .
1) ,Preferences that user does not want to see them ,we store those
preference in filterFile.
2) ,All preferences (include those in the #1) ,also those data
can use to
calculate similarity.
We can not recommend those items to user
#1, Invalid items or expired items .we store those items in
itemSFile.
#2, User Non-interested items ,we store those user ,item pairs
in filterFile
.
#3, User existed items (user already has those item in
preferences ).
ItemFilterAsVectorAndPrefsReducer seems can make those items
been skiped
in last step.
so we do #1 and #2 in the last step
(AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),
but I have not found logic to skip #3.
Am I right ?
Best Regards,
2011/1/14 han henry<huiwen....@gmail.com
<mailto:huiwen....@gmail.com>>
Thank you Sean and sebastian :)
2011/1/14 Sean Owen<sro...@gmail.com <mailto:sro...@gmail.com>>
Look at ItemFilterAsVectorAndPrefsReducer. This does what
you are looking
for.
On Fri, Jan 14, 2011 at 9:17 AM, han
henry<huiwen....@gmail.com
<mailto:huiwen....@gmail.com>> wrote:
Hi,Sebastian
Because my data is on the production ,it 's very
large .so sorry that I
can
not give you input data.
But we can try to review the code .
The initial version cooccurence arithmetic has logic
to skip user's
existed
items.
Best Regards,