Hello,

I have some questions about the SlopeOneRecommender And Distributed
SlopeOne.

First one, is an old question someone posted before:

 I'm confusing, how can I read the users' profile as well as the
diff-matrix at the same time(they are at different location in my HDFS) to
predict a specific user's ratings? I've already checked the mahout
implementation of Slopeone with hadoop, but that one just did the
calculation of diff-matrix.. and no prediction part is included... Anyone
can help me? How to read two kinds of data in Hadoop program at the same
time?

Well, if ran
org.apache.mahout.cf.taste.hadoop.slopeone.SlopeOneAverageDiffsJob on
hadoop, finally it generates the Diff file with 6 columns, like:

    1 18 -0.55439756 1967 -0.55439756 3739.179461  1 19 -1.310974583 5941
-1.310974583 10706.22446  1 20 -1.184633028 1308 -1.184633028 1933.661124  1
21 -0.407834403 7633 -0.407834403 9899.411503
The first two ones are itemA and itemB pair. The third one is diff, the
forth one is count, what does the last two ones mean? the stdDev?

If possible, could you please explain a little bit about the
SlopeOneDiffsToAveragesReducer? The PrefsToDiffs is easy to be understood,
just process per user, to generate item-item difference pairs. How about
the SlopeOneDiffsToAveragesReducer? why on Hadoop it is so slow. Why
finally the DiffStorage is just one single table like the table above?

Thanks,

Steven

Reply via email to