Re: Mahout 0.4 seems recommend user's existed items to user.

Sebastian Schelter Mon, 17 Jan 2011 08:55:47 -0800

It's true that already preferred items might be looked at inAggregateAndRecommendReducer but the prediction for them will always beNaN so they will be filtered out.


--sebastian


On 17.01.2011 16:57, han henry wrote:

Hi,Sebastian,

I have viewed the code today.

Assume that the output of job partialMultiply as following:

context.write(key, vectorAndPrefs);

ItemA -->(([itemB,0.9],[itemC,0.1]),({user1,user2)),({10,1}))
ItemB--> (([itemA,0.9]),{user1,user2),(5,1)).

It meas that user1 has existed item itemA and ItemB,it also may
recommend user1 with itemA or ItemB.

Am I right ?

Best Regards,

--Henry Han


2011/1/14 Sebastian Schelter <s...@apache.org <mailto:s...@apache.org>>

    Hi Han,

    It's hard to see from the sources how the users' already preferred
    items (#3) are excluded from the final results but it's definitely done.

    I'll walk you through the code:

    In SimilarityMatrixRowWrapperMapper.map() we map all similar items
    for each item as a vector, notice that the similarity value of each
    item to itself is set to NaN here.

    When AggregateAndRecommender computes the final recommendations, it
    receives a PrefAndSimilarityColumnWritable for each item preferred
    by the user. Those similarity vectors and preference values are used
    to compute the weighted sum that gives the prediction value for each
    item to recommend.

    For each item that has already been preferred by the user we can be
    sure that there is the NaN value from above added to its sum which
    makes it NaN too. Finally all NaN predictions are explicitly
    filtered in AggregateAndRecommendReducer.writeRecommendedItems().


    --sebastian






    On 14.01.2011 11:19, han henry wrote:

        Hi,Sean and sebastian

        We have two type preference .

        1)  ,Preferences that user does not want to see them ,we store those
        preference in filterFile.
        2)  ,All preferences (include those in the #1) ,also those data
        can use to
        calculate similarity.

        We can not recommend those items to user

        #1, Invalid items or expired items .we store those items in
        itemSFile.
        #2, User Non-interested items ,we store those user ,item pairs
        in filterFile
        .
        #3, User existed items (user already has those item in
        preferences ).

          ItemFilterAsVectorAndPrefsReducer seems can make  those items
        been skiped
        in last step.

        so we do #1 and #2 in the last step
        
(AggregateAndRecommendReducer.java<http://svn.apache.org/repos/asf/mahout/trunk/core/src/main/java/org/apache/mahout/cf/taste/hadoop/item/AggregateAndRecommendReducer.java>),

        but I have not found logic to skip #3.

        Am I right ?

        Best Regards,

        2011/1/14 han henry<huiwen....@gmail.com
        <mailto:huiwen....@gmail.com>>

            Thank you Sean and sebastian :)

            2011/1/14 Sean Owen<sro...@gmail.com <mailto:sro...@gmail.com>>

            Look at ItemFilterAsVectorAndPrefsReducer. This does what
            you are looking

                for.

                On Fri, Jan 14, 2011 at 9:17 AM, han
                henry<huiwen....@gmail.com
                <mailto:huiwen....@gmail.com>>  wrote:

                    Hi,Sebastian

                    Because my data is on the production ,it 's very
                    large .so sorry that I

                can

                    not give you input data.

                    But we can try to review the code .

                    The initial version cooccurence arithmetic has logic
                    to skip user's

                existed

                    items.

                    Best Regards,

Re: Mahout 0.4 seems recommend user's existed items to user.

Reply via email to