Hi Sansam,
You could try to precompute the similarity matrix and use a
FileItemSimilarity to load it into memory. Have a look at "bin/mahout
itemsimilarity".
Hope that helps.
--sebastian
Am 13.07.2010 07:41, schrieb samsam:
> Hi again. I'm getting a really hard time to use Taste using the Large 10M
> dataset. Using FileDataModel is taking too long to recompute the sim-matrix.
> I'm assuming a new user has just arrived (not on the user data or item data)
> and starts voting. Trying to refresh the matrix takes over 25 seconds a
> large time when considering a website for instance.
>
> So I found this class. I tried to use it just like shown on javadocs:
>
> PlusAnonymousUserDataModel plusmodel = new
> PlusAnonymousUserDataModel(datamodel);
> PearsonCorrelationSimilarity sim = new
> PearsonCorrelationSimilarity(datamodel);
> Recommender recommender = new GenericItemBasedRecommender(datamodel,sim);
> PreferenceArray pref = new GenericUserPreferenceArray(10);
> pref.setUserID(0, PlusAnonymousUserDataModel.TEMP_USER_ID);
> for(int i=0;i<10;i++){
> pref.setItemID(i, votes[i][0]);
> pref.setValue(i,votes[i][1]);
> }
> synchronized(pref) {
> plusmodel.setTempPrefs(pref);
> recommender.recommend(PlusAnonymousUserDataModel.TEMP_USER_ID,
> 10);
> }
>
> But this is causing an NoSuchUserException at the recommender.
>
> I tried using the plusmodel on similarity and recommender instead of the
> realmodel, but did not work as well.
>
>
>
> On Mon, Jul 5, 2010 at 11:52 PM, Sean Owen <[email protected]> wrote:
>
>
>> Pre-compute the similarity based on what information? You mention that
>> you don't want to use Pearson and mention item attributes.
>>
>> If you are trying to use domain-specific attributes of items, then
>> it's up to you to write that logic. If you want to say books have a
>> "0.5" similarity when they are within the same genre, and "0.9" when
>> by the same author, you can just write that logic. That's not part of
>> the framework.
>>
>> The hook into the framework comes when you implement ItemSimilarity
>> with logic like that. Then just use that ItemSimilarity instead of one
>> of the given implementations. That's all.
>>
>> On Mon, Jul 5, 2010 at 4:32 PM, samsam <[email protected]> wrote:
>>
>>> About the second question,I have not the similarity,I want to know is how
>>>
>> to
>>
>>> pre-compute the item similarity.
>>>
>>> On Mon, Jul 5, 2010 at 11:20 PM, Sean Owen <[email protected]> wrote:
>>>
>>>
>>>> 1) Good question. One answer is to make these "anonymous" users real
>>>> users in your data model, at least temporarily. That is, they need not
>>>> be anonymous to the recommender, even if they're not yet a registered
>>>> user as far as your site is concerned.
>>>>
>>>> There's a class called PlusAnonymousUserDataModel that helps you do
>>>> this. It wraps a DataModel and lets you quickly add a temporary user,
>>>> recommend, then un-add that user. It may be the easiest thing to try.
>>>>
>>>> (BTW the book Mahout in Action covers this in section 5.4, in the
>>>> current MEAP draft.)
>>>>
>>>> 2) Not sure I fully understand. You already have some external,
>>>> pre-computed notion of item similarity? then just feed that in to
>>>> GenericItemSimilarity and use it from there.
>>>>
>>>> Sean
>>>>
>>>> On Mon, Jul 5, 2010 at 1:52 PM, samsam <[email protected]> wrote:
>>>>
>>>>> Hello,all
>>>>> I want to build recommendation engine with apache mahout,I have read
>>>>>
>> some
>>
>>>>> reading material,and I still have some questions.
>>>>>
>>>>> 1)How to recommend for anonymous users
>>>>> I think recommendation engine should return recommendations given a
>>>>>
>> item
>>
>>>>> id.For example,a anonymous user reviews some items,
>>>>> and tell the recommendation what he reviews,and compute with the
>>>>>
>> reviews
>>
>>>>> histories.
>>>>>
>>>>> 2)How to compute the items similarity dataset
>>>>> Without use items similarity dataset,we can make ItemBasedRecommender
>>>>> with PearsonCorrelationSimilarity,but
>>>>> we need to make recommendations with extra attributes of items,
>>>>> so we should use the items similarity dataset,how to build the dataset
>>>>>
>> is
>>
>>>>> the key point.
>>>>> --
>>>>> I'm samsam.
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> I'm samsam.
>>>
>>>
>>
>
>
>