Sebastian,
               The current recommender implementations does not make a 
distinction between a 'browsed' item and a 'purchased' item when calculating 
similarity.
So that can be done as post step where you remove similar items for a given 
item that were not purchased.

The second option is to extend the 'Preference' interface for adding an API to 
get the type information. You will then need to also provide appropriate 
implementation (default is GenericPreference). You would then add a 
doMostSimilarPurchasedItems() method to GenericItemBaseRecommender along with 
few other changes. Obviously this is more work.

With FP mining algorithm the simplest thing is to just retain itemsets that 
contain purchased items instead of modifying the algorithm itself. This may 
result in interesting frequent itemsets where 2 different types of items were 
browsed and purchased together.

-...@nkur

On 4/15/10 5:40 PM, "Sebastian Feher" <sfe...@crossview.com> wrote:

Robin, Sebastian, Sean, thanks for your responses.

Yes that is exactly what I am looking for: computing frequent item sets based 
on co-browse, co-purchase, co-searching, user-item ratings and other user-item 
activities and then use these frequent item sets to provide recommendations for 
an active item and/or an active user.

Regarding the GenericItemBasedRecommender.mostSimilarItems() I've used both 
Tanimoto and also defined a custom similarity function that works the same way 
to my current custom coded frequent item sets algorithm that I'm trying to 
replace and test with Mahout.

There are a few questions that I'm not able to answer:
- do you support cross-type frequent item sets? for example - people who 
Browsed this item - ended up purchasing these items. In this case the item 
pairs are generated by taking one item from the Browse space and the other from 
Purchase space. Is this something that can be achieved with the current 
algorithms(GenericItemBasedRecommender.mostSimilarItems(), FP-Growth) in there 
existing form and if not there an extension mechanism that allows me to do that 
in a clean fashion or do I have to modify the algorithm code?

Thanks

On Apr 14, 2010, at 11:46 AM, Sebastian Schelter wrote:

> Hi Sebastian,
>
> I can only help you with what
> GenericItemBasedRecommender.mostSimilarItems() does. It's basically what
> you know from amazon.com: "People who like this item also like the
> following items". Mathematically spoken, you have a matrix of the
> preferences of users towards items and mostSimilarItems() searches the
> highest ranking item vectors using some similarity function (usually
> cosine or pearson correlation).
>
> A good overview about how item-based collaborative filtering works and
> what the most similar items are can be found in this paper (helped me
> understand the whole issue):
> http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.144.9927&rep=rep1&type=pdf
>
> Regards,
> Sebastian
>
> Sebastian Feher schrieb:
>> Hi All,
>>
>> I'm looking at extracting association rules with Mahout. If I understand it 
>> correctly, both GenericItemBasedRecommender.mostSimilarItems() and Parallel 
>> FP-Growth seem to provide support for doing that. Is this true? If not what 
>> are the major differences between the two (including scalability, 
>> performance)? Thanks.
>>
>> Sebastian
>


Reply via email to