Ok that'd be great Owen , if you could point me to the book 'Mahout in Action'
. I'm bit interested to know more on the possibilities available with mahout
and also the right usage of similarities, recommenders etc. I'd spend more time
on the java doc get my understanding crystal clear on the RecommenderJob class
and convert my code into one usable in hadoop distributed environment . I'd get
back to you in case i need more details on the RecommenderJob class.
Thanks a lot for your clarifications and support.
Thanks and Regards
Bejoy.K.S
> Date: Fri, 12 Nov 2010 13:33:01 +0000
> Subject: Re: Mahout - Help needed - files with no preferences and integarting
> mahout with Hadoop
> From: [email protected]
> To: [email protected]
>
> Yes, if you have no data for a user, you can't make recommendations. If you
> have a little data, you can make only a few, if any, weak recommendations.
> The framework won't return very weak recommendations.
>
> For RecommenderJob -- just read the javadoc, which will tell you how to run
> it.
>
> I'll also point you to the book Mahout In Action. I wrote up a detailed
> treatment of the Hadoop-based recommender job and also the other
> non-distributed code. You don't need the book to use the code -- the javadoc
> explains it all -- but if you're deeply interested it will add a lot of
> value to you.
>
> On Fri, Nov 12, 2010 at 12:31 PM, bejoy ks <[email protected]> wrote:
>
> >
> > Thanks Owen. Thanks a lot.
> > You were right, I'm having close to 5000 <User Id>,<Item Id> sets. And my
> > code is producing recommendations for some User Ids where as for some it
> > does't.
> > For those User Ids that didn't give any recommendations had only a few
> > preferences compared to others, but all users do have some preferences.
> > Guess i need more data to get much accurate recommendations for all users.
> > Is there any better implementations of this code? any suggestions from your
> > end?
> >
> > Also, I'd like to get this same code running on my hadoop cluster. Could
> > you please help me out with some documents or simple code snippets that
> > could lead my way. I tried web help but most blogs refer me to
> > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob, the code available
> > with mahout src. Unfortunately I’m not able to get much from the code as it
> > is a bit complicated for a starter like me. (I have done a couple of Map
> > reduce project on text files processing before, that is my hadoop MR
> > experience ).
> >
> > It would be much helpful if you could share me with some informative
> > tutorials whichcould make my initial steps in Mahout comfortable before i
> > could start exploring in depth. I'm from India and I'm eagerly waiting for
> > the Indian edition of 'Mahout In Action' which is not in stores yet.
> >
> > Steven,
> > Thank you for your support.
> >
> > Thanks and Regards
> > Bejoy.K.S
> >
> >
> >
> >
> > > Date: Fri, 12 Nov 2010 12:06:47 +0000
> > > Subject: Re: Mahout - Help needed - files with no preferences and
> > integarting mahout with Hadoop
> > > From: [email protected]
> > > To: [email protected]
> > >
> > > I think you'd have to debug to get more insight. To me it looks OK. Do
> > you
> > > have enough data? Maybe your user has few or no prefs, which means
> > nothing
> > > can be recommended.
> > >
> > >
> > > On Fri, Nov 12, 2010 at 12:02 PM, bejoy ks <[email protected]> wrote:
> > >
> > > >
> > > > Hi Steven
> > > > I tried my User Similarity recommendation using
> > > > GenericBooleanPrefUserBasedRecommender as you suggested but
> > unfortunaley it
> > > > is not resolving my issue. I can see in logs that it is processing the
> > > > entire file(<UserId, ItemId>) but then in a few seconds the application
> > is
> > > > terminating without giving me any recommendations. The code i tried is
> > > >
> > > > FileDataModel dataModel = new FileDataModel(new
> > File(recsFile));
> > > > UserSimilarity userSimilarity = new
> > > > TanimotoCoefficientSimilarity(dataModel);
> > > > UserNeighborhood neighborhood =new
> > > > NearestNUserNeighborhood(neighbourhoodSize, userSimilarity, dataModel);
> > > > Recommender recommender =new
> > > > GenericBooleanPrefUserBasedRecommender(dataModel, neighborhood,
> > > > userSimilarity);
> > > > List<RecommendedItem> recommendations
> > > > =recommender.recommend(userId, noOfRecommendations);
> > > > System.out.println(recommendations);
> > > >
> > > > I tried the same code with PearsonCorrelationSimilarity as well. Still
> > it
> > > > not giving me any recommendations.
> > > > Is there any thing else I should change in my code to get in
> > > > recommendations?
> > > >
> > > > Thanks and Regards
> > > > Bejoy.K.S
> > > >
> > > >
> > > >
> > > >
> > > > > Date: Fri, 12 Nov 2010 10:28:16 +0000
> > > > > Subject: Re: FW: Mahout - Help needed - files with no preferences and
> > > > integarting mahout with Hadoop
> > > > > From: [email protected]
> > > > > To: [email protected]
> > > > >
> > > > > Try using the generic boolean pref user based recommender instead of
> > the
> > > > > generic user based recommender when using binary data.
> > > > >
> > > > >
> > > >
> > http://mahout.apache.org/javadoc/core/org/apache/mahout/cf/taste/impl/recommender/GenericBooleanPrefUserBasedRecommender.html
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Nov 12, 2010 at 9:57 AM, bejoy ks <[email protected]>
> > wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi Mahout Experts,
> > > > > >
> > > > > >
> > > > > > I'm totally new into mahout and badly indeed of some expert
> > opinions. I
> > > > > > wanted
> > > > > > to implement a recommendation engine using mahout and hadoop, as
> > the
> > > > > > initial
> > > > > > stage I just tried a recommender class with mahout alone which
> > worked
> > > > > > fine . I’m
> > > > > > using eclipse for my development and to test I just ran the code
> > from
> > > > my
> > > > > > eclipse like a simple java class with the required jars already
> > added
> > > > on
> > > > > > to project build path. The code I used for testing User Similarity
> > > > > > recommendation is as follows
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > public static void main(String[] args) throws TasteException
> > > > > > {
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > String recsFile="D://mahoutFiles//SampleData.txt";
> > > > > >
> > > > > >
> > > > > > int
> > > > > > neighbourhoodSize=7;
> > > > > >
> > > > > >
> > > > > > long
> > > > > > ssoId=206008129;
> > > > > >
> > > > > >
> > > > > > int
> > > > > > noOfRecommendations=5;
> > > > > >
> > > > > >
> > > > > > try {
> > > > > >
> > > > > >
> > > > > > FileDataModel
> > > > > > dataModel = new FileDataModel(new File(recsFile));
> > > > > >
> > > > > >
> > > > > > UserSimilarity
> > > > > > userSimilarity = new PearsonCorrelationSimilarity(dataModel);
> > > > > >
> > > > > >
> > > > > > UserNeighborhood
> > > > > > neighborhood =new NearestNUserNeighborhood(neighbourhoodSize,
> > > > > > userSimilarity,
> > > > > > dataModel);
> > > > > >
> > > > > >
> > > > > > Recommender
> > > > > > recommender =new GenericUserBasedRecommender(dataModel,
> > neighborhood,
> > > > > > userSimilarity);
> > > > > >
> > > > > >
> > > > > > List<RecommendedItem>
> > > > > > recommendations =recommender.recommend(ssoId, noOfRecommendations);
> > > > > >
> > > > > >
> > > > > > for
> > > > > > (RecommendedItem recommendedItem : recommendations) {
> > > > > >
> > > > > >
> > > > > >
> > System.out.println(recommendedItem.getItemID());
> > > > > >
> > > > > >
> > > > > > System.out.print(" Item:
> > > > > > "+dataModel.getItemIDAsString(recommendedItem.getItemID()));
> > > > > >
> > > > > >
> > > > > > System.out.println(" and
> > > > > > value: "+recommendedItem.getValue());
> > > > > >
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > > } catch (IOException e)
> > > > > > {
> > > > > >
> > > > > >
> > > > > > e.printStackTrace();
> > > > > >
> > > > > >
> > > > > > } catch (TasteException
> > > > > > e) {
> > > > > >
> > > > > >
> > > > > > e.printStackTrace();
> > > > > >
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > > The code I used for testing Item Similarity recommendation is as
> > > > follows
> > > > > >
> > > > > > public static void main(String[] args) {
> > > > > >
> > > > > > String recsFile="D://mahoutFiles//SampleData.txt";
> > > > > >
> > > > > > long ssoId=206008129;
> > > > > >
> > > > > > int noOfRecommendations=5;
> > > > > >
> > > > > > String itemId="GE-CORPPROG-IMLP-PTOOLS";
> > > > > >
> > > > > > try {
> > > > > >
> > > > > > //item similarity recommendation based on User
> > > > > >
> > > > > > FileDataModel dataModel = new FileDataModel(new
> > > > File(recsFile));
> > > > > >
> > > > > > ItemSimilarity itemSimilarity = new
> > > > > > LogLikelihoodSimilarity(dataModel);
> > > > > >
> > > > > > ItemBasedRecommender recommender =new
> > > > > > GenericItemBasedRecommender(dataModel, itemSimilarity);
> > > > > >
> > > > > > List<RecommendedItem> recommendations
> > > > > > =recommender.recommend(ssoId, noOfRecommendations);
> > > > > >
> > > > > >
> > > > > >
> > > > > > for (RecommendedItem recommendedItem : recommendations)
> > {
> > > > > >
> > > > > > System.out.println(recommendedItem);
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > //item similarity recommendation based on item
> > > > > >
> > > > > > recommender =new GenericItemBasedRecommender(dataModel,
> > > > > > itemSimilarity);
> > > > > >
> > > > > > recommendations=
> > > > > >
> > recommender.mostSimilarItems(dataModel.readItemIDFromString(itemId),
> > > > > > noOfRecommendations);
> > > > > >
> > > > > > System.out.println("Item recommendations based on
> > Item");
> > > > > >
> > > > > > for (RecommendedItem recommendedItem : recommendations)
> > {
> > > > > >
> > > > > > System.out.println(recommendedItem);
> > > > > >
> > > > > > }
> > > > > >
> > > > > > } catch (IOException e) {
> > > > > >
> > > > > > // TODO Auto-generated catch block
> > > > > >
> > > > > > e.printStackTrace();
> > > > > >
> > > > > > } catch (TasteException e) {
> > > > > >
> > > > > > // TODO Auto-generated catch block
> > > > > >
> > > > > > e.printStackTrace();
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > >
> > > > > > }
> > > > > >
> > > > > >
> > > > > > The issues or concerns I have are as follows.
> > > > > >
> > > > > >
> > > > > >
> > > > > > Problem 1:
> > > > > >
> > > > > >
> > > > > > This code produces relevant
> > > > > > recommendations when I have given my file in the format <User id>,
> > > > <item
> > > > > > id>, <preference value>.
> > > > > > But in my actual sample file I have just the user id and Item id
> > alone,
> > > > no
> > > > > > preference value. On my reference since it is kind of a Boolean
> > > > preference,
> > > > > > I tried
> > > > > > using PearsonCorrelationSimilarity
> > > > > > as well as TanimotoCoefficientSimilarity but it is not giving me
> > any
> > > > > > recommendations. My actual file format that I'd want to process
> > would
> > > > be in
> > > > > > the format <user id>,
> > > > > > <item id>, . I can see it is processing all users in my file( from
> > my
> > > > > > eclipse console logs) but no recommendations are being produced as
> > > > output.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Problem 2:
> > > > > >
> > > > > >
> > > > > > I
> > > > > > want to use my mahout recommender in Hadoop distributed
> > environment. I
> > > > > > don’t have
> > > > > > a clue about it. I tried web help but most blogs refer me to
> > > > > > org.apache.mahout.cf.taste.hadoop.item.RecommenderJob,
> > > > > > the code
> > > > > > available with mahout src. Unfortunately I’m not able to get much
> > from
> > > > > > the code
> > > > > > as it is a bit complicated for a starter like me. (I have done a
> > couple
> > > > > > of Map reduce project on text files processing before ). Could
> > someone
> > > > > > please help me
> > > > > > out with very basic simple code snippets to run my User Similarity
> > and
> > > > > > Item Similarity Recommendation code
> > > > > > on hadoop environment .
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > It would be much
> > > > > > helpful if you could share me with some informative tutorials which
> > > > > > could make my initial steps in Mahout comfortable before i could
> > start
> > > > > > exploring in
> > > > > > depth.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thank you
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thanks and Regards
> > > > > > Bejoy.K.S
> > > > > >
> > > > > >
> > > > > >
> > > >
> > > >
> >
> >