Hi Vignesh, You might want to start with some of the examples that are bundled with mahout. They are pretty straightforward. I guess tweaking those to figure out how they work and adapting to your needs might be easier. For ex. grouplens example.
On Thu, Nov 10, 2011 at 9:22 AM, bejoy ks <[email protected]> wrote: > > Hey Vignesh > You can refer to the book Mahout in Action. It would help you > with a deep knowledge on the available algorithms and choose the best one > for your data set. If your user ids and item ids are numeric then > distributed implementation of item based similarity is straight forward > just use the packed jar avaliabe with mahout and provide the appropriate > inputs. The IBM developer works article is really good. In addition you can > refer the following as well > https://cwiki.apache.org/MAHOUT/itembased-collaborative-filtering.html > > http://kickstarthadoop.blogspot.com/2011/05/mahout-recommendations-in-distributed.html > > Couple of points to take care > - I'm not sure why you choose hive to store the final result. If you are > looking at a distributed database within hadoop eco system it is not hive > but hbase for all your low latency access. You need to go in for such a > distributed database only if the recommendation results are too > large,ranging to a few Tera bytes and not hence not scalable with > traditional RDBMs. If the result size is manageable get it from hdfs and > store into rdbms, which would be better choice for legacy/existing > applications to consume. > - On top, if your user ids or item ids in input data set are alphanumeric > then it'd be better to have an input and output formatter wrapping your > distributed recommender. It'd would inturn do the conversions between > alphanumeric and numeric and vice versa on the data consumed and processed > data by the recommender. > > Hope it helps!.. > > Thanks and Regards > Bejoy.K.S > > > > Date: Thu, 10 Nov 2011 15:16:11 +0100 > > Subject: Re: Help needed for Recommendation engine > > From: [email protected] > > To: [email protected] > > > > This new article by Grant Ingersoll is really good to get started in my > > opinion. > > > > > http://www.ibm.com/developerworks/java/library/j-mahout-scaling/index.html?ca=drs- > > > > Pascal > > > > > > 2011/11/10 VIGNESH PRAJAPATI <[email protected]> > > > > > Ya ted, > > > i am new in the mahout world,having java knowledge with hadoop ,mahout > > > basics.. > > > i indeed want to develop recommender system with kmeans clustering > > > algorithm. > > > So it take little time for generating recommendation of items for > > > millions of users based on their past activity like users rating on > > > the perticular products and total number of product's page visit by > > > users.so,i have three input > > > 1.userid > > > 2.productid > > > 3.rate or number of visits by userid provided at first no. > > > > > > And based on this will provide numbers of item or best item > > > appropriate to user as recommeendation.and after all this information > > > is stored in hadoop > > > > > > -----Original message----- > > > From: Ted Dunning > > > Sent: 10/11/2011, 7:09 pm > > > To: [email protected] > > > Cc: user > > > Subject: Re: Help needed for Recommendation engine > > > > > > > > > Hive is not a database. > > > > > > You should test different algorithms. If you want suggestions you > > > should say a lot more about what you are doing. > > > > > > Sent from my iPhone > > > > > > On Nov 10, 2011, at 5:39, VIGNESH PRAJAPATI <[email protected]> > wrote: > > > > > > > So how to start for it which mahout algorithm is sutable for this? > > > > >
