Hi, I am also very new to Mahout.
My understanding is that Mahout aims to implement 10 machine learning algorithms noted this paper http://cs.stanford.edu/people/ang/papers/nips06-mapreducemulticore.pdf Related to classification, try running the classify-20newsgroups example. It's a good way to be sure the system is working, and to get familiar with the input/output of Mahout. -Kevin On 7/24/13 09:12 , "Sebastian Schelter" <[email protected]> wrote: Jason, You should also search the literature for "link prediction", thats the academic term for the problem you describe. This paper might be a good starting point: "The Link Prediction Problem for Social Networks" http://www.cs.cornell.edu/home/kleinber/link-pred.pdf? 2013/7/24 Ted Dunning <[email protected]> > I don't see the contact list of the potential connection. Overlap of > connection lists should be an extremely strong signal. > > You are correct that this tends to implemented be a classification >problem. > The target variable is a binary variable that indicates whether the >person > knows or does not know the potential connection. Predictor variables > include what you have described as well as many variants of the same. > > > > On Tue, Jul 23, 2013 at 9:28 PM, Jason Lee <[email protected]> wrote: > > > Hi all, > > > > Currently i am working on recommendation system in a SNS site. There >are > > 15M+ registered members in our site. We already have a PYMK > > implementation(not use mahout or any machine learning algorithms libs), > but > > the accuracy of recommend results produced by current implementation is > not > > as good as we expected, so i'm looking for a better way to implement >this > > feature. > > > > Here are some rules should be considered when recommend "People You May > > Know" to current member: (any supplementaries?) > > Contacts list imported by current member; > > Same company: > > overlap of employed date range between current member and >recommended > > members; > > size of company; > > function of current member and recommended members; > > Same login IP > > Same school > > Mutual Friends > > > > > > As far as i know, Mahout is focus on CF(Collaborative filtering), but > PYMK > > is more likely a content-based recommendation, because the informations > > that hold in member's profile is base of PYMK processing. > > > --------------------------------------------------------------------------------- The information transmitted in this email is intended only for the person or entity to which it is addressed, and may contain material confidential to Xoom Corporation, and/or its subsidiary, buyindiaonline.com Inc. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient(s) is prohibited. If you received this email in error, please contact the sender and delete the material from your files.
