Jason, You should also search the literature for "link prediction", thats the academic term for the problem you describe.
This paper might be a good starting point: "The Link Prediction Problem for Social Networks" http://www.cs.cornell.edu/home/kleinber/link-pred.pdf 2013/7/24 Ted Dunning <[email protected]> > I don't see the contact list of the potential connection. Overlap of > connection lists should be an extremely strong signal. > > You are correct that this tends to implemented be a classification problem. > The target variable is a binary variable that indicates whether the person > knows or does not know the potential connection. Predictor variables > include what you have described as well as many variants of the same. > > > > On Tue, Jul 23, 2013 at 9:28 PM, Jason Lee <[email protected]> wrote: > > > Hi all, > > > > Currently i am working on recommendation system in a SNS site. There are > > 15M+ registered members in our site. We already have a PYMK > > implementation(not use mahout or any machine learning algorithms libs), > but > > the accuracy of recommend results produced by current implementation is > not > > as good as we expected, so i'm looking for a better way to implement this > > feature. > > > > Here are some rules should be considered when recommend "People You May > > Know" to current member: (any supplementaries?) > > Contacts list imported by current member; > > Same company: > > overlap of employed date range between current member and recommended > > members; > > size of company; > > function of current member and recommended members; > > Same login IP > > Same school > > Mutual Friends > > > > > > As far as i know, Mahout is focus on CF(Collaborative filtering), but > PYMK > > is more likely a content-based recommendation, because the informations > > that hold in member's profile is base of PYMK processing. > > >
