Hi Josh, One thing to consider is that CF approaches will typically ignore "similarity" between items/articles except for implied similarity based on stars/ratings. I.e. if you want your model to account for textual similarity as well as star/rating relations, a basic CF model probably isn't what you want. Instead, you might consider jointly solving many classification problems (one for each user) where the item/article feature set is the text. Here's an example I worked on which was a bit more general (5-star ratings rather than the on/off input it sounds like you have):
http://people.csail.mit.edu/jrennie/papers/ijcai05-preference.pdf With text, you may need to be a bit careful about the size of the feature set (words) so that your parameter set doesn't become intractable. Note that if you want the system to exhibit real-time feedback, Mahout may not be what you want since it is intended for batch-processing, IIUC. Jason On Mon, Mar 30, 2009 at 5:07 PM, Joshua Bronson <[email protected]> wrote: > I'm working on an experimental web-based feed reader[1], and in our next > release we would like to feature collaborative filtering-based article > recommendation. For starters, articles will be recommended to you based on > how similar they are to other articles that either you or people you're > following have starred. I am just getting started reading up on mahout and > the problem space in general[2], and thought I would inquire here about > whether it would be a good choice for us. > Thanks! > Josh > > P.S. Do you guys hang out in an IRC channel by any chance? > > > [1] http://melkjug.org, http://melkjug.openplans.org/about > [2] http://oreilly.com/catalog/9780596529321/ > -- Jason Rennie Research Scientist, ITA Software http://www.itasoftware.com/
