On 11 April 2011 16:37, Mathieu sgard <[email protected]> wrote: > Hello, > > I'm working on a recommender feature in e-commerce. > Is it possible to train the mahout recommender in incremental way or the > only way is compute entire dataset when new items are added ?
Yes, for example see mention of update/delta files for the Taste subsystem in http://search-lucene.com/jd/mahout/core/org/apache/mahout/cf/taste/impl/model/file/FileDataModel.html Excerpt, "This class will also look for update "delta" files in the same directory, with file names that start the same way (up to the first period). These files have the same format, and provide updated data that supersedes what is in the main data file. This is a mechanism that allows an application to push updates to without re-copying the entire data file. One small format difference exists. Update files must also be able to express deletes. This is done by ending with a blank preference value, as in "123,456,"." (I've not investigated similar mechanisms for the other kinds of DataModel implementation (eg. JDBC-backed).) If you have the 'Mahout in Action' in action book, skim (the pdf!) for 'update' or 'update file' (around ~ p.27). Brief excerpt, "Because scale is a pervasive theme of this book, here we should emphasize another useful feature of FileDataModel: “update files”. Data changes, and usually the data that changes is only a tiny subset of all the data – maybe even just a few new data points, in comparison to a billion existing ones. Pushing around a brand new copy of a file containing a billion preferences just to push a few updates is wildly inefficient.". Oh and if you don't have the book and you're building an ecommerce system with Mahout and value your own time, ... just get the book, it'll pay for itself within an evening's reading :) cheers, Dan
