On May 23, 2011, at 2:40 AM, Uwe Reimann wrote: > Hi, > > I'm currently integrating mahout's recommendation engine into a site. > > I'm not quite clear what DataModel to use. PostgresJdbcDataModel looks handy, > but seem to produce way to many queries. ReloadFromJDBCDataModel seems to > address that problem but still needs to calculate the similarity of a given > user to every other user in the system. > > Would it be possible and performant to use lucene to perform the search for > the top n most similar users, provided an index exists where the user id is > the document id and the preferences of the users are the term vectors?
It is certainly possible, but I don't know that Term Vectors will give you the performance you are looking for. You might find http://www.lucidimagination.com/search/document/c82c577e1e28259f/problems_with_itembasedrecommender_with_lucene#c82c577e1e28259f helpful as I think it describes a better way of leveraging Lucene for the problem. That being said, doesn't Mahout's recommender have the necessary pieces as well to do what you want? -Grant
