I would also be willing to provide guidance and advice for anyone taking this on, I can especially help with the offline analysis part.
--sebastian 2013/7/19 Ted Dunning <[email protected]> > I would be happy to supervise a project to implement a demo of this if > anybody is willing to do the grunt work of gluing things together. > > Sooo, if you would like to work on this, here is a suggested project. > > This project would entail: > > a) build a synthetic data source > > b) write scripts to do the off-line analysis > > c) write scripts to export to Solr > > d) write a very quick web facade over Solr to make it look like a > recommendation engine. This would include > > d.1) a "most popular page" that does combined popularity rise and > recommendation > > d.2) a "personal recommendation page" that does just recommendation with > dithering > > d.3) item pages with "related items" at the bottom > > e) work with others to provide high quality system walk-through and install > directions > > If you want to bite on this, we should arrange a weekly video hangout. I > am willing to commit to guiding and providing detailed technical > approaches. You should be willing to commit to actually doing stuff. > > The goal would be to provide a fully worked out scaffolding of a practical > recommendation system that presumably would become an example module in > Mahout. > > > On Fri, Jul 19, 2013 at 1:08 PM, B Lyon <[email protected]> wrote: > > > +1 as well. Sounds fun. > > > > On Fri, Jul 19, 2013 at 4:06 PM, Dominik Hübner <[email protected] > > >wrote: > > > > > +1 for getting something like that in a future release of Mahout > > > > > > On Jul 19, 2013, at 10:02 PM, Sebastian Schelter <[email protected]> > wrote: > > > > > > > It would be awesome if we could get a nice, easily deployable > > > > implementation of that approach into Mahout before 1.0 > > > > > > > > > > > > 2013/7/19 Ted Dunning <[email protected]> > > > > > > > >> My current advice is to use Hadoop (if necessary) to build a sparse > > > >> item-item matrix based on each kind of behavior you have and then > drop > > > >> those similarities into a search engine to deliver the actual > > > >> recommendations. This allows lots of flexibility in terms of which > > > kinds > > > >> of inputs you use for the recommendation and lets you blend > > > recommendations > > > >> with search and geo-location. > > > >> > > > >> > > > >> On Fri, Jul 19, 2013 at 12:33 PM, Helder Martins < > > > >> [email protected]> wrote: > > > >> > > > >>> Hi, > > > >>> I'm a dev working for a web portal in Brazil and I'm particularly > > > >>> interested in building a item-based collaborative filtering > > recommender > > > >>> for our database of news articles. > > > >>> After some coding, I was able to get some recommendations using a > > > >>> GenericItemBasedRecommender, a CassandraDataModel and some custom > > > >>> classes that store item similarities and migrated item IDs into > > > >>> Cassandra. But know I'm in doubt of what is normally done with this > > > >>> recommender: Should I run this as a daemon, cache the > recommendations > > > >>> into memory and set up a web service to consult it online? Should I > > pre > > > >>> process these recommendations for each recent user and store it > > > >>> somewhere? My first idea was storing all these recs back into > > > Cassandra, > > > >>> but looking into some classes it seems to me that the norm is to > read > > > >>> the input data and store the output always using files. Is this a > > > common > > > >>> practice that benefits from HDFS? > > > >>> My use case here is something around 70k recommendations requests > per > > > >>> second. > > > >>> > > > >>> Thanks in advance, > > > >>> > > > >>> -- > > > >>> > > > >>> Atenciosamente > > > >>> Helder Martins > > > >>> Arquitetura do Portal e Sistemas de Backend > > > >>> +55 (51) 3284-4475 > > > >>> Terra > > > >>> > > > >>> > > > >>> Esta mensagem e seus anexos se dirigem exclusivamente ao seu > > > >> destinatário, > > > >>> podem conter informação privilegiada ou confidencial e são de uso > > > >> exclusivo > > > >>> da pessoa ou entidade de destino. Se não for destinatário desta > > > mensagem, > > > >>> fica notificado de que a leitura, utilização, divulgação e/ou cópia > > sem > > > >>> autorização pode estar proibida em virtude da legislação vigente. > Se > > > >>> recebeu esta mensagem por engano, pedimos que nos o comunique > > > >> imediatamente > > > >>> por esta mesma via e, em seguida, apague-a. > > > >>> > > > >>> Este mensaje y sus adjuntos se dirigen exclusivamente a su > > > destinatario, > > > >>> puede contener información privilegiada o confidencial y es para > uso > > > >>> exclusivo de la persona o entidad de destino. Si no es usted él > > > >>> destinatario indicado, queda notificado de que la lectura, > > utilización, > > > >>> divulgación y/o copia sin autorización puede estar prohibida en > > virtud > > > de > > > >>> la legislación vigente. Si ha recibido este mensaje por error, le > > > pedimos > > > >>> que nos lo comunique inmediatamente por esta misma vía y proceda a > su > > > >>> exclusión. > > > >>> > > > >>> The information contained in this transmissión is privileged and > > > >>> confidential information intended only for the use of the > individual > > or > > > >>> entity named above. If the reader of this message is not the > intended > > > >>> recipient, you are hereby notified that any dissemination, > > distribution > > > >> or > > > >>> copying of this communication is strictly prohibited. If you have > > > >> received > > > >>> this transmission in error, do not read it. Please immediately > reply > > to > > > >> the > > > >>> sender that you have received this communication in error and then > > > delete > > > >>> it. > > > >>> > > > >> > > > > > > > > > > > > -- > > BF Lyon > > http://www.nowherenearithaca.com > > >
