I am very interested in collaborating on the off-line to Solr part. Just let me know how we want to get going.
Thanks, Andrew On 7/19/13 4:45 PM, "Ted Dunning" <[email protected]> wrote: >OK. I think the crux here is the off-line to Solr part so let's see who >else pops up. > >Having a solr maven could be very helpful. > > >On Fri, Jul 19, 2013 at 3:39 PM, Luis Carlos Guerrero Covo < >[email protected]> wrote: > >> I'm currently working for a portal that has a similar use case and I was >> thinking of implementing this in a similar way. I'm generating >> recommendations using python scripts based on similarity measures >>(content >> based recommendation) only using euclidean distance and some weights for >> each attribute. I want to use mahout's GenericItemBasedRecommender to >> generate these same recommendations without user data (no tracking right >> now of user to item relationship). I was thinking of pushing the >>generated >> recommendations to solr using atomic updates since my fields are all >>stored >> right now. Since this is very similar to what I'm trying to accomplish, >>I >> would sign up to collaborate in any way I can since I'm fairly familiar >> with solr and I'm starting to learn my way around mahout. >> >> >> On Fri, Jul 19, 2013 at 5:12 PM, Sebastian Schelter <[email protected]> >> wrote: >> >> > I would also be willing to provide guidance and advice for anyone >>taking >> > this on, I can especially help with the offline analysis part. >> > >> > --sebastian >> > >> > >> > 2013/7/19 Ted Dunning <[email protected]> >> > >> > > I would be happy to supervise a project to implement a demo of this >>if >> > > anybody is willing to do the grunt work of gluing things together. >> > > >> > > Sooo, if you would like to work on this, here is a suggested >>project. >> > > >> > > This project would entail: >> > > >> > > a) build a synthetic data source >> > > >> > > b) write scripts to do the off-line analysis >> > > >> > > c) write scripts to export to Solr >> > > >> > > d) write a very quick web facade over Solr to make it look like a >> > > recommendation engine. This would include >> > > >> > > d.1) a "most popular page" that does combined popularity rise and >> > > recommendation >> > > >> > > d.2) a "personal recommendation page" that does just >>recommendation >> > with >> > > dithering >> > > >> > > d.3) item pages with "related items" at the bottom >> > > >> > > e) work with others to provide high quality system walk-through and >> > install >> > > directions >> > > >> > > If you want to bite on this, we should arrange a weekly video >>hangout. >> I >> > > am willing to commit to guiding and providing detailed technical >> > > approaches. You should be willing to commit to actually doing >>stuff. >> > > >> > > The goal would be to provide a fully worked out scaffolding of a >> > practical >> > > recommendation system that presumably would become an example >>module in >> > > Mahout. >> > > >> > > >> > > On Fri, Jul 19, 2013 at 1:08 PM, B Lyon <[email protected]> wrote: >> > > >> > > > +1 as well. Sounds fun. >> > > > >> > > > On Fri, Jul 19, 2013 at 4:06 PM, Dominik Hübner < >> [email protected] >> > > > >wrote: >> > > > >> > > > > +1 for getting something like that in a future release of Mahout >> > > > > >> > > > > On Jul 19, 2013, at 10:02 PM, Sebastian Schelter >><[email protected]> >> > > wrote: >> > > > > >> > > > > > It would be awesome if we could get a nice, easily deployable >> > > > > > implementation of that approach into Mahout before 1.0 >> > > > > > >> > > > > > >> > > > > > 2013/7/19 Ted Dunning <[email protected]> >> > > > > > >> > > > > >> My current advice is to use Hadoop (if necessary) to build a >> > sparse >> > > > > >> item-item matrix based on each kind of behavior you have and >> then >> > > drop >> > > > > >> those similarities into a search engine to deliver the actual >> > > > > >> recommendations. This allows lots of flexibility in terms of >> > which >> > > > > kinds >> > > > > >> of inputs you use for the recommendation and lets you blend >> > > > > recommendations >> > > > > >> with search and geo-location. >> > > > > >> >> > > > > >> >> > > > > >> On Fri, Jul 19, 2013 at 12:33 PM, Helder Martins < >> > > > > >> [email protected]> wrote: >> > > > > >> >> > > > > >>> Hi, >> > > > > >>> I'm a dev working for a web portal in Brazil and I'm >> particularly >> > > > > >>> interested in building a item-based collaborative filtering >> > > > recommender >> > > > > >>> for our database of news articles. >> > > > > >>> After some coding, I was able to get some recommendations >> using a >> > > > > >>> GenericItemBasedRecommender, a CassandraDataModel and some >> custom >> > > > > >>> classes that store item similarities and migrated item IDs >>into >> > > > > >>> Cassandra. But know I'm in doubt of what is normally done >>with >> > this >> > > > > >>> recommender: Should I run this as a daemon, cache the >> > > recommendations >> > > > > >>> into memory and set up a web service to consult it online? >> > Should I >> > > > pre >> > > > > >>> process these recommendations for each recent user and >>store it >> > > > > >>> somewhere? My first idea was storing all these recs back >>into >> > > > > Cassandra, >> > > > > >>> but looking into some classes it seems to me that the norm >>is >> to >> > > read >> > > > > >>> the input data and store the output always using files. Is >> this a >> > > > > common >> > > > > >>> practice that benefits from HDFS? >> > > > > >>> My use case here is something around 70k recommendations >> requests >> > > per >> > > > > >>> second. >> > > > > >>> >> > > > > >>> Thanks in advance, >> > > > > >>> >> > > > > >>> -- >> > > > > >>> >> > > > > >>> Atenciosamente >> > > > > >>> Helder Martins >> > > > > >>> Arquitetura do Portal e Sistemas de Backend >> > > > > >>> +55 (51) 3284-4475 >> > > > > >>> Terra >> > > > > >>> >> > > > > >>> >> > > > > >>> Esta mensagem e seus anexos se dirigem exclusivamente ao seu >> > > > > >> destinatário, >> > > > > >>> podem conter informação privilegiada ou confidencial e são >>de >> uso >> > > > > >> exclusivo >> > > > > >>> da pessoa ou entidade de destino. Se não for destinatário >>desta >> > > > > mensagem, >> > > > > >>> fica notificado de que a leitura, utilização, divulgação >>e/ou >> > cópia >> > > > sem >> > > > > >>> autorização pode estar proibida em virtude da legislação >> vigente. >> > > Se >> > > > > >>> recebeu esta mensagem por engano, pedimos que nos o >>comunique >> > > > > >> imediatamente >> > > > > >>> por esta mesma via e, em seguida, apague-a. >> > > > > >>> >> > > > > >>> Este mensaje y sus adjuntos se dirigen exclusivamente a su >> > > > > destinatario, >> > > > > >>> puede contener información privilegiada o confidencial y es >> para >> > > uso >> > > > > >>> exclusivo de la persona o entidad de destino. Si no es >>usted él >> > > > > >>> destinatario indicado, queda notificado de que la lectura, >> > > > utilización, >> > > > > >>> divulgación y/o copia sin autorización puede estar >>prohibida en >> > > > virtud >> > > > > de >> > > > > >>> la legislación vigente. Si ha recibido este mensaje por >>error, >> le >> > > > > pedimos >> > > > > >>> que nos lo comunique inmediatamente por esta misma vía y >> proceda >> > a >> > > su >> > > > > >>> exclusión. >> > > > > >>> >> > > > > >>> The information contained in this transmissión is privileged >> and >> > > > > >>> confidential information intended only for the use of the >> > > individual >> > > > or >> > > > > >>> entity named above. If the reader of this message is not the >> > > intended >> > > > > >>> recipient, you are hereby notified that any dissemination, >> > > > distribution >> > > > > >> or >> > > > > >>> copying of this communication is strictly prohibited. If you >> have >> > > > > >> received >> > > > > >>> this transmission in error, do not read it. Please >>immediately >> > > reply >> > > > to >> > > > > >> the >> > > > > >>> sender that you have received this communication in error >>and >> > then >> > > > > delete >> > > > > >>> it. >> > > > > >>> >> > > > > >> >> > > > > >> > > > > >> > > > >> > > > >> > > > -- >> > > > BF Lyon >> > > > http://www.nowherenearithaca.com >> > > > >> > > >> > >> >> >> >> -- >> Luis Carlos Guerrero Covo >> M.S. Computer Engineering >> (57) 3183542047 >>
