Hi Sebastian, The usecase is as follows : I have a catalog containing around 100.000 items. I want to setup a webapplication providing recommendations to users.
When a user subscribes, I ask him to rate a random subset of the catalog, for example 20 randoms items. I store its userId, itemId and score in a file (or db). Once he has finished rating the 20 items, I was planning to to suggest him, a list of items recommendations. I wish I could the recommendations could be done with a minimum delay (avoiding waiting too long after its initial subscription to display a recommendation). I'm looking for the simplest solution (no application server, no weblayers such as kornapi/myrrix, and if possible no database). Writing Java is not a real problem, but I'm really willing to have the dummiest thing possible, no dependencies. I've not been able to use the examples to : - be able to issue recommendation for a very specific userId - get a efficient way to return a recommendation with a latency < 6 minutes on a laptop VM. Hope this is clearer now, thanks again for your time. Sekine 2013/1/17 Sebastian Schelter <[email protected]> > Hi Sekine, > > I'm not sure I understand your problem correctly, What exactly is your > usecase, how many users and items do you have? > > The mahout commandline tools only offer Hadoop-based recommenders that > are designed to recommend in batch for millions of users and will > usually take minutes to hours to run. > > Mahout also offers a Java framework that allows flexible, online > recommendation for single users. For people who don't want to dive into > the framework, there are simple, easy to use weblayers available like > kornakapi [1] or myrrix [2]. > > Did you look at those? You don't need to write a single line of Java > code for using them and both offer a very convenient way to use an > ALS-based recommender. Also they are shipped with an easy to use > webservice that should be callable from PHP with minimal effort. > Furthermore they should respond to requests concerning single user > recommendations in a few milliseconds. > > > Best, > Sebastian > > [1] https://github.com/plista/kornakapi > [2] http://myrrix.com/ > > > > On 17.01.2013 17:55, Sékine Coulibaly wrote: > > Sebastian, > > > > This sounds reasonable. However, I observe that running the > > factorize-movielens script computes recommendations for *all* users. Is > > there a way to compute the recommandation for only one user ? > > > > The recommenditembased recommender allows for using an external file > > containing the user id, however that algorithm is so slow compared to the > > factorize (6minutes to run, compared to 6 minutes but for thousands of > > recommendations). But I didn't find such an option in the factorize > script > > (besides, it seems that some of the ALS are precomputed and cached, so > that > > the recommendation job is quicker). > > > > Thank you ! > > > > > > 2013/1/17 Sékine Coulibaly <[email protected]> > > > >> Sebastian, > >> > >> This sounds reasonable. However, I observe that running the > >> factorize-movielens script computes recommendations for *all* users. Is > >> there a way to compute the recommandation for only one user ? > >> > >> The recommenditembased recommender allows for using an external file > >> containing the user id, however that algorithm is so slow compared to > the > >> factorize (6minutes to run, compared to 6 minutes but for thousands of > >> recommendations). > >> > >> Thank you ! > >> > >> > >> > >> 2013/1/14 Sebastian Schelter <[email protected]> > >> > >>> Then I would suggesz that you modify the shell script to periodically > >>> precompute the recommendations and put them into a database afterwards > >>> which you can query via PHP. > >>> > >>> It makes no sense IMO to call a webservice that starts a Hadoop job and > >>> wait for the results. > >>> > >>> /s > >>> > >>> > >>> > >>> > >>> On 14.01.2013 10:12, Sékine Coulibaly wrote: > >>>> Ibrahim, Sebastian, > >>>> > >>>> I precisely am trying to create a PHP Webservice to deliver > >>> recommendations. > >>>> > >>>> On a webpage, I would call that webservice, and I was imagining having > >>> that > >>>> webservice calling the factorize-movielens script itself, and > >>> transforming > >>>> the latter output to something like > >>>> > >>> > [{itemID:557,value:5.988698},{itemID:578,value:5.0461025},{itemID:1149,value:4.9268165},{itemID:572,value:4.9265957},{itemID:3245,value:4.8139095}], > >>>> a JSON I could easily parse in my front-end. > >>>> > >>>> I don't want (if possible) to involve any Java application or http > >>> server > >>>> as suggested (kornapi,myrrix), although I understand these would be a > >>>> sensible way to do (I'm interested in learning MAhout, so obfuscating > >>> that > >>>> part is something I'd like to avoid). > >>>> > >>>> Regards > >>>> > >>>> > >>>> 2013/1/14 Sebastian Schelter <[email protected]> > >>>> > >>>>> This blog post might be useful for you: > >>>>> > >>>>> http://ssc.io/a-recommendation-webservice-in-10-minutes/ > >>>>> > >>>>> On 14.01.2013 09:31, Sékine Coulibaly wrote: > >>>>>> Hi Ibrahim, > >>>>>> > >>>>>> Actually, for now, I wish I could use it locally, in other words > >>> without > >>>>>> using Haddop framework. I've been successfull in trying to launch : > >>>>>> factorize-movielense-1M.sh ratings.dat > >>>>>> > >>>>>> I wish I could launch that very same command from PHP. The Apache > >>> user is > >>>>>> www-data indeed. The /tmp/mahout-work-www-data directory is created > >>> but > >>>>>> only contains the ratings.csv file. > >>>>>> > >>>>>> Regards > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> 2013/1/14 Ibrahim Yakti <[email protected]> > >>>>>> > >>>>>>> your php scripts run using apache user which most probably doesn't > >>> have > >>>>>>> HADOOP_HOME, HADOOP_CONF_DIR, ...etc variables defined, please try > >>> to > >>>>>>> define them in the php script before making the call. > >>>>>>> > >>>>>>> I hope it works. > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Ibrahim > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Jan 13, 2013 at 11:38 PM, Sékine Coulibaly < > >>>>> [email protected] > >>>>>>>> wrote: > >>>>>>> > >>>>>>>> Hi there, > >>>>>>>> > >>>>>>>> I've been able to start locally the factorize-movielens script. > What > >>>>> I'd > >>>>>>>> like to do is basically create a PHP webservice able to start that > >>> very > >>>>>>>> same script, and return the recommendations. > >>>>>>>> > >>>>>>>> I'm using Apache2, and I use PHP's shell_exec to start the script > as > >>>>>>>> follows : > >>>>>>>> > >>>>>>>> > >>>>>>>> putenv("JAVA_HOME=" .'/usr/local/jvm/jdk1.7.0_05'); > >>>>>>>> $output = > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>> > shell_exec('/home/scoulibaly/Téléchargements/mahout-distribution-0.6/examples/bin/factorize-movielens-1M.sh > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>> > /home/scoulibaly/Téléchargements/mahout-distribution-0.6/examples/bin/ratings.dat'); > >>>>>>>> echo $output; > >>>>>>>> > >>>>>>>> > >>>>>>>> Unfortunately the output I get is as follows : > >>>>>>>> > >>>>>>>> creating work directory at /tmp/mahout-work-www-data > >>>>>>>> > >>>>>>>> Converting ratings... > >>>>>>>> > >>>>>>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > >>>>>>>> no HADOOP_HOME set, running locally > >>>>>>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > >>>>>>>> no HADOOP_HOME set, running locally > >>>>>>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > >>>>>>>> no HADOOP_HOME set, running locally > >>>>>>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. > >>>>>>>> no HADOOP_HOME set, running locally > >>>>>>>> > >>>>>>>> RMSE is: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Sample recommendations: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> removing work directory > >>>>>>>> > >>>>>>>> > >>>>>>>> I know this is not strictly a Mahout issue, but if someone could > >>> point > >>>>>>> me a > >>>>>>>> way to start Mahout jobs from a PHP script, I'd be very grateful ! > >>>>>>>> > >>>>>>>> Thank you > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > >
