Space: Apache Mahout (https://cwiki.apache.org/confluence/display/MAHOUT)
Page: RecommendationExamples
(https://cwiki.apache.org/confluence/display/MAHOUT/RecommendationExamples)
Edited by Sean Owen:
---------------------------------------------------------------------
h1. Introduction
This quick start page describes how to run the recommendation examples provided
by Mahout. Mahout comes with four recommendation mining examples. They are
based on netflixx, jester, grouplens and bookcrossing respectively.
h1. Steps
h2. Testing it on one single machine w/o cluster
In the examples directory type:
{code}
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.cf.taste.example.bookcrossing.BookCrossingRecommenderEvaluatorRunner"
-Dexec.args="<OPTIONS>"
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.cf.taste.example.netflix.NetflixRecommenderEvaluatorRunner"
-Dexec.args="<OPTIONS>"
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.cf.taste.example.netflix.TransposeToByUser"
-Dexec.args="<OPTIONS>"
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.cf.taste.example.jester.JesterRecommenderEvaluatorRunner"
-Dexec.args="<OPTIONS>"
mvn -q exec:java
-Dexec.mainClass="org.apache.mahout.cf.taste.example.grouplens.GroupLensRecommenderEvaluatorRunner"
-Dexec.args="<OPTIONS>"
{code}
Note that the GroupLens example is designed for the "1 million" data set,
available at http://www.grouplens.org/node/73 . This file has an unusual format
and so has a special parser. The example code here can be easily modified to
use a regular FileDataModel and thus work on more standard input, including the
other data sets available at this site.
h2. Running it on the cluster
* In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job will
be generated in $MAHOUT_HOME/core/target/ and it's name will contain the Mahout
version number. For example, when using Mahout 0.1 release, the job will be
mahout-core-0.1.jar
* (Optional) 1 Start up Hadoop: $HADOOP_HOME/bin/start-all.sh
* Put the data: $HADOOP_HOME/bin/hadoop fs -put <PATH TO DATA> testdata
* Run the Job: $HADOOP_HOME/bin/hadoop jar
$MAHOUT_HOME/core/target/mahout-core-<MAHOUT VERSION>.job
org.apache.mahout.cf.taste.example.<JOB> <OPTIONS>
* Get the data out of HDFS and have a look. Use bin/hadoop fs -lsr output to
view all outputs.
h1. Command line options
{code}
Usage: <JOB>
--input (-i) input The Path for input preferences. This argument is optional
except for the netflix example.
--help (-h) Print out help
{code}
Change your notification preferences:
https://cwiki.apache.org/confluence/users/viewnotifications.action