Kamal, factorize-movielens-1M.sh is a small example that invokes HADOOP locally to showcase how ALS factorizes a small dataset sitting in the local filesystem.
parallelALS runs on cluster, that is its main purpose. Just invoke it via mahout parallelALS or directly via hadoop jar ... just like any other Hadoop job. Be aware that its slow as it needs a lot of iterations which impose a lot of overhead on Hadoop. If your data is small, then you can also use an SVDRecommender with ALSWRFactorizer to factorize your data in-memory on a single machine. /s On 19.01.2013 01:35, Kamal Ali wrote: > thanks sebastian, the key was this line: > export MAHOUT_LOCAL=true > > upon your advice, i just set that in my bash and everything worked. THANKS! > the main thing in factorize*sh is invocation of parallelALS which i had > hoped > would run over mahout when not running just in local. > > is it well known that parallelALS doesnt run over a distributed hadoop > cluster > or am i misunderstanding what MAHOUT_LOCAL=true does ? > [its name "parallel" made me think it ran on a hadoop cluster] > if you know of a way of making parallelALS run on a true cluster, if you > could send me the link(s) , i would really appreciate it. > thanks, > kamal. > > http://svn.apache.org/repos/asf/mahout/trunk/bin/mahout > says: > > # MAHOUT_LOCAL set to anything other than an empty string to force > # mahout to run locally even if > # HADOOP_CONF_DIR and HADOOP_HOME are set > > > > > > On Fri, Jan 18, 2013 at 3:43 PM, Sebastian Schelter <[email protected]> wrote: > >> The example should work, I tested it yesterday. The simplest way to >> execute it is to first build mahout using >> >> $ mvn -DskipTests clean install >> >> Then download the movielens1M dataset from >> http://www.grouplens.org/node/73 and unzip it. >> >> After that, go to examples/bin and point the script to the ratings.dat >> file found in the movielens dataset. >> >> $ export MAHOUT_LOCAL=true >> $ bash factorize-movielens-1M.sh /path/to/ratings.dat >> >> Best, >> Sebastian >> >> >> On 19.01.2013 00:20, Kamal Ali wrote: >>> I'm a newbie trying to get some mahout commandline examples to work. >>> >>> I tried executing factorize-movielens-1M.sh but get an error "input path >>> does not exist: /tmp/mahout-work-kali/movielens/ratings.csv" >>> even after i manually created /tmp/mahout-work-ali/ and all its >> descendant >>> directories and chmod'd them to 777. >>> >>> even after i modified factorize-movielens-1M.sh to do a "ls -l " on the >>> ratings.csv which show /tmp/mahout-work-kali/movielens/ratings.csv >>> exists. >>> >>> [the input file u1.base already has "::" instead of \t as delimiters.] >>> >>> i'm wondering if the error is something else and is being mis-reported >> and >>> some intermediate script/program is just getting a non-zero >>> return status and falling back on a stock error message. >>> >>> i am on 64bit mac, jdk1.7. my ssh keys were generated using user "kali". >>> >>> has anyone had success running factorize-movielens-1M.sh ? >>> >>> does this factorize*sh only run in mahout local mode ? >>> >>> is factorize-movielens-1M.sh cruddy and old and some other way >>> should be used?? >>> >>> i'm primarily interested in getting ALS methods to work, >>> if someone knows where in the mahout distribution one can find the >>> latest or most tested ALS implementation (and the maven command to run >> it) >>> pls let me know . >>> >>> THANK YOU! >>> kamal. >>> >>> my hadoop-env.sh is at the end of this email. >>> ================================================ >>> ./factorize-movielens-1M.sh $grouplens/ml-100k/u1.base # grouplens >>> points to a directory containing the file u1.base >>> creating work directory at /tmp/mahout-work-kali >>> kamal: doing ls -l on movie lens dir: >>> total 1544 >>> drwxrwxrwx 3 kali wheel 102 Jan 18 12:20 dataset >>> -rwxrwxrwx 1 kali wheel 786544 Jan 18 13:46 ratings.csv >>> kamal: doing wc -l on ratings.csv >>> 80000 /tmp/mahout-work-kali/movielens/ratings.csv >>> Converting ratings... >>> after sed >>> -rwxrwxrwx 1 kali wheel 786544 Jan 18 13:47 >>> /tmp/mahout-work-kali/movielens/ratings.csv >>> kamal: doing head on ratings.csv >>> 1,1,5 >>> 1,2,3 >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >>> Warning: $HADOOP_HOME is deprecated. >>> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf >>> MAHOUT-JOB: >>> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar >>> Warning: $HADOOP_HOME is deprecated. >>> >>> 13/01/18 13:47:24 INFO common.AbstractJob: Command line arguments: >>> {--endPhase=[2147483647], >>> --input=[/tmp/mahout-work-kali/movielens/ratings.csv], >>> --output=[/tmp/mahout-work-kali/dataset], --probePercentage=[0.1], >>> --startPhase=[0], --tempDir=[/tmp/mahout-work-kali/dataset/tmp], >>> --trainingPercentage=[0.9]} >>> 2013-01-18 13:47:24.918 java[53562:1703] Unable to load realm info from >>> SCDynamicStore >>> 13/01/18 13:47:25 INFO mapred.JobClient: Cleaning up the staging area >>> >> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0035 >>> 13/01/18 13:47:25 ERROR security.UserGroupInformation: >>> PriviledgedActionException as:kali >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input >>> path does not exist: /tmp/mahout-work-kali/movielens/ratings.csv >>> Exception in thread "main" >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path >>> does not exist: /tmp/mahout-work-kali/movielens/ratings.csv >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) >>> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962) >>> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979) >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >>> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.run(DatasetSplitter.java:90) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.main(DatasetSplitter.java:64) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> after splitDataset >>> -rwxrwxrwx 1 kali wheel 786544 Jan 18 13:47 >>> /tmp/mahout-work-kali/movielens/ratings.csv >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >>> Warning: $HADOOP_HOME is deprecated. >>> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf >>> MAHOUT-JOB: >>> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar >>> Warning: $HADOOP_HOME is deprecated. >>> >>> 13/01/18 13:47:31 INFO common.AbstractJob: Command line arguments: >>> {--alpha=[40], --endPhase=[2147483647], --implicitFeedback=[false], >>> --input=[/tmp/mahout-work-kali/dataset/trainingSet/], --lambda=[0.065], >>> --numFeatures=[20], --numIterations=[10], >>> --output=[/tmp/mahout-work-kali/als/out], --startPhase=[0], >>> --tempDir=[/tmp/mahout-work-kali/als/tmp]} >>> 2013-01-18 13:47:31.259 java[53605:1703] Unable to load realm info from >>> SCDynamicStore >>> 13/01/18 13:47:32 INFO mapred.JobClient: Cleaning up the staging area >>> >> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0036 >>> 13/01/18 13:47:32 ERROR security.UserGroupInformation: >>> PriviledgedActionException as:kali >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input >>> path does not exist: /tmp/mahout-work-kali/dataset/trainingSet >>> Exception in thread "main" >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path >>> does not exist: /tmp/mahout-work-kali/dataset/trainingSet >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) >>> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962) >>> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979) >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >>> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.run(ParallelALSFactorizationJob.java:137) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.ParallelALSFactorizationJob.main(ParallelALSFactorizationJob.java:98) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >>> Warning: $HADOOP_HOME is deprecated. >>> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf >>> MAHOUT-JOB: >>> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar >>> Warning: $HADOOP_HOME is deprecated. >>> >>> 13/01/18 13:47:38 INFO common.AbstractJob: Command line arguments: >>> {--endPhase=[2147483647], >>> --input=[/tmp/mahout-work-kali/dataset/probeSet/], >>> --itemFeatures=[/tmp/mahout-work-kali/als/out/M/], >>> --output=[/tmp/mahout-work-kali/als/rmse/], --startPhase=[0], >>> --tempDir=[/tmp/mahout-work-kali/als/tmp], >>> --userFeatures=[/tmp/mahout-work-kali/als/out/U/]} >>> 2013-01-18 13:47:38.142 java[53645:1703] Unable to load realm info from >>> SCDynamicStore >>> 13/01/18 13:47:38 INFO mapred.JobClient: Cleaning up the staging area >>> >> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0037 >>> 13/01/18 13:47:38 ERROR security.UserGroupInformation: >>> PriviledgedActionException as:kali >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input >>> path does not exist: /tmp/mahout-work-kali/dataset/probeSet >>> Exception in thread "main" >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path >>> does not exist: /tmp/mahout-work-kali/dataset/probeSet >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) >>> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962) >>> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979) >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >>> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.FactorizationEvaluator.run(FactorizationEvaluator.java:91) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.FactorizationEvaluator.main(FactorizationEvaluator.java:68) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >>> Warning: $HADOOP_HOME is deprecated. >>> >>> Running on hadoop, using /Users/kali/hadoop/hadoop-1.0.4/bin/hadoop and >>> HADOOP_CONF_DIR=/Users/kali/hadoop/hadoop-1.0.4/conf >>> MAHOUT-JOB: >>> /users/kali/mahout/mahout0.7/examples/target/mahout-examples-0.7-job.jar >>> Warning: $HADOOP_HOME is deprecated. >>> >>> 13/01/18 13:47:44 INFO common.AbstractJob: Command line arguments: >>> {--endPhase=[2147483647], >>> --input=[/tmp/mahout-work-kali/als/out/userRatings/], >>> --itemFeatures=[/tmp/mahout-work-kali/als/out/M/], --maxRating=[5], >>> --numRecommendations=[6], >>> --output=[/tmp/mahout-work-kali/recommendations/], --startPhase=[0], >>> --tempDir=[temp], --userFeatures=[/tmp/mahout-work-kali/als/out/U/]} >>> 2013-01-18 13:47:44.859 java[53687:1703] Unable to load realm info from >>> SCDynamicStore >>> 13/01/18 13:47:45 INFO mapred.JobClient: Cleaning up the staging area >>> >> hdfs://localhost:9000/tmp/hadoop-kali/mapred/staging/kali/.staging/job_201301151900_0038 >>> 13/01/18 13:47:45 ERROR security.UserGroupInformation: >>> PriviledgedActionException as:kali >>> cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input >>> path does not exist: /tmp/mahout-work-kali/als/out/userRatings >>> Exception in thread "main" >>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path >>> does not exist: /tmp/mahout-work-kali/als/out/userRatings >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:235) >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:55) >>> at >>> >> org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) >>> at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:962) >>> at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:979) >>> at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:897) >>> at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) >>> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850) >>> at org.apache.hadoop.mapreduce.Job.submit(Job.java:500) >>> at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.RecommenderJob.run(RecommenderJob.java:95) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at >>> >> org.apache.mahout.cf.taste.hadoop.als.RecommenderJob.main(RecommenderJob.java:69) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at >>> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >>> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >>> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>> at >>> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:601) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> >>> RMSE is: >>> >>> cat: /tmp/mahout-work-kali/als/rmse/rmse.txt: No such file or directory >>> >>> >>> >>> Sample recommendations: >>> >>> cat: /tmp/mahout-work-kali/recommendations/part-m-00000: No such file or >>> directory >>> >>> >>> ================================================== >>> # Set Hadoop-specific environment variables here. >>> >>> # The only required environment variable is JAVA_HOME. All others are >>> # optional. When running a distributed configuration it is best to >>> # set JAVA_HOME in this file, so that it is correctly defined on >>> # remote nodes. >>> >>> # The java implementation to use. Required. >>> export >>> >> JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_10.jdk/Contents/Home/jre >>> >>> # Extra Java CLASSPATH elements. Optional. >>> # export HADOOP_CLASSPATH= >>> >>> # The maximum amount of heap to use, in MB. Default is 1000. >>> # export HADOOP_HEAPSIZE=2000 >>> >>> # Extra Java runtime options. Empty by default. >>> # export HADOOP_OPTS=-server >>> >>> # Command specific options appended to HADOOP_OPTS when specified >>> export HADOOP_NAMENODE_OPTS="-Dcom.sun.management.jmxremote >>> $HADOOP_NAMENODE_OPTS" >>> export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote >>> $HADOOP_SECONDARYNAMENODE_OPTS" >>> export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote >>> $HADOOP_DATANODE_OPTS" >>> export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote >>> $HADOOP_BALANCER_OPTS" >>> export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote >>> $HADOOP_JOBTRACKER_OPTS" >>> # export HADOOP_TASKTRACKER_OPTS= >>> # The following applies to multiple commands (fs, dfs, fsck, distcp etc) >>> # export HADOOP_CLIENT_OPTS >>> >>> # Extra ssh options. Empty by default. >>> # export HADOOP_SSH_OPTS="-o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR" >>> >>> # Where log files are stored. $HADOOP_HOME/logs by default. >>> # export HADOOP_LOG_DIR=${HADOOP_HOME}/logs >>> >>> # File naming remote slave hosts. $HADOOP_HOME/conf/slaves by default. >>> # export HADOOP_SLAVES=${HADOOP_HOME}/conf/slaves >>> >>> # host:path where hadoop code should be rsync'd from. Unset by default. >>> # export HADOOP_MASTER=master:/home/$USER/src/hadoop >>> >>> # Seconds to sleep between slave commands. Unset by default. This >>> # can be useful in large clusters, where, e.g., slave rsyncs can >>> # otherwise arrive faster than the master can service them. >>> # export HADOOP_SLAVE_SLEEP=0.1 >>> >>> # The directory where pid files are stored. /tmp by default. >>> # export HADOOP_PID_DIR=/var/hadoop/pids >>> >>> # A string representing this instance of hadoop. $USER by default. >>> # export HADOOP_IDENT_STRING=$USER >>> >>> # The scheduling priority for daemon processes. See 'man nice'. >>> # export HADOOP_NICENESS=10 >>> >> >> >
