Aha! Thanks for catching that. On Aug 3, 2012, at 3:13 PM, Sean Owen <[email protected]> wrote:
> Ah that's the ticket. The stack trace shows it is failing in the driver > program, which runs client-side. It's not getting to launch a job. > > It looks like it's running out of memory creating a new dense vector in the > random seed generator process. I don't know anything more than that about > why it happens, whether your input is funny, etc. but that is why it is not > getting to Hadoop. > > On Fri, Aug 3, 2012 at 5:04 PM, Sears Merritt <[email protected]>wrote: > >> Exactly. There isn't an error. The job just runs on a single machine and >> eventually crashes when it exhausts the JVM's memory. I never see it show >> up in the job tracker and never get any map-reduce status output. The full >> output is here: >> >> -bash-4.1$ bin/mahout kmeans -i /users/merritts/rvs -o >> /users/merritts/kmeans_output -c /users/merritts/clusters -k 10000 -x 10 >> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >> Running on hadoop, using /usr/lib/hadoop/bin/hadoop and >> HADOOP_CONF_DIR=/usr/lib/hadoop/conf >> MAHOUT-JOB: >> /home/merritts/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar >> 12/08/03 14:26:52 INFO common.AbstractJob: Command line arguments: >> {--clusters=[/users/merritts/clusters], --convergenceDelta=[0.5], >> --distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure], >> --endPhase=[2147483647], --input=[/users/merritts/rvs], --maxIter=[10], >> --method=[mapreduce], --numClusters=[10000], >> --output=[/users/merritts/kmeans_output], --startPhase=[0], >> --tempDir=[temp]} >> 12/08/03 14:26:52 INFO common.HadoopUtil: Deleting /users/merritts/clusters >> 12/08/03 14:26:53 WARN util.NativeCodeLoader: Unable to load native-hadoop >> library for your platform... using builtin-java classes where applicable >> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new compressor >> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new decompressor >> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new decompressor >> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new decompressor >> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new decompressor >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >> at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:54) >> at org.apache.mahout.math.DenseVector.like(DenseVector.java:115) >> at org.apache.mahout.math.DenseVector.like(DenseVector.java:28) >> at >> org.apache.mahout.math.AbstractVector.times(AbstractVector.java:478) >> at >> org.apache.mahout.clustering.AbstractCluster.observe(AbstractCluster.java:273) >> at >> org.apache.mahout.clustering.AbstractCluster.observe(AbstractCluster.java:248) >> at >> org.apache.mahout.clustering.kmeans.RandomSeedGenerator.buildRandom(RandomSeedGenerator.java:93) >> at >> org.apache.mahout.clustering.kmeans.KMeansDriver.run(KMeansDriver.java:94) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at >> org.apache.mahout.clustering.kmeans.KMeansDriver.main(KMeansDriver.java:48) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:197) >> >> >> >> >> On Aug 3, 2012, at 3:00 PM, Sean Owen <[email protected]> wrote: >> >>> I don't see an error here...? the warning is an ignorable message from >>> hadoop. >>> >>> On Fri, Aug 3, 2012 at 4:56 PM, Sears Merritt <[email protected] >>> wrote: >>> >>>> Hi All, >>>> >>>> I'm trying to run a kmeans job using mahout 0.8 on my hadoop cluster >>>> (Cloudera's 0.20.2-cdh3u3) and am running into an odd problem where the >>>> mahout job connects to HDFS for reading/writing data but only runs >> hadoop >>>> on a single machine, not the entire cluster. To the best of my >> knowledge I >>>> have all the environment variables configured properly, as you will see >>>> from the output below. >>>> >>>> When I launch the job using the command line tools as follows: >>>> >>>> bin/mahout kmeans -i /users/merritts/rvs -o >> /users/merritts/kmeans_output >>>> -c /users/merritts/clusters -k 100 -x 10 >>>> >>>> I get the following output: >>>> >>>> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. >>>> Running on hadoop, using /usr/lib/hadoop/bin/hadoop and >>>> HADOOP_CONF_DIR=/usr/lib/hadoop/conf >>>> MAHOUT-JOB: >>>> >> /home/merritts/trunk/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar >>>> 12/08/03 14:26:52 INFO common.AbstractJob: Command line arguments: >>>> {--clusters=[/users/merritts/clusters], --convergenceDelta=[0.5], >>>> >> --distanceMeasure=[org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure], >>>> --endPhase=[2147483647], --input=[/users/merritts/rvs], --maxIter=[10], >>>> --method=[mapreduce], --numClusters=[10000], >>>> --output=[/users/merritts/kmeans_output], --startPhase=[0], >>>> --tempDir=[temp]} >>>> 12/08/03 14:26:52 INFO common.HadoopUtil: Deleting >> /users/merritts/clusters >>>> 12/08/03 14:26:53 WARN util.NativeCodeLoader: Unable to load >> native-hadoop >>>> library for your platform... using builtin-java classes where applicable >>>> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new compressor >>>> 12/08/03 14:26:53 INFO compress.CodecPool: Got brand-new decompressor >>>> >>>> Has anyone run into this before? If so, how did you fix the issue? >>>> >>>> Thanks for your time, >>>> Sears Merritt >> >>
