Hi Sebastian, As promised, you can find some results for testing your ALS code, on 64 high performance Amazon EC2 machines (with up to 1,024 cores). http://bickson.blogspot.com/2011/03/tunning-hadoop-configuration-for-high.html
I would love to get any feedback you or others may have about the setup of this experiment. Best, Danny Bickson On Wed, Feb 23, 2011 at 4:41 PM, Sebastian Schelter <[email protected]> wrote: > Hi Danny, > > please send all mails to [email protected] instead of directly > sending them to me, there are a lot of smart people on that list that might > join with advice. > > I'm very excited that you intensively test this code and I'm positively > suprised to see it give good results. Thank you for the effort you put into > that! > > The exception seems to occur when ALSEvaluator is run. The code uses a > quick and dirty approach to compute the error of the model as it just loads > the user and item feature matrices completely into memory. With an > increasing number of features memory consumption is getting too large. > > The code of that evaluator step needs to be changed, so that each > (user,item) pair for which the rating shall be predicted gets joined with > the according user and item feature vectors in a way that they are mapped to > the same key and go to the same reducer which can then compute the error. > > I already started implementing something like this, but I don't have a lot > of time these days unfortunately. I could update the patch during the next > week if that's ok for you. > > --sebastian > > > > > On 23.02.2011 21:57, Danny Bickson wrote: > >> Another exception I am getting: >> >> 11/02/23 20:45:34 INFO common.AbstractJob: Command line arguments: >> {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/ >> , --probes=/user/ubuntu/myout/probeSet/, --startPhase=0, --tempDir=temp, >> --userFeatures=/tmp/als/out/U/} >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >> at >> >> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433) >> at >> >> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387) >> at >> >> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134) >> at >> org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113) >> at >> >> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751) >> at >> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879) >> at >> >> org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113) >> at >> org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at >> org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at >> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> THANKS! >> ---------- Forwarded message ---------- >> From: *Danny Bickson* <[email protected] >> <mailto:[email protected]>> >> Date: Wed, Feb 23, 2011 at 3:05 PM >> Subject: Another mahout ALS question >> To: [email protected] <mailto:[email protected]> >> >> >> Hi! >> I successfully run 10 iterations for your ALS code, with D=20, >> lambda=0.065 and I get a very impressive RMSE of 0.93 >> However, when I try to increase D, I get various out of memory errors, >> even with small netflix subsample of 3M values. >> >> One of the errors I am getting is in the evaluateALS step: >> 11/02/23 19:04:11 WARN driver.MahoutDriver: No evaluateALS.props found >> on classpath, will use command-line arguments only >> 11/02/23 19:04:12 INFO common.AbstractJob: Command line arguments: >> {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/, >> --probes=/user/ubuntu/myout/probeSet/, --startPhase=0, --tempDir=temp, >> --userFeatures=/tmp/als/out/U/} >> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit >> exceeded >> at >> >> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433) >> at >> >> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387) >> at >> >> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134) >> at >> org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113) >> at >> >> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751) >> at >> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879) >> at >> >> org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113) >> at >> org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> at >> org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at >> >> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) >> at >> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) >> at >> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke(Method.java:616) >> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> >> There is no related exception in the Hadoop logs. >> >> I am running with java child opts of -Xmx2048M. >> >> Do you have any tips for me? Do you want me to post this into the >> Mahout-542 newsgroup? >> >> thanks, >> >> >> DB >> >> >
