Hi again,
I think I found some problems in my setup and I will rerun experiments soon.
When using 32,64 machines I think that not enough mappers/reducers are
allocated.
Regarding the patch, I still need it, I run all experiments with D=20, on
D=30 and above
I get memory errors.

Thanks!

On Sun, Mar 6, 2011 at 4:02 PM, Sebastian Schelter <[email protected]> wrote:

> Hi Danny,
>
> thanks for the nice writeup! I'm a little bit disappointed about the
> performance though...
>
> Seems you got around those memory problems from last week without my patch,
> which is good, since I unfortunately didn't have the time to finish that one
> yet.
>
>
>
>
>
> On 05.03.2011 01:33, Danny Bickson wrote:
>
>> Hi Sebastian,
>> As promised,  you can find some results for testing your ALS code, on 64
>> high performance Amazon EC2 machines (with up to 1,024 cores).
>>
>> http://bickson.blogspot.com/2011/03/tunning-hadoop-configuration-for-high.html
>>
>> I would love to get any feedback you or others may have about the setup
>> of this experiment.
>>
>> Best,
>>
>> Danny Bickson
>>
>> On Wed, Feb 23, 2011 at 4:41 PM, Sebastian Schelter <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>    Hi Danny,
>>
>>    please send all mails to [email protected]
>>    <mailto:[email protected]> instead of directly sending them to
>>
>>    me, there are a lot of smart people on that list that might join
>>    with advice.
>>
>>    I'm very excited that you intensively test this code and I'm
>>    positively suprised to see it give good results. Thank you for the
>>    effort you put into that!
>>
>>    The exception seems to occur when ALSEvaluator is run. The code uses
>>    a quick and dirty approach to compute the error of the model as it
>>    just loads the user and item feature matrices completely into
>>    memory. With an increasing number of features memory consumption is
>>    getting too large.
>>
>>    The code of that evaluator step needs to be changed, so that each
>>    (user,item) pair for which the rating shall be predicted gets joined
>>    with the according user and item feature vectors in a way that they
>>    are mapped to the same key and go to the same reducer which can then
>>    compute the error.
>>
>>    I already started implementing something like this, but I don't have
>>    a lot of time these days unfortunately. I could update the patch
>>    during the next week if that's ok for you.
>>
>>    --sebastian
>>
>>
>>
>>
>>    On 23.02.2011 21:57, Danny Bickson wrote:
>>
>>        Another exception I am getting:
>>
>>        11/02/23 20:45:34 INFO common.AbstractJob: Command line arguments:
>>        {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/
>>        , --probes=/user/ubuntu/myout/probeSet/, --startPhase=0,
>>        --tempDir=temp,
>>        --userFeatures=/tmp/als/out/U/}
>>        Exception in thread "main" java.lang.OutOfMemoryError: Java heap
>>        space
>>                at
>>
>>  
>> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433)
>>                at
>>
>>  
>> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
>>                at
>>
>>  
>> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
>>                at
>>
>>  org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
>>                at
>>
>>  
>> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
>>                at
>>
>>  org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
>>                at
>>
>>  org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113)
>>                at
>>        org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71)
>>                at
>> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>                at
>> org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>                at
>>
>>  org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52)
>>                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>        Method)
>>                at
>>
>>  
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                at
>>
>>  
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                at java.lang.reflect.Method.invoke(Method.java:616)
>>                at
>>
>>  
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>                at
>>        org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>                at
>>        org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>                at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>        Method)
>>                at
>>
>>  
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                at
>>
>>  
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                at java.lang.reflect.Method.invoke(Method.java:616)
>>                at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>
>>        THANKS!
>>        ---------- Forwarded message ----------
>>        From: *Danny Bickson* <[email protected]
>>        <mailto:[email protected]>
>>        <mailto:[email protected] <mailto:[email protected]>>>
>>        Date: Wed, Feb 23, 2011 at 3:05 PM
>>        Subject: Another mahout ALS question
>>        To: [email protected] <mailto:[email protected]>
>>        <mailto:[email protected] <mailto:[email protected]>>
>>
>>
>>        Hi!
>>        I successfully run 10 iterations for your ALS code, with D=20,
>>        lambda=0.065 and I get a very impressive RMSE of 0.93
>>        However, when I try to increase D, I get various out of memory
>>        errors,
>>        even with small netflix subsample of 3M values.
>>
>>        One of the errors I am getting is in the evaluateALS step:
>>        11/02/23 19:04:11 WARN driver.MahoutDriver: No evaluateALS.props
>>        found
>>        on classpath, will use command-line arguments only
>>        11/02/23 19:04:12 INFO common.AbstractJob: Command line arguments:
>>        {--endPhase=2147483647, --itemFeatures=/tmp/als/out/M/,
>>        --probes=/user/ubuntu/myout/probeSet/, --startPhase=0,
>>        --tempDir=temp,
>>        --userFeatures=/tmp/als/out/U/}
>>        Exception in thread "main" java.lang.OutOfMemoryError: GC
>>        overhead limit
>>        exceeded
>>                 at
>>
>>  
>> org.apache.mahout.math.map.OpenIntDoubleHashMap.rehash(OpenIntDoubleHashMap.java:433)
>>                 at
>>
>>  
>> org.apache.mahout.math.map.OpenIntDoubleHashMap.put(OpenIntDoubleHashMap.java:387)
>>                 at
>>
>>  
>> org.apache.mahout.math.RandomAccessSparseVector.setQuick(RandomAccessSparseVector.java:134)
>>                 at
>>
>>  org.apache.mahout.math.VectorWritable.readFields(VectorWritable.java:113)
>>                 at
>>
>>  
>> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1751)
>>                 at
>>
>>  org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1879)
>>                 at
>>
>>  org.apache.mahout.utils.eval.ALSEvaluator.readMatrix(ALSEvaluator.java:113)
>>                 at
>>        org.apache.mahout.utils.eval.ALSEvaluator.run(ALSEvaluator.java:71)
>>                 at
>>        org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>                 at
>>        org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>                 at
>>
>>  org.apache.mahout.utils.eval.ALSEvaluator.main(ALSEvaluator.java:52)
>>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>        Method)
>>                 at
>>
>>  
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                 at
>>
>>  
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                 at java.lang.reflect.Method.invoke(Method.java:616)
>>                 at
>>
>>  
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>                 at
>>        org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>                 at
>>        org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:174)
>>                 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>        Method)
>>                 at
>>
>>  
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>                 at
>>
>>  
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>                 at java.lang.reflect.Method.invoke(Method.java:616)
>>                 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>
>>
>>        There is no related exception in the Hadoop logs.
>>
>>        I am running with java child opts of -Xmx2048M.
>>
>>        Do you have any tips for me? Do you want me to post this into the
>>        Mahout-542 newsgroup?
>>
>>        thanks,
>>
>>
>>        DB
>>
>>
>>
>>
>

Reply via email to