You need to give much more memory than 200 MB to your mappers. What are the dimensions of your input in terms of users and items?
--sebastian On 19.11.2012 09:28, Abramov Pavel wrote: > Thanks for your replies. > > 1) >> Can you describe your failure or give us a strack trace? > > > Here is job log: > > 12/11/19 09:54:07 INFO als.ParallelALSFactorizationJob: Recomputing U > (iteration 0/15) > … > 12/11/19 10:03:31 INFO mapred.JobClient: Job complete: > job_201211150152_1671 > 12/11/19 10:03:31 INFO als.ParallelALSFactorizationJob: Recomputing M > (iteration 0/15) > … > 12/11/19 10:10:04 INFO mapred.JobClient: Task Id : > attempt_201211150152_<*ALL*>, Status : FAILED > … > 12/11/19 10:40:40 INFO mapred.JobClient: Failed map tasks=1 > > > > All of these mappers (Recomputing M on 1st iteration) fail with "Java heap > space" error. > > Here is Hadoop job memory config: > > mapred.map.child.java.opts = -Xmx5024m -XX:-UseGCOverheadLimit > mapred.child.java.opts = -Xmx200m > mapred.job.reuse.jvm.num.tasks = -1 > > > mapred.cluster.reduce.memory.mb = -1 > mapred.cluster.map.memory.mb = -1 > mapred.cluster.max.reduce.memory.mb = -1 > mapred.job.reduce.memory.mb = -1 > mapred.job.map.memory.mb = -1 > mapred.cluster.max.map.memory.mb = -1 > > Any tweaks possible? Is mapred.map.child.java.opts ok? > > 2) As far as I understand ALS can not load U matrix in RAM (20m users) > while M is Ok (150k items). Can I split input matrix R (keep all items, > split by user) to R1, R2, Rn, then compute M and U1 on R1 (many > iterations, then fix M), then compute U2,U3,Un etc using existing M (0,5 > iteration, do not recompute M)? I want to do this to avoid Memory issues > (train on part ). > My question is: will all the users from U1, U2, Un "exist" in the same > feature space? Can I then compare users from U1 with users from U2 using > their features? > Any tweak possible here > > 3) How to calculate maximum matrix size for given items count and memory > limit? For example, my matrix has 20m users, I want to factorize it using > 20 features. 20m*20*8 = > 3.2 Gb. On the one hand I want to avoid "Java heap space" on the another > hand I want to provide my model with maximum training data. I understand > that minor changes to parallelALS needed. > > Have a nice day! > > > Regards, > Pavel > > > > > > > > > > > > >
