Thanks for your replies.

1) 
> Can you describe your failure or give us a strack trace?


Here is job log:

12/11/19 09:54:07 INFO als.ParallelALSFactorizationJob: Recomputing U
(iteration 0/15)
…
12/11/19 10:03:31 INFO mapred.JobClient: Job complete:
job_201211150152_1671
12/11/19 10:03:31 INFO als.ParallelALSFactorizationJob: Recomputing M
(iteration 0/15)
…
12/11/19 10:10:04 INFO mapred.JobClient: Task Id :
attempt_201211150152_<*ALL*>, Status : FAILED
…
12/11/19 10:40:40 INFO mapred.JobClient:     Failed map tasks=1



All of these mappers (Recomputing M on 1st iteration) fail with "Java heap
space" error.

Here is Hadoop job memory config:

mapred.map.child.java.opts = -Xmx5024m -XX:-UseGCOverheadLimit
mapred.child.java.opts = -Xmx200m
mapred.job.reuse.jvm.num.tasks = -1


mapred.cluster.reduce.memory.mb = -1
mapred.cluster.map.memory.mb = -1
mapred.cluster.max.reduce.memory.mb = -1
mapred.job.reduce.memory.mb = -1
mapred.job.map.memory.mb = -1
mapred.cluster.max.map.memory.mb = -1

Any tweaks possible? Is mapred.map.child.java.opts ok?

2) As far as I understand ALS can not load U matrix in RAM (20m users)
while M is Ok (150k items). Can I split input matrix R (keep all items,
split by user) to R1, R2, Rn, then compute M and U1 on R1 (many
iterations, then fix M), then compute U2,U3,Un etc using existing M (0,5
iteration, do not recompute M)? I want to do this to avoid Memory issues
(train on part ). 
My question is: will all the users from U1, U2, Un "exist" in the same
feature space? Can I then compare users from U1 with users from U2 using
their features?
Any tweak possible here

3) How to calculate maximum matrix size for given items count and memory
limit? For example, my matrix has 20m users, I want to factorize it using
20 features. 20m*20*8 =
3.2 Gb. On the one hand I want to avoid "Java heap space" on the another
hand I want to provide my model with maximum training data. I understand
that minor changes to parallelALS needed.

Have a nice day!


Regards, 
Pavel













Reply via email to