DistributedLanczosSolver has been deprecated (and the blog post u mention is old). Use Stochastic SVD (SSVD) instead.
On Friday, December 20, 2013 12:41 AM, Partha Pratim Talukdar <[email protected]> wrote: Hello, I am running mahout (v0.8) svd over a sparse matrix of size 5,064,569 x 44,543,104 with the matrix in the format as per [1]. However, I get OOM error immediately as given below. I have tried increasing the JAVA_HEAP_MAX and MAHOUT_HEAPSIZE in bin.mahout to 10GB, but to no effect. Anyone knows a way out? 13/12/19 23:23:36 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --inMemory=[false], --input=[/user/ppt/data/pra_svo/openie/input/openiev4_filtered_pred_arg1arg2_l2norm_center_sorted_col_mahout_inp.bin], --maxError=[0.05], --minEigenvalue=[0.0], --numCols=[44543104], --numRows=[5064569], --output=[/user/ppt/data/pra_svo/openie/output/], --rank=[200], --startPhase=[0], --symmetric=[false], --tempDir=[/user/ppt/data/pra_svo/openie/temp/], --workingDir=[/user/ppt/data/pra_svo/openie/scratch/]} Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at org.apache.mahout.math.DenseVector.<init>(DenseVector.java:53) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.getInitialVector(DistributedLanczosSolver.java:68) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:203) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:131) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:291) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.main(DistributedLanczosSolver.java:297) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:194) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) [1] http://bickson.blogspot.com/2011/02/mahout-svd-matrix-factorization.html
