Hi, I have a 1M X 6 dimensional matrix stored as sequence file and I am trying to use rowSimilarity for this job... But when I try to run the job, I see Java heap space error for the second step (RowSimilarityJob-CooccurrencesMapper-Reducer) . My raw sequence file is around 700MB and then I have already set MAHOUT_OPTS to (say) 7gb? But I am still seeing that error? My command line args are:
hadoop jar /usr/lib/mahout/mahout-examples-0.8-cdh5.0.0-job.jar org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob -i $INPUT -o $OUTPUT *-r 6 *-s SIMILARITY_COSINE -m 15 --tempDir $TEMP -ess Also, is this "r" a typo.. the help file says that this is column length? Is it column or row dimension ? Thanks -- Mohit "When you want success as badly as you want the air, then you will get it. There is no other secret of success." -Socrates
