I had seen this issue too with RSJ until 0.8. Switch to using Mahout 0.9,
downsampling was introduced in RSJ which should avoid this error.


On Fri, May 23, 2014 at 2:59 PM, Mohit Singh <[email protected]> wrote:

> Hi,
>    I have a 1M X 6 dimensional matrix stored as sequence file and I am
> trying to use rowSimilarity for this job...
> But when I try to run the job, I see Java heap space error for the second
> step (RowSimilarityJob-CooccurrencesMapper-Reducer) .
> My raw sequence file is around 700MB and then I have already set
> MAHOUT_OPTS to (say) 7gb?
> But I am still seeing that error?
> My command line args are:
>
> hadoop jar /usr/lib/mahout/mahout-examples-0.8-cdh5.0.0-job.jar
> org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob -i
> $INPUT -o $OUTPUT *-r 6 *-s SIMILARITY_COSINE -m 15 --tempDir $TEMP -ess
>
> Also, is this "r" a typo.. the help file says that this is column length?
> Is it column or row dimension ?
>
> Thanks
>
> --
> Mohit
>
> "When you want success as badly as you want the air, then you will get it.
> There is no other secret of success."
> -Socrates
>

Reply via email to