Hello, I've tried spark-rowsimilarity with out-of-the-box setup (downloaded mahout distribution and spark, and set up the PATH), and I stumble upon a Java Heap space error. My input file is ~100MB. It seems the various parameters I tried to give won't change this. I do :
~/mahout-distribution-0.10.0/bin/mahout spark-rowsimilarity --input ~/query_result.tsv --output ~/work/result -sem 24g -D:spark.executor.memory=24g Do I just need to input more memory, or is there another step I can do to solve this ?