Hi, I'm trying to create vectors with Mahout as explained in http://cwiki.apache.org/confluence/display/MAHOUT/Creating+Vectors+from+Text, however I keep running out of heap. My heap is set to 2 GB already and I use these parameters: "java org.apache.mahout.utils.vectors.Driver --dir /LUCENE/ind --output /user/florian/index-vectors-01 --field content --dictOut /user/florian/index-dict-01 --weight TF".
My index currently is about 6 GB large. Is there any way to compute the vectors in a distributed manner? What's the largest index someone has created vectors from? Thanks! Florian
