I am trying to train a Naive Bayes model in Mahout and I keep getting a java heap space error. It is a strange thing because I am using a hashing vectorizer where I am fitting 1-gram and 2-gram tokens into a vector of size 2^20. My cluster consists of 7 nodes with 16 gb of ram and each time I run it I seem to get this heap space error. I have reduced the dimensions of the vectors to only 16 dimensions and I still get this error so it seems to be a more systematics issue.
I am running this on cloudera CDH 5.3, is there anything I could adjust in terms of heap space which would allow for this to work?