Note, especially, that there is no override for amd64, which is the architecture you are likely using if you have 4+ gigs of ram, so you're stuck with the idiotic value of 2 unless you set the option explicitly. I.e. if you set a 1 gig heap, only 683m of that can be used for "old" allocation. And, for doing this kind of work, just about all your data is "old", so your max heap needs to be set about 50% higher than what you actually need (unless you set NewRatio to a sane value). Any value between 8 and 12 should be fine for mahout as these are in the right ballpark and aren't very different from a practical perspective (8 corresponds to 11% new; 12 corresponds to 7.7% new).
More grumblings on java gc/memory: http://javaquirks.blogspot.com/2008/08/garbage-collection-or-outofmemoryerror.html http://javaquirks.blogspot.com/2009/04/usecompressedoops.html Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/
