Hi Suneel, I have the exact same problem with the following values: No of docs: 25.904.599 command line params: -k 1000 -km 17070 Reducer Xmx is 6GB, running in full Map/Reduce mode.
Do you have any other idea what to try? Thanks, Roland On Tue, Mar 25, 2014 at 7:13 PM, Suneel Marthi <[email protected]>wrote: > What's ur value for -km? > Based on what you had provided -km should be = 10000 * ln(2000000) = > 145090 > > Try reducing ur no. of clusters to 1000 and -km = 14509 > > > > > > > > On Tuesday, March 25, 2014 2:45 AM, fx MA XIAOJUN < > [email protected]> wrote: > > I am using Mahout Streamingkmeans in sequential mode. > With a dataset of 2000000 objects, 128 variables, I would like to get > 10000 clusters. > > " GC Overhead limit exceed " error occurred. > How to set java memory limit for sequential model? > > > Yours Sincerely, > Ma >
