60 x 40K = 2400K = 2GB. How much memory does each reducer get? If it is significantly larger than 3GB, you should be fine.
On Thu, Feb 24, 2011 at 3:09 PM, Jeff Eastman <[email protected]> wrote: > The reducers then accumulate the posterior statistics for one or more > clusters. You can try increasing the number of reducers (up to k) which can > help with this step. Again, if most of your points are being assigned to a > single cluster that reducer will be bogged down observing them all. Also, > since the models accumulate Gaussian statistics to compute mean and std > posterior values these values will tend to become denser as many vectors are > summed and this can drive up memory consumption during the reduce step.
