60 x 40K = 2400K = 2GB.

How much memory does each reducer get?  If it is significantly larger than
3GB, you should be fine.

On Thu, Feb 24, 2011 at 3:09 PM, Jeff Eastman <[email protected]> wrote:

> The reducers then accumulate the posterior statistics for one or more
> clusters. You can try increasing the number of reducers (up to k) which can
> help with this step. Again, if most of your points are being assigned to a
> single cluster that reducer will be bogged down observing them all. Also,
> since the models accumulate Gaussian statistics to compute mean and std
> posterior values these values will tend to become denser as many vectors are
> summed and this can drive up memory consumption during the reduce step.

Reply via email to