Hi,

the algorithm uses memory proportianal to the number of your centers.
By default, it sets "k.means.caching.enabled" to true, which caches your
vectors to cluster in heap and thus you would need 1tb of ram.
I would suggest you to set this to false (you will need to recompile the
KMeansBSP class in the ml package, the line you have to change is 347).

Good luck and let us know if you have problems.

2012/8/27 HuYuesheng <[email protected]>

> Hi,
>
>     I want to know, if I want to test a 1TB K-means dataset, dose it mean I
> need at least 1TB RAM(all of the cluster)?
>     Thank you!
>
>    Best Regards!
>
> Yuesheng Hu
> China
>

Reply via email to