Re: OutofMemory problem in ClusterDumper

Paritosh Ranjan Thu, 17 Nov 2011 22:54:30 -0800

We are trying to create a cluster output post processor which will writecluster specific data.You can apply the latest patch available onhttps://issues.apache.org/jira/browse/MAHOUT-843 and useClusterOutputPostProcessor's distribute method. You won't getoutofmemory there. If this is what you want.


Paritosh


On 18-11-2011 12:09, zou.cl wrote:

Hi guys,

      I just noticed the out of memory problem in the ClusterDumper class. It 
seems that it loads all the data (for example, the clusteredPoints) into the 
Map container which cost huge memory if we have GBs data. I think we could also 
use Mapreduce to print the results instead of loading all into memory.








zou.cl via foxmail
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------



-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1411 / Virus Database: 2092/4022 - Release Date: 11/17/11

Re: OutofMemory problem in ClusterDumper

Reply via email to