Try turning on the Kryo serializer as described at 
http://spark.apache.org/docs/latest/tuning.html. Also, are there any exceptions 
in the driver program’s log before this happens?

Matei

On Apr 28, 2014, at 9:19 AM, Buttler, David <buttl...@llnl.gov> wrote:

> Hi,
> I am trying to run the K-means code in mllib, and it works very nicely with 
> small K (less than 1000).  However, when I try for a larger K (I am looking 
> for 2000-4000 clusters), it seems like the code gets part way through 
> (perhaps just the initialization step) and freezes.  The compute nodes stop 
> doing any CPU / network / IO and nothing happens for hours.  I had done 
> something similar back in the days of Spark 0.6, and I didn’t have any 
> trouble going up to 4000 clusters with similar data.
>  
> This happens with both a standalone cluster, and in local multi-core mode 
> (with the node given 200GB of heap), but eventually completes in local 
> single-core mode.
>  
> Data statistics:
> Rows: 166248
> Columns: 108
>  
> This is a test run before trying it out on much larger data
>  
> Any ideas on what might be the cause of this?
>  
> Thanks,
> Dave

Reply via email to