Re: [Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-05 Thread Abhishek Pratap
Hi Gael The MemoryError exception I am getting is from using scikit's DBSCAN implementation. I can check mini-batch implementation of Kmeans. Best, -Abhi On Wed, Apr 4, 2012 at 10:33 PM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Wed, Apr 04, 2012 at 04:41:51PM -0700, Abhishek

Re: [Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-05 Thread Gael Varoquaux
On Thu, Apr 05, 2012 at 01:05:01PM -0700, Abhishek Pratap wrote: Also in my case I dont really have a good approximate on value of K in K-means. That's a hard problem, for which I have no answer, sorry :$ G ___ NumPy-Discussion mailing list

[Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-04 Thread Abhishek Pratap
Hey Guys I am new to both python and more so to numpy. I am trying to cluster close to a 900K points using DBSCAN algo. My input is a list of ~900k tuples each having two points (x,y) coordinates. I am converting them to numpy array and passing them to pdist method of scipy.spatial.distance for

Re: [Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-04 Thread Chris Barker
On Wed, Apr 4, 2012 at 4:17 PM, Abhishek Pratap close to a 900K points using DBSCAN algo. My input is a list of ~900k tuples each having two points (x,y) coordinates. I am converting them to numpy array and passing them to pdist method of scipy.spatial.distance for calculating distance between

Re: [Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-04 Thread Abhishek Pratap
Thanks Chris. So I guess the question becomes how can I efficiently cluster 1 million x,y coordinates. -Abhi On Wed, Apr 4, 2012 at 4:35 PM, Chris Barker chris.bar...@noaa.gov wrote: On Wed, Apr 4, 2012 at 4:17 PM, Abhishek Pratap close to a 900K points using DBSCAN algo. My input is a list of

Re: [Numpy-discussion] MemoryError : with scipy.spatial.distance

2012-04-04 Thread Gael Varoquaux
On Wed, Apr 04, 2012 at 04:41:51PM -0700, Abhishek Pratap wrote: Thanks Chris. So I guess the question becomes how can I efficiently cluster 1 million x,y coordinates. Did you try the scikit-learn's implementation of DBSCAN: http://scikit-learn.org/stable/modules/clustering.html#dbscan ? I am