Re: DBSCAN implementation in Mahout

2014-11-30 Thread Ted Dunning
On Sat, Nov 29, 2014 at 8:31 PM, 3316 Chirag Nagpal chiragnagpal_12...@aitpune.edu.in wrote: Since Density based clustering algorithms, are being utilised extensively, especially by the GIS research groups, it is a bit sad that there isn't a Map Reduce implementation available.. I think I

Re: DBSCAN implementation in Mahout

2014-11-30 Thread 3316 Chirag Nagpal
Hi Ted, Thanks for the reply. I have been using DBSCAN (in python), the one implemented in sci-kit package. For a dataset with about 8k points, the running time on my Intel i7 4700 QM comes to around ~300 seconds. I have implemented a parallel version using the multiprocessing python library,

Re: DBSCAN implementation in Mahout

2014-11-30 Thread Ted Dunning
What happens with 8 million points and 1000 threads? Or 8 billion points? On Sun, Nov 30, 2014 at 1:10 PM, 3316 Chirag Nagpal chiragnagpal_12...@aitpune.edu.in wrote: Hi Ted, Thanks for the reply. I have been using DBSCAN (in python), the one implemented in sci-kit package. For a