Hi Andreas, Looking through the literature, DBCLASD has been a constant reference both for practical applications and benchmark papers that are as recent as 2014 (see for example "A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis" or "A new clustering algorithm with adaptive attractor for LIDAR points"). The paper itself has been cited over 260 times. The most interesting feature of this algorithm is its non-parametric nature, allowing it to address classification problems without imposing hard constrains such as the number of classes. It can be used precisely as a way to estimate K for KMeans (and related problems) :-)
Comparing clustering algorithms is always a tricky task but according to the survey I've mentioned before (from 2014), DBCLASD is quite similar to DBSCAN and OPTICS: in terms of size of the dataset; it copes with noise (as oppose to DBSCAN, BIRCH and K-Means) and it has a complexity of O(3n^2) which compares with DBSCAN's O(n^2) Regards, Sebastian On 31 July 2015 at 18:43, Andreas Mueller <t3k...@gmail.com> wrote: > Hi Sebastian. > Have you seen this used much recently? How does it compare against DBSCAN, > BIRCH, OPTICS or just KMeans? > > Cheers, > Andy > > > > On 07/31/2015 10:28 AM, Sebastián Palacio wrote: > > Hello all, > > I've been investigating clustering algorithms with special interest in > non-parametric methods and, one that is being mentioned quite often is > DBCLASD [1]. I've looked around but I haven't been able to find one single > implementation of this algorithm whatsoever so I decided to implement my > own. > > My first running version is already on GitHub: https://goo.gl/V4HOVH > I tried to make it as simple as possible for anyone to run it: it's all > written in Python, requires only "standard" python packages (numpy, > scikit-learn, scipy and matplotlib) and it comes with a main routine that > runs an example. > > I would really appreciate some feedback from the community, regarding the > correctness of this implementation (if you happen to have some experience > with the algorithm) and perhaps a discussion about how useful this > algorithm may be in order to decide whether it makes sense to integrate it > into a future version of scikit-learn or not. Thanks in advance for your > time :-) > > Regards, > Sebastian > >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general