[Scikit-learn-general] misleading example for DBSCAN?

Johannes Knopp Thu, 04 Apr 2013 10:01:01 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi everyone,


I just stumbled upon the example plot_dbscan.py at [1]. As far as I
understand, the similarity matrix S is computed from the data in X and
then it is used for clustering with DBSCAN. What confused me was that
the documentation for DBSCAN.fit(X) says that it takes a *distance*
matrix.

Here is the code snippet:

- ------------------------
# Compute similarities
D = distance.squareform(distance.pdist(X))
S = 1 - (D / np.max(D))

# Compute DBSCAN
db = DBSCAN(eps=0.95, min_samples=10).fit(S)
- ------------------------

Shouldn't it be "[?].fit(D)" instead?

I would be happy if anybody could explain if my understanding is wrong
or if the example is flawed.

Best regards,

Johannes

[1]
http://scikit-learn.org/dev/auto_examples/cluster/plot_dbscan.html#example-cluster-plot-dbscan-py
[2]
http://scikit-learn.org/dev/modules/generated/sklearn.cluster.DBSCAN.html#sklearn.cluster.DBSCAN

- -- 
Knowledge Representation & Knowledge Management Research Group
68159 Mannheim
B6, 26, Room C 1.10
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iQEcBAEBAgAGBQJRXbD2AAoJENiDBNxHmpwDmjgH/1mbZsvdMfhlI96bh/GvxBkI
j/4zCROlkfGRE9ATyC8esBrchq1i0muuh3FJU9uzXPiqVDiVh7WEBhkt1KdrOQ1G
BadSJlWpeH2KX/2WP6KFsYul61Y0mFRUgeBw75ixCE2CMfq1MHbAsZVInBVHwcbq
ZnSrXrbD+EVWFUDrYipDYGibTCqzDdTiIaOge+mD3/QGpOmUIpkm6cctsyeZvo/q
5JQKpARUpXGowrrbEpvX0m2iQ9NQmff1yKRRMznqSM1zGBEbqUN15HWAq+cMQUfO
8W5vD28z6jP1/RHxnwyg8LmFCGseCL52mfmNSivUvGlJy/5COmBhTEhLuJ2xvfs=
=PlLq
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Minimize network downtime and maximize team effectiveness.
Reduce network management and security costs.Learn how to hire 
the most talented Cisco Certified professionals. Visit the 
Employer Resources Portal
http://www.cisco.com/web/learning/employer_resources/index.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] misleading example for DBSCAN?

Reply via email to