Hi Lars.
All I can say is that it worked for me by passing X directly:
http://scikit-learn.org/dev/auto_examples/cluster/plot_cluster_comparison.html

I'm deadlining right now, hopefully I have time to work on Olivier's
"quadratic_fit" (or whatever) proposal afterward.

Cheers,
Andy


On 03/13/2012 10:05 PM, Lars Buitinck wrote:
> Hi all,
>
> A colleague approached me today asking how the scikit-learn DBSCAN
> algorithm should be applied and I must admit that the documentation
> and example was confusing even to me. The fit docstring says
>
>      X: array [n_samples, n_samples] or [n_samples, n_features]
>          Array of distances between samples, or a feature array.
>          The array is treated as a feature array unless the metric is given as
>          'precomputed'.
>
> However, the online demo does the following:
>
>      D = distance.squareform(distance.pdist(X))
>      S = 1 - (D / np.max(D))
>
>      db = DBSCAN().fit(S, eps=0.95, min_samples=10)
>
> which uses a similarity matrix rather than a feature matrix as input
> without passing metric="precomputed". Am I missing some interesting
> clustering trick here, or is this a bug? I tried running the example
> with the original feature matrix X (without tuning the parameters) and
> it gave different output: all points were considered a single cluster
> with no outliers.
>
> TIA,
> Lars
>
> [1] http://scikit-learn.org/0.10/auto_examples/cluster/plot_dbscan.html
>
>    


------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to