See the sample_size parameter: silhouette score can be calculated on a
random subset of the data, presumably for efficiency. Feel free to submit a
PR improving the docstring.

On 16 June 2015 at 13:54, Sebastian Raschka <se.rasc...@gmail.com> wrote:

> Hi, all,
>
> I am a little bit confused about the two related metrics silhouette_score
> and silhouette_samples. The silhouette_samples calculates the silhouette
> coefficient for each sample and returns an array of those. However, I am
> wondering if I interpret the silhouette_score correctly. Based on the
> documentation at
> http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html
> I assume that it's just the average of the silhouette coefficients, which
> can be confirmed by running, e.g.,
>
> np.mean(silhouette_samples(X, y, metric='euclidean'))
>
> Now, I am wondering why silhouette_score has this additional random_state
> parameter?
>
> Best,
> Sebastian
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to