See the sample_size parameter: silhouette score can be calculated on a random subset of the data, presumably for efficiency. Feel free to submit a PR improving the docstring.
On 16 June 2015 at 13:54, Sebastian Raschka <se.rasc...@gmail.com> wrote: > Hi, all, > > I am a little bit confused about the two related metrics silhouette_score > and silhouette_samples. The silhouette_samples calculates the silhouette > coefficient for each sample and returns an array of those. However, I am > wondering if I interpret the silhouette_score correctly. Based on the > documentation at > http://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html > I assume that it's just the average of the silhouette coefficients, which > can be confirmed by running, e.g., > > np.mean(silhouette_samples(X, y, metric='euclidean')) > > Now, I am wondering why silhouette_score has this additional random_state > parameter? > > Best, > Sebastian > > ------------------------------------------------------------------------------ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >
------------------------------------------------------------------------------
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general