Re: [scikit-learn] Silhouette example - performance issue
And we can reduce any substantial performance issues by merging https://github.com/scikit-learn/scikit-learn/pull/7177 ... :) On 15 October 2016 at 00:55, Michael Eickenberg < michael.eickenb...@gmail.com> wrote: > Dear Anaël, > > if you wish, you could add a line to the example verifying this > correspondence. E.g. by moving the print function from between the two > silhouette evaluations to after and also evaluating that average and > printing it in parentheses. > > Probably not necessary though. A comment would do also. Or nothing :) > > Michael > > > On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V wrote: > >> On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton > > wrote: >> >>> Hi, >>> >>> In the silhouette example (http://scikit-learn.org/stabl >>> e/auto_examples/cluster/plot_kmeans_silhouette_analysis.html >>> #sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py), >>> the silhouette values of each sample is computed twice: once with >>> *silhouette_score >>> *and once with *silhouette_samples.* The call to *silhouette_score* can >>> be easily avoided by computing the average of the result of* >>> silhouette_samples*. >>> >>> Do you think we should remove the call to *silhouette_score* to improve >>> the performance ? Or it is better to keep the two functions to show how to >>> use them ? >>> >> Hi, >> >> When I wrote it, I intended it to be demonstrative of the two methods. >> >> Not sure if we should worry about performance issues there >> >> >> -- >> Raghav RV >> https://github.com/raghavrv >> >> >> ___ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Silhouette example - performance issue
Dear Anaël, if you wish, you could add a line to the example verifying this correspondence. E.g. by moving the print function from between the two silhouette evaluations to after and also evaluating that average and printing it in parentheses. Probably not necessary though. A comment would do also. Or nothing :) Michael On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V wrote: > On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton > wrote: > >> Hi, >> >> In the silhouette example (http://scikit-learn.org/stabl >> e/auto_examples/cluster/plot_kmeans_silhouette_analysis. >> html#sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py), >> the silhouette values of each sample is computed twice: once with >> *silhouette_score >> *and once with *silhouette_samples.* The call to *silhouette_score* can >> be easily avoided by computing the average of the result of* >> silhouette_samples*. >> >> Do you think we should remove the call to *silhouette_score* to improve >> the performance ? Or it is better to keep the two functions to show how to >> use them ? >> > Hi, > > When I wrote it, I intended it to be demonstrative of the two methods. > > Not sure if we should worry about performance issues there > > > -- > Raghav RV > https://github.com/raghavrv > > > ___ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > > ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn
Re: [scikit-learn] Silhouette example - performance issue
On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton wrote: > Hi, > > In the silhouette example (http://scikit-learn.org/ > stable/auto_examples/cluster/plot_kmeans_silhouette_ > analysis.html#sphx-glr-auto-examples-cluster-plot-kmeans- > silhouette-analysis-py), the silhouette values of each sample is computed > twice: once with *silhouette_score *and once with *silhouette_samples.* > The call to *silhouette_score* can be easily avoided by computing the > average of the result of* silhouette_samples*. > > Do you think we should remove the call to *silhouette_score* to improve > the performance ? Or it is better to keep the two functions to show how to > use them ? > Hi, When I wrote it, I intended it to be demonstrative of the two methods. Not sure if we should worry about performance issues there -- Raghav RV https://github.com/raghavrv ___ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn