Re: [scikit-learn] Silhouette example - performance issue

2016-10-18 Thread Joel Nothman
And we can reduce any substantial performance issues by merging
https://github.com/scikit-learn/scikit-learn/pull/7177 ... :)

On 15 October 2016 at 00:55, Michael Eickenberg <
michael.eickenb...@gmail.com> wrote:

> Dear Anaël,
>
> if you wish, you could add a line to the example verifying this
> correspondence. E.g. by moving the print function from between the two
> silhouette evaluations to after and also evaluating that average and
> printing it in parentheses.
>
> Probably not necessary though. A comment would do also. Or nothing :)
>
> Michael
>
>
> On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V  wrote:
>
>> On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton > > wrote:
>>
>>> Hi,
>>>
>>> In the silhouette example (http://scikit-learn.org/stabl
>>> e/auto_examples/cluster/plot_kmeans_silhouette_analysis.html
>>> #sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
>>> the silhouette values of each sample is computed twice: once with 
>>> *silhouette_score
>>> *and once with *silhouette_samples.* The call to *silhouette_score* can
>>> be easily avoided by computing the average of the result of*
>>> silhouette_samples*.
>>>
>>> Do you think we should remove the call to *silhouette_score* to improve
>>> the performance ? Or it is better to keep the two functions to show how to
>>> use them ?
>>>
>> Hi,
>>
>> When I wrote it, I intended it to be demonstrative of the two methods.
>>
>> Not sure if we should worry about performance issues there
>>
>>
>> --
>> Raghav RV
>> https://github.com/raghavrv
>>
>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Silhouette example - performance issue

2016-10-14 Thread Raghav R V
On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton 
wrote:

> Hi,
>
> In the silhouette example (http://scikit-learn.org/
> stable/auto_examples/cluster/plot_kmeans_silhouette_
> analysis.html#sphx-glr-auto-examples-cluster-plot-kmeans-
> silhouette-analysis-py), the silhouette values of each sample is computed
> twice: once with *silhouette_score *and once with *silhouette_samples.*
> The call to *silhouette_score* can be easily avoided by computing the
> average of the result of* silhouette_samples*.
>
> Do you think we should remove the call to *silhouette_score* to improve
> the performance ? Or it is better to keep the two functions to show how to
> use them ?
>
Hi,

When I wrote it, I intended it to be demonstrative of the two methods.

Not sure if we should worry about performance issues there


-- 
Raghav RV
https://github.com/raghavrv
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn