Re: [scikit-learn] Silhouette example - performance issue

2016-10-18 Thread Joel Nothman
And we can reduce any substantial performance issues by merging
https://github.com/scikit-learn/scikit-learn/pull/7177 ... :)

On 15 October 2016 at 00:55, Michael Eickenberg <
michael.eickenb...@gmail.com> wrote:

> Dear Anaël,
>
> if you wish, you could add a line to the example verifying this
> correspondence. E.g. by moving the print function from between the two
> silhouette evaluations to after and also evaluating that average and
> printing it in parentheses.
>
> Probably not necessary though. A comment would do also. Or nothing :)
>
> Michael
>
>
> On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V  wrote:
>
>> On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton > > wrote:
>>
>>> Hi,
>>>
>>> In the silhouette example (http://scikit-learn.org/stabl
>>> e/auto_examples/cluster/plot_kmeans_silhouette_analysis.html
>>> #sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
>>> the silhouette values of each sample is computed twice: once with 
>>> *silhouette_score
>>> *and once with *silhouette_samples.* The call to *silhouette_score* can
>>> be easily avoided by computing the average of the result of*
>>> silhouette_samples*.
>>>
>>> Do you think we should remove the call to *silhouette_score* to improve
>>> the performance ? Or it is better to keep the two functions to show how to
>>> use them ?
>>>
>> Hi,
>>
>> When I wrote it, I intended it to be demonstrative of the two methods.
>>
>> Not sure if we should worry about performance issues there
>>
>>
>> --
>> Raghav RV
>> https://github.com/raghavrv
>>
>>
>> ___
>> scikit-learn mailing list
>> scikit-learn@python.org
>> https://mail.python.org/mailman/listinfo/scikit-learn
>>
>>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Silhouette example - performance issue

2016-10-14 Thread Michael Eickenberg
Dear Anaël,

if you wish, you could add a line to the example verifying this
correspondence. E.g. by moving the print function from between the two
silhouette evaluations to after and also evaluating that average and
printing it in parentheses.

Probably not necessary though. A comment would do also. Or nothing :)

Michael


On Fri, Oct 14, 2016 at 3:38 PM, Raghav R V  wrote:

> On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton 
> wrote:
>
>> Hi,
>>
>> In the silhouette example (http://scikit-learn.org/stabl
>> e/auto_examples/cluster/plot_kmeans_silhouette_analysis.
>> html#sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
>> the silhouette values of each sample is computed twice: once with 
>> *silhouette_score
>> *and once with *silhouette_samples.* The call to *silhouette_score* can
>> be easily avoided by computing the average of the result of*
>> silhouette_samples*.
>>
>> Do you think we should remove the call to *silhouette_score* to improve
>> the performance ? Or it is better to keep the two functions to show how to
>> use them ?
>>
> Hi,
>
> When I wrote it, I intended it to be demonstrative of the two methods.
>
> Not sure if we should worry about performance issues there
>
>
> --
> Raghav RV
> https://github.com/raghavrv
>
>
> ___
> scikit-learn mailing list
> scikit-learn@python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


Re: [scikit-learn] Silhouette example - performance issue

2016-10-14 Thread Raghav R V
On Fri, Oct 14, 2016 at 3:27 PM, Anaël Bonneton 
wrote:

> Hi,
>
> In the silhouette example (http://scikit-learn.org/
> stable/auto_examples/cluster/plot_kmeans_silhouette_
> analysis.html#sphx-glr-auto-examples-cluster-plot-kmeans-
> silhouette-analysis-py), the silhouette values of each sample is computed
> twice: once with *silhouette_score *and once with *silhouette_samples.*
> The call to *silhouette_score* can be easily avoided by computing the
> average of the result of* silhouette_samples*.
>
> Do you think we should remove the call to *silhouette_score* to improve
> the performance ? Or it is better to keep the two functions to show how to
> use them ?
>
Hi,

When I wrote it, I intended it to be demonstrative of the two methods.

Not sure if we should worry about performance issues there


-- 
Raghav RV
https://github.com/raghavrv
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn


[scikit-learn] Silhouette example - performance issue

2016-10-14 Thread Anaël Bonneton
Hi,

In the silhouette example (
http://scikit-learn.org/stable/auto_examples/cluster/plot_kmeans_silhouette_analysis.html#sphx-glr-auto-examples-cluster-plot-kmeans-silhouette-analysis-py),
the silhouette values of each sample is computed twice: once with
*silhouette_score
*and once with *silhouette_samples.* The call to *silhouette_score* can be
easily avoided by computing the average of the result of*
silhouette_samples*.

Do you think we should remove the call to *silhouette_score* to improve the
performance ? Or it is better to keep the two functions to show how to use
them ?

Anaël Bonneton
___
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn