Upgraded from Python 2.6 -> 2.7, then installed numpy, scipy, and
scikit-learn on that. Now starting to gt to grips with sklearn.
Thanks for advice, Nigel.
Regards,
Nigel Legg
07722 652866
http://twitter.com/nigellegg
http://uk.linkedin.com/in/nigellegg
To support the footyvoice app, please visi
Hi Alexandre,
It sounds very great. I will try it and let you know soon.
Regards,
T.Bao
On Fri, May 10, 2013 at 6:19 PM, Alexandre ABRAHAM <
abraham.alexan...@gmail.com> wrote:
> Bao,
>
> Sorry for the delay. I have push a new version of the code on the gist
> (there is now a n_jobs keyword p
Bao,
Sorry for the delay. I have push a new version of the code on the gist
(there is now a n_jobs keyword parameter). It should use a bit more memory.
Fast bench (see main in the gist) :
Scikit silhouette (113.294149s): -0.013992
Block silhouette (23.485517s): -0.013992
Block silhouette parallel
the dataset is clustered into 50 clusters
>
OK, so each clusters contains approximately 5K elements, which means
distance matrices of size 25 000K.
> I have not monitored the memory usage. But the computation time here is
> the real CPU time, not the elapse time
>
OK.
> I only can run the
Hi Alexandre,
I have a few questions on your experiment though:
> - how many clusters do you have (as the block method speed and memory
> consumption is dependent of the number of cluster)
>
the dataset is clustered into 50 clusters
> - have you monitored memory usage ? In particular, did you
Hi Bao,
Thanks for your feedback ! I am not surprised that the sampling method
saves time and gives a good approximation, especially considering the size
of your data.
I have a few questions on your experiment though:
- how many clusters do you have (as the block method speed and memory
consumpti
Hi Alexandre,
I run the silhouette_score_block on my dataset, and this is the result
dataset size |X| = 260486, dimension 40, RAM 4GB
Trial Original Ward (whole data)(1) *Original Ward
(sub_sample=50K)(2)* Silhouette
Score Time(s) Silhouette Score Time(s) 1st 0.19045893 6250.758648
0.189+/