uhhh +1. any chance of using theano with it?
On Sun, May 12, 2013 at 7:35 AM, Alexandre ABRAHAM <
abraham.alexan...@gmail.com> wrote:
> Hey scikit people,
>
> I know that the first purpose of scikit is not to handle big data but
> would you be interested by a PR of my silhouette block implementation ? My
> benches have shown that it is a bit slower than the scikit one when data is
> small but it divides memory usage by n_cluster ^ 2. Plus it can be
> parallelized. But, obviously, the code is less readable.
>
> I am currently working with data that does not fit in memory so I try to
> minimize its usage as much as I can. I have also implemented an online
> variance (and explained variance) object based on [Chan79] approach (there
> may be better ones, I haven't checked). This is not hard to code but it can
> be useful for some people.
>
> Alexandre.
>
> [Chan79] "Updating Formulae and a Pairwise Algorithm for Computing Sample
> Variances." Chan 79
>
>
>
> On Fri, May 10, 2013 at 6:26 PM, Bao Thien <ntba...@gmail.com> wrote:
>
>> Hi Alexandre,
>>
>> It sounds very great. I will try it and let you know soon.
>>
>> Regards,
>>
>> T.Bao
>>
>>
>> On Fri, May 10, 2013 at 6:19 PM, Alexandre ABRAHAM <
>> abraham.alexan...@gmail.com> wrote:
>>
>>> Bao,
>>>
>>> Sorry for the delay. I have push a new version of the code on the gist
>>> (there is now a n_jobs keyword parameter). It should use a bit more memory.
>>>
>>> Fast bench (see main in the gist) :
>>> Scikit silhouette (113.294149s): -0.013992
>>> Block silhouette (23.485517s): -0.013992
>>> Block silhouette parallel (23.351142s): -0.013992
>>>
>>> I only have 2 cores so this is not very significant. If you have more,
>>> feedback is welcome !
>>>
>>> Alexandre.
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> --
>> Nguyen Thien Bao
>>
>> NeuroInformatics Laboratory (NILab),
>> Fondazione Bruno Kessler (FBK), Trento, Italy
>> Centro Interdipartimentale Mente e Cervello (CIMeC)
>> Universit`a degli Studi di Trento, Italy
>> Email: ntba...@gmail.com or ntbao...@yahoo.com
>> Cellphone: +39.345.293.1006 (Italy)
>> Cellphone: +84.996.352.452 (VietNam)
>>
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general