theano for the parallelization?
from what i understand your PR uses on-the-fly computation to reduce memory
usage vs all at once. Wouldn't Theano help? As in could you per chance
'theano-ize' the parallel calculation maybe? I consider heavy numerical
processes to be (at least now) mostly the domain of the GPU if it's
possible to parallelize reasonably well.
Also do you know if scikit - is opposed to adding more dependencies (just
wondering?)
thanks
On Sun, May 12, 2013 at 11:14 AM, Alexandre ABRAHAM <
abraham.alexan...@gmail.com> wrote:
> Hi Ronnie,
>
> I have never used Theano, could you be a little more specific ? What do
> you want to compute ? What is your input data ? Basically, all these
> metrics are independant of the scikit and take numpy arrays as input so you
> can use it with any data under this format.
>
> Now, if you want to integrate some online computation directly into
> Theano, this is another story...
>
> Alexandre.
>
>
> On Sun, May 12, 2013 at 4:35 PM, Ronnie Ghose <ronnie.gh...@gmail.com>wrote:
>
>> uhhh +1. any chance of using theano with it?
>>
>>
>> On Sun, May 12, 2013 at 7:35 AM, Alexandre ABRAHAM <
>> abraham.alexan...@gmail.com> wrote:
>>
>>> Hey scikit people,
>>>
>>> I know that the first purpose of scikit is not to handle big data but
>>> would you be interested by a PR of my silhouette block implementation ? My
>>> benches have shown that it is a bit slower than the scikit one when data is
>>> small but it divides memory usage by n_cluster ^ 2. Plus it can be
>>> parallelized. But, obviously, the code is less readable.
>>>
>>> I am currently working with data that does not fit in memory so I try to
>>> minimize its usage as much as I can. I have also implemented an online
>>> variance (and explained variance) object based on [Chan79] approach (there
>>> may be better ones, I haven't checked). This is not hard to code but it can
>>> be useful for some people.
>>>
>>> Alexandre.
>>>
>>> [Chan79] "Updating Formulae and a Pairwise Algorithm for Computing
>>> Sample Variances." Chan 79
>>>
>>>
>>>
>>> On Fri, May 10, 2013 at 6:26 PM, Bao Thien <ntba...@gmail.com> wrote:
>>>
>>>> Hi Alexandre,
>>>>
>>>> It sounds very great. I will try it and let you know soon.
>>>>
>>>> Regards,
>>>>
>>>> T.Bao
>>>>
>>>>
>>>> On Fri, May 10, 2013 at 6:19 PM, Alexandre ABRAHAM <
>>>> abraham.alexan...@gmail.com> wrote:
>>>>
>>>>> Bao,
>>>>>
>>>>> Sorry for the delay. I have push a new version of the code on the gist
>>>>> (there is now a n_jobs keyword parameter). It should use a bit more
>>>>> memory.
>>>>>
>>>>> Fast bench (see main in the gist) :
>>>>> Scikit silhouette (113.294149s): -0.013992
>>>>> Block silhouette (23.485517s): -0.013992
>>>>> Block silhouette parallel (23.351142s): -0.013992
>>>>>
>>>>> I only have 2 cores so this is not very significant. If you have more,
>>>>> feedback is welcome !
>>>>>
>>>>> Alexandre.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>>> their applications. This 200-page book is written by three acclaimed
>>>>> leaders in the field. The early access version is available now.
>>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nguyen Thien Bao
>>>>
>>>> NeuroInformatics Laboratory (NILab),
>>>> Fondazione Bruno Kessler (FBK), Trento, Italy
>>>> Centro Interdipartimentale Mente e Cervello (CIMeC)
>>>> Universit`a degli Studi di Trento, Italy
>>>> Email: ntba...@gmail.com or ntbao...@yahoo.com
>>>> Cellphone: +39.345.293.1006 (Italy)
>>>> Cellphone: +84.996.352.452 (VietNam)
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Learn Graph Databases - Download FREE O'Reilly Book
>>>> "Graph Databases" is the definitive new guide to graph databases and
>>>> their applications. This 200-page book is written by three acclaimed
>>>> leaders in the field. The early access version is available now.
>>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Learn Graph Databases - Download FREE O'Reilly Book
>>> "Graph Databases" is the definitive new guide to graph databases and
>>> their applications. This 200-page book is written by three acclaimed
>>> leaders in the field. The early access version is available now.
>>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Learn Graph Databases - Download FREE O'Reilly Book
>> "Graph Databases" is the definitive new guide to graph databases and
>> their applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and
> their applications. This 200-page book is written by three acclaimed
> leaders in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general