Well, C-Means is pretty established, but I'm not sure about the benefits
compared to
a diagonal covariance GMM.
Wrt my continuation of Issam's work:
https://github.com/scikit-learn/scikit-learn/pull/3939
I added momentum and did some refactoring, Olivier would like to see
Nesterov's momentum,
callbacks, and built-in online scaling.
I'm not sure if we want to do automatic scaling in online algorithms,
and we haven't really settled
on a callback scheme yet (I think GRBT is the only class that currently
allows that).
For the momentum, I just didn't have the time yet.
On 03/16/2015 06:19 AM, Michael Eickenberg wrote:
Dear Patrick,
On Mon, Mar 16, 2015 at 11:00 AM, Patrick Urbanke
<[email protected]
<mailto:[email protected]>> wrote:
Hi,
thank you for your responses.
I did take a look at your previous work in this regard as well as the
todo-list and it seems you've made quite some progress. That's
great and
I really look forward to seeing the final result, but to be quite
honest
with you, I'm afraid "I've added a couple of minor features to a
module
that was otherwise pretty well developed" doesn't really make for
a good
Bachelor's thesis.
Indeed, it may be too difficult, because it requires reading and
understanding other people's code and the motivation behind design
choices as well as everything about the method.
What I would like my students to do is to develop one
algorithm (or several algorithms) from beginning to end.
So how about we implement some more standard clustering
algorithms, for
instance c-means, DIANA or fuzzy subspace clustering? I've
searched for
these and a couple of other related keywords in the pull requests
and it
appears no is working on that yet. Correct me if I'm wrong.
Any new contribution needs to be motivated thoroughly. See
http://scikit-learn.org/stable/faq.html for criteria. The methods you
mention do appear in the mainstream literature, but it would be
necessary to pick one of them and make a case for the utility of
adding it to the code base. Acceptance criteria are so stringent
because the cost of maintenance of the code base is already very high.
I'll talk to my student, but my guess is he'll like the idea more than
the multilayer perceptron.
Maybe your student can apply to one of the Google Summer of Code 2015
projects?
Michael
Raghav R V 於 2015/3/16 上午 07:54 寫道:
> Also there is a PR by Andy working towards completing the same (MLP)
> here - https://github.com/scikit-learn/scikit-learn/pull/3939
>
> BTW, that PR does have a nice todo list, which you might want to
take
> a look at :)
>
>
>
> R
>
> On Mon, Mar 16, 2015 at 2:39 AM, Joel Nothman
<[email protected] <mailto:[email protected]>> wrote:
>> I think #3306 (Extreme Learning Machines) needs review, and
after that's
>> merged, focus should return to the MLP PR. I've not been
following either of
>> those PRs extremely closely, but I gather that both are quite
mature, but
>> not small items for review.
>>
>> On 16 March 2015 at 07:53, Michael Eickenberg
<[email protected] <mailto:[email protected]>>
>> wrote:
>>> Maybe others can comment on the status of this PR and to what
extent help
>>> may be needed to finish it?
>>>
>>> Michael
>>>
>>> On Sun, Mar 15, 2015 at 9:47 PM, Michael Eickenberg
>>> <[email protected]
<mailto:[email protected]>> wrote:
>>>> Dear Patrick,
>>>>
>>>> there is an almost finished pull request for multilayer
perceptrons from
>>>> last years GSoC by Issam Laradji:
>>>> https://github.com/scikit-learn/scikit-learn/pull/3204
>>>>
>>>> Michael
>>>>
>>>> On Sun, Mar 15, 2015 at 8:57 PM, Patrick Urbanke
>>>> <[email protected]
<mailto:[email protected]>> wrote:
>>>>> Hello,
>>>>>
>>>>>
>>>>> I'm writing, because I would like to contribute a multilayer
perceptron
>>>>> module to scikit-learn. On your website it says that I
should contact
>>>>> you to avoid duplicating work, so here I am.
>>>>>
>>>>> I'm a research associate and PhD candidate at the University of
>>>>> Göttingen, Germany. All of my research is related to machine
learning
>>>>> and I often use scikit-learn to benchmark my own algorithms.
I also use
>>>>> scikit-learn for teaching, so thank you for all for your
great work.
>>>>>
>>>>> I've noticed that scikit-learn still lacks a multilayer
perceptron.
>>>>> Since this is a very popular algorithm, I've decided that it
would be a
>>>>> good idea to have one of my students develop such a module
for his
>>>>> Bachelor's thesis under my supervision. He is very talented
and I have
>>>>> no doubt that he can do it. Also, he can build on some code
I have have
>>>>> already written.
>>>>>
>>>>> Here are the functionalities we would implement:
>>>>> - Classifier and regressor
>>>>> - Trained using SGD with minibatch updating
>>>>> - One hidden layer
>>>>> - Different activation functions (sigmoid, tanh, Gaussian RBM,
>>>>> multiquadric RBM, linear) and the ability to mix them (so
you could have
>>>>> a neural network with 5 sigmoid functions, 10 Gaussian RBM and 5
>>>>> multiquadric RBM)
>>>>> - L2 regularization
>>>>>
>>>>> Nice to have:
>>>>> - Support for scipy sparse matrices
>>>>>
>>>>> We would develop the main functionalities in C++ and then
write an
>>>>> interface using Cython. Obviously, we would adhere to the coding
>>>>> guidelines
>>>>>
>>>>>
(http://scikit-learn.org/stable/developers/index.html#coding-guidelines).
>>>>>
>>>>> Anything else we should consider?
>>>>>
>>>>>
>>>>> Greetings,
>>>>> Patrick Urbanke
>>>>>
>>>>>
>>>>>
>>>>>
------------------------------------------------------------------------------
>>>>> Dive into the World of Parallel Programming The Go Parallel
Website,
>>>>> sponsored
>>>>> by Intel and developed in partnership with Slashdot Media,
is your hub
>>>>> for all
>>>>> things parallel software development, from weekly thought
leadership
>>>>> blogs to
>>>>> news, videos, case studies, tutorials and more. Take a look
and join the
>>>>> conversation now. http://goparallel.sourceforge.net/
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> [email protected]
<mailto:[email protected]>
>>>>>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>
>>>
>>>
------------------------------------------------------------------------------
>>> Dive into the World of Parallel Programming The Go Parallel
Website,
>>> sponsored
>>> by Intel and developed in partnership with Slashdot Media, is
your hub for
>>> all
>>> things parallel software development, from weekly thought
leadership blogs
>>> to
>>> news, videos, case studies, tutorials and more. Take a look
and join the
>>> conversation now. http://goparallel.sourceforge.net/
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
<mailto:[email protected]>
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>
>>
------------------------------------------------------------------------------
>> Dive into the World of Parallel Programming The Go Parallel
Website,
>> sponsored
>> by Intel and developed in partnership with Slashdot Media, is
your hub for
>> all
>> things parallel software development, from weekly thought
leadership blogs
>> to
>> news, videos, case studies, tutorials and more. Take a look and
join the
>> conversation now. http://goparallel.sourceforge.net/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
<mailto:[email protected]>
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel
Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is
your hub for all
> things parallel software development, from weekly thought
leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and
join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
<mailto:[email protected]>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel
Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your
hub for all
things parallel software development, from weekly thought
leadership blogs to
news, videos, case studies, tutorials and more. Take a look and
join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general