Well, C-Means is pretty established, but I'm not sure about the benefits compared to
a diagonal covariance GMM.
Wrt my continuation of Issam's work:
https://github.com/scikit-learn/scikit-learn/pull/3939

I added momentum and did some refactoring, Olivier would like to see Nesterov's momentum,
callbacks, and built-in online scaling.
I'm not sure if we want to do automatic scaling in online algorithms, and we haven't really settled on a callback scheme yet (I think GRBT is the only class that currently allows that).
For the momentum, I just didn't have the time yet.

On 03/16/2015 06:19 AM, Michael Eickenberg wrote:
Dear Patrick,

On Mon, Mar 16, 2015 at 11:00 AM, Patrick Urbanke <[email protected] <mailto:[email protected]>> wrote:

    Hi,


    thank you for your responses.

    I did take a look at your previous work in this regard as well as the
    todo-list and it seems you've made quite some progress. That's
    great and
    I really look forward to seeing the final result, but to be quite
    honest
    with you, I'm afraid "I've added a couple of minor features to a
    module
    that was otherwise pretty well developed" doesn't really make for
    a good
    Bachelor's thesis.


Indeed, it may be too difficult, because it requires reading and understanding other people's code and the motivation behind design choices as well as everything about the method.

    What I would like my students to do is to develop one
    algorithm (or several algorithms) from beginning to end.

    So how about we implement some more standard clustering
    algorithms, for
    instance c-means, DIANA or fuzzy subspace clustering? I've
    searched for
    these and a couple of other related keywords in the pull requests
    and it
    appears no is working on that yet. Correct me if I'm wrong.


Any new contribution needs to be motivated thoroughly. See http://scikit-learn.org/stable/faq.html for criteria. The methods you mention do appear in the mainstream literature, but it would be necessary to pick one of them and make a case for the utility of adding it to the code base. Acceptance criteria are so stringent because the cost of maintenance of the code base is already very high.

    I'll talk to my student, but my guess is he'll like the idea more than
    the multilayer perceptron.


Maybe your student can apply to one of the Google Summer of Code 2015 projects?

Michael



    Raghav R V 於 2015/3/16 上午 07:54 寫道:
    > Also there is a PR by Andy working towards completing the same (MLP)
    > here - https://github.com/scikit-learn/scikit-learn/pull/3939
    >
    > BTW, that PR does have a nice todo list, which you might want to
    take
    > a look at :)
    >
    >
    >
    > R
    >
    > On Mon, Mar 16, 2015 at 2:39 AM, Joel Nothman
    <[email protected] <mailto:[email protected]>> wrote:
    >> I think #3306 (Extreme Learning Machines) needs review, and
    after that's
    >> merged, focus should return to the MLP PR. I've not been
    following either of
    >> those PRs extremely closely, but I gather that both are quite
    mature, but
    >> not small items for review.
    >>
    >> On 16 March 2015 at 07:53, Michael Eickenberg
    <[email protected] <mailto:[email protected]>>
    >> wrote:
    >>> Maybe others can comment on the status of this PR and to what
    extent help
    >>> may be needed to finish it?
    >>>
    >>> Michael
    >>>
    >>> On Sun, Mar 15, 2015 at 9:47 PM, Michael Eickenberg
    >>> <[email protected]
    <mailto:[email protected]>> wrote:
    >>>> Dear Patrick,
    >>>>
    >>>> there is an almost finished pull request for multilayer
    perceptrons from
    >>>> last years GSoC by Issam Laradji:
    >>>> https://github.com/scikit-learn/scikit-learn/pull/3204
    >>>>
    >>>> Michael
    >>>>
    >>>> On Sun, Mar 15, 2015 at 8:57 PM, Patrick Urbanke
    >>>> <[email protected]
    <mailto:[email protected]>> wrote:
    >>>>> Hello,
    >>>>>
    >>>>>
    >>>>> I'm writing, because I would like to contribute a multilayer
    perceptron
    >>>>> module to scikit-learn. On your website it says that I
    should contact
    >>>>> you to avoid duplicating work, so here I am.
    >>>>>
    >>>>> I'm a research associate and PhD candidate at the University of
    >>>>> Göttingen, Germany. All of my research is related to machine
    learning
    >>>>> and I often use scikit-learn to benchmark my own algorithms.
    I also use
    >>>>> scikit-learn for teaching, so thank you for all for your
    great work.
    >>>>>
    >>>>> I've noticed that scikit-learn still lacks a multilayer
    perceptron.
    >>>>> Since this is a very popular algorithm, I've decided that it
    would be a
    >>>>> good idea to have one of my students develop such a module
    for his
    >>>>> Bachelor's thesis under my supervision. He is very talented
    and I have
    >>>>> no doubt that he can do it. Also, he can build on some code
    I have have
    >>>>> already written.
    >>>>>
    >>>>> Here are the functionalities we would implement:
    >>>>> - Classifier and regressor
    >>>>> - Trained using SGD with minibatch updating
    >>>>> - One hidden layer
    >>>>> - Different activation functions (sigmoid, tanh, Gaussian RBM,
    >>>>> multiquadric RBM, linear) and the ability to mix them (so
    you could have
    >>>>> a neural network with 5 sigmoid functions, 10 Gaussian RBM and 5
    >>>>> multiquadric RBM)
    >>>>> - L2 regularization
    >>>>>
    >>>>> Nice to have:
    >>>>> - Support for scipy sparse matrices
    >>>>>
    >>>>> We would develop the main functionalities in C++ and then
    write an
    >>>>> interface using Cython. Obviously, we would adhere to the coding
    >>>>> guidelines
    >>>>>
    >>>>>
    (http://scikit-learn.org/stable/developers/index.html#coding-guidelines).
    >>>>>
    >>>>> Anything else we should consider?
    >>>>>
    >>>>>
    >>>>> Greetings,
    >>>>> Patrick Urbanke
    >>>>>
    >>>>>
    >>>>>
    >>>>>
    
------------------------------------------------------------------------------
    >>>>> Dive into the World of Parallel Programming The Go Parallel
    Website,
    >>>>> sponsored
    >>>>> by Intel and developed in partnership with Slashdot Media,
    is your hub
    >>>>> for all
    >>>>> things parallel software development, from weekly thought
    leadership
    >>>>> blogs to
    >>>>> news, videos, case studies, tutorials and more. Take a look
    and join the
    >>>>> conversation now. http://goparallel.sourceforge.net/
    >>>>> _______________________________________________
    >>>>> Scikit-learn-general mailing list
    >>>>> [email protected]
    <mailto:[email protected]>
    >>>>>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>>
    >>>
    >>>
    >>>
    
------------------------------------------------------------------------------
    >>> Dive into the World of Parallel Programming The Go Parallel
    Website,
    >>> sponsored
    >>> by Intel and developed in partnership with Slashdot Media, is
    your hub for
    >>> all
    >>> things parallel software development, from weekly thought
    leadership blogs
    >>> to
    >>> news, videos, case studies, tutorials and more. Take a look
    and join the
    >>> conversation now. http://goparallel.sourceforge.net/
    >>> _______________________________________________
    >>> Scikit-learn-general mailing list
    >>> [email protected]
    <mailto:[email protected]>
    >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>
    >>
    >>
    
------------------------------------------------------------------------------
    >> Dive into the World of Parallel Programming The Go Parallel
    Website,
    >> sponsored
    >> by Intel and developed in partnership with Slashdot Media, is
    your hub for
    >> all
    >> things parallel software development, from weekly thought
    leadership blogs
    >> to
    >> news, videos, case studies, tutorials and more. Take a look and
    join the
    >> conversation now. http://goparallel.sourceforge.net/
    >> _______________________________________________
    >> Scikit-learn-general mailing list
    >> [email protected]
    <mailto:[email protected]>
    >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>
    >
    
------------------------------------------------------------------------------
    > Dive into the World of Parallel Programming The Go Parallel
    Website, sponsored
    > by Intel and developed in partnership with Slashdot Media, is
    your hub for all
    > things parallel software development, from weekly thought
    leadership blogs to
    > news, videos, case studies, tutorials and more. Take a look and
    join the
    > conversation now. http://goparallel.sourceforge.net/
    > _______________________________________________
    > Scikit-learn-general mailing list
    > [email protected]
    <mailto:[email protected]>
    > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


    
------------------------------------------------------------------------------
    Dive into the World of Parallel Programming The Go Parallel
    Website, sponsored
    by Intel and developed in partnership with Slashdot Media, is your
    hub for all
    things parallel software development, from weekly thought
    leadership blogs to
    news, videos, case studies, tutorials and more. Take a look and
    join the
    conversation now. http://goparallel.sourceforge.net/
    _______________________________________________
    Scikit-learn-general mailing list
    [email protected]
    <mailto:[email protected]>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general




------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/


_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to