Re: [Scikit-learn-general] (Deep learning) pre-proposal for (GSOC) 2013

Issam Thu, 02 May 2013 08:13:16 -0700

Hi,

Thanks a lot for the comment, I hope this doesn't open a new thread :),I'm pretty new to using mailing list.

You are right that I'm underestimating the development time to craftefficient,usable DL algorithms.

For this I would like to ask your opinion on which deep models do yourecommend I should focus on within the given time frame?


Thanks a lot,
yours truly
--Issam

On 5/2/2013 5:20 PM, Frédéric Bastien wrote:

Hi,

I have no dough you are a great programmer, but even people in my labthat is specialized in deep learning won't be able to do the full listwhile respecting the scikit-learn code quality, documentation andperformance/efficiency level.

I think you should keep 1 or 2 deep model. Just look at the time ittook for the MLP and RBM PR to be done. I don't expect less time foryours.

Other people mentioned the problem of usability of deep learningtechniques. I think you should focus on that instead of doing manymodels. That is the what will difference your implementation fromTheano/Pylearn2/DLT. For this, you could check James Bergstra email onthis list that talk about automatic hyper-parameter selection. I thatcould solve a big part of the usability problem of deep learning. Isuppose good documentation could do the rest.

Also, shared variable is a Theano only thing. For GPU without Theano,you can look at Numba Pro, PyCUDA or PyOpenCL. scikit-learn don't wantTheano as a dependency (and I understand that).


HTH

Frédéric Bastien

Disclaimer: I'm a core Theano developer. I never contributed toscikit-learn, so take other people comment from this list moreimportant then mine.

On Thu, May 2, 2013 at 5:34 AM, Issam <issamo...@gmail.com<mailto:issamo...@gmail.com>> wrote:


    Hi Vladn,

    Here is the updated proposal, I have added the current challenges and
    proposed solutions on the abstract,

    
https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/issamou/1#

    Thank you!

    On 5/2/2013 11:34 AM, Vlad Niculae wrote:
    > Sorry, I just saw that your submission is on Melange.
    >
    > I think the proposal could use some discussion on what issues
    might be
    > faced.  Many people here have expressed concerns about including
    "deep
    > stuff", the difficulty to have sensible defaults, the difficulty to
    > having a general-purpose efficient implementation that can be
    used on
    > different domains without hacking the code.  In the very simple RBM,
    > the example is still unsatisfactory because it is hard to show
    off the
    > algorithm on too small a dataset.  This might be even trickier with
    > deeper things.
    >
    > In tuning a good neural model some know-how and tricks are needed,
    > many times you need to look over the training process and measure
    > statistics.  It would be useful to describe this kind of
    difficulties
    > and how we might be able to avoid them, what kind of hyperparameter
    > heuristics / initialization should be used, etc.  It is early to go
    > into it too deeply (pun intended) but I think the proposal can
    benefit
    > by your embracing the skeptic side.
    >
    > Hope this helps,
    > Vlad
    >
    >
    > On Thu, May 2, 2013 at 5:20 PM, Vlad Niculae <zephy...@gmail.com
    <mailto:zephy...@gmail.com>> wrote:
    >> Hi Issam,
    >>
    >> The deadline is fast approaching.  How is your proposal going?
    Could
    >> you share a version so we can give some feedback?
    >>
    >> Yours,
    >> Vlad
    >>
    >> On Sat, Apr 20, 2013 at 3:57 AM, amir rahimi
    <noname01....@gmail.com <mailto:noname01....@gmail.com>> wrote:
    >>> Sorry, I didn't see Andy's note ;)
    >>>
    >>>
    >>> On Fri, Apr 19, 2013 at 11:23 PM, amir rahimi
    <noname01....@gmail.com <mailto:noname01....@gmail.com>>
    >>> wrote:
    >>>> Hi,
    >>>> I recommend Theano if you want to use python with GPU for
    deep learning.
    >>>> It is tightly integrated with numpy....
    >>>>
    >>>> Best,
    >>>> Amir
    >>>>
    >>>>
    >>>> On Thu, Apr 18, 2013 at 9:21 PM, Wei LI <kuant...@gmail.com
    <mailto:kuant...@gmail.com>> wrote:
    >>>>> @Andy What do you mean by "blackbox" algorithm? Does that
    mean something
    >>>>> similar to pylearn2?
    >>>>>
    >>>>> @Issam, It seems to me that scalablity is a key factor to
    train deep
    >>>>> models and make them work. Do you have any suggestion how to
    make it
    >>>>> scalable while still fits in sklearn framework? I think
    sklearn cannot
    >>>>> supports GPU easily. I wanna know is training a deep model
    for a mid-level
    >>>>> scale(maybe like cifar?) painful on CPU only with numpy?
    >>>>>
    >>>>> Best,
    >>>>> Wei
    >>>>>
    >>>>> On Fri, Apr 19, 2013 at 12:27 AM, Andreas Mueller
    >>>>> <amuel...@ais.uni-bonn.de <mailto:amuel...@ais.uni-bonn.de>>
    wrote:
    >>>>>> Hi Issam.
    >>>>>> Thank you for your interest. Have you looked at the
    >>>>>> MLP and RBM pull requests that are currently open?
    >>>>>> How would your project relate to those?
    >>>>>>
    >>>>>> A real problem is that we don't want to replicate theano
    >>>>>> and rather have a somewhat "black box" algorithm that
    people can
    >>>>>> apply....
    >>>>>>
    >>>>>> Cheers,
    >>>>>> Andy
    >>>>>>
    >>>>>>
    >>>>>> On 04/18/2013 06:07 PM, Issam wrote:
    >>>>>>> Hi scikit,
    >>>>>>>
    >>>>>>> Here I am proposing to work on deep learning topic for
    GSOC 2013. Deep
    >>>>>>> learning is a relatively new research area that  is
    progressing fast
    >>>>>>> with a lot of potential for contributions. It involves an
    intersting
    >>>>>>> idea by trying to imitate the brain, as it uses many
    levels (hidden
    >>>>>>> layers) of processing. Where the levels are at decreasing
    order of
    >>>>>>> abstractions!
    >>>>>>>
    >>>>>>> In this project, I'm planning to work on each step
    carefully, first I
    >>>>>>> look into "Deep Boltzmann machines",  then "Deep belief
    >>>>>>> networks","Deep
    >>>>>>> auto-encoders", "Stacked denoising auto-encoders", and
    more. I could
    >>>>>>> create a complete plan for this, once I get your feedback :)
    >>>>>>>
    >>>>>>> I have been involved in quite a number of machine learning
    projects,
    >>>>>>> from dealing with imbalanced datasets (software quality
    prediction),
    >>>>>>> to
    >>>>>>> XML classification, from recognizing gender out of
    handwriting, to
    >>>>>>> breast cancer prediction using mammograms. I'm in my
    second semester
    >>>>>>> as
    >>>>>>> a graduate student (MSc), and machine learning is my
    research area. My
    >>>>>>> thesis would involve deep learning, which i will apply on
    >>>>>>> bioinformatics
    >>>>>>> and face recognition.
    >>>>>>>
    >>>>>>> I would be more than happy to work with a mentor on this!
    >>>>>>>
    >>>>>>> Thank you!
    >>>>>>>
    >>>>>>> Best regards,
    >>>>>>> --Issam Laradji
    >>>>>>>
    >>>>>>>
    >>>>>>>
    
------------------------------------------------------------------------------
    >>>>>>> Precog is a next-generation analytics platform capable of
    advanced
    >>>>>>> analytics on semi-structured data. The platform includes
    APIs for
    >>>>>>> building
    >>>>>>> apps and a phenomenal toolset for data science. Developers
    can use
    >>>>>>> our toolset for easy data analysis & visualization. Get a free
    >>>>>>> account!
    >>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
    >>>>>>> _______________________________________________
    >>>>>>> Scikit-learn-general mailing list
    >>>>>>> Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    >>>>>>>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>>>>
    >>>>>>
    >>>>>>
    
------------------------------------------------------------------------------
    >>>>>> Precog is a next-generation analytics platform capable of
    advanced
    >>>>>> analytics on semi-structured data. The platform includes
    APIs for
    >>>>>> building
    >>>>>> apps and a phenomenal toolset for data science. Developers
    can use
    >>>>>> our toolset for easy data analysis & visualization. Get a
    free account!
    >>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
    >>>>>> _______________________________________________
    >>>>>> Scikit-learn-general mailing list
    >>>>>> Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    >>>>>>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>>>
    >>>>>
    >>>>>
    >>>>> --
    >>>>> LI, Wei
    >>>>> Tsinghua/CUHK
    >>>>> http://kuantkid.github.com/
    >>>>>
    >>>>>
    >>>>>
    >>>>>
    
------------------------------------------------------------------------------
    >>>>> Precog is a next-generation analytics platform capable of
    advanced
    >>>>> analytics on semi-structured data. The platform includes
    APIs for
    >>>>> building
    >>>>> apps and a phenomenal toolset for data science. Developers
    can use
    >>>>> our toolset for easy data analysis & visualization. Get a
    free account!
    >>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
    >>>>> _______________________________________________
    >>>>> Scikit-learn-general mailing list
    >>>>> Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    >>>>>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>>>
    >>>>
    >>>>
    >>>> --
    >>>>
    ----------------------------------------------------------------------
    >>>> #include <stdio.h>
    >>>> double d[]={9299037773.178347,2226415.983937417,307.0};
    >>>> main(){d[2]--?d[0]*=4,d[1]*=5,main():printf((char*)d);}
    >>>>
    ----------------------------------------------------------------------
    >>>
    >>>
    >>>
    >>> --
    >>>
    ----------------------------------------------------------------------
    >>> #include <stdio.h>
    >>> double d[]={9299037773.178347,2226415.983937417,307.0};
    >>> main(){d[2]--?d[0]*=4,d[1]*=5,main():printf((char*)d);}
    >>>
    ----------------------------------------------------------------------
    >>>
    >>>
    
------------------------------------------------------------------------------
    >>> Precog is a next-generation analytics platform capable of advanced
    >>> analytics on semi-structured data. The platform includes APIs
    for building
    >>> apps and a phenomenal toolset for data science. Developers can use
    >>> our toolset for easy data analysis & visualization. Get a free
    account!
    >>> http://www2.precog.com/precogplatform/slashdotnewsletter
    >>> _______________________________________________
    >>> Scikit-learn-general mailing list
    >>> Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
    >>>
    >


    
------------------------------------------------------------------------------
    Introducing AppDynamics Lite, a free troubleshooting tool for
    Java/.NET
    Get 100% visibility into your production application - at no cost.
    Code-level diagnostics for performance bottlenecks with <2% overhead
    Download for free and get started troubleshooting in minutes.
    http://p.sf.net/sfu/appdyn_d2d_ap1
    _______________________________________________
    Scikit-learn-general mailing list
    Scikit-learn-general@lists.sourceforge.net
    <mailto:Scikit-learn-general@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] (Deep learning) pre-proposal for (GSOC) 2013

Reply via email to