Re: [Scikit-learn-general] (Deep learning) pre-proposal for (GSOC) 2013

Frédéric Bastien Thu, 02 May 2013 09:35:42 -0700

Sorry, but I'm not an DL expert and can't do such recommandation.

Maybe someone else here can, but you can also ask on the pylearn2 mailing
list.


Fred


On Thu, May 2, 2013 at 11:12 AM, Issam <issamo...@gmail.com> wrote:

>  Hi,
>
> Thanks a lot for the comment, I hope this doesn't open a new thread :),
> I'm pretty new to using mailing list.
>
> You are right that I'm underestimating the development time to craft
> efficient,usable DL algorithms.
>
> For this I would like to ask your opinion on which deep models do you
> recommend I should focus on within the given time frame?
>
> Thanks a lot,
> yours truly
> --Issam
>
> On 5/2/2013 5:20 PM, Frédéric Bastien wrote:
>
>    Hi,
>
>  I have no dough you are a great programmer, but even people in my lab
> that is specialized in deep learning won't be able to do the full list
> while respecting the scikit-learn code quality, documentation and
> performance/efficiency level.
>
>  I think you should keep 1 or 2 deep model. Just look at the time it took
> for the MLP and RBM PR to be done. I don't expect less time for yours.
>
>  Other people mentioned the problem of usability of deep learning
> techniques. I think you should focus on that instead of doing many models.
> That is the what will difference your implementation from
> Theano/Pylearn2/DLT. For this, you could check James Bergstra email on this
> list that talk about automatic hyper-parameter selection. I that could
> solve a big part of the usability problem of deep learning. I suppose good
> documentation could do the rest.
>
>  Also, shared variable is a Theano only thing. For GPU without Theano, you
> can look at Numba Pro, PyCUDA or PyOpenCL. scikit-learn don't want Theano
> as a dependency (and I understand that).
>
>  HTH
>
> Frédéric Bastien
>
> Disclaimer: I'm a core Theano developer. I never contributed to
> scikit-learn, so take other people comment from this list more important
> then mine.
>
>
> On Thu, May 2, 2013 at 5:34 AM, Issam <issamo...@gmail.com> wrote:
>
>> Hi Vladn,
>>
>> Here is the updated proposal, I have added the current challenges and
>> proposed solutions on the abstract,
>>
>>
>> https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/issamou/1#
>>
>> Thank you!
>>
>> On 5/2/2013 11:34 AM, Vlad Niculae wrote:
>> > Sorry, I just saw that your submission is on Melange.
>> >
>> > I think the proposal could use some discussion on what issues might be
>> > faced.  Many people here have expressed concerns about including "deep
>> > stuff", the difficulty to have sensible defaults, the difficulty to
>> > having a general-purpose efficient implementation that can be used on
>> > different domains without hacking the code.  In the very simple RBM,
>> > the example is still unsatisfactory because it is hard to show off the
>> > algorithm on too small a dataset.  This might be even trickier with
>> > deeper things.
>> >
>> > In tuning a good neural model some know-how and tricks are needed,
>> > many times you need to look over the training process and measure
>> > statistics.  It would be useful to describe this kind of difficulties
>> > and how we might be able to avoid them, what kind of hyperparameter
>> > heuristics / initialization should be used, etc.  It is early to go
>> > into it too deeply (pun intended) but I think the proposal can benefit
>> > by your embracing the skeptic side.
>> >
>> > Hope this helps,
>> > Vlad
>> >
>> >
>> > On Thu, May 2, 2013 at 5:20 PM, Vlad Niculae <zephy...@gmail.com>
>> wrote:
>> >> Hi Issam,
>> >>
>> >> The deadline is fast approaching.  How is your proposal going? Could
>> >> you share a version so we can give some feedback?
>> >>
>> >> Yours,
>> >> Vlad
>> >>
>> >> On Sat, Apr 20, 2013 at 3:57 AM, amir rahimi <noname01....@gmail.com>
>> wrote:
>> >>> Sorry, I didn't see Andy's note ;)
>> >>>
>> >>>
>> >>> On Fri, Apr 19, 2013 at 11:23 PM, amir rahimi <noname01....@gmail.com
>> >
>> >>> wrote:
>> >>>> Hi,
>> >>>> I recommend Theano if you want to use python with GPU for deep
>> learning.
>> >>>> It is tightly integrated with numpy....
>> >>>>
>> >>>> Best,
>> >>>> Amir
>> >>>>
>> >>>>
>> >>>> On Thu, Apr 18, 2013 at 9:21 PM, Wei LI <kuant...@gmail.com> wrote:
>> >>>>> @Andy What do you mean by "blackbox" algorithm? Does that mean
>> something
>> >>>>> similar to pylearn2?
>> >>>>>
>> >>>>> @Issam, It seems to me that scalablity is a key factor to train deep
>> >>>>> models and make them work. Do you have any suggestion how to make it
>> >>>>> scalable while still fits in sklearn framework? I think sklearn
>> cannot
>> >>>>> supports GPU easily. I wanna know is training a deep model for a
>> mid-level
>> >>>>> scale(maybe like cifar?) painful on CPU only with numpy?
>> >>>>>
>> >>>>> Best,
>> >>>>> Wei
>> >>>>>
>> >>>>> On Fri, Apr 19, 2013 at 12:27 AM, Andreas Mueller
>> >>>>> <amuel...@ais.uni-bonn.de> wrote:
>> >>>>>> Hi Issam.
>> >>>>>> Thank you for your interest. Have you looked at the
>> >>>>>> MLP and RBM pull requests that are currently open?
>> >>>>>> How would your project relate to those?
>> >>>>>>
>> >>>>>> A real problem is that we don't want to replicate theano
>> >>>>>> and rather have a somewhat "black box" algorithm that people can
>> >>>>>> apply....
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>> Andy
>> >>>>>>
>> >>>>>>
>> >>>>>> On 04/18/2013 06:07 PM, Issam wrote:
>> >>>>>>> Hi scikit,
>> >>>>>>>
>> >>>>>>> Here I am proposing to work on deep learning topic for GSOC 2013.
>> Deep
>> >>>>>>> learning is a relatively new research area that  is progressing
>> fast
>> >>>>>>> with a lot of potential for contributions. It involves an
>> intersting
>> >>>>>>> idea by trying to imitate the brain, as it uses many levels
>> (hidden
>> >>>>>>> layers) of processing. Where the levels are at decreasing order of
>> >>>>>>> abstractions!
>> >>>>>>>
>> >>>>>>> In this project, I'm planning to work on each step carefully,
>> first I
>> >>>>>>> look into "Deep Boltzmann machines",  then "Deep belief
>> >>>>>>> networks","Deep
>> >>>>>>> auto-encoders", "Stacked denoising auto-encoders", and more. I
>> could
>> >>>>>>> create a complete plan for this, once I get your feedback :)
>> >>>>>>>
>> >>>>>>> I have been involved in quite a number of machine learning
>> projects,
>> >>>>>>> from dealing with imbalanced datasets (software quality
>> prediction),
>> >>>>>>> to
>> >>>>>>> XML classification, from recognizing gender out of handwriting, to
>> >>>>>>> breast cancer prediction using mammograms. I'm in my second
>> semester
>> >>>>>>> as
>> >>>>>>> a graduate student (MSc), and machine learning is my research
>> area. My
>> >>>>>>> thesis would involve deep learning, which i will apply on
>> >>>>>>> bioinformatics
>> >>>>>>> and face recognition.
>> >>>>>>>
>> >>>>>>> I would be more than happy to work with a mentor on this!
>> >>>>>>>
>> >>>>>>> Thank you!
>> >>>>>>>
>> >>>>>>> Best regards,
>> >>>>>>> --Issam Laradji
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> ------------------------------------------------------------------------------
>> >>>>>>> Precog is a next-generation analytics platform capable of advanced
>> >>>>>>> analytics on semi-structured data. The platform includes APIs for
>> >>>>>>> building
>> >>>>>>> apps and a phenomenal toolset for data science. Developers can use
>> >>>>>>> our toolset for easy data analysis & visualization. Get a free
>> >>>>>>> account!
>> >>>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> >>>>>>> _______________________________________________
>> >>>>>>> Scikit-learn-general mailing list
>> >>>>>>> Scikit-learn-general@lists.sourceforge.net
>> >>>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> ------------------------------------------------------------------------------
>> >>>>>> Precog is a next-generation analytics platform capable of advanced
>> >>>>>> analytics on semi-structured data. The platform includes APIs for
>> >>>>>> building
>> >>>>>> apps and a phenomenal toolset for data science. Developers can use
>> >>>>>> our toolset for easy data analysis & visualization. Get a free
>> account!
>> >>>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> >>>>>> _______________________________________________
>> >>>>>> Scikit-learn-general mailing list
>> >>>>>> Scikit-learn-general@lists.sourceforge.net
>> >>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> LI, Wei
>> >>>>> Tsinghua/CUHK
>> >>>>> http://kuantkid.github.com/
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> ------------------------------------------------------------------------------
>> >>>>> Precog is a next-generation analytics platform capable of advanced
>> >>>>> analytics on semi-structured data. The platform includes APIs for
>> >>>>> building
>> >>>>> apps and a phenomenal toolset for data science. Developers can use
>> >>>>> our toolset for easy data analysis & visualization. Get a free
>> account!
>> >>>>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> >>>>> _______________________________________________
>> >>>>> Scikit-learn-general mailing list
>> >>>>> Scikit-learn-general@lists.sourceforge.net
>> >>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >>>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>>
>> ----------------------------------------------------------------------
>> >>>> #include <stdio.h>
>> >>>> double d[]={9299037773.178347,2226415.983937417,307.0};
>> >>>> main(){d[2]--?d[0]*=4,d[1]*=5,main():printf((char*)d);}
>> >>>>
>> ----------------------------------------------------------------------
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> ----------------------------------------------------------------------
>> >>> #include <stdio.h>
>> >>> double d[]={9299037773.178347,2226415.983937417,307.0};
>> >>> main(){d[2]--?d[0]*=4,d[1]*=5,main():printf((char*)d);}
>> >>> ----------------------------------------------------------------------
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> Precog is a next-generation analytics platform capable of advanced
>> >>> analytics on semi-structured data. The platform includes APIs for
>> building
>> >>> apps and a phenomenal toolset for data science. Developers can use
>> >>> our toolset for easy data analysis & visualization. Get a free
>> account!
>> >>> http://www2.precog.com/precogplatform/slashdotnewsletter
>> >>> _______________________________________________
>> >>> Scikit-learn-general mailing list
>> >>> Scikit-learn-general@lists.sourceforge.net
>> >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> >>>
>> >
>>
>>
>>
>> ------------------------------------------------------------------------------
>> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
>> Get 100% visibility into your production application - at no cost.
>> Code-level diagnostics for performance bottlenecks with <2% overhead
>> Download for free and get started troubleshooting in minutes.
>> http://p.sf.net/sfu/appdyn_d2d_ap1
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>

------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] (Deep learning) pre-proposal for (GSOC) 2013

Reply via email to